The multishift QR algorithm is efficient for computing all the eigenvalues of
a dense, large-scale, non-Hermitian matrix. The major part of this algorithm can be performed
by matrix-matrix multiplications and is therefore suitable for modern processors
with hierarchical memory. A variant of this algorithm was recently proposed which can
execute more computational parts by matrix-matrix multiplications. The algorithm is
especially appropriate for recent coprocessors which contain many processor-elements
such as the CSX600. However, the performance of the algorithm highly depends on
the setting of parameters such as the numbers of shifts and divisions in the algorithm.
Optimal settings are different depending on the matrix size and computational environments.
In this paper, we construct a performance model to predict a setting of parameters
which minimizes the execution time of the algorithm. Experimental results with
the CSX600 coprocessor show that our model can be used to find the optimal setting.
Eigenvalues multishift QR algorithm bulge-chasing performance modeling