Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi

Parallel reduction of four matrices to the condensed form for a general matrix eigenvalue algorithm (CROSBI ID 698519)

Prilog sa skupa u zborniku | sažetak izlaganja sa skupa

Bosner, Nela Parallel reduction of four matrices to the condensed form for a general matrix eigenvalue algorithm // Programme and Abstracts, 10th International Workshop onParallel Matrix Algorithms and Applications(PMAA 18). 2018. str. 21-21

Podaci o odgovornosti

Bosner, Nela

engleski

Parallel reduction of four matrices to the condensed form for a general matrix eigenvalue algorithm

The VZ algorithm proposed by Charles F. Van Loan (SIMA, 1975) is a QR-type process for the solution of the gen-eral matrix eigenvalue problemACx=λBDx, whereA, B∈Rn×m, C, D∈Rm×n, andm≥n. Especially, this algorithmis suitable for solving the generalized singular value problemATAx=μ2BTBx. Transforming the general eigen-value problem to the standard form(BD)−1(AC)x=λxrepresents a possible numerical danger, since formation ofthe productsACandBD, as well as formation of the inverse(BD)−1can produce a result with a large backwarderror. Thus, the VZ algorithm attempts to solve the problem without forming these products and inverse. Thisapproach transforms each of the four matrices separately into a suitable form, which is a generalization of theSchur decomposition. Actually, the algorithm computes orthogonal matricesQ, U∈Rn×nandV, Z∈Rm×msuchthatQAZis upper quasi-triangular, andQBV, ZTCUandVTDUare upper triangular. The VZ algorithm begins byreducing the matricesA, B, C, andDto an equivalent condensed form by the finite step initial reduction. Thisreduction finds orthogonal matricesQ0, U0, V0andZ0, such thatQ0AZ0is upper Hessenberg, andQ0BV0, ZT0CU0andVT0DU0are upper triangular. Then, the VZ iterations are applied to the matrices in the condensed form. Inthe initial reduction, Ais reduced to the upper Hessenberg form, while simultaneously preserving triangularityof the other three matrices. This is done by the Givens rotations, annihilating one by one element ofA, and bygenerating three more rotations applied to the other matrices per each annihilation. Such an algorithm is quiteinefficient. In our work, we propose a blocked algorithm for the initial reduction, based on the aggregated Givensrotations and matrix–matrix multiplications, which are applied in the outer loop updates. This algorithm has an-other level of blocking, exploited in the inner loop. Further, application of a sequence of the rotations in the innerloop is parallelized, with balanced operation count per thread. Since a large number of aggregated rotations isproduced in every outer loop step, they are simultaneously accumulated before the outer loop updates. We alsoconsider a variant of the algorithm in a hybrid CPU–GPU framework, where the compute-intensive outer loopupdates are performed on GPU, and can be overlapped with the reduction in the next step performed on CPU.This adjustments speed up the original initial reduction considerably, and the efficiency of the whole VZ algorithmis increased.

Hessenberg-triangular-triangular-triangular form ; generalized singular value and eigenvalue problem ; Givens rotations ; block and parallel implementations

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

21-21.

2018.

objavljeno

Podaci o matičnoj publikaciji

Podaci o skupu

10th International Workshop on Parallel Matrix Algorithms and Applications

predavanje

27.06.2018-29.06.2018

Zürich, Švicarska

Povezanost rada

Matematika, Računarstvo