Pregled bibliografske jedinice broj: 1236123
Vectorization of a thread-parallel Jacobi singular value decomposition method
Vectorization of a thread-parallel Jacobi singular value decomposition method // SIAM journal on scientific computing, 45 (2023), 3; C73-C100 doi:10.1137/22M1478847 (međunarodna recenzija, članak, znanstveni)
CROSBI ID: 1236123 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Vectorization of a thread-parallel Jacobi singular value
decomposition method
Autori
Novaković, Vedran
Izvornik
SIAM journal on scientific computing (1064-8275) 45
(2023), 3;
C73-C100
Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni
Ključne riječi
batched eigendecomposition of Hermitian matrices of order two ; SIMD vectorization ; singular value decomposition ; parallel one-sided Jacobi-type SVD method
Sažetak
The eigenvalue decomposition (EVD) of (a batch of) Hermitian matrices of order two has a role in many numerical algorithms, of which the one-sided Jacobi method for the singular value decomposition (SVD) is the prime example. In this paper the batched EVD is vectorized, with a vector-friendly data layout and the AVX-512 SIMD instructions of Intel CPUs, alongside other key components of a real and a complex OpenMP-parallel Jacobi-type SVD method, inspired by the sequential xGESVJ routines from LAPACK. These vectorized building blocks should be portable to other platforms that support similar vector operations. Unconditional numerical reproducibility is guaranteed for the batched EVD, sequential or threaded, and for the column transformations, that are, like the scaled dot-products, presently sequential but can be threaded if nested parallelism is desired. No avoidable overflow of the results can occur with the proposed EVD or the whole SVD. The measured accuracy of the proposed EVD often surpasses that of the xLAEV2 routines from LAPACK. While the batched EVD outperforms the matching sequence of xLAEV2 calls, speedup of the parallel SVD is modest but can be improved and is already beneficial with enough threads. Regardless of their number, the proposed SVD method gives identical results, but of somewhat lower accuracy than xGESVJ.
Izvorni jezik
Engleski
Znanstvena područja
Matematika, Računarstvo
Napomena
Prihvaćen za objavljivanje 08.12.2022.
Objavljen online 02.06.2023.
Preprint: https://doi.org/10.48550/arXiv.2202.08361
POVEZANOST RADA
Projekti:
HRZZ-IP-2014-09-3670 - Matične faktorizacije i blok dijagonalizacijski algoritmi (MFBDA) (Hari, Vjeran, HRZZ - 2014-09) ( CroRIS)
Profili:
Vedran Novaković
(autor)
Citiraj ovu publikaciju:
Časopis indeksira:
- Current Contents Connect (CCC)
- Web of Science Core Collection (WoSCC)
- Science Citation Index Expanded (SCI-EXP)
- SCI-EXP, SSCI i/ili A&HCI
- Scopus
Uključenost u ostale bibliografske baze podataka::
- Compendex (EI Village)
- Compu-Math Citation Index
- INSPEC
- MathSciNet
- Zentrallblatt für Mathematik/Mathematical Abstracts