Pregled bibliografske jedinice broj: 932608
Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs
Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs // Proceedings of Machine Learning Research v. 70
Sydney, Australija, 2017. str. 1733-1741 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 932608 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs
Autori
Jing, Li ; Shen, Yichen ; Dubček, Tena ; Peurifoy, John ; Skirlo, Scott ; LeCun, Yann ; Tegmark, Max ; Soljačić, Marin
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of Machine Learning Research v. 70
/ - , 2017, 1733-1741
Skup
International Conference on Machine Learning
Mjesto i datum
Sydney, Australija, 06.08.2017. - 11.08.2017
Vrsta sudjelovanja
Poster
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
neural network, deep learning, unitary neural network
Sažetak
Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long- term correlations in the data. This approach appears particularly promising for Recurrent Neural Networks (RNNs). In this work, we present a new architecture for implementing an Efficient Unitary Neural Network (EUNNs) ; its main advantages can be summarized as follows. Firstly, the representation capacity of the unitary space in an EUNN is fully tunable, ranging from a subspace of SU(N) to the entire unitary space. Secondly, the computational complexity for training an EUNN is merely O(1) per parameter. Finally, we test the performance of EUNNs on the standard copying task, the pixel-permuted MNIST digit recognition benchmark as well as the Speech Prediction Test (TIMIT). We find that our architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed. EUNNs are thus promising alternatives to RNNs and LSTMs for a wide variety of applications.
Izvorni jezik
Engleski
Znanstvena područja
Fizika, Računarstvo
POVEZANOST RADA
Ustanove:
Prirodoslovno-matematički fakultet, Zagreb