Deep Convolutional Oscillator: Synthesizing Waveforms from Timbral Descriptors

Kreković, Gordan

Pregled bibliografske jedinice broj: 1197683

Deep Convolutional Oscillator: Synthesizing Waveforms from Timbral Descriptors

Kreković, Gordan

Deep Convolutional Oscillator: Synthesizing Waveforms from Timbral Descriptors // Proceedings of the 19th Sound and Music Computing Conference
Saint-Étienne, Francuska, 2022. str. 200-206 doi:10.5281/zenodo.6573045 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)

CROSBI ID: 1197683 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Deep Convolutional Oscillator: Synthesizing Waveforms from Timbral Descriptors

Autori
Kreković, Gordan

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the 19th Sound and Music Computing Conference / - , 2022, 200-206

Skup
Sound and Music Computing 2022 (SMC-22)

Mjesto i datum
Saint-Étienne, Francuska, 05.06.2022. - 12.06.2022

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
sound synthesis ; wavetable synthesis ; deep learning ; convolutional neural networks ; timbral attributes ; generative neural network

Sažetak
This paper presents a novel deep learning model for synthesizing single-cycle waveforms from timbral attributes. The motivation was to investigate a viable alternative to traditional wavetable oscillators with intuitive control. Based on a thorough literature review and practical considerations , we selected three attributes appropriate for describing timbral characteristics of steady and harmonic tones: bright, warm, and rich. A deep learning network was designed to map magnitudes of these attributes to single cycle waveforms. The architecture was based on stacking of upsampling and convolutional layers to model temporal dependencies within the waveform. The network was trained on a large number of waveforms extracted from NSynth dataset. Audio features closely related to the selected attributes were used as inputs, while the custom loss function was employed to minimize the difference in normalized power spectra between outputs and training wave-forms. Four models with different hyperparameters were trained and the best one was selected using the validation dataset. Further experiments with the selected model showed that synthesized waveforms generally match the input attributes well, as the mean absolute errors for normalized attributes were 0.07, 0.05, and 0.18 for bright, warm, and rich respectively on the testing dataset.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo

POVEZANOST RADA

Profili:

GORDAN KREKOVIĆ (autor)

Poveznice na cjeloviti tekst rada:

Pristup cjelovitom tekstu rada doi zenodo.org

CROSBI Hrvatska znanstvena bibliografija

Pregled bibliografske jedinice broj: 1197683

Deep Convolutional Oscillator: Synthesizing Waveforms from Timbral Descriptors

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Citati:

Altmetrijski pokazatelji:

Pregled bibliografske jedinice broj: 1197683

Deep Convolutional Oscillator: Synthesizing Waveforms from Timbral Descriptors

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Citati:

Altmetrijski pokazatelji:

Podijeli: