Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Assessing and Modeling the Role of the Noticeability of Sound Events and Attention in Urban Sound Perception (CROSBI ID 428264)

Ocjenski rad | doktorska disertacija

Filipan, Karlo Assessing and Modeling the Role of the Noticeability of Sound Events and Attention in Urban Sound Perception / Botteldooren, Dick ; De Coensel, Bert (mentor); Gent, Belgija, . 2018

Podaci o odgovornosti

Filipan, Karlo

Botteldooren, Dick ; De Coensel, Bert

engleski

Assessing and Modeling the Role of the Noticeability of Sound Events and Attention in Urban Sound Perception

People who live in urban areas are often bothered by noise in their daily lives. The increase of mechanical noise in the urban environment in the twentieth century stimulated research into the negative effects of noise on health and well-being. The investigation led the regulatory authorities to propose policies and legislation for the reduction of these negative effects. However, this approach has not been very effective and noise remains a problem in many places. That is why, in the nineties, a different approach emerged that considered the urban sound environment as a whole, including the sounds that contribute to the pleasantness, individuality and the relaxing character of the environment. This concept was called soundscape and it emphasized not only the physical sound levels, but also their perception and the way in which these sounds are interpreted by the citizens and the society as a whole. Soundscape research has shown that it is mainly the individual sounds that people notice which contribute to the perception of the overall quality of the sound environment. Noticeability of the sounds is closely related to attention which can be divided into two parts: bottom-up (inward-oriented, involuntary) attention based on the physical characteristics of the sound, and top-down (outward-oriented, voluntary) attention formed by the higher cognitive processes. Although models and psychoacoustic metrics that are used today in soundscape research are related to perception, they do not explicitly take the attention system into account. This work aims to unravel the influence of attention on the perception of the sound environment and to present calculation models for the analysis of sonic environments based on this knowledge. The thesis is structured in three parts: mobile measurements and assessment of urban soundscape using the computational machine listening model, modeling of bottom-up auditory attention from the spectrotemporal modulation features, and investigation of the higher cognitive processes on the attended sounds. The creation of the models for bottom-up auditory attention, i.e. auditory saliency---a metric that describes how much a sound stands out of its environment---is inspired by both physiological knowledge from the previous experimental studies in neuroscience, and psychoacoustical knowledge. In this thesis, the studied urban sound environment was an urban park: a public space that is accessible to everyone and which is often regarded as a restorative place. To represent the spatial and the temporal variation of the sonic environment, a methodology is proposed for recording the sound using mobile measurements with a combination of GPS and a self-developed sound meter. This methodology was tested in a study of eight city parks in Antwerp where both the mobile noise measurements and the questioning of the visitors were carried out. It is shown that the mapped physical and psychoacoustical characteristics of the sound environment could be related to the quality of the sound environment perceived by the visitors. On the same sound measurements, machine listening was also applied using a model, developed by Michiel Boes, which used a multi-layered recursive neural network to simulate the dynamics of human hearing (attention, forgetting, inhibition, etc.). However, this model makes use of a simplified estimate of the saliency of the sound based on the output of the current generation of sound level meters and an equivalence with visual saliency. Nevertheless, it can be shown that the audible sounds estimated by the machine listening model give a better prediction of the quality of the sound environment as assessed by the people, than the classical acoustical and psychoacoustical indicators. Based on this analysis, it can be stated that it makes sense to analyze the sound features that attract attention more accurately. It is plausible that the sounds, to which the human auditory system is specifically tuned to, are also the ones that attract attention. Several literature sources underline the selectivity and tuning of the human auditory cortex to specific spectrotemporal modulations, i.e. dynamic ripples corresponding to the simultaneous modulation in time and frequency domain. Because of their close relation to the physiological reaction, it was decided to use the spectrotemporal modulation features in the computational model for auditory saliency. To test the relevance of these features, the response to simple sounds (car honk and music) and their counterparts with the same amplitude spectrum but scrambled phase is tested. The comparison with other feature extractors shows that the spectrotemporal features are discriminating between the original and the adapted version of the sound, just as a person can easily make this distinction. Because environmental sounds are often short, the transient response of the model is very important in the analysis of environmental sound. Therefore, a physiologically-inspired computational model for auditory saliency that uses the investigated spectrotemporal modulation features was developed. The model is divided into two parts: auditory periphery and a central processing stage. A state- of-the-art ear model by Sarah Verhulst is used for the periphery. However, due to the detailed implementation of the ear dynamics, this model is too slow to apply on the large datasets of environmental noise. To overcome this problem, a fast alternative was created which uses a Gammatone filter bank and a demodulation based on squaring and filtering. The stage of the saliency model that simulates the central processing comprises a simulation of the auditory cortex, which evaluates the modulation content, and an excitation- inhibition model that responds to the changes in time of the modulation content of the sound. Compared to the earlier models reported in the literature, the proposed model maps the transient response better by using a combination of damped resonators and time delays as well as the inhibition for the implementation of sensitivity to spectrotemporal modulations. Different stimuli were used to analyze the response of the created model. The subtle transition from a pure tone to a modulated or a rough tone can easily be detected by a human being. It is shown that the model for auditory saliency also reacts to such transitions. Masking could only be determined when the advanced ear periphery is used. For the electrophysiological measurement of the response of hearing the deviant sounds for humans, i.e. mismatch negativity, sequences of tones with a sporadically deviating frequency are used. The model responds to these sequences in a way similar to the human hearing. Finally, an environmental noise specifically designed to stand out of its background was also examined: a siren from an emergency vehicle embedded in a background traffic noise. The results show that the model correctly predicts that the increase in the level when adding an emergency siren is more salient than a comparable increase in the level when adding a traffic noise. A number of applications of the computational model for auditory saliency were explored within this doctoral research. The current industrial noise legislation includes a dose penalty for the very annoying sounds. However, there are no clear guidelines for identifying these sounds. Therefore, the response to impulsive sounds and rapid increases in amplitude was tested using several models applicable to soundscape research. It was shown that the model for auditory saliency can predict the presence of the annoying impulse noise and the noise with short rise times within a context of traffic noise. This result advocates the application of the created model as an assessment method for new noise regulations. Another application of the model is the interpretation of a listening experiment, in which the participants evaluated the pleasantness of the recorded soundwalks. The data were analyzed for the causality prediction using Granger causality methodology between three metrics: sound amplitude, calculated saliency and the change in pleasantness rating. The findings suggest that the saliency of a sound is a better predictor for the change in the assessment of the pleasantness than the sound amplitude. To further assess the importance of this auditory saliency within an audiovisual context, a comparison between experiments with and without the video was performed. It is shown that the auditory saliency becomes an even better predictor for the change in the assessment of the pleasantness when visual stimulus is removed from the experimental setting. Although auditory saliency is an important and sometimes dominant component of attention, voluntary attention also plays an important role in noticing the specific sound events. The component of the voluntary attention is related to the meaning and belief that the people assign to a sound environment. This outward-oriented attention was investigated in the third part of the thesis. The perceptual dataset from the Antwerp parks was evaluated based on the meaning that the people assign to tranquility. According to the methodology proposed by Pauline Delaitre and Catherine Lavandier, visitors of the parks were assigned to one of the three tranquility viewpoint groups (natural sound sources, social relationships, silence) based on their responses to a questionnaire. Using an analysis of variance, the relationship between belonging to a tranquility viewpoint group was compared to the reported hearing of the sounds from the three categories: human, natural and mechanical. The results show that the visitors notice the sounds outside their tranquility viewpoint group more often: the participants who belonged to the group who associated the tranquility with natural sound sources reported to have heard more mechanical sounds. Furthermore, the same visitors rate the overall quality of the sound environment as less pleasant in comparison to the visitors belonging to the group with a different viewpoint on tranquility. The models and methodologies for investigation of the noticeability of sound events presented in this thesis could serve as a tool for investigating urban soundscape. By focusing on the elements of the urban sound environment that people pay attention to, it would be possible to condense the large amounts of data from soundscape studies and researchers and designers could concentrate on the sounds and their sources which are relevant for improving the perceived quality of the sonic environment.

Urban Soundscape ; Computational Modeling ; Auditory Attention ; Auditory Saliency

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o izdanju

200

26.11.2018.

obranjeno

Podaci o ustanovi koja je dodijelila akademski stupanj

Gent, Belgija

Povezanost rada

Povezane osobe



Elektrotehnika, Računarstvo