Pregled bibliografske jedinice broj: 1175659
Person localization model based on a fusion of acoustic and visual inputs
Person localization model based on a fusion of acoustic and visual inputs // Electronics (Basel), 11 (2022), 3; 440, 13 doi:10.3390/electronics11030440 (međunarodna recenzija, članak, znanstveni)
CROSBI ID: 1175659 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Person localization model based on a fusion of acoustic and visual
inputs
Autori
Koren, Leon ; Stipancic, Tomislav ; Ricko, Andrija ; Orsag, Luka
Izvornik
Electronics (Basel) (2079-9292) 11
(2022), 3;
440, 13
Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni
Ključne riječi
spatial location ; residual neural network ; digital filter ; person separation ; cognitive robotics ; multimodal signal processing ; sensors ; HRI
Sažetak
PLEA is an interactive, biomimetic robotic head with non-verbal communication capabilities. PLEA reasoning is based on a multimodal approach combining video and audio inputs to reason about the current emotional state of the person. PLEA expresses emotions using facial expressions generated in real-time and projected onto the 3D projection face surface. In this paper, a more sophisticated computation mechanism is developed and evaluated in this paper. The Model for Audio-Visual Person Separation can locate a talking person in a crowded place by combining the input from the ResNet network with the input from a hand-crafted algorithm. While the first input is used to find human faces in the room, the second input is used to determine the direction of the sound and to focus attention on a single person. After an information fusion procedure is performed, the face of the person speaking is matched with the corresponding sound direction. As a result of this procedure, the robot can start an interaction with the person based on non-verbal signals. The model is tested and evaluated under laboratory conditions in interaction with users. The results suggest that the methodology can be efficiently used to focus a robot’s attention on the localized person.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo, Strojarstvo, Interdisciplinarne tehničke znanosti
POVEZANOST RADA
Projekti:
HRZZ-UIP-2020-02-7184 - Afektivna multimodalna interakcija temeljena na konstruiranoj robotskoj spoznaji (AMICORC) (Stipančić, Tomislav, HRZZ - 2020-02) ( CroRIS)
Ustanove:
Fakultet strojarstva i brodogradnje, Zagreb
Citiraj ovu publikaciju:
Časopis indeksira:
- Current Contents Connect (CCC)
- Web of Science Core Collection (WoSCC)
- Science Citation Index Expanded (SCI-EXP)
- Social Science Citation Index (SSCI)
- SCI-EXP, SSCI i/ili A&HCI
- Scopus