Human action prediction in collaborative environments based on shared-weight LSTMs with feature dimensionality reduction

Petković, Tomislav; Petrović, Luka; Marković, Ivan; Petrović, Ivan

doi:10.1016/j.asoc.2022.109245

Pregled bibliografske jedinice broj: 1205498

Human action prediction in collaborative environments based on shared-weight LSTMs with feature dimensionality reduction

Petković, Tomislav; Petrović, Luka; Marković, Ivan; Petrović, Ivan

Human action prediction in collaborative environments based on shared-weight LSTMs with feature dimensionality reduction // Applied Soft Computing, 126 (2022), 109245, 12 doi:10.1016/j.asoc.2022.109245 (međunarodna recenzija, članak, znanstveni)

CROSBI ID: 1205498 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Human action prediction in collaborative environments based on shared-weight LSTMs with feature dimensionality reduction

Autori
Petković, Tomislav ; Petrović, Luka ; Marković, Ivan ; Petrović, Ivan

Izvornik
Applied Soft Computing (1568-4946) 126 (2022); 109245, 12

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
Human action prediction ; Long short-term memory networks ; Feature dimensionality reduction ; Correlation ; Autoencoder ; Gaze estimation

Sažetak
As robots are progressing towards being ubiquitous and an indispensable part of our everyday environments, such as home, offices, healthcare, education, and manufacturing shop floors, efficient and safe collaboration and cohabitation become imperative. Given that, such environments could benefit greatly from accurate human action prediction. In addition to being accurate, human action prediction should be computationally efficient, in order to ensure a timely reaction, and capable of dealing with changing environments, since unstructured interaction and collaboration with humans usually do not assume static conditions. In this paper, we propose a model for human action prediction based on motion cues and gaze using shared-weight Long Short-Term Memory networks (LSTMs) and feature dimensionality reduction. LSTMs have proven to be a powerful tool in processing time series data, especially when dealing with long-term dependencies ; however, to maximize their performance, LSTM networks should be fed with informative and quality inputs. Given that, in this paper, we furthermore conducted an extensive input feature analysis based on (i) signal correlation and their strength to act as stand-alone predictors, and (ii) a multilayer perceptron inspired by the autoencoder architecture. We validated the proposed model on a publicly available MoGaze1 dataset for human action prediction, as well as on a smaller dataset recorded in our laboratory. Our model outperformed alternatives, such as recurrent neural networks, a fully connected LSTM network, and the strongest stand-alone signals (baselines), and can run in real-time on a standard laptop CPU. Since eye gaze might not always be available in a real-world scenario, we have implemented and tested a multi- layer perceptron for gaze estimation from more easily obtainable motion cues, such as head orientation and hand position. The estimated gaze signal can be utilized during inference of our LSTM-based model, thus making our action prediction pipeline suitable for real-time practical applications

Izvorni jezik
Engleski

Znanstvena područja
Elektrotehnika, Računarstvo, Temeljne tehničke znanosti

POVEZANOST RADA

Projekti:
--KK.01.1.1.01.009 - Napredne metode i tehnologije u znanosti o podatcima i kooperativnim sustavima (DATACROSS) (Šmuc, Tomislav; Lončarić, Sven; Petrović, Ivan; Jokić, Andrej; Palunko, Ivana) ( CroRIS)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Ivan Petrović (autor)