Pregled bibliografske jedinice broj: 1154222
Preparation of Simplified Molecular Input Line Entry System Notation Datasets for use in Convolutional Neural Networks
Preparation of Simplified Molecular Input Line Entry System Notation Datasets for use in Convolutional Neural Networks // The 21st IEEE International Conference on BioInformatics and BioEngineering / Nenad Filipović (ur.).
Kragujevac: Institute of Electrical and Electronics Engineers (IEEE), 2021. Paper ID #1 44, 6 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), ostalo)
CROSBI ID: 1154222 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Preparation of Simplified Molecular Input Line
Entry System Notation Datasets for use
in Convolutional Neural Networks
Autori
Baressi Šegota, Sandi ; Anđelić, Nikola ; Lorencin, Ivan ; Musulin, Jelena ; Štifanić, Daniel ; Car, Zlatan
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), ostalo
Izvornik
The 21st IEEE International Conference on BioInformatics and BioEngineering
/ Nenad Filipović - Kragujevac : Institute of Electrical and Electronics Engineers (IEEE), 2021
ISBN
978-86-81037-69-0
Skup
21st IEEE International Conference on BioInformatics and BioEngineering (BIBE 2021)
Mjesto i datum
Kragujevac, Srbija, 25.10.2021. - 27.10.2021
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
artificial intelligence, convolutional neural networks, data processing and transformation, machine learning, SMILES
Sažetak
Simplified Molecular Input Line Entry System (SMILES) is a type of chemical notation. The SMILES format allows the representation of chemical structures in a shape easily readable by computer programs. This allows many techniques, such as Artificial Neural Networks (ANNs) to be applied on the SMILES formatted data. One of the highest-performing ANN types is the Convolutional Neural Networks (CNNs), designed to work on images or matrix-shaped data. In this paper, the authors will present the preparation of the SMILES dataset for use by CNNs. The paper will start with a brief description of the SMILES format, followed by the explanation of the dataset transformation into an NPY matrix-based format, with an example of utilization via the application of popular CNN architectures on a transformed dataset. The proposed architecture achieves satisfactory results (AUC=0.92), with the transformation algorithm speed also proving satisfactory (0.08 seconds per data point)
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
Ostalo-CEI - 305.6019-20 - Use of regressive artificial intelligence (AI) and machine learning (ML) methods in modelling of COVID-19 spread (COVIDAi) (Car, Zlatan, Ostalo - CEI Extraordinary Call for Proposals 2020) ( CroRIS)
InoUstZnVO-CIII-HR-0108-10 - Concurrent Product and Technology Development - Teaching, Research and Implementation of Joint Programs Oriented in Production and Industrial Engineering (Car, Zlatan, InoUstZnVO - CEEPUS) ( CroRIS)
--KK.01.1.1.01.009 - Napredne metode i tehnologije u znanosti o podatcima i kooperativnim sustavima (DATACROSS) (Šmuc, Tomislav; Lončarić, Sven; Petrović, Ivan; Jokić, Andrej; Palunko, Ivana) ( CroRIS)
Ustanove:
Tehnički fakultet, Rijeka
Profili:
Zlatan Car
(autor)
Jelena Musulin
(autor)
Nikola Anđelić
(autor)
Sandi Baressi Šegota
(autor)
Ivan Lorencin
(autor)
Daniel Štifanić
(autor)