An overview and comparison of free Python libraries for data mining and big data analysis

Stančin, Igor; Jović, Alan

Pregled bibliografske jedinice broj: 1002799

An overview and comparison of free Python libraries for data mining and big data analysis

Stančin, Igor; Jović, Alan

An overview and comparison of free Python libraries for data mining and big data analysis // MIPRO 2019 Proceedings / Skala, Karolj (ur.).
Rijeka: Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO, 2019. str. 1161-1166 doi:10.23919/MIPRO.2019.8757088 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)

CROSBI ID: 1002799 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
An overview and comparison of free Python libraries for data mining and big data analysis

Autori
Stančin, Igor ; Jović, Alan

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
MIPRO 2019 Proceedings / Skala, Karolj - Rijeka : Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO, 2019, 1161-1166

Skup
42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO 2019)

Mjesto i datum
Opatija, Hrvatska, 20.05.2019. - 24.05.2019

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
data science ; python ; data mining ; machine learning library ; big data analysis ; framework

Sažetak
The popularity of Python is growing, especially in the field of data science. Consequently, there is an increasing number of free libraries available for usage. The aim of this review paper is to describe and compare the characteristics of different data mining and big data analysis libraries in Python. There is currently no paper dealing with the subject and describing pros and cons of all these libraries. Here we consider more than 20 libraries and separate them into six groups: core libraries, data preparation, data visualization, machine learning, deep learning and big data. Beside functionalities of a certain library, important factors for comparison are the number of contributors developing and maintaining the library and the size of the community. Bigger communities mean larger chances for easily finding solution to a certain problem. We currently recommend: pandas for data preparation ; Matplotlib, seaborn or Plotly for data visualization ; scikit-learn for machine learning ; TensorFlow, Keras and PyTorch for deep learning ; and Hadoop Streaming and PySpark for big data.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo

POVEZANOST RADA

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Igor Stančin (autor)

Alan Jović (autor)

Poveznice na cjeloviti tekst rada:

Pristup cjelovitom tekstu rada doi ieeexplore.ieee.org

CROSBI Hrvatska znanstvena bibliografija

Pregled bibliografske jedinice broj: 1002799

An overview and comparison of free Python libraries for data mining and big data analysis

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Citati:

Altmetrijski pokazatelji:

Pregled bibliografske jedinice broj: 1002799

An overview and comparison of free Python libraries for data mining and big data analysis

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Citati:

Altmetrijski pokazatelji:

Podijeli: