Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 1252222

Empirical Study: How Issue Classification Influences Software Defect Prediction


Afrić, Petar; Vukadin, Davor; Šilić, Marin; Delač, Goran
Empirical Study: How Issue Classification Influences Software Defect Prediction // IEEE access, 11 (2023), 11732-11748 doi:10.1109/ACCESS.2023.3242045 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 1252222 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Empirical Study: How Issue Classification Influences Software Defect Prediction

Autori
Afrić, Petar ; Vukadin, Davor ; Šilić, Marin ; Delač, Goran

Izvornik
IEEE access (2169-3536) 11 (2023); 11732-11748

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
Issue tracking ; Version Control Systems ; Natural language processing , Issue classification ; Software defect prediction ; RoBERTa

Sažetak
Software defect prediction aims to identify potentially defective software modules to better allocate limited quality assurance resources. Practitioners often do this by utilizing supervised models trained using historical data. This data is gathered by mining version control and issue tracking systems. Version control commits are linked to issues they address. If the linked issue is classified as a bug report, the change is considered as bug fixing. The problem arises from the fact that issues are often incorrectly classified within issue tracking systems. This introduces noise into the gathered datasets. In this paper, we investigate the influence issue classification has on software defect prediction dataset quality and resulting model performance. To do this, we mine data from 7 popular open-source repositories, create issue classification and software defect prediction datasets for each of them. We investigate issue classification using four different methods ; a simple keyword heuristic, an improved keyword heuristic, the FastText model and the RoBERTa model. Our results show that using the RoBERTa model for issue classification produces the best software defect prediction datasets, containing on average 14.3641% of mislabeled instances. SDP models trained on such datasets achieve superior performance, to those trained on SDP datasets created using other issue classification methods, in 65 out of 84 experiments, with 55 of them being statistically relevant. Furthermore, in 17 out of 28 experiments we could not show a statistically relevant performance difference between SDP models trained on RoBERTa derived software defect prediction datasets and those created using manually labeled issues.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Projekti:
HRZZ-IP-2018-01-6423 - Pouzdani kompozitni primjenski sustavi zasnovani na web uslugama (RELS) (Srbljić, Siniša, HRZZ ) ( CroRIS)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Avatar Url Petar Afrić (autor)

Avatar Url Davor Vukadin (autor)

Avatar Url Goran Delač (autor)

Avatar Url Marin Šilić (autor)

Poveznice na cjeloviti tekst rada:

doi ieeexplore.ieee.org

Citiraj ovu publikaciju:

Afrić, Petar; Vukadin, Davor; Šilić, Marin; Delač, Goran
Empirical Study: How Issue Classification Influences Software Defect Prediction // IEEE access, 11 (2023), 11732-11748 doi:10.1109/ACCESS.2023.3242045 (međunarodna recenzija, članak, znanstveni)
Afrić, P., Vukadin, D., Šilić, M. & Delač, G. (2023) Empirical Study: How Issue Classification Influences Software Defect Prediction. IEEE access, 11, 11732-11748 doi:10.1109/ACCESS.2023.3242045.
@article{article, author = {Afri\'{c}, Petar and Vukadin, Davor and \v{S}ili\'{c}, Marin and Dela\v{c}, Goran}, year = {2023}, pages = {11732-11748}, DOI = {10.1109/ACCESS.2023.3242045}, keywords = {Issue tracking, Version Control Systems, Natural language processing , Issue classification, Software defect prediction, RoBERTa}, journal = {IEEE access}, doi = {10.1109/ACCESS.2023.3242045}, volume = {11}, issn = {2169-3536}, title = {Empirical Study: How Issue Classification Influences Software Defect Prediction}, keyword = {Issue tracking, Version Control Systems, Natural language processing , Issue classification, Software defect prediction, RoBERTa} }
@article{article, author = {Afri\'{c}, Petar and Vukadin, Davor and \v{S}ili\'{c}, Marin and Dela\v{c}, Goran}, year = {2023}, pages = {11732-11748}, DOI = {10.1109/ACCESS.2023.3242045}, keywords = {Issue tracking, Version Control Systems, Natural language processing , Issue classification, Software defect prediction, RoBERTa}, journal = {IEEE access}, doi = {10.1109/ACCESS.2023.3242045}, volume = {11}, issn = {2169-3536}, title = {Empirical Study: How Issue Classification Influences Software Defect Prediction}, keyword = {Issue tracking, Version Control Systems, Natural language processing , Issue classification, Software defect prediction, RoBERTa} }

Časopis indeksira:


  • Current Contents Connect (CCC)
  • Web of Science Core Collection (WoSCC)
    • Science Citation Index Expanded (SCI-EXP)
    • SCI-EXP, SSCI i/ili A&HCI
  • Scopus


Citati:





    Contrast
    Increase Font
    Decrease Font
    Dyslexic Font