A Statistical Framework for the Prediction of Fault-Proneness

Ma, Yan; Guo, Lan; Cukic, Bojan

Pregled bibliografske jedinice broj: 330563

A Statistical Framework for the Prediction of Fault-Proneness

Ma, Yan; Guo, Lan; Cukic, Bojan

A Statistical Framework for the Prediction of Fault-Proneness // Advances in Machine Learning Application in Software Engineering / Zhang, Du ; Tsai, Jeffrey (ur.).
Hershey (PA): Idea Group, 2007. str. 237-264

CROSBI ID: 330563 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
A Statistical Framework for the Prediction of Fault-Proneness

Autori
Ma, Yan ; Guo, Lan ; Cukic, Bojan

Vrsta, podvrsta i kategorija rada
Poglavlja u knjigama, znanstveni

Knjiga
Advances in Machine Learning Application in Software Engineering

Urednik/ci
Zhang, Du ; Tsai, Jeffrey

Izdavač
Idea Group

Grad
Hershey (PA)

Godina
2007

Raspon stranica
237-264

ISBN
1591409411

Ključne riječi
ne
(none)

Sažetak
Accurate prediction of fault prone modules in software development process enables effective discovery and identification of the defects. Such prediction models are especially valuable for the large-scale systems, where verification experts need to focus their attention and resources to problem areas in the system under development. This paper presents a methodology for predicting fault prone modules using a modified random forests algorithm. Random forests improve classification accuracy by growing an ensemble of classification trees and letting them vote on the classification decision. We applied the methodology to five NASA public domain defect data sets. These data sets vary in size, but all typically contain a small number of defect samples in the learning set. For instance, in project PC1, only around 7% of the instances are defects. If overall accuracy maximization is the goal, then learning from such data usually results in a biased classifier, i.e. the majority of samples would be classified into non-defect class. To obtain better prediction of fault-proneness, two strategies are investigated: proper sampling technique in constructing the tree classifiers, and threshold adjustment in determining the winning class. Both are found to be effective in accurate prediction of fault prone modules. In addition, the paper presents a thorough and statistically sound comparison of these methods against ten other classifiers frequently used in the literature.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo

POVEZANOST RADA

Projekti:
165-0362980-2002 - Postupci raspoređivanja u samoodrživim raspodijeljenim računalnim sustavima (Martinović, Goran, MZO ) ( CroRIS)

Ustanove:
Fakultet elektrotehnike, računarstva i informacijskih tehnologija Osijek

Profili:

Bojan Čukić (autor)

CROSBI Hrvatska znanstvena bibliografija

Pregled bibliografske jedinice broj: 330563

A Statistical Framework for the Prediction of Fault-Proneness

Citiraj ovu publikaciju:

Pregled bibliografske jedinice broj: 330563

A Statistical Framework for the Prediction of Fault-Proneness

Citiraj ovu publikaciju:

Podijeli: