Pregled bibliografske jedinice broj: 330563
A Statistical Framework for the Prediction of Fault-Proneness
A Statistical Framework for the Prediction of Fault-Proneness // Advances in Machine Learning Application in Software Engineering / Zhang, Du ; Tsai, Jeffrey (ur.).
Hershey (PA): Idea Group, 2007. str. 237-264
CROSBI ID: 330563 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
A Statistical Framework for the Prediction of Fault-Proneness
Autori
Ma, Yan ; Guo, Lan ; Cukic, Bojan
Vrsta, podvrsta i kategorija rada
Poglavlja u knjigama, znanstveni
Knjiga
Advances in Machine Learning Application in Software Engineering
Urednik/ci
Zhang, Du ; Tsai, Jeffrey
Izdavač
Idea Group
Grad
Hershey (PA)
Godina
2007
Raspon stranica
237-264
ISBN
1591409411
Ključne riječi
ne
(none)
Sažetak
Accurate prediction of fault prone modules in software development process enables effective discovery and identification of the defects. Such prediction models are especially valuable for the large-scale systems, where verification experts need to focus their attention and resources to problem areas in the system under development. This paper presents a methodology for predicting fault prone modules using a modified random forests algorithm. Random forests improve classification accuracy by growing an ensemble of classification trees and letting them vote on the classification decision. We applied the methodology to five NASA public domain defect data sets. These data sets vary in size, but all typically contain a small number of defect samples in the learning set. For instance, in project PC1, only around 7% of the instances are defects. If overall accuracy maximization is the goal, then learning from such data usually results in a biased classifier, i.e. the majority of samples would be classified into non-defect class. To obtain better prediction of fault-proneness, two strategies are investigated: proper sampling technique in constructing the tree classifiers, and threshold adjustment in determining the winning class. Both are found to be effective in accurate prediction of fault prone modules. In addition, the paper presents a thorough and statistically sound comparison of these methods against ten other classifiers frequently used in the literature.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
165-0362980-2002 - Postupci raspoređivanja u samoodrživim raspodijeljenim računalnim sustavima (Martinović, Goran, MZO ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike, računarstva i informacijskih tehnologija Osijek
Profili:
Bojan Čukić
(autor)