Evaluation metrics and dimensional reduction for binary classification algorithms: a case study on bankruptcy prediction

Pérez Pons, María Eugenia; Parra Domínguez, Javier; Hernández González, Guillermo; Herrera Viedma, Enrique; Corchado Rodríguez, Juan Manuel

doi:10.1017/S026988892100014X

Título

Evaluation metrics and dimensional reduction for binary classification algorithms: a case study on bankruptcy prediction

dc.contributor.author	Pérez Pons, María Eugenia
dc.contributor.author	Parra Domínguez, Javier
dc.contributor.author	Hernández González, Guillermo
dc.contributor.author	Herrera Viedma, Enrique
dc.contributor.author	Corchado Rodríguez, Juan Manuel
dc.date.accessioned	2026-01-21T11:15:08Z
dc.date.available	2026-01-21T11:15:08Z
dc.date.issued	2022-01-14
dc.identifier.citation	Pérez-Pons ME, Parra-Dominguez J, Hernández G, Herrera-Viedma E, Corchado JM. Evaluation metrics and dimensional reduction for binary classification algorithms: a case study on bankruptcy prediction. The Knowledge Engineering Review. 2022;37:e1. doi:10.1017/S026988892100014X	es_ES
dc.identifier.issn	0269-8889
dc.identifier.uri	http://hdl.handle.net/10366/169119
dc.description.abstract	[EN]This paper presents a methodology that permits to automate binary classification using the minimum possible number of attributes. In this methodology, the success of the binary prediction does not lie in the accuracy of an algorithm but in the evaluation metrics, which give information about the goodness of fit; which is an important factor when the data batch is unbalanced. The proposed methodology assesses the possible biases in identifying one algorithm as the best performer when considering the goodness of fit of an algorithm through evaluation metrics. The dimension of data has been reduced through the cumu- lative explained variance. Then, the performance of six machine learning classification models has been compared through Matthew correlation coefficient (MCC), area under curve – receiver operating char- acteristic (ROC-AUC), and area under curve – precision-recall (AUC-PR). The results show graphically and numerically how the evaluation metrics interfere with the most optimal outcome of an algorithm. The algorithms with the best performance in terms of evaluation metrics have been random forest and gradi- ent boosting. In the imbalanced datasets, MCC has provided better prediction results than ROC-AUC or AUC-PR. The proposed methodology is adapted to the case of bankruptcy prediction.	es_ES
dc.language.iso	eng	es_ES
dc.publisher	Cambridge University Press	es_ES
dc.rights	Attribution-4.0 Internacional	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	*
dc.subject	Artificial Intelligence	es_ES
dc.subject	Bankruptcy	es_ES
dc.subject	Accountancy	es_ES
dc.title	Evaluation metrics and dimensional reduction for binary classification algorithms: a case study on bankruptcy prediction	es_ES
dc.type	info:eu-repo/semantics/article	es_ES
dc.relation.publishversion	https://doi.org/10.1017/S026988892100014X	es_ES
dc.identifier.doi	10.1017/S026988892100014X
dc.relation.projectID	RTC-2017-6536-7	es_ES
dc.rights.accessRights	info:eu-repo/semantics/closedAccess	es_ES
dc.identifier.essn	1469-8005
dc.journal.title	The Knowledge Engineering Review	es_ES
dc.volume.number	37	es_ES
dc.type.hasVersion	info:eu-repo/semantics/publishedVersion	es_ES