Redes bayesianas aplicadas a problemas de credit scoringUna aplicación práctica

  1. Beltrán Pascual, Mauricio
  2. Muñoz Martínez, Azahara
  3. Muñoz Alamillos, Ángel
Revista:
Cuadernos de economía: Spanish Journal of Economics and Finance

ISSN: 2340-6704 0210-0266

Año de publicación: 2014

Volumen: 37

Número: 104

Páginas: 73-86

Tipo: Artículo

DOI: 10.1016/J.CESJEF.2013.07.001 DIALNET GOOGLE SCHOLAR lock_openBiblos-e Archivo editor

Otras publicaciones en: Cuadernos de economía: Spanish Journal of Economics and Finance

Objetivos de desarrollo sostenible

Resumen

En este artículo se aborda la forma de construir un clasificador eficiente a través de redes bayesianas utilizadas en la minería de datos y cuya finalidad es conseguir más precisión que otros modelos empleados en los problemas de credit scoring. El enfoque bayesiano, basado en modelos de probabilidad, emplea la teoría de la decisión para el análisis del riesgo eligiendo en cada situación que se presenta la acción que maximiza la utilidad esperada. Usando una muestra de datos bancarios reales se concluye la superior capacidad predictiva de estos modelos respecto a los resultados obtenidos por otros métodos estadísticos paramétricos y no paramétricos.

Referencias bibliográficas

  • Bonilla M., Olmeda I., Puertas R. Modelos paramétricos y no paramétricos en problemas de credit scoring. Revista Española de Financiación y Contabilidad 2003, XXXII.
  • Buntine W. Theory refinement on Bayesian Networks. Proceedings of Seventh Conference on Uncertainty in Artificial Intelligence 1991, 52-60.
  • Campos L.M. A scoring function for learning Bayesian networks based on mutual information and conditional independence tests. Journal of Machine Learning Research 2006, 7:149-2187.
  • Castillo E., Gutierrez J.M., Hadi A. Sistemas Expertos y Modelos de Redes Probabilísticas. Monografías de la Academia de Ingeniería 1998.
  • Chawla N.V., Bowyer K.W., Hall L.O., Kegelmeyer W.P. SMOTE: Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research 2002, 16:321-357.
  • Chow K., Liu C.N. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 1968, IT-14:462-467.
  • Cohen G., Hilario M., Sax H., Hugonnet S.Y., Geissbuhler A. Learning from imbalancing data in surveillance of nosocomial infection. Artificial Intelligence in Medicine 2006, 37:7-18.
  • Cooper G., Herskovitz E. A Bayesian method for the induction of probabilistic networks from data. Machine Learning 1992, 9:309-348.
  • Cowell R.G., David A.P., Lauritzen S.L., Spiegelhalter D.J. Probabilistic Networks and Expert Systems 1999, Springer-Verlag, New York.
  • Deville J.-C., Tillé Y. Eficient balanced sampling: The cube method. Biometrika 2004, 91:893-912.
  • Domingos P. MetaCost. A general method for making classifiers cost-sensitive. Fifth International Conference on Knowledge Discovery and Data Mining 1999, 155-164.
  • Duda R.O., Hart P.E. Pattern Classification and Scene Analysis 1973, John Wiley & Sons, New York.
  • Edwards W. Hailfinder. Tools for and experiences with bayesian normative modeling. American Psychologist 1998, 53:416-428.
  • Fayyard U.M., Irani K.B. Multi-interval discretization of continuous valued attributes for classification learning. Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence 1993, 1022-1027.
  • Friedman N., Geiger D., Goldszmidt M. Bayesian networks classifiers. Machine Learning 1997, 29:131-167.
  • Friedman N., Getoor L., Köller D., Pfeffer A. Learning probabilistic relational models. Proceedings of the Sixteenth International Joint Conferences on artificial Intelligence 1999, 1300-1309.
  • Garbolino P., Taroni F. Evaluation of scientific evidence using Bayesian networks. Forensic Science International 2002, 125:149-155.
  • Han H, Wang W, Mao B. (2005) Borderline-SMOTE: A new Over-Sampling Method in Imbalanced Data Sets Learning. En: Huanng D.S., Zhzng X.-P., Huang G.-B., editors. ICICS, vol. 3644 de LNCS, pp. 878-887.
  • Heckerman D. A tutorial on learning with Bayesian networks 1996, Microsoft Reseach, Redmon, WA, Tech. Rep. N.° MSR-TR-95-06.
  • Hernández Orallo J., Ramírez Quintan M.J., Ferri Ramírez C. Introducción a la minería de datos 2004, Pearson - Prentice Hall.
  • Holland J.H. Adaptation in Natural and Artificial Systems 1975, The University of Michigan Press, (The MIT Press, London, 1992).
  • Hulse J.V., Khoshgoftaar T.M., Napolitano A. (2007) Experimental perspectives on learning from imbalanced data. En: Ghahramani Z. editor. ICML, vol. 227 de ACM International Conference Proceeding series, pp. 935-942.
  • Japkowicz N. (2001) Concept-Learning in the Presence of Between-Class and Within-Class Imbalances. En: Stroulia E., Matwin S., editors. Canadian Conference on AI, vol. 2056 de LNCS, pp. 67-77.
  • Japkowicz N., Stephen S. The class imbalance problem: A systematic study intelligent data. Analysis Journal 2002, 6:1-32.
  • Jo T., Japkowicz N. Class imbalances versus small disjuncts. SIGKDD Explorations 2004, 6:40-49.
  • Learning in Graphical Models 1998, Kluwer, Dordrecht, Netherlands. M.I. Jordan (Ed.).
  • Kadie, C.M., Hovel, D., Hovitz, E., 2001. A component-centric toolkit for modeling and inference with Bayesian networks. Microsoft Research, Richmond, WA, Technical Report MSR-TR-2001-67, pp. 13-25.
  • Keogh E.J., Pazzani M. Learning augmented Bayesian classifiers: A comparison of distribution-based and non distribution-based approaches. Proceedings of the 7th International Workshop on Artificial Intelligence and Statistics 1999, 225-230.
  • Kubat M., Matwin S. (1997) Addressing the Course of Imbalanced Training Sets: One-Sided Selection. En: Fisher D.H., editor. ICML, pp. 179-186.
  • Kuncheva L., Jain L.C. Nearest neighbor classifier: Simultaneous editing and feature selection. Pattern Recognition Letters 1999, 20:1149-1156.
  • Langley P.W., Iba P., Thompson K. An analysis of Bayesian classifiers. Proceedings of Tenth National Conference on Artificial Intelligence 1992, 223-228. AAAI Press, Menlo Park, CA.
  • Larrañaga P., Poza M., Yurramendi Y., Murga R.H., Kuijpers C.M.H. Structure learning of Bayesian networks by genetic algorithms: A performance analysis of control parameters. Pattern Analysis and Machine Intelligence, IEEE Transactions on Sep 1996 1996, 18:912-926.
  • Laurikkala J. Instance-based data reduction for improved identification of difficult small classes. Intelligent Data Analysis 2002, 6:311-322.
  • López J., García J., de la Fuente L. Modelado causal con redes bayesianas. Actas de las XXVII Jornadas de Automática 2006, 198-202.
  • Marczyk A. Genetic algorithms and evolutionary computation. The Talk Origins Archive 2004.
  • Martínez I., Rodríguez C. Modelos gráficos. Técnicas estadísticas aplicadas al análisis de datos. 2003, 217-257. Servicio de Publicaciones de la Universidad de Almería, Almería. Y. del Águila, E.M. Artés, A.M. Juan, I. Martínez, I. Oña, I.M. Ortiz (Eds.).
  • Mitchell T.M. Machin Learning 1997, MacGraw-Hill.
  • Nadkarni S., Shenoy P.P. A Bayesian network approach to making inferences in causal maps. European Journal of Operational Research 2001, 128:479-498.
  • Nadkarni S., Shenoy P.P. A causal mapping approach to constructing Bayesian networks. Decision Support Systems 2004, 38:259-281.
  • Neapolitan R.E. Learning Bayesian Networks 2003, Prentice Hall, New York, NY, USA.
  • Provost F. 2003. Machine learning from imbalanced data sets 101 (Extended Abstract). En: AAAI: Workshop on Learning with Imbalanced Data Sets.
  • Spiegelhalter D.J., Lauritzen S.L. Sequential updating of conditional probabilities on directed graph structures. Network 1990, 20:579-605.
  • Suzuki J. Learning Bayesian Belief Network Based on the Minimum Description Length Principle: An Efficient Algorithm Using the B&B Technique. Proceedings of the Thirteenth International Conference on Machine Learning 1996, 462-470.
  • Wang J., Xu M., Wang H., Zhang J. 2006. Classification of Imbalanced Data by Using the SMOTE Algorithm and locally Linear Embedding. En: ICSP, vol. 3, pp. 16-20.
  • Wilson D.L. Asymptotic properties of nearest neighbour rules using edited data, IEEE Transactions on Systems, Man and Cybernetics 1972, IEEE Computer Society Press, Los Alamos.
  • Zhang J., Mani I. kNN approach to unbalanced data distributions: A case study involving information extraction. ICML: Workshop on Learning from Imbalanced Dataset II 2003.