https://doi.org/10.1140/epjds/s13688-026-00655-0
Research
A row-type specific hybrid framework for credit risk analysis: loan portfolio based feature selection and unsupervised Bayesian network dependency exploration
Department of Analytics and Decision Science, Indian Institute of Management (IIM), Mumbai, India
a
This email address is being protected from spambots. You need JavaScript enabled to view it.
Received:
17
July
2025
Accepted:
2
April
2026
Published online:
14
April
2026
Abstract
Credit risk assessment is a critical function in financial analytics, requiring models that can adapt to diverse borrower profiles while providing clear and interpretable insights. Although a range of data driven techniques have been applied in this domain, many struggle to handle the inherent heterogeneity of financial data across different loan categories such as personal and agricultural loans. This paper introduces Credit Risk Analysis with Bayesian Networks (CRAB-Net), a row type specific hybrid framework for credit risk modeling. The approach first segments the data by loan type and balances the distribution of risk categories to ensure fair representation. It then identifies the most outcome relevant attributes through targeted feature selection, focusing on variables most associated with credit risk differentiation. On this refined set of features, unsupervised Bayesian network learning is applied to uncover conditional dependencies among financial variables without relying on default outcome labels. This design combines supervised relevance filtering with unsupervised dependency discovery, reducing noise and avoiding misleading patterns from analyzing all features indiscriminately. The framework revealed that in personal loans, installment-related variables such as installment frequency, overdue status, and repayment structure emerged as central nodes, indicating their dominant role in defining repayment behavior and delinquency risk. In contrast, for agricultural loans, the network structure was shaped primarily by provisioning norms, landholding details, and exposure-related attributes such as sanctioned amount and collateral type, suggesting that borrower risk in this segment is more closely linked to regulatory classification and collateral strength. Experiments on real world banking data show that CRAB-Net provides interpretable dependency graphs, supports fair segment level analysis, and enhances transparency for audit and supervisory compliance under Basel norms by offering clear, data driven evidence of the risk factors shaping borrower outcomes.
Key words: Credit risk analysis / Personal loan prediction / Agriculture loan prediction / Bayesian network
Handling Editor: David Garcia
© The Author(s) 2026
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

