Artificial Intelligence in Banking

Print pagePDF pageEmail page

Dr. Periklis Gogas

Professor, Dr. Periklis Gogas 



Anna Agrapetidou, Ph.D. Candidate


Professor, Dr. Theophilos Papadimitriou,


                                 Department of Economics  



 The problem

The health and stability of the banking sector is crucial in modern economies. Failures of systemically important financial institutions and generalized distress in even less significant banks can propagate to the whole sector very fast. These issues of distress, if not addressed swiftly and directly by the regulators (usually the central banks to associated specialized entities) may lead to wide-spread full economic crises and even international financial crises.

The U.S. banking sector

From 2000 to 2018 the total number of banking institutions in the U.S. decreased from 9,904 to 5,406 (more than 40%). This significant decline was the result of: a) an increased number of bank failures (more than 500 banks went bankrupt), b) a lack of new financial institutions entering the U.S. banking sector and c) a consolidation process through mergers and acquisitions. The financial crisis of 2007 highlighted the systemic effects of a banking crisis propagated in national (to other sectors of the U.S. economy) and international level (to other national economies around the world). Moreover, it raised serious concerns on the appropriate regulatory policies in effect and led to significant supervisory and regulatory reforms in an international scale (the Dodd-Frank Act and Basel III). Banking institutions are supervised, and their performance is monitored and evaluated by regulatory authorities through i) periodic stress testing, ii) the imposition of minimum capital requirements (Basel III), and iii) the implementation of prompt mandatory corrective actions when their financial position deteriorates significantly.

The research

Many researchers intrigued by the importance and gravity of a bank failure, attempted to construct forecasting models of banks’ insolvency. These forecasting models can be categorized in two broad groups: a) traditional statistical and econometrical methodologies and b) more recently, artificial intelligence and machine learning techniques.

Our research team at Democritus University of Thrace developed a forecasting model of bank failures based on Artificial intelligence and more specifically, the Support Vector Machines (SVM) algorithm. The proposed methodology classifies each banking institution as solvent or insolvent in the next year. Moreover, as we will see, this model can also be used to as a novel alternative stress testing tool that uses minimum resources and thus it is low cost and easy to be implemented often.

The data

Our sample includes more than half the U.S. banks in operation during the 2007- 2013 period, for a total of 2379 bans. This sample also includes all 481 banks that failed. The data for each bank come from their publicly available financial statements and they are retrieved from the database of the Federal Deposit Insurance Corporation (FDIC).

We use 36 variables from the financial statements and/or financial ratios for four years prior to the target forecast year. For example, in order to forecast the financial state of a bank in year 2013 we use data for the years 2009 to 2012.

The Methodology

From the total of 2379 banks, we used a total of 1100 banks to train and cross validate the SVM-based forecasting model. The remaining 1279 banks were set aside and used to test the generalization ability of the model to unknown data, what is known as out-of-sample forecasting.

The proposed model identifies only two variables out of 144 used as the most important for the classification between solvent and insolvent banks. These two variables are:

  1. a) Tier 1 (core) risk-based capital over total assets (T1RBC) and
  2. b) Total interest expense over total interest income (TIE).

The T1CRC is a measure of capital adequacy and an increased ratio is associated with a lower probability of default. An increased ratio means that banks hold more capital, thus they are better equipped to absorb future adverse shocks.

The TIE is a measure of operational efficiency; it represents the interest paid on any type of borrowing over the interest earned on any type of lending. A lower TIE ratio is associated with a lower risk of bank default and indicates less leveraged banks (debt-to-own-funds).

In general, the T1CRC can be treated as a proxy for capital adequacy and as a result higher values correspond to decreased default risk. The TIE is as a proxy for a bank’s efficiency. Capital adequacy is a dominant part of the current regulatory framework (Basel III) and many studies demonstrated that it as a key measure in assessing banks’ solvency.

The Results

Using only these 2 variables, the overall out-of-sample forecasting accuracy of the model is 99.22%. The forecasting accuracy for the solvent banks is 99.40% and for the insolvent ones is 97.37%. We compare our results to the ones we get from the well-established Ohlson’s score (Ohlson, 1980) that was considered the state-of-the-art methodology at that time. The overall accuracy of the O-score is 79.05%. For the solvent banks the forecasting accuracy is 77.06% and for the insolvent ones 99.13%. The SVM model’s superior performance in avoiding Type 1 (false positive) errors — i.e., solvent banks misclassified as insolvent — represents an important advance from Ohlson’s O-score.

Figure 1: Graphical representation of the out-of-sample forecasting: the dashed line represents the decision boundary obtained in the training process of the SVM-linear model. The red and blue dots represent the banks forecasted as insolvent and solvent respectively. The black dots are banks that were misclassified.

Stress-testing tool

This SVM forecasting model, beyond the apparent very high precision, has two more advantages. The first is parsimony; the proposed approach identifies only two variables (out of 144 used) as the most important to forecast bank insolvency (T1CRC and TIE). The second advantage is the straightforward graphical interpretation that can be used in this framework: in a two-dimensional space defined by the two explanatory variables mentioned, a linear decision boundary is introduced. This boundary separates the banking sector into two sub-spaces; the solvent sub-space and the insolvent sub-space. Each bank is represented by a point in the two-dimensional data space.



Leave a Reply

Your email address will not be published. Required fields are marked *