The Hebrew University Logo
Syllabus fintech analytics- credit risk - 55790
òáøéú
Print
 
close window close
PDF version
Last update 19-03-2019
HU Credits: 1

Degree/Cycle: 2nd degree (Master)

Responsible Department: Business Administration

Semester: 2nd Semester

Teaching Languages: English

Campus: Mt. Scopus

Course/Module Coordinator: Prof ROGER STEIN

Coordinator Email: steinr@mit.edu

Coordinator Office Hours:

Teaching Staff:
Prof ROGER STEIN

Course/Module description:
Data-driven credit analytics have become increasingly prevalent among both traditional
banks and new lending platform companies. However, machine learning and statistical algorithms are only a
small part of what is involved in building robust risk analytics. This short seminar focuses on the practical
challenges that arise in implementing a variety of data-driven credit models (e.g., bankruptcy and default
models retail and commercial entities). With a focus on large data sets, we explore a number of data-driven
approaches to modeling the likelihood that credit-risky borrowers will default on their obligations. I will
draw heavily on my experiences building and evaluating some of the most widely used and commercially
successful data-driven credit evaluation tools in the industry. This seminar will tend heavily towards discussions
of practical model implementations and the “frictions” that make these implementations difficult in
real-world settings. We pay special attention to validating discrete-choice models in real-world settings. We
will not focus as heavily on the structure of credit markets or the details of pricing a broad variety of creditrisky
instruments.
We will take the view that an effective, practical credit modeling framework will be rough around the edges
with the odd inconsistency (usually to deal with available data or the lack thereof). This implies that seemingly
incompatible models can each have value in specific contexts, resulting in retention of several models
despite their theoretical inconsistency. Because the focus is applied, we will discuss model validation and
calibration in detail and highlight data issues in estimation and validation. Since credit models for corporate
debt are most well developed, we will deal most extensively with these models. Lectures will to focus on
conceptual themes and practical issues, with much of the technical detail underlying these to be found in the
readings.
I will also provide suggested “mini-projects” for those students who are more technically inclined. These
projects serve to provide motivation and, if you do them, you will leave the seminar with some very useful
tools for applying this subject matter in practice.

Course/Module aims:
To expose students to the practical challenges associated with building and testing singleborrower
credit risk models, such as those used by banks, as well as to the types of modeling techniques that can
be used to build them. These “mini-projects” are described using R syntax, though they may be implemented
in any language in which you work (Python, SAS, Matlab, etc.). I will go over a “solution” to at least one of
these during the seminar.

Learning outcomes - On successful completion of this module, students should be able to:
build and test single-borrower credit risk models, such as those used by banks.

Attendance requirements(%):

Teaching arrangement and method of instruction:

Course/Module Content:
Day 1
Introduction to credit risk modeling concepts, the challenge of data analytics and the nature of
FinTech platforms
• How can we add value in developing data-driven analytics
• The features of successful of FinTech platforms
• Data problems and resolutions
• Key components of credit risk – PD, LGD, (EAD), correlation, size
• Differing modeling paradigms
• Diversification
ACPMIP: Chapter 1, pp. 2-16; 19-23; 32-34; 38; 42-43. Chapter 2, pp. 60-62; 72-74.
Supplemental readings:
• Dhar, V. and R. Stein (1997), Seven Methods for Transforming Corporate Data into Business Intelligence, Prentice
Hall, NJ. Chapter 3.
• Dhar, V. and Stein R. M. (2017) “Economic and Business Dimensions on FinTech Platforms and Strategy.”,
Communications of the ACM, 60, 10, October, pp. 32-35.
Day 2
Introduction to PD model validation
• Validating model power using ROC curves
• Validating model calibration using probability-based measures
ACPMIP: Chapter 7, pp. 361-397.
• Stein, R. M., A. E. Kocagil, J. Bohn and J. Akhavain (2003). “Systematic and Idiosyncratic Risk in Middle-Market
Default Prediction: A Study of the Performance of the RiskCalc and PFM Models.” Moody’s KMV.
For those interested, see if you can write this tool: Function to calculate the AUC ROC for two different subsets of a single data set.
• Definition: subROC<-function(x, split.val, split.on, score, outcome,...) where
• x is a dataframe
• split.val is a scalar, factor value, date or string used to divide the data frame
• split.on is a vector of length nrow(x) of the same type as split.val; split.on <&eq; split.val goes to one data
subset while the remainder goes to the other
• score is a numerical vector of length nrow(x) for calculating the ROC AUC
• outcome is a binary numerical vector of length nrow(x) for calculating the ROC AUC
• ... additional parameters
• Return: The function should return a list with three elements:
• A vector of length 2 with the ROC AUC for each data subset.
• A vector of length nrow(x.subset1) giving the indices of x that for the records included in x.subset1
• A vector of length nrow(x.subset2) giving the indices of x that for the records included in x.subset2
• Tasks:
• Implement subROC
• Describe how you would make subROC more general so that it could take in an arbitrary one or two variable statistic
(function) as an input and return the appropriate data
VERSION: 10/31/2018
3
Outline (cont.)
A short introduction to data-driven default modeling
Day 3
Data driven models for default prediction
• Discrete choice models
• Survival models
Tree-based models
• CART
• RandomForests
ACPMIP: Chapter 4, pp. 183-215, 238-252.
Supplemental readings:
• Dhar, V. and Stein, R. (1997), Seven Methods for Transforming Corporate Data into Business Intelligence, Ch. 10.
• Friedman, Hastie and Tibshirani (2013), Elements of Statistical Learning, Section 9.2, pp. 305-311.
Day 4
Introduction to PD model calibration
• Calibrating to empirical data using calibration curves
• Adjusting for differing baseline default rates
• Mapping between ratings and PDs and back again
ACPMIP: Chapter 4, pp. 215-233.
Supplemental readings:
• Stein, R. M., A. E. Kocagil, J. Bohn and J. Akhavain (2003). “Systematic and Idiosyncratic Risk in Middle-Market
Default Prediction: A Study of the Performance of the RiskCalc and PFM Models.” Moody’s KMV.
For those interested, see if you can write this tool:
• Function to create a calibration curve mapping a variable to a default rate.
• Function to use the calibration curve to map a variable to a default rate (including interpolation).
• Definition: estimateCalibCurve<-function(x, outcome, k, ...) where
• x is a vector of model scores
• outcome is a is a binary numerical vector of length length(x)
k is a scalar, denoting the number of “buckets” to use in the mapping
• Return: The function should return a list with three elements:
• A list containing
o map a dataframe of length k with two columns
o the cutoff (on the same scale as x)
o the mapped PD corresponding to the cutoff
o baseline a scalar containing the baseline PD for outcome
• Definition: applyCalibCurve <-function(x, map, baseline&eq;NULL, ...) where
• x is a vector of model scores
• map is a is a dataframe returned in map by buildCalibCurve
baseline is a scalar, to be used if baseline adjustment is to be applied after mapping
• Return: The function should return a vector of length length(x) containing the mapped PD for each element of x .
• Tasks:
• Implement estimateCalibCurve
• Implement applyCalibCurve
VERSION: 10/31/2018
4
Day 5
Introduction to portfolio and correlation models
• Credit loss distributions, VaR and ES
• Default correlation
• Asset correlation and asset-factor models
• Macro-economic correlation models
Introduction to structured finance
ACPMIP: Chapter 8
Supplemental readings:
• Fernandez, JM, R. M. Stein and A. W. Lo. 2012. “Commercializing biomedical research through securitization techniques,”
nature biotechnology. 30. 10.
• Das, A. and R. M. Stein. 2013. “Differences in tranching methods: Some results and implications.” Credit Securitizations
and Derivatives. Wiley.

Required Reading:
Bohn, J. R. and R. M. Stein, (2009) Active Credit Portfolio Management in Practice, NY, Wiley.
(ACPMIP).

Additional Reading Material:
1-2 papers per course session may be recommended (see outline).

Course/Module evaluation:
End of year written/oral examination 100 %
Presentation 0 %
Participation in Tutorials 0 %
Project work 0 %
Assignments 0 %
Reports 0 %
Research project 0 %
Quizzes 0 %
Other 0 %

Additional information:
This seminar is a highly compressed version of a full-semester course I give. Much of that course focuses on implementing
the ideas through hands-on data science projects with large data sets. While I will allude to this work from
time-to-time, much of the technical detail will be omitted. No programming is required for this seminar.
 
Students needing academic accommodations based on a disability should contact the Center for Diagnosis and Support of Students with Learning Disabilities, or the Office for Students with Disabilities, as early as possible, to discuss and coordinate accommodations, based on relevant documentation.
For further information, please visit the site of the Dean of Students Office.
Print