FA Quiz 2 PDF

Week 7 Net Interest Margin (NIM): - how banks make money via difference between interest rate: what they receive on loans - what they paid out on deposits - Not easy to make money due to credit risk, as people who they lend to might not be able to pay back → loan defaults, offset interest revenue generated by loans - Loan interest rates need to be aligned with default risk and operating costs to ensure net gain and minimise losses Lending terms: Current - Borrower is up to date with loan payments Delinquent - Borrower is behind on loan payments and may incur late fees and penalty interest - Early stages of default - usually measured in days (e.g. 30, 60, 90, 180), the higher the number, the more likely to becoming in default → during which the bank may begin collection actions or suspend credit account - Collection amounts and frequency may be adjusted in the case where the burrower lost their job and is temporarily unemployed In Default - Fails to repay the debt as agreed - Definition of default can vary from loan to loan, such as: - Number of payments missed depends on the loan agreement - Being delinquent for a certain period of time - Triggers reminder of loan balance to be due - Typically banks treat 180 days in delinquent as being in default - Banks have to write off the debt and treat as losses Characteristics of retail lending: 1. Large number of product loans 2. Small value 3. Require automation to be cost effective due to operating costs 4. Example of lending products: home mortgages, auto loans and leasing, personal loans and credit cards Determinants of lending profitability: 1. Interest rate charged on loans 2. Default rate of the loan portfolio 3. To minimise losses due to defaults: a. Assess potential borrowers’ likelihood to default b. Lend to individuals that are better credit risks c. Limit the amount of credit provided (if default you don't lose so much) d. Ensure that the interest rate charged matches the average risk of default of the portfolio Credit assessment: estimate whether applicant will successfully repay their loan, based on 1. Applicant characteristics 2. Credit bureau information: past payments and borrowing history 3. Application information of other applicants 4. Repayment behaviour of other applicants 5. Types: Both assume that the future will resemble the past Judgemental Based on: - Experience of the lender / loan officer - Knowledge of the customer 5C’s: character, capital (house, car etc.), collateral, capacity (full-time/ seasonal/ high-paying job etc.), condition - Manual, not practical for high-volume Statistical Based on multivariate correlations between inputs and risk of default - Can automate In the past, manual evaluation of credit requests faced these issues: slow approval, error prone and subject to individual biases. Today, it is mainly limited to (1) corporate lending, (2) high-value consumer loans and (3) borderline cases (close to being approved but not quite making it). Retail Lending Credit Assessment Approach: Scorecards 1. Historical lending data is used to develop scorecard models which estimates the probability of default based on a customer’s profile and at the time the applicant applies for the loan 2. Scorecard structure and mechanics a. Points are assigned to each attribute and summed to get the credit score b. Points are based on features such as (1) predictive strength of the characteristic, & (2) correlation between the characteristics c. Compare with a threshold (cut-off) defined by institution i. If score is at or above threshold, approve credit, else deny credit 3. Scorecard structure is visible and transparent, allows for explanation on why one gets approved or denied Scorecard characteristics: 1. Decision support tool for assessing credit risk 2. Consistent with how a credit analyst would evaluate credit: simple and explainable 3. Predictive variables represent major information categories 4. Usually between 8 and 15 variables 5. Score reflects probability that an applicant with any given score will be a good ro bad credit risk Credit bureau: companies that keep records on consumers and sometimes businesses, such as their use or abuse of credit. They typically get their information from creditors such as banks. Banks cna also get this data about customer’s loan details from other institutions. Statistical ML predictive model: Scorecard model parameters Loss Event Definition - 30/60/90 delinquent days - When have to charge/ write off debt due to being in default - Borrower is bankrupt - Claim over $1000 - Fraud over $500 - Negative Net Present Value (NPV) - Less than x% owed collected Variable availability - What variables to used in the model and selection - demographic, behaviour data & bureau score information Credit scores - can get scores from bureaus or third parties that take bureau data and generate credit scores - e.g. FICO scores range from 300 to 850 - raw bureau data, e.g. number of credit checks, total amount of credit & delinquency history Scorecard usage: - Terms can be adjusted for low scores due to high chance of default: higher mortgage down payment, higher interest to offset risk, requesting further documentation, having a lower beginning credit limit etc. - Terms can be adjusted for high scores due to low chance of default: preferential rates such as lower interest rates, higher credit limits and premium cards etc. Drivers for using credit scorecards: 1. Can generate and test multiple scorecards based on different input parameters or individual customer segments, e.g. lower cut-off score depending on job status such as high-value loans for doctors 2. Easy to understand and use OR new customers such as students with no established credit history and are unable to map them into tradition scorecard that puts alot of weight on income 3. Automation/ processing efficiency/ repeatability: easy to modify algorithm and insensitive to changes in input data 4. Regulatory concerns: fairness and transparency → able to provide reasons for declining credit Scorecard development strategies: How to create one? 1. Buy a generic or pooled data scorecard: might not have resources or data to create one but might not be tuned to your customer segment → not specific to institutions’ customer and products 2. Outsource to credit risk vendor: a. Provide date to vendor and they develop scorecards b. Might not fully integrate business logic and understand how it was created 3. Develop in-house: a. Require data, internal expertise and software tools Data Exploration - Lab histogram -.hist() - distribution of numeric variables boxplot -.boxplot() - visual representation of outliers for numerical variables - valid and invalid outliers, e.g. salary of ceo & negative ages countplot -.countplot() - Bar charts - distribution of categorical variables - distribution of events & non-events Kernel density -.kdeplot() estimation (kde) - distribution of events & non-events Correlation heat -.corr() - correlation maps -.heatmap() - heatmap - understand which variables may overlap and be redundant z-score - measures how many std an observation is away from the mean - outliers have z-score or more than 3 Week 9 Data merging: 1. Can use left join, but duplicate records (form one-to-many rls) may inflate number of bad/ good events as ML algorithms assume each row to be a unique observation, thus skewing results and lowering accuracy 2. An approach to prevent duplicate records is to append observations as columns all in one row (acc id 1, acct id 2 etc.) → many columns added and too many may be empty 3. Feature selection and engineering: selectively add additional columns, e.g. avg/ median acc amount etc. Common problems with data: 1. Dirty/ noisy: e.g. negative age 2. Inconsistent data: zero means actual zero or missing value 3. Incomplete data: missing income data 4. Data integration and data merging problems: currency difference 5. Overlapping data fields: salary vs income Missing data: Completely - unrelated to other data, e.g. applicant didn’t see data at random At random - reason for missing value can be inferred from other data, e.g. missing professional data where age < 10 Not at - underlying reason for missing data, e.g. someone with very low salary does not want to share random - structural reason Options for dealing with missing values: Keep - variable that is missing can be important information → encode e.g. ‘Missing’ Delete - when too many values are missing - drop rows (horizontally missing) or column (vertically missing) Replace - estimate using imputation procedures, e.g. mean, median, mode etc. Treatment of outliers: - Invalid outliers: treat as missing value → keep, delete or replace - Valid outliers: truncate based on z-scores: - Replace all variable values if z-score > 3 std by the mean and

Document Details

Tags

Related

Summary

Full Transcript