Petit example of credit scoring analysis based on the data mining course of my former PhD adviser Tomas Aluja
The project contains two datasets in csv format (raw data, and cleaned data), as well as the R scripts for the analysis
- Part 1 - Data Processing
- Part 2 - Profiling
- Part 3 - Principal Components Analysis
- Part 4 - Multiple Correspondence Analysis
- Part 5 - Clustering Analysis
- Part 6 - Decision Trees
- Part 7 - Logistic Regression
The raw dataset is in the file "CreditScoring.csv" which contains 4455 rows and 14 columns:
1 Status | credit status |
2 Seniority | job seniority (years) |
3 Home | type of home ownership |
4 Time | time of requested loan |
5 Age | client's age |
6 Marital | marital status |
7 Records | existance of records |
8 Job | type of job |
9 Expenses | amount of expenses |
10 Income | amount of income |
11 Assets | amount of assets |
12 Debt | amount of debt |
13 Amount | amount requested of loan |
14 Price | price of good |