Data Analysis on FICO Scores.
Your FICO has Changed !!!
Data Analysis is not about complexing the things but to simplify it and try to infer as much info as possible.
A FICO score is a credit score created by the Fair Isaac Corporation (FICO). Lenders use borrowers’ FICO scores along with other details on borrowers’ credit reports to assess credit risk and determine whether to extend credit.
FICO Score is a three-digit number based on the information in your credit reports. It helps lenders determine how likely you are to repay a loan.
This, in turn, affects how much you can borrow, how many months you have to repay, and how much it will cost (the interest rate).
In general, many lenders find scores above 670 as indicating good creditworthiness.
Typically, the higher your score, the lower the risk and the more likely creditors are to lend to you.
There are general score ranges recognized by creditors to help them make lending decisions.
These ranges can also serve as goals for you to achieve.
The scores can be categorized as below:
- Poor → (300,580)
- Fair → (580,670)
- Good → (670,740)
- Very Good → (740,800)
- Excellent → (800,850)
Let’s do a Basic EDA for the data we have:
Here, we got two columns — Account Id and their FICO Scores.
Check for the null values.
Remove/Drop Duplicated Values.
Now, Bucketing !! But Why ?
- FICO scores are very conjugated (or very close in numbers) in our dataset and needs to be segregated and categorized to make better conclusions.
- Data binning (or bucketing) groups data in bins (or buckets), in the sense that it replaces values contained into a small interval with a single representative value for that interval. Sometimes binning improves accuracy in predictive models.
We’ll have a more sorted and grouped data.
Pie chart Visualization.
Now, we can infer from the above analysis:
We can clearly see from the above analysis and Visualization that:
The highest number customers are having a “Very Good” FICO Scores.-There is also a good amount of Customers with “Excellent” FICO Scores.-Only 15 % (nearly) have a fair or good or very poor FICO Scores.
But this is slightly different from what happens in the real world.
As FICO scores take into account data in five areas to determine creditworthiness:
- payment history,
- current level of indebtedness,
- types of credit used,
- length of credit history, and
- new credit accounts.
Achieving a high FICO score requires having a mix of credit accounts and maintaining an excellent payment history.-Borrowers should also show restraint by keeping their credit card balances well below their limits.
Most consumers have credit scores that fall between 600 and 750. (Good to Very Good) Range.-In 2020, the average FICO Score in the U.S. reached 710 (Very Good) — an increase of seven points from the previous year.
67 % of Americans have a Good FICO Score or better.
Where as, in our dataset, 95 % have a Good FICO Score or better.
There is a mismatch of around 30 % from the real world scenario. Thus, it surely doesn’t reflects the real word in the true sense completely.
2. Data Analysis on a Website Traffic dataset:
Info about the columns:
- Majority of the Audience is Coming from France (70 %) and 30 % audience is coming from United States.
- We can clearly see in the below graph:
- Out of around 35 thousand clicks.
- Maximum amount of clicks are coming from France. Around (23 K).
- And around (12 K clicks) are coming from United States.
3. CTR(Click-through rate) — The basic and the General Formula of CTR= Clicks/Impressions.
On event basis:
Highest clicks in terms of browser type:
We can state:
Chrome mobile got the Highest click sessions followed by Chrome desktop and Safari on a Single day.
Note: All I wanted to show through this simple data analysis article to the emerging data analyst is that — It’s not that difficult that others say, It’s just as simple as we have done in this article. It’s just the data and the info you have been able to extract from it. That’s it. Stick to Basics and then sky is the limit.