IPL Matches Data Analysis
As finally, this year IPL Season 13 has started on Sept. 19, 2020 , the cricket mood is on. While watching the first match itself, the idea of analyzing IPL dataset struck my mind and luckily I found one dataset on Kaggle which contains the data of matches held between 2008–2019. So, I shall be analyzing that dataset only. Although it can be improved to a great extent, hope you like my work.
Data Preparation and Cleaning
We have selected the below datasets for the analysis which contains data related to IPL matches for the last 12 seasons :
Also for better insights into the analysis, do refer to my Jovian Notebook here.
Our first dataset called matches.csv looks like this :
This dataset, as we can see, contains 18 columns including Season, City, Date, Team1, Team2, Toss Winner, Toss Decision, Result, DL Applied, Winner, Winner by Runs, Winner by Wickets, Player of the Match, Venue, Umpire1, Umpire2 and Umpire3.
From this dataset, we are removing three columns containing the respective umpire names as we ain’t going to use them.
discard_columns = ['umpire1','umpire2','umpire3']
ipl_df = ipl_df.drop(discard_columns, axis=1)
We have another dataset called deliveries.csv that contains data of each ball bowled in the last 12 seasons. It’s a large dataset having 179078 rows and 21 columns. Let’s have a quick look on the dataset :
Exploratory Analysis and Visualization
Let us first see the number of matches played in different cities in the last 12 seasons.
So, we can see in the bar plot that Mumbai(101) has hosted the maximum number of IPL matches followed by Kolkata(77) and Delhi(74).
Now, lets see which players have scored the maximum runs during an inning in the last 12 seasons.
We can see that Chris Gayle is the one who has scored the maximum runs(175*) in a match followed by Brendon McCullum and AB de Villiers.
Well, if you are an IPL fan, you must have been aware of this Gayle Storm that hit the Chinnaswamy Stadium on April 23,2013. Gayle scored an unbeaten 175 against Pune Warriors with 17 sixes and 13 fours in just 66 balls and a strike rate of 265.15 . Isn’t that something the Universe Boss can only do??
In the above output, we can see two players, Chris Gayle and AB de Villers, have appeared twice and they both play for Royal Challengers Bangalore. But there’s a fun fact, despite having world class players like Gayle, Kohli and ABD , RCB has never won any season of IPL.
Now, let’s move to the bowlers and find out who has taken the most number of wickets in total over these years. But before that, we must know that run outs are not counted in the bowlers account, so we can discard all those such dismissals.
Lasith Malinga, the Sri Lankan fast bowler, has taken the maximum number(170) wickets in the last 12 seasons of the IPL, followed by Amit Mishra(156) and Harbhajan Singh(150).
Now, let’s find which team has the best winning ratio (or percentage) over the last 12 seasons of the IPL.
We can see the Win percentage of each team. When we look at it closely, we find that Delhi Capitals has the best stats. But when we see the number of matches played by Delhi Capitals, we find that it is quite low as compared with the other teams. This is so because earlier Delhi used to play with the name of Delhi Daredevils and then renamed to Delhi Capitals.
We can see the same data in a bar plot more easily, so let’s plot it.
In this graph, we can clearly see that Delhi Capitals has the best stats. But now that we know the reason, we can say that Mumbai Indians have the best winning percentage as compared to other teams.
Since, IPL is T20 tournament, often teams score much runs in even 20 overs. So, let’s see which team has scored maximum number of runs against which team.
Royal Challengers Bangalore has scored the maximum runs in one innings against Pune Warriors India, followed by Kolkata Knight Riders against Kings XI Punjab and then again Royal Challengers Bangalore against Gujarat Lions.
Well, the match in which RCB scored 263 runs is the same match win which Chris Gayle scored unbeaten 175.
Now, let’s find maximum runs which a team in every year.
Asking and Answering Questions
Q. Which team won the maximum number of matches in all seasons?
We can see that Mumbai Indians have won the maximum number of matches in the last 12 seasons of IPL, followed by Chennai Super Kings and Kolkata Knight Riders. So, we may say that Mumbai Indians has been the most successful team in the IPL. We can also see that few teams like Kochi Tuskers Kerala, Delhi Capitals, Pune Warriors, etc. have won very less number of matches. The reason for this is they had played the IPL for just 1 or 2 seasons.
Q. Which player has become the Man of the Match most number of times?
Whoh! The GayleStorm Chris Gayle has become Man of the Match most number of times, followed by Mr. 360 AB de Villiers, our very own Hitman Rohit Sharma, Thalaiva MS Dhoni and Reverend David Warner.
But here’s a quick fun fact :
Despite having outstanding performance in IPL, Chris Gayle had gone unsold twice in the IPL 2018 auction before being taken by Kings XI Punjab at his base price of ₹2 Crores.
Q. Does winning the toss increase the chances of winning the match?
Out of 756 matches in the last 12 seasons, we can see that the toss winning team has won the match 393 times but has lost the match 363 times. Well, the difference is not that much.
Q. Who are the Top 5 scoring batsmen?
Thus, we understood that why Virat Kohli is called the Run Machine. He has scored 5434 runs in total, followed by Suresh Raina with slightly less 5415 runs and Rohit Sharma with 4914 runs.
Q. What is the maximum run by which a team won?
We know that a team can win by runs only if it bats first. So, to answer this question, we’ve first separated the teams which have batted first, then we plotted a histogram.
We can see that in almost around 140 matches, teams have won with a margin of 0–20 runs whereas only one or two team has won match with a margin of 120–140 runs. One of such matches is RCB vs PWI as we have mentioned it already.
Q. What is the maximum wicket by a which a team won?
Similarly to answer this question, we have separated the teams which have bowled first, then plotted a histogram.
We can see that around 85 matches have been won by 6 wickets, 70 matches with 5 wickets, 80 matches with 7 wickets. We can also find that 10 matches have been won by 10 wickets meaning the opening batsmen were enough for the opponents.
Inferences and Conclusion
These are the few conclusions that I can draw from the above analysis.
- There are several players who perform very good in these private tournaments. But sometimes, they aren’t even selected in the playing 11 team or sometimes remain unsold. The reason can be their inconsistent performance that they become burden for their team. We can consider Chris Gayle. He performs outstanding in some matches while in other he becomes burden for the team.
- During the analysis, we found that which team can be considered as the most successful team over the last 12 seasons, which batsman is the highest scorer and who has taken most number of wickets.
This analysis can be further more improved by adding more visualizations as the dataset is quite interesting and clean in itself.
The complete analysis can be found on Jovian.ml account or here. Please give claps if you liked my work.
Thanks. You can always catch me up here.