(Yes, I just made an AC/DC reference)
First off, huge shout out to LatentView Analytics. Thank you for the job placement offer ! It’s given me so much peace of mind, and in turn, freedom to experiment with new stuff.
About a week back, I heard about LatentView’s new event for final and pre-final years - “The Number Thing”. Or, TNT. Perfect experimentation ground.
All the problems were interesting. It was just a matter of picking the right one.
Cricket stat analysis. Crime pattern analysis. Text analysis.
One sports analytics question, and I was sold. That was my choice. Cricket Craze.
I knew it would be a lot of fun. I also expected it to be challenging. What I didn’t expect was a 175373-row dataset. I knew this would be a whole new challenge for me. This was the biggest dataset I’ve had to work with so far. I had to get new tools in my arsenal to tackle this. This was going to be a crazy ride.
I decided to blog about what my approach to this problem. Think of it as documentation. Over the next few posts, I’ll break down my approach - from the initial steps, all the way down to the final result (hopefully!).
If you’re reading this and think there’s a better way than the approach I’ve taken, do give your suggestions ! Also, suggest resources to get started with them. You can write to me at alexmathew003[at]gmail[dot]com
Check out the “TNT” category on the blog to read the posts. I’ll make regular updates as I make progress.