Apr 082014

The past few weeks while I have been shifting my website from one web host to another in an attempt to fight off the DDoS attacks I started thinking about how big my stats.hockeyanalysis.com database actually is. I was thinking about it because of how long it takes to upload the data to a new web host and how long it takes to set up the database again.

So, how many data points do I have in my database?  A lot. A data point is any single piece of data like the Leafs 2008-09 CF% or Jarome Iginla’s 2007-13 (6yr) individual Goals/60 or Jack Johnson’s CF% while playing with Drew Doughty during the 2008-09 season. Each of those is a single data point.

Here is a summary of all the data point totals by table type.

Database Table Type Total Records Datapoints/record Total Data points
Individual+OnIce Stats 595726 123 73274298
WOWY 3983667 54 215118018
“Against You” 10856454 38 412545252
Team Data 660 28 18480
Total 700956048

So yes, there are just over 700 million data points in my database not including things like player names, player positions, players team, etc. Once I add in all the multi-year data that includes this current season I estimate there will be over 900 million datapoints.

The majority, though not all (I’d estimate 70-80%), of these data points are accessible to you if you conduct the right searches. Which one of you is going to be the first to count them all?

Now, if I actually uploaded all the data I can generate (specifically WOWY and Against You data when players have played fewer than 5 minutes with/against each other) the number of data points would rise dramatically, probably several billion data points. This is why I don’t upload that data.


