BigQuery can process billions of records in seconds. Add a script to automate the task and load the result into a spreadsheet and hey presto you have information that can be presented in tables and charts.
Google announced BigQuery back in May as a way of allowing more or less anyone to access "big data" using basic SQL queries. You can upload data to Google Storage and then you can use SQL to extract the data you are interested in. The API works with REST and JSON and its key feature is speed. A dataset consisting of 50 million rows queries in a few seconds. There are some sample datasets provided by Google which total some 60 billion records and queries still get returned in around 5 seconds.
In case you missed the video showing both the BigQuery and Prediction API announcement you can view it below:
Even though BigQuery is fairly easy to use it still could be easier and more flexible. What Google has done now is to integrate BigQuery with Google Apps Script and Spreadsheet. Of course a spreadsheet is a natural follow on to querying data. Once you have a smaller subset the spreadsheet can be used to process and present it as simple charts. Put this together with a script and you have an automated way to repeatedly query big data.
The only downside is that BigQuery is still in a closed beta test and so it is difficult to find out the exact workings of the system. When and if it is open to the rest of us then it will bring big data to the desktop without the need to invest in clusters or Hadoop expertise. Of course given that Google have just announced the pricing structure for their Prediction API BigQuery is unlikely to be free when it is released into the wild.
Today we celebrate the birth on December 26th, 1791 of Charles Babbage, the man who invented calculating machines that, although they were never realised in his lifetime, are rightly seen as the [ ... ]