Google Cloud Products in practice - BigQuery
by Robin Laurens, on Nov 2, 2017 12:18:39 PM
Every month we dig into a Google Cloud Product that makes it possible to capture, process, store and analyze big amounts of data in the Cloud. We don't just tell what these products do, as this is all very well explained on the Google website. We show how the product applies to business challenges, by describing how the product works in practice. We start today with the product we think is one of the most important and useful ones; Google BigQuery.
BigQuery is a cloud database that makes it possible to store and query massive amounts of data in seconds, simplifying cloud computing. Where on-premise SQL (Structured Query Languages) queries could take hours or sometimes even days to run, BigQuery makes it possible to do this in seconds using the power of Google's infrastructure.
Running Queries with Google BigQuery is simple as it has a very easy to use user interface. There are only two things you need; data and knowledge on how to run queries. No complicated and expensive on-premise infrastructure to support the queries, no time to waste on setting the right indexes and building them. It's just data, Google and you. There are two different ways you can use Google BigQuery, with the Web User Interface and the BQ command-line tool. For practical guides on how to start doing your first query, click here.
The billing model BigQuery uses, just like all the other Google Cloud Products, is a pay-as-you-go model. Your consumption is measured and billed at the end of every month, the costs depending on three different factors: storage, queries and direct insertion of data into the product. Google developed an excellent tool to estimate your expenses, which you can find here.
Benefits: Marketers and analysts
Within the whole big data process, BigQuery offers advantages in different kind of stages and for different types of functions. For marketers and analysts it's possible to run ad-hoc queries and get results in minutes or seconds, in this way better understanding online and offline attribution, lead funnels and long-term customer value. Result: a better marketing performance.
One of the exceptional benefits of BigQuery is that it runs on your cloud-based data lake, which means that it's easy to combine data from all different sources. BigQuery is designed to store data from numerous sources, like the CRM database and Google Analytics and it can pull in any CSV or JSON data.
Having massive, diverse data sets in a raw format makes it possible for analysts and marketers to find relationships that were previously invisible. Which helps to maximize profits and to improve the performance of the business systems in use.
Besides marketers and data analysts, BigQuery also brings advantages for intern data engineers. Thanks to BigQueries Rest API it's possible to connect the platform to every programming network. We hope in the next use cases these benefits become visible as well.
Learn about cloud data warehousing and how it can help your business better prepare for the future. Download The Economic Advantage of Google BigQuery On-Demand Serverless Analytics™ white paper, sponsored by Google Cloud, now for free.
BigQuery user Cases
We present here three different use cases to indicate what BigQuery can do for businesses.
Harvard's courses platform on MOOC, HarvardX, get visits of thousands of users a day, generating more than 1 million clicks every day. These clicks provide lots of data which can, when doing it right, give profound insights into learning processes and interests of people all over the world. To have these Insights, Harvard collaborated with MITx (Massachusetts Institute of Technology courses platform on MOOC) and together they developed an open source prototype for processing edX MOOC data using Google BigQuery, called edx2bigquery. In this case study, you can find out how they did it.
Ocado, the world's largest online-only grocery store, migrated almost all their business data (100 TB) to Google Cloud Platform and now uses BigQuery to run queries on it. Before they used Hadoop and Apache Spark software to do this but they saw the cons, which were the requirements of an elaborate setup and workflow, and difficulties with scaling. Read the case study.
Sandaya is a campsite