Making Machine Learning easily available with SQL in BigQuery
by Tom Wamelink, on May 20, 2021 9:56:03 AM
BigQuery ML (BQML) makes machine learning available directly from Google’s data warehouse using only SQL for coding. Models are built and evaluated in BigQuery and stored in tables where predictions can be made directly on data in the data warehouse. Many model types are supported, including deep neural networks, time series analysis, K-means clustering and linear and logistic regression. Tensorflow models can be imported for scoring or developing new models in BigQuery. There is also an Auto ML function that can find a good model for your case among alternative models.
In other words: BQML brings machine learning to the data warehouse rather than data to a machine learning platform. This increases the speed of developing models significantly. Data does not need to be moved or formatted for Python based ML frameworks and the dependence of less tools reduces complexity of the process.
Use case example: predicting conversion intent
We can predict the conversion of website visitors and use conversion intent for online audiences, for example to optimise advertising spend. My blog focuses especially on the functionality and ‘how to’ in BQML. For a more extensive example of predicting conversion intent you can read the blogs of my colleague Lotte about the same topic.
The following SQL statements will help you find a model to predict conversion intent:
- A first SQL statement appoints the location table to store the trained model, provides model parameters (logistic regression for this case), selects data for training just like any SQL query would (e.g. country, mobile device, operating system, pageviews for this case), appoints the label data (conversion) and starts the training process.
- A next statement using the ML.EVALUATE function starts evaluation of the model, provides the location of the previously stored model and selects data for evaluation again like any query would. Evaluation metrics for the model show up in BigQuery once the query ends.
- After finding a good model the next statement uses the ML.PREDICT function, providing the location of the model and selecting the input data of website visitors we want to predict conversion for directly from the data warehouse.
This use case only needs some basic SQL coding for applying Machine Learning and no knowledge of ML frameworks is required. Data analysts with knowledge of the data warehouse can build ML models themselves using SQL to prepare data, specify model parameters and training and evaluate the model. This makes BQML available to a wider audience in organisations and also reduces developing time.
There is an one-minute video about BQML on YouTube.
Interested in using BQML for your use case? We are happy to show you the abilities and guide you with your first BQML project.
Crystalloids helps companies improve their customer experiences and build marketing technology. Founded in 2006 in the Netherlands, Crystalloids builds crystal-clear solutions that turn customer data into information and knowledge into wisdom. As a leading Google Cloud Partner, Crystalloids combines experience in software development, data science, and marketing, making them one of a kind IT company. Using the Agile approach Crystalloids ensures that use cases show immediate value to their clients and frees their time to focus on decision making and less on programming.