This is how we use Cloud Dataprep at Crystalloids to speed up data wrangling
by Richard Verhoeff, on Jul 13, 2021 3:34:44 PM
To enable Crystalloids’ management to make wise decisions for our company, data needs to be collected and wrangled. Graphical data preparation without coding speeds up the process, and it enhances the data quality, which leads to more meaningful analysis and massive time savings. Google Cloud Dataprep speeds up the wrangling process. In just a few clicks, it provides us with only that data we need. Read our story here.
At Crystalloids, we face the same challenges as our customers when it comes to a data-driven approach to the decision-making process. That’s why we understand our customers’ pain. Our data enters the company through various systems. Combining this data makes it a lot more meaningful. It can help us make forecasts, budgets and make predictions for the future, so we can make the right decisions. Combining data can be hard though. It can be a hassle and is time-consuming. In this article, you will learn how to combine only the data you need and make it ready for analysis.
How we used to handle data-driven decision-making
An integral view of the data coming from the various systems would mean downloading all the reports from the separate solutions and combining them in a spreadsheet. This works, but it’s time-consuming and cumbersome. The last few years our company grew fast. We started to reach the boundaries of what project administration, invoicing and bookkeeping modules can accommodate for us. When it comes to reporting, we have seen a lack of flexibility.
Difficulties to get to the right data to set meaningful KPIs
In our company, we use KPIs to manage our business, but it became difficult to attribute revenue to the KPI’s or to implement the KPI’s correctly in Exact Online. In order to create an interesting KPI, we had to add new classifications and copy and paste ourselves towards the desired outcome. Of course, this wasn’t an ideal situation. We thought of different solutions. Should we automate further with a combination of macros and an app script in Google Sheets? Or instead, should we use a combination of Google Sheets and Python? That would mean there was programming involved. Since time is a big constraint in every company, we asked ourselves: “is there a way to do this faster, preferably without code?”
A more detailed view of the problem
Let me illustrate this with an example. As a consultancy company it is very important to know the occupancy of our people working on projects and in teams to serve our customers. In the standard Exact Online overview, you can see this particular employee has a percentage above 100%. This is caused by an “out of office” service task for this employee. This should be excluded from his occupancy calculation in projects. In Exact there is no way to book these hours. Not through the system, neither is it possible to exclude this from the reports.
So far, we took the data to a Google Sheet and pushed the data into the desired outcome by deleting, merging, classification, aggregation and pivoting. For every quarterly meeting, a new document was created, creating a monthly or weekly report would be too big an effort. So far, we extracted the data manually, but we wondered if there would be a solution to fetch the data exactly as we need it and as quickly as possible, so we could use meaningful data whenever we needed it? There is! It’s called data wrangling.
Data wrangling, what’s that?
If data is incomplete or unreliable, then analyses will be too. This means that the value of any insight will be lost. Data wrangling seeks to remove that risk by ensuring the data is in a reliable state before it is analysed. Data wrangling refers to a variety of processes designed to transform raw data into ready to use data formats. The process depends on the data you want to leverage and the goal you are trying to achieve. A great tool that can assist us with data wrangling is Google Cloud Dataprep.
Google Cloud Dataprep
Google Cloud Dataprep is one of Google’s data services for cleaning and preparing structured and unstructured data for reporting, machine learning and analysis. Dataprep interprets the data transformation through a proprietary inference algorithm. A ranked set of suggestions and patterns will be returned and displayed in a graphical user interface. With a click on the mouse we can apply aggregation, joins, unions, pivot and unpivot, regular expressions, extraction, merge and much more to the data. Let’s see how we can use Google Cloud Dataprep as a solution.
In order to create this overview, three extracts from the Exact Online system are needed.
- The list of all current active employees.
- Overview of worked hours per project.
- Overview of all available hours per month and absence hours per month.
To calculate the % Nett Billable, we will need the following formula:
Accountability = IF(Schedule<Final;Final/Schedule;1-(Out of office)/Schedule)
In the image above you see the Dataprep UI. With only two lines of codes that are generated by the graphical user interface, the first steps to more meaningful data are made. We can easily expand the same file with more data, or repeat the same query. A perfect example of the beauty of automation. We can get to exactly that data we needed in just a few clicks, without any coding!
The image below shows how Dataprep simplifies the data lifecycle and smoothens the way to more meaningful data.
We have seen that combining data to one meaningful dataset is time-consuming. It involves programming or lots of manual actions, and it takes the attention away from what is really important, to analyse the data in such a way that the right business decision can be made on the basis of the right data. Data preparation without coding speeds up the process, and it increases the quality of the data, which leads to more meaningful analysis.
Google Cloud Dataprep makes this process easier. In just a few clicks, it provides us with only that data we need. Are you struggling with your data? At Crystalloids, we have lots of experience with Google Cloud Dataprep. Book an online appointment for a consultation.
Crystalloids helps companies improve their customer experiences and build marketing technology. Founded in 2006 in the Netherlands, Crystalloids builds crystal-clear solutions that turn customer data into information and knowledge into wisdom. As a leading Google Cloud Partner, Crystalloids combines experience in software development, data science, and marketing, making them one of a kind IT company. Using the Agile approach Crystalloids ensures that use cases show immediate value to their clients and frees their time to focus on decision making and less on programming.