Blog: Hans Boetius explains the ins and outs of DataOps
Why every company should incorporate DataOps
Hans Boetius is Lead Consultant at Crystalloids and has focused on data-driven marketing for years. For some time now, he has worked with DataOps, a relatively unknown methodology to help make companies more data-driven. We asked Hans about the ins and outs of DataOps and why he feels companies should incorporate this methodology.
Hans, why should we know all there is to know about DataOps?
Because any company can become data-driven, but most managers have no idea how to reach that point. They want all the decisions in the organisation to be made based on facts, in other words, based on data, and for the required tooling to become available to all employees. DataOps makes this possible through a smart combination of processes, technology and people.
Through a smart combination of processes, technology and people, DataOps makes it possible that all decisions in the organisation be made on facts.
Are the decisions not already being made based on facts?
It does happen, but mostly by a small group of experts in the organisation. Ideally, however, you’d want everyone to be able to use data to do their work in a smarter way. This is not all that simple if you must access data from various sources, combine the data in a consistent way, and transfer the data to the users, who also must be able to derive meaningful insights from such data. Companies see that tools for self-service data analysis are becoming better and more advanced, but struggle to find ways to let much larger groups of employees work with the data using such tools.
Well, haven’t we been building data warehouses for this for years now?
That’s correct, but during the development, it was clear that organisations were still divided into silos. A data warehouse team, an IT team, a BI team and different user groups. The structure just doesn’t work well. The teams don’t understand each other that well. They don’t speak the same language. This resulted in a mismatch between what was supplied and the way the data should be used. Also, the resentment between the different teams is visible at many companies. The business feels that IT works way too slow. IT, on the other hand, feels that users are making unrealistic demands. The result: way too little value is derived from the data.
So, things are different with DataOps?
Yes, DataOps assumes that everyone works together on one team. Users, IT and BI. In that team, you decide what is important, together, and you work on it together. This way you also learn from each other. In other words, IT gets an understanding of what the business really wants and why. Users gain a better understanding of the complexities of the data and technical aspects.
Is this collaboration what DataOps defines?
No, it is certainly an important aspect, but DataOps entails much more than that. Another aspect of DataOps is that work is done with modern cloud platforms and tooling that enable you to supply new functionalities much faster. A cloud platform like AWS, Amazon Web Services, offers a range of standard building blocks to access data sources and to create automated processes from them. Therefore, accessing new data and placing it in a central system is much faster than before. You use the platform’s standard API, alter it a little if needed, and the new data pours in.
Does a cloud platform like AWS offer other benefits?
For sure. A platform like AWS makes it much easier to implement genuine agile development. This is often very difficult with traditional on-premise systems. Moreover, the cloud obviously offers scalability and elasticity benefits which are very useful in a big data environment. You no longer use servers but instead use freely scalable capacity. You simply scale up (or down) if you are in temporary need of more (or less) computing power or storage space. Moreover, you pay based on what you use, which is nearly always cheaper with a cloud solution. DataOps makes maximum use of the scalability and speed offered by a cloud platform.
What is required of employees for the implementation of DataOps?
First, let me explain that DataOps revolves around three aspects: people, process and technology. People and their knowledge and expertise are the linchpins in the implementation of DataOps. At the same time, that is also one of the challenges, because you are looking for skills that are very scarce. A DataOps team requires general IT knowledge, programming knowledge, but also knowledge of R and Python, for example. You will have to invest substantially in training and education or in hiring external specialists. Likewise, the people on the team must have a sufficiently broad knowledge base so they can take over the tasks of one another. This involves so much more than mere technical knowledge. The organisation as a whole must grow in terms of how to make better use of data and the DataOps team plays a crucial role in this respect. The team members are the data evangelists of the company. They must be able to train people, but also to ‘sell’ the use of data, as it were, within the organisation.
You also mentioned process, so, clearly, you will also be working differently with DataOps?
Agile development is inextricably linked to DataOps. Also, as with DevOps, the wall between development and operation disappears. The DataOps team develops, manages and runs the operation. Things are no longer tossed over the wall. It is quite different when you transfer an end product to yourself. This means that people are already thinking ‘how can this be managed more easily?’ during the development stage.
Such a different way of working can naturally only be achieved when all expertise and interests are combined on one team that carries its own responsibility. Also, because you are using modern cloud technology.
The best advice is to start small and to change gradually
Won’t a change like this be rather difficult for many companies?
Of course, and it’ll take time. Combining all expertise on one team, in one fell swoop, constitutes a major change. The same applies to the employees. Also, what do you do when a particular process component is outsourced to an external team? Therefore, the best advice is to start small and to change gradually. A frequently occurring challenge is the dependency of the various clients, each with their own interests and priorities. If that can’t be solved by representing the various business units on the DataOps team, you can also opt for various teams, each with its own sub-goal derived from the main company goal.
Why don’t you simply apply DevOps for big data solutions?
DataOps is about obtaining insights from data, and that is the core, which differs from DevOps. The methodology borrows elements from DevOps but contains aspects that are not encountered in DevOps. Take the DataOps team, for example. The team includes roles like Data Analyst and Data Scientist, which don’t exist in DevOps.
Now that you mention the roles, are standard roles defined for DataOps?
Yes. Apart from the two that I just mentioned, there are also the DataOps Engineer and Data Engineer. The Data Engineer is responsible for the technical access and processing of data so that the Data Scientist and Data Analyst can use the data. The DataOps Engineer takes charge of solutions. Those are the four substantive data roles within each DataOps team. Of course, there is also the well-known role of Product Owner, who represents the business for whom the work is being done. And of course, the Scrum Master.
How does the DataOps team work?
Each development, whether a one-off analysis or a structural information product, involves three steps: ingestion, processing and analysis. The required data, if not yet available, must be accessed. The data must be processed and potentially integrated with other data. Then, meaningful insights must be derived from the data. The actions that can be linked to the insights and the structural information products that can provide the desired results for the organisation are then determined in conjunction with the Product Owner.
Each development involves three steps: ingestion, processing and analysis
Why, in your opinion, should every company start DataOps as early as tomorrow?
Simple, to become data-driven. To derive many more insights from all the available data and to be able to use these insights faster and convert them into actions that generate money. This is not so easy with the traditional working methods that are currently being used by companies. For instance, look at how much data on clients and client behaviour is currently available in many companies. DataOps will help you distil insights from the data, telling you what a client is doing and, more importantly, will do, so you can improve the interaction with clients, resulting in a positive effect on your revenue. Or, to provide information on the future value of clients or client groups, allowing you to perform a much smarter allocation of the marketing budget. Your communication with clients will become much more relevant and more personalised. This also helps your client. Naturally, companies are already doing these things or at least trying to, but it has been a slow and tedious process for years. DataOps takes this to a whole new level at a highly accelerated pace. With a smart combination of processes, people and technology.
Still, one does not hear much about DataOps. Why is that?
An organisation must meet two important requirements if it wants to grow into a data-driven, fact-based organisation with self-service data access, where the value of the data is truly evangelised. Firstly, each employee must feel like the role and value of data allow them to make better decisions. Secondly, the confidence in data must have won its way to the highest management levels. The latter is often missing because the people who are in the position to make these decisions are often not that ‘data-savvy’. Therefore, you don’t yet see DataOps being incorporated to a sufficient extent.
What are we going to do about this?
I think the problem will resolve itself. You won’t make it if the competitors and newcomers keep getting smarter and you are still struggling to report something as mundane as your revenue. Pressure from the market will force you to follow suit at one point or another. In other words, make an early start. Look at marketing departments. Many still go about their business in a very ‘old school’ way. Campaigns are put together based on a gut feeling and are launched in bulk while some competitors are already using machine learning technologies to generate insights from data, enabling them to achieve perfect 1-on-1 communication with clients. In an environment like this, your old school approach won’t cut it for very long.
Suppose I am one of those marketers, and this was my wake-up call, what will I do differently tomorrow?
Then I would like to advise you, once again, to not start too big, but first, show your organisation that it can earn money by incorporating DataOps. Seeing is believing. First, go and experience that the approach works. Put together a temporary DataOps team and include the roles that I’ve mentioned and be sure to include a good business owner on the team. Brainstorm with your team, about an innovative marketing campaign, for example, and about which data you can and want to use to figure out which clients are most relevant to your campaign. Make clear agreements, beforehand, about your expectations for the experiment. Make sure that you have access to the required data sources. The rest can easily be done off the beaten path by using a cloud platform like AWS. Then, show everyone the excellent results that were achieved with this approach. Having difficulty making the right decisions or defining a good use case? You can always give me a call or send me an email!