🧮 Establish a Universal Data Model
We recommend using a layered Data Warehouse (DWH) approach to gradually structure the data you are collecting in a universal manner depending on your business’ needs. As a result you will create a single point of truth for your company which is based on the same standards and definitions (e.g. a centrally defined customer lifetime value).
⚒️ Data Infrastructure Tooling
It is crucial to be conscious of the emergence of new data infrastructure tools with respect to your company’s changing business requirements. At the same time be wary of hypes and assess whether it is really necessary to dive into something new. Often, it is more recommendable to keep to solutions that you are familiar with (e.g. SQL) since you have already invested resources and created a knowledge base.
In addition, look for appropriate tools for every task that you want to tackle (one task - one tool): Some tools let you combine data visualization while also defining the business logic of the data. While this might seem like a comfortable solution at first we recommend keeping these two separate: Using your DWH to define business logic while using another tool to visualize your reports.
👁 Data Quality & Accessibility
You can ensure high quality data standards by implementing several processes such as validation tests and data profiling. The former can consist of querying the raw data before any transformation has taken place - the returning results will indicate a failure or success.
The second process can be considered an exploration to better understand the structure, distribution, and limitations of the data you are working with (e.g. on a simple level: What null values do I have, where do I have them and why?).
Remember: It is far easier to provide good quality data by employing these steps right from the beginning, so when you are generating the data at the source. Testing the data and applying fixes later on is possible as well but a bigger effort.
Of similar importance is good accessibility which means, among other things, establishing access to documentation and metadata, i.e. by facilitating a data catalogue. The ones who work with the data need to know where it comes from, how it is collected and who is in charge of it (data ownership).
Comentarios