Becoming a Data-Driven Company with Applied Data Products

18.5.2024 | 9 minutes reading time

In recent years, the hype surrounding the value of data has grown continuously, and a multitude of concepts and methods have emerged on how companies can become 'data-driven'. From strategic top management to detail-oriented data analysts attempts are being made to place data at the heart of value creation. The resulting initiatives regularly oscillate between a holistic data strategy (“we need common rules before we start working with data”) and exploratory data analysis (“we have so much data, there must be something valuable in it”). As a result, however, there are only a few initiatives that produce real progress on the way to becoming a data-driven company. This article presents Applied Data Products—an integrative approach to data products with concrete steps that aid in this process.

A core problem on the way to becoming a data-driven company is too much emphasis on one of two poles: Business or technology. While governance, regulation, and use cases are on one side, technology with concepts and techniques like APIs, data mesh, and UML diagrams is on the other. Very few people in a company manage to combine both sides effectively, which makes achieving true alignment a challenge.

In order to better bundle the business- and technology-oriented forces in our customers’ companies, we have combined proven and well-known frameworks into a consulting approach to data products that addresses specifically the connection between the two aforementioned poles. We call our approach Applied Data Products and–true to the motto “Learn first, scale second”–it focuses on creating the greatest possible added value in order to develop an overall strategy to become data-driven based on successful individual cases.

Increase Efficiency with Simple Data Products

Let's make it very concrete: If New York's authorities want to track down tax evaders with a holistic data strategy, the project is lengthy, risky, and has a high probability of failing. The key question is: How can I send data-supported tax inspectors to the right companies? If you start with the hospitality industry, the use case is almost obvious: Because a lot of waste means a lot of turnover, it should be enough to systematically record waste volumes and compare them with reported turnover. Where the discrepancies are large, a more detailed check for potential tax evasion should be carried out. In fact, this procedure was used successfully a few years ago (see this German article). The recording of waste and its subsequent comparison with submitted turnover data led to a significant increase of detected tax evaders.

When introducing Applied Data Products, the first step is to clearly define one or more specific data products following the principle of “quality over quantity”. It is better to have a handful of well-considered data products on the way to becoming a data-driven company than trying to achieve too much at the same time. Next to defining actual data products, another goal of this step is to learn about your own organization: Which aspects of data-driven action are already working and in which specific areas there is still a need for action? While it doesn't matter whether a data product is a local Excel sheet or a data warehouse with a connected self-service BI dashboard, learning to think in terms of data is key here.

We have developed the Data Product Canvas (DPC) for the definition of data products, and evaluated its usefulness in various industries and scenarios. Based on the Business Model Canvas, the DPC serves to provide all stakeholders with a quick and comprehensive overview of a data product. It is not intended to be a substitute for specialist or technical documentation. Rather, we position it as a conversation starter. That is, instead of addressing all details clearly and unambiguously, it primarily serves as a basis for discussion, including questions such as:

Which source systems do we need to integrate? Who is responsible for these systems (and therefore the quality of the data supplied)?
What is the added value of our product? What processing steps do we need to carry out on the source data?
What actions should our data product enable, directly and indirectly?
Where do costs arise in the provision of the product?
How can success be measured?

The Structure of the DPC

Our DPC is divided into five areas with a total of 12 fields in order to capture all essential information regarding a data product. Above everything is a management summary that in a clearly understandable fashion describes the core idea of a data product as well as its owning and responsible party.

Data Products Canvases - Anatomy of Data Product Canvas v1.1.jpg

The main part of the DPC consists of three blocks:

Sources, Storage & Processing
Value Proposition
Discoverability, Distribution & Usage

At its bottom, the canvas gathers auxiliary information that is not at the core of a data product, but nevertheless plays a role in the product’s design and development.

The Contents of the DPC

Sometimes information about a data product cannot be clearly assigned to a field of the DPC. However, in the sense of a conversation starter, it is not important that topics are included in the right field, but that they are actually included and taken into account.

Data Products Canvases - Data Product Canvas v1.1.jpg

From left to right and top to bottom, the DPC’s fields capture the following information.

Data Providers

Organizational and technical data sources are at the beginning of the value chain. Depending on the degree of abstraction, departments or specific systems can be named here. The information in the Data Providers field addresses aspects such as

What content does the data product draw from which source systems?
Is it internal company data or external data?
Which departments are data suppliers and who is responsible for defining this data?

Data Processing

As soon as data is available, actions are carried out with it. In the simplest case, data is only aggregated, but preparation and processing steps are probably also required. The documentation of data processing within the DPC can be abstract according to anticipated implementation phases, or specific with the naming of tools. The information in the Data Processing field addresses aspects such as

What steps are carried out with the data before it is made available?
What steps are taken for data cleansing?
How are data quality and redundancy addressed?
Are summarization steps carried out via ETL or machine learning?

Frequency and Storage

In order to make data available within the product, it must be stored. Whether and when data is stored long-term or only calculated and distributed ad hoc is the core content of the DPC field Frequency and Storage. There is also the question of how often the (summarized) data changes and is available to consumers, e.g., monthly, daily, hourly, or real-time?

Value Proposition(s)

At the heart of the DPC is the value proposition, i.e., the actual added value of a data product. The corresponding field thus addresses the central question: What value does the (use of) the data provide for a department or the entire company? The focus here is on the impact or outcomes, not the small-scale outputs associated with a data product.

Distribution and Access

The provision of data and appropriate regulation of access is essential for successful data products. Whether sent weekly by email or made available at any time via APIs, it must be clear to everyone involved who actually controls data product access. Can individual data records be released granularly or does access to a data product mean that all available information can also be viewed? Technical and organizational information also come together in the Distribution and Access field.

Discoverability and Semantics

Related to the topic of Distribution and Access is the aspect of discoverability. How do departments and users know that a data product exists? Is there a data catalog or are users specifically invited? There is also the question of how existing data is to be interpreted and what its semantics is—what is the definition of “turnover” or “customer”, for example? Who is responsible for documenting the semantics that are already relevant when accessing data sources?

Target Users

In the context of products, it is always important to consider for whom they are being implemented. This means that the target users are those for whom value is to be created conceptually. However, the fact that other people within or outside a company may also be able to use a data product is not the focus of the DPC’s Target Users field. Instead, its main focus is on defining for whom the product is being optimized. This usually results in use cases, which are captured in the following field.

Use Cases

Once Target Users have been defined, the DPC’s Use Cases field documents what is to be done with the data. Will existing use cases be better supported or will new use cases be made possible in the first place? Is the considered data product just a preliminary product that is part of a longer value chain? A clear understanding of what actions take place on the basis of the captured data product helps both producers and consumers to manage expectations.

Cost Structure

At the DPC’s bottom, the Cost Structure field addresses the biggest cost drivers for a data product. Note that the field is not about detailed cost planning, but about gaining clarity as to whether storage, processing, or even acquisition of external data are among the largest cost blocks.

Success Measurement

As experience has shown that many data products are initially created with a focus on internal departments, a cost structure cannot always be compared with a revenue structure. For this reason, the DPC’s Success Measurement field contains criteria that are intended to make the success of a data product measurable. Success can, for example, be defined in the form of revenue or profit, but also as the number of API calls or the reduction of the access count for an Excel list in Sharepoint. It is essential to establish measurable parameters in order to actually be able to learn on the way to becoming a data-driven company.

Conclusion

A proven approach to enable companies and employees to make sustainable progress towards becoming (parts of) data-driven organizations is the definition of data products, which also requires a change in mentality to think in terms of such products. In general, it doesn't matter whether data products are locally maintained Excel spreadsheets or highly complex machine learning pipelines behind API gateways. What is crucial is that all stakeholders start to understand the value of data for business decisions and translate it into data products. The presented Data Product Canvas (DPC) is just one aspect on the way to Applied Data Products, which is supplemented by other tools such as stakeholder maps. The DPC’s structure allows focusing on added data product value as much as possible. In the following, appropriate rules and specifications can be derived from successful data product definitions. Too much abstraction and specifications in the early phases usually lead to lower learning effects and significantly slower progress on the path to becoming data-driven.

Was this post helpful?

Blog authors

Dr. Florian Rademacher

Service Lead "Software Modernization" & People Lead

Do you still have questions? Just send me a message.

Stephan Hochhaus

Do you still have questions? Just send me a message.

fromDr. Florian Rademacher & Stephan Hochhaus

Charge your APIs Volume 33 - Definition-Based API Mocking, Simulation,...

Key Takeaways This article is the third and last one in a three-part series about definition-based API mocking, simulation, and testing with Microcks (make sure you have read the first and second article)The previous articles focused on (i) Microcks’...

Testing
API

23.10.2024 | 11 minutes reading time

Dr. Florian Rademacher

Sheila Kolodziej

Charge your APIs Volume 32 - Definition-Based API Mocking, Simulation,...

Key Takeaways This article is the second one in a three-part series about definition-based API mocking, simulation, and testing with Microcks (make sure you have read the first article) While the previous article concentrated on Microcks’ architecture...

API
Testing

16.10.2024 | 11 minutes reading time

Dr. Florian Rademacher

Sheila Kolodziej

Charge your APIs Volume 31 - Definition-Based API Mocking, Simulation,...

Key Takeaways API mocking used, e.g., for integration testing, is challenging as it assumes conformance to mocked API functionality, which can incur significant costs as mock complexity increases with API complexity Definition-based API mocking can reduce...

API
Testing

9.10.2024 | 9 minutes reading time

Dr. Florian Rademacher

Sheila Kolodziej

When Business Meets Technology: From Data Product to Data Architecture...

Abstract The Data Product Canvas (DPC) is a tool for the lightweight and iterative definition of data products. It increases the efficiency of product definition by clearly presenting the key impact areas on data products. Additionally, the DPC motivates...

Software architecture
Data
DDD
Digital product developement

6.8.2024 | 24 minutes reading time

Daniel Engelhardt

Dr. Florian Rademacher

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Using Dagster with DuckDB

DuckDB has rapidly emerged as a popular in-process analytics database. Dagster, on the other hand, is a modern data orchestration framework that makes it easy to build and manage data pipelines. Combining Dagster with DuckDB allows data engineers to ...

Data

16.5.2025 | 4 minutes reading time

Hendrik Kamp

Querying Databricks Delta Tables in Motherduck

Intro In a previous article, my colleague Matthias Niehoff demonstrated how duckdb can serve as a viable alternative to Spark for processing data stored in Databricks, specifically by directly accessing the Unity Catalog. Building upon that, a next ...

Data

25.4.2025 | 4 minutes reading time

Hendrik Kamp

Introducing Data Interface Quadrants (DIQs)

In today’s rapidly evolving, data-driven world, organisations face an increasingly complex challenge: how to design, implement, and manage data interfaces that meet both immediate operational demands and long-term strategic business objectives. A data...

API
Data

30.1.2025 | 8 minutes reading time

Daniel Kocot

Miriam Greis

Access Databricks UnityCatalog from duckdb

Databricks is a great platform when it comes to data management and governance, mostly due to the unity catalog. But Spark as an engine for processing the data is just ok'ish, especially when data is not really big. New engines like polars, datafusion...

Data

20.1.2025 | 5 minutes reading time

Matthias Niehoff

Charge your APIs Volume 36 - Trends for 2025

As 2025 approaches, new trends are emerging in the world of APIs. After 2024 was user-centric, the focus is now shifting back to developer needs and increasing productivity. APIs are evolving and the technologies surrounding them are becoming more powerful...

Integration
API
Data
Software architecture

11.12.2024 | 5 minutes reading time

Daniel Kocot

When Business Meets Technology: From Data Product to Data Architecture...

Software architecture
Data
DDD
Digital product developement

6.8.2024 | 24 minutes reading time

Dr. Florian Rademacher

Charge your APIs Volume 28: Empowering application and data integration...

In today's fast-paced world, seamless application and data integration is crucial for organisational success. This blog explores how frameworks like Maslow's Pyramid, Team Topologies, Evolutionary Architectures, API Federation, and API Marketplaces, ...

API
Data
Integration

25.7.2024 | 8 minutes reading time

Daniel Kocot

Discover Catena-X: A revolution in the Automotive Value Chain?

Whenever someone asks us what Catena-X actually is, it is difficult to provide a straightforward and comprehensible answer, without falling back on fancy marketing buzzwords or industry specific vocabulary. It is however a question we need to answer ...

Digitalization

26.6.2024 | 14 minutes reading time

Data for the Masses Volume 2: Data Products, Data Contracts and API Contracts

The pillars of modern data architectures as success factors for organisations In the digital economy, a well-thought-out data architecture and the efficient use of data are crucial for organisational success. Data products, data contracts and API contracts...

Data
API

13.6.2024 | 7 minutes reading time

Daniel Kocot

Charge your APIs Volume 27: Transition from COE/C4E to an API Platform...

The Center of Excellence (COE) focuses on centralised expertise, ensuring best practices and governance, while the Center for Enablement (C4E) empowers teams with tools, guidance, and support for API development. Although beneficial, these models face...

API
Platform engineering
Agile transformation
Agile

24.5.2024 | 10 minutes reading time

Daniel Kocot

A/B Testing: Tool support and testing GrowthBook

In the previous blog post we introduced some general concepts of A/B testing: we explored the main aspects, defined test types and explained the most common statistical methods. Now we want to explore the areas in which A/B testing tools can provide...

Testing
Python
Data
UX/UI
Analysis
JavaScript

18.3.2024 | 20 minutes reading time

Francesca Diana

A/B Testing: An introduction

This blog series aims to aid teams who are contemplating adding A/B testing to their toolkit but are unsure of which tool to use. In addition to helping with tool selection, the series also provides the entire team with a consistent initial understanding...

Testing
Data
UX/UI
Analysis

6.2.2024 | 29 minutes reading time

Francesca Diana

Data for the Masses Volume 1: The Digital Product Passport - A Key Element...

The Digital Product Passport represents a significant shift for digital units within organisations, compelling them to ensure comprehensive data transparency. This tool not only serves as a product's digital fingerprint but also opens up new dimensions...

Data
Product management

25.1.2024 | 7 minutes reading time

Daniel Kocot

Charge your APIs Volume 18: Unleashing the Power of APIs - From Interfaces...

In this series, I usually delve into fresh ideas and concepts, also weaving connections between them. Yet, have you ever pondered over what API means within your organisation? Through our enablement projects, we constantly explore this alongside our ...

API
Product management

26.10.2023 | 4 minutes reading time

Daniel Kocot

Charge your APIs: NordicAPIs Platform Summit Edition - API first ... not...

In the ever-evolving landscape of software development, buzzwords and paradigms come and go. One such term that has gained significant traction in recent years is "API-First Development." It's been hailed as the holy grail of modern software engineering...

API
Data

19.10.2023 | 5 minutes reading time

Daniel Kocot

An introduction to federated learning in an industrial context: Advanced

In the Machine Learning space, it was long believed that sharing learnings or weights was safe in the sense that the input data couldn't be extracted. However, this belief has been challenged by researchers coming out over the years. Nowadays, numerous...

Machine Learning
Big Data
Data Science
Data

18.9.2023 | 9 minutes reading time

An introduction to federated learning in an industrial context: Fundamentals

With the help of data, companies are able to make more informed decisions, optimize their workflows and gain an edge in the competitive world of business using the power of Machine Learning (ML). However, handling data has become increasingly difficult...

Machine Learning
Data Science
Data
Big Data

25.8.2023 | 8 minutes reading time

Charge your APIs Volume 13: Data meets APIOps

In the swirling digital vortex that modern businesses navigate, two things stand clear as day: our escalating reliance on Application Programming Interfaces (APIs) and the immeasurable value of data. The API Operations (APIOps) pipeline, with its automated...

API
Data

24.8.2023 | 11 minutes reading time

Daniel Kocot

Charge your APIs Volume 1 - Build a documentation system with mkdocs

Following last week's launch of a new series of articles on LinkedIn, there is now a refresh under a new name on the codecentric blog. Why "Charge your APIs"? Quite simply, we all need to think APIs further and not just see them as a vehicle for integration...

API
Documentation
Product management
Git

10.5.2023 | 5 minutes reading time

Daniel Kocot

Simple Fraud Detection with PyMC

In one of my last projects, we were facing a prediction problem with very limited data. Each set of data took a specialist hours to compile, and results were not always successful. Therefore, we were looking for a tool to handle these requirements, as...

Python
Data Science

26.1.2023 | 7 minutes reading time

Becoming a Data-Driven Company with Applied Data Products

Increase Efficiency with Simple Data Products

The Structure of the DPC

The Contents of the DPC

Data Providers

Data Processing

Frequency and Storage

Value Proposition(s)

Distribution and Access

Discoverability and Semantics

Target Users

Use Cases

Cost Structure

Success Measurement

Conclusion

Was this post helpful?

Blog authors

More articles

Charge your APIs Volume 33 - Definition-Based API Mocking, Simulation,...

Charge your APIs Volume 32 - Definition-Based API Mocking, Simulation,...

Charge your APIs Volume 31 - Definition-Based API Mocking, Simulation,...

When Business Meets Technology: From Data Product to Data Architecture...

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

More articles in this subject area

Using Dagster with DuckDB

Querying Databricks Delta Tables in Motherduck

Introducing Data Interface Quadrants (DIQs)

Access Databricks UnityCatalog from duckdb

Charge your APIs Volume 36 - Trends for 2025

When Business Meets Technology: From Data Product to Data Architecture...

Charge your APIs Volume 28: Empowering application and data integration...

Discover Catena-X: A revolution in the Automotive Value Chain?

Data for the Masses Volume 2: Data Products, Data Contracts and API Contracts

Charge your APIs Volume 27: Transition from COE/C4E to an API Platform...

A/B Testing: Tool support and testing GrowthBook

A/B Testing: An introduction

Data for the Masses Volume 1: The Digital Product Passport - A Key Element...

Charge your APIs Volume 18: Unleashing the Power of APIs - From Interfaces...

Charge your APIs: NordicAPIs Platform Summit Edition - API first ... not...

An introduction to federated learning in an industrial context: Advanced

An introduction to federated learning in an industrial context: Fundamentals

Charge your APIs Volume 13: Data meets APIOps

Charge your APIs Volume 1 - Build a documentation system with mkdocs

Simple Fraud Detection with PyMC