[Webinar] Collaboration and Monetization of Data Products: The Role of the Data Marketplace

Watch the replay
Data intelligence & reporting

Understanding and accelerating data usage with data lineage

Data portals need to demonstrate their impact and meet user needs by providing the right data assets to generate reuse. We explore how our customers are using the Opendatasoft data lineage feature to analyze portal performance and continually improve the experience they provide.

VP of Marketing , Opendatasoft
More articles

Whether your data portal is internal, aimed at partners or open to all, it is vital that you understand how your data is being used. This enables you to demonstrate the impact of your program, helps convince data owners to share their data, guarantees future funding and resources and enables you to react more quickly to changing user needs.

That’s why Opendatasoft has launched its unique, innovative data lineage feature. Focused on usage, it allows organizations to better understand how their data is used internally and externally, across data ecosystems, while improving the ease and efficiency of data portal management.

Copy to clipboard

Data lineage is vital to running successful data portals at both an operational and strategic level.

By understanding data usage through lineage you can focus your data roadmap and make better and more informed decisions around:

  • Publishing new datasets – Which new datasets should I publish first? Which datasets do my users want me to make available?
  • Updating and managing datasets – Which datasets should I update as a priority? Who is using my data within my ecosystem? How might they be impacted by any changes I make?
  • Sharing and demonstrating the impact of a data portal – How do I know if my data portal is used and is providing value? How can I know if particular pages or datasets depend on my own data?

Opendatasoft’s data lineage feature has been created to uncover relationships between data and how it is used, enabling better operational and strategic decision-making by providing understandable insights to data portal managers. It delivers value at both the dataset level, through data mapping, and at a portal level, providing strategic and operational insights through an intuitive dashboard.

Copy to clipboard

It is vital to understand the journey of individual datasets, from when they are generated or added to your portal through their different uses inside your portal and when reused within other applications. Data lineage mapping provides insights into where datasets are being used (such as within pages, visualizations and even by other portals within the Opendatasoft ecosystem) and who is reusing them internally or externally.

This means that if you need to make changes to a dataset or even remove it from the portal you can identify who will be affected. This enables you to contact relevant users and ensure they are in the loop and kept informed.

Digital Wallonia: Manage data depreciation with intelligence

Digital Wallonia, which supports digitization across the Belgian region of Wallonia uses data lineage insights to manage data depreciation. If the team is looking to delete a dataset, they go through data lineage to see if it’s used, and if so where and by whom. This helps with daily decision making and enables them to minimize any risks to reuses when deleting datasets.

Screenshot - Digital wallonia data lineage use case
This feature helps us verify if a dataset is used before deleting it, identify data sources for regional insights, and track data flow performance through KPI monitoring. It's a great asset for efficient data management and informed decision-making.
Marie-Bénédicte Laridant
Open Data Analyst, Digital Wallonia

UK Power Networks: Uncovering new data reuse insights

In the energy industry, the comprehensive open data portal of UK Power Networks (UKPN) is used by a wide community of users, ranging from local authorities and developers to consumers. While UKPN was able to see which its most popular datasets were, it couldn’t automatically understand what datasets were being reused for, and who was using them. Now, thanks to data lineage it has a clearer view of the type of reuses, all while preserving user anonymity. This helps it better plan its strategy and roadmap for releasing new datasets.

UKPN data lineage use case
From a data publisher perspective, one of the problems experienced with open data is understanding what data users are doing with the data. Whilst Opendatasoft facilitates the submission of reuses already, this new data lineage feature provides additional insight into the maps and charts that users have built whilst maintaining user anonymity, that we would not have known about. This adds to the value of open data.
Yiu-Shing Pang
Open Data Manager, UK Power Networks
Copy to clipboard

As well as understanding how individual datasets are being reused, administrators need to be able to demonstrate the overall value and impact of their data portal. That requires a higher level view of which datasets are most valued, and who top data consumers are. Administrators also need to be able to manage their portal efficiently, identifying any issues (such as invalid relationships or underused datasets).

Opendatasoft data lineage delivers this overview through an intuitive, interactive dashboard that allows portal administrators to see where data is being used (internally and externally), which datasets are most popular and flags operational issues around data quality. These insights demonstrate the impact that a portal is having, and any areas where it can be improved. For example, if a dataset is underused, administrators could choose to feature it more visibly on the portal home page to increase engagement, or alternatively depreciate it. The dashboard therefore aids better, more informed, decision-making around the direction and strategy for your portal.

SNCF: Building new connections and uncovering new insights

French railway operator SNCF publishes an enormous range of data on its portal, from timetables and customer satisfaction metrics to network information such as the location of bridges and level crossings. This can be reused by a wide variety of individuals and organizations, from mobility players to energy companies and municipalities in apps, websites and data visualizations. SNCF therefore uses the data lineage dashboard to build new connections, encourage collaboration and start conversations around particular data needs with its data ecosystem. It can now identify external stakeholders that are using particular datasets and reach out to them to understand their data needs and how they can best meet them.

screenshot of SNCF data lineage use case
The use of data lineage allows us to easily visualize the relationships between different stakeholders. In an open data portal context, it is particularly useful in encouraging collaboration with other players in the Opendatasoft ecosystem. It elevates the open data approach to the next level, giving it more meaning and value. Overall, it creates new opportunities for SNCF across its ecosystem thanks to the increased visibility that the data lineage feature provides.
Bertrand Billoud
Head of open data and content platforms, SNCF

OFGL: Identifying dependencies and improving data quality

OFGL is the French government body responsible for collecting, analyzing and sharing information on the finances of local government agencies, from municipalities and departments to entire regions. While it believed it knew who was using its data, data lineage has provided definitive evidence of who is consuming its data, enabling it to better demonstrate the value of its data portal.

Ranking of data consumers using OFGL datasets (snapshot of the data lineage dashboard)
We discovered that local communities, as well as other sector stakeholders, were using our data by integrating them into their own open data portal. At this stage, we had assumed such direct uses, but without concrete knowledge. The lineage statistics page provides easy access to the list of these users. This prompts us to be even more vigilant about the quality of the datasets being used and the regularity of their updates.
Nicolas Laroche
Project Manager, OFGL

Data lineage is an essential feature to demonstrate the impact and ROI of your data portal to all stakeholders, improve data portal maintenance and strengthen your data sharing strategy. To learn more about Opendatasoft’s data lineage feature click here to watch our new webinar on the subject.

Articles on the same topic : Reporting Data Intelligence
Learn more
The impact of GenAI on data management – predictions from Gartner Data Trends
The impact of GenAI on data management – predictions from Gartner

How can generative AI help Chief Data Officers and other data leaders to better manage their operations? Based on Gartner research, our blog outlines the key benefits AI can provide within the data management stack

3 reasons why data marketplaces are the only solution to turn data into value Data Marketplace
3 reasons why data marketplaces are the only solution to turn data into value

How can you maximize the value of data and use it to achieve organizational objectives? That’s the ambitious goal of many data leaders as they plan for 2025. In an increasingly digitalized world, where data volumes are exploding, to generate value data leaders need to enable everyone in the business to easily access the right information in a seamless way. Data marketplaces are essential to this, delivering capabilities that move beyond traditional data catalogs, as this article explains.

2025 data leader trends and the importance of self-service data – insights from Gartner Data Trends
2025 data leader trends and the importance of self-service data – insights from Gartner

Growing data volumes, increasing complexity and pressure on budgets - just some of the trends that CDOs need to understand and act on. Based on Gartner research, we analyze CDO challenges and trends and explain how they can deliver greater business value from their initiatives.