Understanding and accelerating data usage with data lineage
Data portals need to demonstrate their impact and meet user needs by providing the right data assets to generate reuse. We explore how our customers are using the Opendatasoft data lineage feature to analyze portal performance and continually improve the experience they provide.
Whether your data portal is internal, aimed at partners or open to all, it is vital that you understand how your data is being used. This enables you to demonstrate the impact of your program, helps convince data owners to share their data, guarantees future funding and resources and enables you to react more quickly to changing user needs.
That’s why Opendatasoft has launched its unique, innovative data lineage feature. Focused on usage, it allows organizations to better understand how their data is used internally and externally, across data ecosystems, while improving the ease and efficiency of data portal management.
Data lineage - at the heart of analyzing performance
Data lineage is vital to running successful data portals at both an operational and strategic level.
By understanding data usage through lineage you can focus your data roadmap and make better and more informed decisions around:
- Publishing new datasets – Which new datasets should I publish first? Which datasets do my users want me to make available?
- Updating and managing datasets – Which datasets should I update as a priority? Who is using my data within my ecosystem? How might they be impacted by any changes I make?
- Sharing and demonstrating the impact of a data portal – How do I know if my data portal is used and is providing value? How can I know if particular pages or datasets depend on my own data?
Opendatasoft’s data lineage feature has been created to uncover relationships between data and how it is used, enabling better operational and strategic decision-making by providing understandable insights to data portal managers. It delivers value at both the dataset level, through data mapping, and at a portal level, providing strategic and operational insights through an intuitive dashboard.
Data lineage mapping use cases
It is vital to understand the journey of individual datasets, from when they are generated or added to your portal through their different uses inside your portal and when reused within other applications. Data lineage mapping provides insights into where datasets are being used (such as within pages, visualizations and even by other portals within the Opendatasoft ecosystem) and who is reusing them internally or externally.
This means that if you need to make changes to a dataset or even remove it from the portal you can identify who will be affected. This enables you to contact relevant users and ensure they are in the loop and kept informed.
Digital Wallonia: Manage data depreciation with intelligence
Digital Wallonia, which supports digitization across the Belgian region of Wallonia uses data lineage insights to manage data depreciation. If the team is looking to delete a dataset, they go through data lineage to see if it’s used, and if so where and by whom. This helps with daily decision making and enables them to minimize any risks to reuses when deleting datasets.
UK Power Networks: Uncovering new data reuse insights
In the energy industry, the comprehensive open data portal of UK Power Networks (UKPN) is used by a wide community of users, ranging from local authorities and developers to consumers. While UKPN was able to see which its most popular datasets were, it couldn’t automatically understand what datasets were being reused for, and who was using them. Now, thanks to data lineage it has a clearer view of the type of reuses, all while preserving user anonymity. This helps it better plan its strategy and roadmap for releasing new datasets.
Data lineage dashboard use cases
As well as understanding how individual datasets are being reused, administrators need to be able to demonstrate the overall value and impact of their data portal. That requires a higher level view of which datasets are most valued, and who top data consumers are. Administrators also need to be able to manage their portal efficiently, identifying any issues (such as invalid relationships or underused datasets).
Opendatasoft data lineage delivers this overview through an intuitive, interactive dashboard that allows portal administrators to see where data is being used (internally and externally), which datasets are most popular and flags operational issues around data quality. These insights demonstrate the impact that a portal is having, and any areas where it can be improved. For example, if a dataset is underused, administrators could choose to feature it more visibly on the portal home page to increase engagement, or alternatively depreciate it. The dashboard therefore aids better, more informed, decision-making around the direction and strategy for your portal.
SNCF: Building new connections and uncovering new insights
French railway operator SNCF publishes an enormous range of data on its portal, from timetables and customer satisfaction metrics to network information such as the location of bridges and level crossings. This can be reused by a wide variety of individuals and organizations, from mobility players to energy companies and municipalities in apps, websites and data visualizations. SNCF therefore uses the data lineage dashboard to build new connections, encourage collaboration and start conversations around particular data needs with its data ecosystem. It can now identify external stakeholders that are using particular datasets and reach out to them to understand their data needs and how they can best meet them.
OFGL: Identifying dependencies and improving data quality
OFGL is the French government body responsible for collecting, analyzing and sharing information on the finances of local government agencies, from municipalities and departments to entire regions. While it believed it knew who was using its data, data lineage has provided definitive evidence of who is consuming its data, enabling it to better demonstrate the value of its data portal.
Data lineage is an essential feature to demonstrate the impact and ROI of your data portal to all stakeholders, improve data portal maintenance and strengthen your data sharing strategy. To learn more about Opendatasoft’s data lineage feature click here to watch our new webinar on the subject.
How can you maximize the value of data and use it to achieve organizational objectives? That’s the ambitious goal of many data leaders as they plan for 2025. In an increasingly digitalized world, where data volumes are exploding, to generate value data leaders need to enable everyone in the business to easily access the right information in a seamless way. Data marketplaces are essential to this, delivering capabilities that move beyond traditional data catalogs, as this article explains.
Growing data volumes, increasing complexity and pressure on budgets - just some of the trends that CDOs need to understand and act on. Based on Gartner research, we analyze CDO challenges and trends and explain how they can deliver greater business value from their initiatives.