[Webinar] Collaboration and Monetization of Data Products: The Role of the Data Marketplace

Watch the replay
Data Trends

The importance of data quality in turning information into value

What is data quality and why is it important? We explain why data quality is central to building trust and increasing data use, and the processes and tools required to deliver consistent high-quality data across the organization.

VP of Marketing , Opendatasoft
More articles

Organizations are generating and collecting increasing volumes of data – but for this information to be used at scale, it has to be reliable, accurate, and trusted by employees and other stakeholders. Businesses therefore need to focus on data quality if they are to create real value from their information, drive greater usage, and support better decision-making. Otherwise poor data quality will add to costs, increase inefficiency, prevent collaboration and undermine security and compliance. 

According to Gartner, every year poor data quality costs organizations an average of $12.9 million, holding back data sharing and preventing employees effectively using data in their working lives. What is data quality and how can it be improved?

Copy to clipboard

High quality data shares multiple key attributes:

  • It is accurate, without mistakes or errors
  • It is reasonable, with data containing valid values within acceptable ranges
  • It is timely, with information being up-to-date
  • It is complete, without gaps or missing fields
  • It is consistent, following the same formats and using the same units throughout the dataset, matching terms and definitions used across the organization

Reliable, high-quality data builds trust with users. This means they are confident about accessing and using it, as they are certain that it meets their requirements. Ensuring that data is trusted is particularly vital when scaling data use, such as through self-service data marketplaces, where data might be shared by another department and there is no existing relationship between the user and the data provider. 

Poor data quality damages businesses in five key ways:

Poor decision-making

Inaccurate data leads to the wrong decisions being made, such as when creating strategies and tactical plans. For example, if sales figures seem to show that a particular product is both profitable and popular, businesses will make or stock more of it. If the figures are inaccurate, then this leads to the wrong supply chain and sales strategy, impacting revenues and profitability. The same applies if customer data is out-of-date, leading to missed opportunities to upsell or cross-sell products or services.

Lower efficiency

Manually fixing issues with low quality data takes an enormous amount of time within organizations. Often data analysts have to dedicate a significant part of their workload to checking, updating and cross-referencing information before they use or share it. This prevents them dedicating their time to higher-value activities that benefit the business. Equally, data can be used to automate previously manual processes – something that cannot happen if it is unreliable, out-of-date or in the wrong format.

Compliance and reputational damage

Under regulations such as the GDPR and CCPA organizations have a duty to safeguard confidential data and ensure it is accurate and reliable. Poor quality data, particularly if shared externally with regulators or partners, can damage reputations or lead to legal issues, triggering investigations by regulators and lawsuits from aggrieved consumers.

Inability to create new data services

Many organizations are harnessing the data they produce to create new data services that they can sell to existing and new customers. This will be impossible to achieve if data is unreliable or inaccurate, impacting new revenue streams and undermining existing customer relationships.

Holds back increased data use

Data should be central to how every employee works, helping them to be more productive, efficient and effective. However, if they do not trust or understand particular datasets or believe they are low quality, they simply will not use them. This prevents data democratization, collaboration between departments and digital transformation. Poor data quality is a particular issue if inaccurate information is used to train AI models, resulting in inaccurate outcomes and potential bias within results.

Copy to clipboard

Given the business benefits of data quality, it seems obvious that every organization should be focused on ensuring their information is accurate, reliable and trustworthy. However, achieving data quality is held back by two major challenges:

Growing volumes of data

In an increasingly digital world, the sheer amount of information generated and collected by organizations is growing exponentially. Internal sources, such as business systems, customer databases and Internet of Things (IoT) sensors all produce enormous volumes of data, often created in real-time, while information collected or bought from external partners adds to the problem. Keeping track of all of these data assets, and ensuring they meet quality standards, is a major issue for organizations.

Data is scattered across the organization

These increasing volumes of data are being created and stored across the organization, such as in departmental systems. This makes building a comprehensive inventory of data a major task for data leaders and their teams. Differences between the terms used by different departments to describe data also leads to potential quality and trust issues. For example, the term “customer” might not mean the same thing when used in data generated by the sales or finance departments, leading to confusion and inaccuracies if information is shared or compared.

Copy to clipboard

Improving data quality requires a strong combination of data governance processes, data quality tools, and training. There needs to be a comprehensive, end-to-end framework in place that covers all relevant data and builds trust with users so that they confidently use data in their working lives.

Data governance

Data quality begins with the creation and enforcement of robust data governance processes. Data governance covers how you identify, organize, handle, manage, and use data collected in your organization. It ensures data quality and compliance by setting out a framework for how data is stored, protected and shared, with data stewards appointed to look after individual datasets, and comprehensive monitoring to ensure data is used in accordance with corporate rules. As part of data governance it is recommended that organizations create a data dictionary, defining what terms used to describe data actually mean, in order to provide consistency across the organization.

Data governance is particularly important when data is being shared between departments or externally – users need to know what a dataset contains and who the owner is, so they can make contact with them if they have queries. Data governance also needs to cover security and access rights, protecting confidential data from unauthorized access and therefore ensuring compliance.

Data quality tools and metadata

As well as rules that help ensure the quality of data, organizations need to automate data processing to remove manual steps that cause inefficiencies and potential errors. Data quality tools that check and fix data, particularly for common issues (such as formatting mistakes) or processes (such as anonymizing personally identifiable information) need to be an integral part of all data processing flows. Many solutions, such as Opendatasoft, have built-in processors that can be used to improve data quality, maximizing efficiency and accuracy.

Accurate metadata is also central to data quality. Metadata is “data about data”, providing context to what a dataset contains and explaining its provenance. This helps people better understand the dataset and increases interoperability. Metadata improves data quality by providing consistent answers to questions users might have about a dataset.

Training and data culture

While data governance and tools put in place the right processes and frameworks to deliver data quality, ultimately it is vital to build a strong, confident data culture across the organization. This ensures that data is treated as a key business asset by everyone, from data creators and stewards to users. They need to be educated on its benefits, how to harness it, and the importance of using data while safeguarding privacy and compliance.

Copy to clipboard

Data marketplaces enable organizations to share data at scale, creating a central, secure portal that collects and makes available all relevant data, to all users. An intuitive, self-service, e-commerce style experience makes it easy for everyone to discover and reuse data, while ensuring compliance through robust access rights management. Data marketplaces play a key role in ensuring data quality, with in-built processors to check and fix quality issues and drive data governance, strong metadata to describe datasets, and the ability for users to connect directly to data producers with feedback and queries. All of this makes data more available and builds confidence and trust in it, helping spread its use across the organization and beyond.

Want to learn how Opendatasoft can help meet your data quality challenges? Contact us to find out more and book a personalized demo

Articles on the same topic : Data marketplace Data Sharing Data Intelligence Data democratization Metadata Compliance
Learn more
What is metadata and why is it as important as the data itself? Data intelligence & reporting
What is metadata and why is it as important as the data itself?

Understanding the importance of metadata and putting the right strategy in place is vital to effective data sharing and reuse via data portals to progress towards data democratization. Our comprehensive blog explains what metadata is, outlines its benefits and shares best practice for your strategy.

What is data governance and why is it an essential foundation for data democratization? Digital transformation
What is data governance and why is it an essential foundation for data democratization?

Strong data governance is vital to extend the use and value of your data across your organization and ecosystem, but also to protect it and meet regulatory obligations. We explain the benefits and challenges of data governance and share best practice advice for successfully introducing programs that will help you become a data-driven organization.

Overcoming the top 5 challenges faced by Chief Data Officers Data Trends
Overcoming the top 5 challenges faced by Chief Data Officers

Chief Data Officers are central to organizations becoming data-centric, maximizing data sharing to ensure that everyone has immediate access to the information they need. We explore the challenges they face - and how they can be overcome with the right strategy and technology.