[Webinar] Collaboration and Monetization of Data Products: The Role of the Data Marketplace

Watch the replay
Product

How to accelerate the reuse of data thanks to deep search features

natural language serach Opendatasoft

Searching for data shouldn’t be the equivalent of looking for a needle in a haystack. Our blog explains why you need natural language search within your data platform if you are to increase usage and drive data democratization.

VP of Marketing , Opendatasoft
More articles

Without effective search engines, the web would simply be an enormous mass of disorganized information. People would only be able to visit sites that they already knew about, severely limiting their ability to discover new information. Organizations would find it hard to attract new customers or visitors who had not heard about them before.

Thankfully, today’s web is much more user-friendly. All people need to do is either type or speak their query into a search engine, phone or smart assistant to get fast, accurate results. And because search companies such as Google are continuously looking to improve their natural language algorithms to better reflect the intent behind a search, results are always getting better.

Copy to clipboard

When it comes to internal and external data portals, users have the same need to find relevant datasets or information as when they are searching on the web. If they can’t quickly find what they are looking for they are likely to give up on their query or switch to other ways to find it.

  • On public open data portals it will undermine trust and stop people coming back to use them in the future
  • For internal self-service data portals employees will waste valuable time searching, leading to inefficiency. Alternatively, they won’t bother incorporating new, relevant datasets into decision making or their daily working lives, undermining data cultures and stopping organizations becoming data-driven
  • Users visiting data services platforms won’t benefit from relevant data services, potentially hitting the revenues of providers.

All of this means it is imperative that data portals make it easy for visitors to find information successfully. This means overcoming multiple challenges:

  • They have a wide range of users, particularly on open data portals, with differing levels of knowledge about data. For example an energy company’s portal might have visitors from local government, developers, generators, researchers and the general public. Search has to be able to deliver the right results to all of these audiences by understanding their queries.
  • People will use different words to describe datasets. This is particularly true on public open data portals, where citizens might search using completely different words to those that internal data administrators use. Some portals might even include data in multiple languages, further complicating navigation. Just like a search engine, portals need to be able to find the right results, whatever terms (or language) is used.
  • How data is described, and the metadata used, can be complex and technical. That means non-specialists can find it hard to understand exactly what a dataset means or the information it contains. If they are unsure whether it is what they are looking for, they may well then not access, download or reuse it.
  • While data portals clearly contain less information than the web itself, they do have an increasing amount of data available on them. For example, Log In to North Carolina, the open data portal of the North Carolina Office of State Budget Management (OSBM) has datasets that go back to 1969, covering areas as diverse as population (including census data), labor force, education and agriculture, supplied by 20 state departments. For many portals simply scrolling through the available datasets is just too time-consuming for users to attempt.
Copy to clipboard

All of this makes navigation and search a key part of deploying a successful data portal and driving data democratization. Making it easy for users to find information is critical, and that requires natural language search that uses the same techniques as commercial search engines such as Google.

What is Natural Language Search?

As the term implies, natural language search is a search carried out in everyday language, just as if you were having a conversation with someone. The search engine understands the whole sentence and uses this to find the best answer to the query. Algorithms learn from user satisfaction and therefore continually improve the results they provide.

The alternative is keyword-based search. This tries to break the query down into its most important terms, removing connecting words such as “how” and “the”. It then matches these against what is in its database or knowledge base. However, it may not be able to find an exact match (or may find thousands of results) and cannot cope with different terms that refer to the same concept, unless it has been specially programmed.

On a data portal, natural language search enables organizations to:

  • Deliver relevant results faster to users, improving efficiency, gaining time and giving users exact matches to their intent
  • Improve the user experience as searchers seamlessly get to access the data they need.
  • Provide greater confidence to users that they are getting the right results

This all makes it more likely people will return to data portals and incorporate datasets in their working and private lives. It therefore boosts data democratization, enabling greater transparency, performance and innovation.

Copy to clipboard

At Opendatasoft we are committed to making it as easy as possible for everyone to benefit from improved access to data, making it easy for all to experience data in our daily lives. That’s why our platform incorporates:

  • Straightforward navigation that enables users to easily move through a data catalog manually to find the datasets they want. You can even build portals with pages and subpages covering specific themes to aid navigation outside of the data catalog itself.
  • The ability to automatically add detailed metadata to help users find specific datasets
  • In-built filtering that lets users narrow down their searches. For example, French energy company Enedis enables visitors to filter geographically and by theme (such as mobility, energy and operations), with the ability to combine multiple filters.
  • Powerful natural language search capabilities that understand the entire search and look for matches within the metadata from all the available datasets (title, description, keywords etc.) Importantly, the natural language search engine also gives insight into what users are looking for – including flagging searches that had no matching dataset. This helps plan which new datasets should be published to meet user requirements.
  • Advanced searching using our query language. For technical users the Opendatasoft query language makes it possible to express complex boolean conditions as a filtering context. These can be full-text, using boolean operators or through per-field filtering.

Users expect finding the right dataset to be as simple as performing a Google search. That’s why the platform supporting your data sharing portal has to deliver seamless and immediate results, using natural language searching if it is to drive usage and data democratization.

 

Want to learn more about our data democratization platform? Contact one of our experts!

 

Articles on the same topic : Features Data catalog
Learn more
The impact of GenAI on data management – predictions from Gartner Data Trends
The impact of GenAI on data management – predictions from Gartner

How can generative AI help Chief Data Officers and other data leaders to better manage their operations? Based on Gartner research, our blog outlines the key benefits AI can provide within the data management stack

3 reasons why data marketplaces are the only solution to turn data into value Data Marketplace
3 reasons why data marketplaces are the only solution to turn data into value

How can you maximize the value of data and use it to achieve organizational objectives? That’s the ambitious goal of many data leaders as they plan for 2025. In an increasingly digitalized world, where data volumes are exploding, to generate value data leaders need to enable everyone in the business to easily access the right information in a seamless way. Data marketplaces are essential to this, delivering capabilities that move beyond traditional data catalogs, as this article explains.

2025 data leader trends and the importance of self-service data – insights from Gartner Data Trends
2025 data leader trends and the importance of self-service data – insights from Gartner

Growing data volumes, increasing complexity and pressure on budgets - just some of the trends that CDOs need to understand and act on. Based on Gartner research, we analyze CDO challenges and trends and explain how they can deliver greater business value from their initiatives.