Opendatasoft integrates Mistral AI’s LLM models to provide a multi-model AI approach tailored to client needs
To give customers choice when it comes to AI, the Opendatasoft data portal solution now includes Mistral AI's generative AI, alongside its existing deployment of OpenAI's model. As we explain in this blog, this multi-model approach delivers significant advantages for clients, their users, our R&D teams and future innovation.
Artificial intelligence (AI) has the power to transform a huge range of industries and business activities. Since the launch of ChatGPT back in 2023, most organizations have taken the opportunity to integrate it into their products and services, demonstrating its long-term impact on the digital world. Opendatasoft has a long track record of researching the concrete applications of AI, and has deployed a range of AI-driven features into its solution that have been quickly adopted by clients. To extend its AI innovation, Opendatasoft is now integrating the Large Language Model (LLM) model of European AI leader Mistral AI, giving customers choice in terms of the models they use with their data portal solution.
Integrating AI into the Opendatasoft data portal solution: a three-step multi-model approach
AI integration doesn’t happen in the blink of an eye. An effective strategy requires technology providers to make well-thought choices during the development phase in order to provide relevant features that meet customer needs and bring real added value. As part of its AI strategy, Opendatasoft has therefore followed three key steps before deploying AI-driven features:
Choosing the right models
From the beginning of its research into AI, Opendatasoft decided to harness the models and services offered by established, leading-edge AI players. This began with integrating the models created by OpenAI. This strategy allowed Opendatasoft to focus its efforts on developing relevant, customer-focused features, each based on a model that was perfectly suited to its use case, rather than having to create its own models from scratch. The models provided by Mistral AI have also now been integrated into the solution.
Choosing the right hosting locations
Once models have been selected, the second question is where they, and the data they consume, are hosted. Again, Opendatasoft has chosen to rely on existing infrastructure, provided by partners such as OpenAI and now Mistral AI, which hosts its models in Europe. Offering this choice is based on growing demand from clients around model localization and hosting. Where data is held is a major issue, especially for organizations in the European Union, where companies must meet requirements around sovereignty, compatibility and GDPR compliance.
Focusing on model querying methods to maximize the value of AI
By leveraging existing AI models and hosting infrastructure, Opendatasoft has been able to concentrate its resources and expertise on the third step of its AI strategy: optimizing model querying methods. Opendatasoft’s R&D teams have invested significant time and resources in optimizing the way AI models are interrogated. This enables the solution to provide the right context and balance between minimizing the amount of information required by models and ensuring that results are truly relevant. This optimization process transforms AI into a powerful driver of both innovation and improved performance when it comes to using data.
The integration of AI into the Opendatasoft data portal solution: a methodical approach focused on the democratization of data
Understanding AI to better harness it and add value
Developing intelligent, AI-based tools requires a deep understanding of their underlying workings. Based on this, Opendatasoft’s R&D teams start by selecting concrete use cases and then use these to create relevant solutions that support the company’s mission of democratizing access to data. To achieve this, it is crucial that the chosen AI models adapt to the context in which they are being used. In the case of Opendatasoft, this means enriching user queries with contextual elements such as metadata or extracts from data assets themselves. These elements enable AI to be leveraged effectively, delivering valuable insights and turning data into a strategic asset for organizations.
Identifying differences between AI models to better adapt them
The integration of Mistral AI’s models has led Opendatasoft’s R&D teams to work on a multi-model approach. Although the performance of different models is almost identical, how a model responds does vary according to how questions and prompts are worded. This is where Opendatasoft’s expertise plays an essential role. By questioning the models in ways that match the real-world its teams have been able to understand their specific differences and any variations due to particular contexts. This ensures that users get consistent results across different models when interacting with AI.
Test, iterate and disseminate knowledge for increasingly efficient models
To ensure reliable AI capabilities, Opendatasoft’s R&D teams apply a fundamental principle: don’t blindly trust AI. They query LLMs by providing them with maximum context to produce results based on real information. For example, to reduce the risk of hallucinations (when a model provides an incorrect answer because they received an inconsistent question), our teams cross-examine them, teaching them to answer “no” when the answer is unreliable, rather than giving inaccurate information. To facilitate the dissemination of knowledge internally within Opendatasoft and enable accurate tracking of AI learnings, all the results of the hundreds of tests that have been carried out are centralized and shared through internal dashboards. These highlight key metrics such as query performance evolution, response time, and cost estimates.
AI features already available on Opendatasoft data portals
For nearly a year, Opendatasoft has been offering its clients a range of AI-based features. Integrated into their data portals these capabilities improve performance and optimize the consumption of their data:
- Intelligent search to identify all relevant data assets: Opendatasoft’s portals have integrated semantic search based on a vector model to improve the relevance of search results. In concrete terms, this multilingual search engine allows users to go beyond keyword searches and literal matches, taking into account context as well as the intent of a query, and delivering fast, relevant results. This means that search will ensure users can easily discover all of the data assets that match their needs.
- Similar data recommendations: Inspired by the experience provided by e-commerce websites, Opendatasoft now maximizes the ability of users to discover relevant data by automatically recommending similar datasets, based on their query and results. This proactive approach not only deepens user engagement, but also enriches their consumption experience within the data portal by increasing the average amount of data being accessed. By displaying the most relevant data assets based on the information being viewed, this feature guides users through the navigation and discovery process, providing them with simplified access to a wealth of additional information. By identifying related datasets and offering them intuitively at the right time in the user journey, Opendatasoft encourages a more in-depth and relevant exploration of the resources available on a data portal.
- The creation of data visualizations to bring raw data to life: By integrating an automated, AI-driven data visualization generation feature, Opendatasoft enables users to create and reuse maps, figures and charts in just a few clicks. This is not only an innovative, interactive and educational tool, but it also allows users to familiarize themselves with AI, building trust and confidence while exploiting raw data in a straightforward, self-service way.
Next steps with AI to drive innovation and accelerate data democratization
AI is a powerful and ever-evolving technology which Opendatasoft is committed to harnessing to help clients effectively address their data challenges, now and in the future. Here is a overview of current objectives and plans for AI:
- Integrate client models: Through the multi-model approach, Opendatasoft wants to eventually allow clients to choose from different AI models, including making it easy to integrate their own models. As part of this, the team is considering creating internal tools to assess the quality of generated content and identify the most relevant model for specific needs.
- Helping users embrace AI: Opendatasoft is also working to provide all the necessary resources to help clients and their users take full advantage of existing AI features. In many ways AI is comparable to an e-bike: it offers valuable assistance, but requires users to know how to pedal to get the most out of it.
- Providing a chatbot: Opendatasoft is exploring the integration of a chatbot into its solution, allowing end users to seamlessly interact with all of a portal’s data assets.
When it comes to maximizing the value of data, AI has an extensive range of potential uses, including task automation, data preparation and quality, defining processing chains, metadata extraction, and document summarization, for example. One thing is certain: for Opendatasoft, the use of AI will be an ongoing process, focused on client needs.
Over the past months Opendatasoft has been working to transform its data portal solution by enriching it with AI, helping clients to save time, improve the experience for their users, and reduce the risk of errors within processes.
Opendatasoft is launching a new AI-based feature: semantic search. This is based on a vector model for easier, enriched discovery of an organization's data assets on a data portal. To find out more, we interviewed Emmanuel Daubricourt, VP Product at Opendatasoft.