[REPLAY] Product Talk: Using AI to enhance the data marketplace search experience

Watch the replay
Product

Open data inception: listing 2600+ open data portals around the world

Opendatasoft is thrilled to continue its Open Data Weekly series with this comprehensive list (and a map) of 2600+ Open Data portals in the world.

Brand content manager, Opendatasoft
More articles

Today, the Opendatasoft team is launching something really special; something for all you open data geeks out there.

When working on building a state-of-the-art data and API solution, we often hear this question: “Where can I find clean and usable data?”

Over time, the idea of creating a unified resource gathering every open data portal in the world started to emerge. Opening it to everyone was at the core of the project’s conception. A couple of Dr. Evil gifs later, we started outlining the Open Data Inception project: a list of 2600+ Open Data portals around the world.

So, after the tasty map of French cheese, the state of Open Data in 2014 and SNCF Open Data, Opendatasoft is thrilled to continue its Open Data Weekly series with this comprehensive list (and a map) of 2600+ Open Data portals in the world.

Ebook - Data Portal: the essential solution to maximize impact for data leaders

Copy to clipboard

Our first step was to search for similar projects. Whether the projects were OpenGeocode or DataPortals by the Open Knowledge Foundation, what we found was interesting, but not exactly what we envisioned. OpenGeocode was mostly focused on American portals – for example, no French portals are listed. In addition, the two weren’t easy to understand either: is there an API? Can I download an entire dataset? How can I select one area of a map?

We also found lists on Quora and StackExchange showing interesting data. However, we still had a problem because neither list was structured, nor were they easy to reuse without clicking the links every time.

We therefore decided to combine what we found with the Open Data portals we knew of, adding those by hand that were not already listed. The Opendatasoft platform allows users to add different data sources to a single dataset. Thus, we added the data that we collected, as well as a link to an online table where we were able to add data by hand, keeping them permanently synchronized to their main dataset.

When mixing different data sources into one dataset, it is important to find a common thread between all of these data. In our case, we limited ourselves to the name, the organization, the link to the portal, and a location. In most other cases, all of the other information was difficult to find, and we wanted a single consistent and useful list. Next, we used simple scripts, namely Clojure, to harmonize the different fields. For example, we capitalized textual fields or converted geographic data into one coordinate system.

Copy to clipboard

When collecting data from multiple sources, you seldom have a clean resource at first: there are typos, missing coordinates, duplicates, and various typologies.

We also wanted to get two things out of our data:

  • A list of all Open Data portals classified by country that people could easily browse through and bookmark.
  • An independent website showing a cool map on which Open Data portals could be geotagged. This would give a good feeling on the density of Open Data portals.

On our original list, countries, cities and organizations were all placed on the same level. So we went through and created two columns to add standardized country names (in French and in English).

It almost immediately raised the question of our own geopolitical knowledge. Should we classify England, Wales and Northern Ireland in different rows or include them in the United Kingdom? What about the Isle of Man, which is a self-governing British Crown Dependency? In order to avoid unnecessary fraying, we used the United Nations list of sovereign states.

 

Here is a table of all organizations and countries with Open Data portals that we could gather.

Our second task was to clean and fill geographical coordinates for all of the Open Data portals on the list. We just had a little over 1000 portals already geotagged, so we added the remaining 600 by hand.

City portals were easy to map but what about that of the United Nations or nation-wide portals? The later were respectively mapped on the headquarters of their parent organization and on their district capital. For example, if a portal was a civic initiative throughout Spain, we mapped it in Madrid. If the portal was Cantabria’s, we mapped it on Santander.

The last steps were to remove duplicates, push the new dataset onto our public portal.

Copy to clipboard

In order to put our data on a map in seconds, we uploaded it onto our cloud-based data solution. The Opendatasoft platform automatically recognized coordinates and mapped the Open Data portals. When looking at the map on a global scale, portals are regrouped as clusters allowing visitors to grasp the density of Open Data available in the area.

We simply customized the basemap and the markers through the admin interface. Neither code nor any hassle were required.

Copy to clipboard

One of the perks of the Opendatasoft solution is its ability to generate widgets that are always connected to your data through APIs. Opendatasoft provides a large open source widget library, which allows for easy and quick construction of effective dashboards.

Communauté data - 1

The widget code we’ve copied and pasted in our HTML page. Easy as A.B.C.

Our dataset and our map of all the Open Data portals from around the world were ready. The widgets then proved useful to build our website.

We created a responsive website displaying all of the Open Data portals by adding the map widget and the search box widget to effortlessly explore the data. The two widgets are connected: if you search for Paris, New York or the Isle of Man, the map will show you these results instantly.

We built the website from 25 lines of code, taken straight from the tutorial. Neither Javascript, nor Python, nor PHP were required. Just some simple HTML and CSS. All the behaviour is handled by the widgets.

Want to use the Opendatasoft open source widgets? Check our documentation and our tutorials.

Copy to clipboard

A few things we learned along the way:

  • More than 200 countries have dedicated space to Open Data,  whether these portals are official, unofficial or civic initiatives.
  • The United States boasts an impressive 500 Open Data portals, ranging from citywide to those of intergovernmental organizations, such as the UN. We have our favorite portal of course, that of the City of Durham and Durham County.
Communauté data - 2 Copy to clipboard

If our goal was to achieve a comprehensive list of all Open Data portals around the world, our work is by no means done.

Dead URLs, new portals created every day, portals we missed… Since we hope the list will help other people and professionals to get a unified and up-to-date resource, we’d be happy to receive feedback.

We will also add other sources that are not Open Data per se over time: data dumps, Github repositories, portals…

We forgot one of your portals? You found a dead link? Shoot us an email! We created a form just for that. Or, you can do it on Twitter.

We hope you enjoy the list and the map just as much as we do, and we hope that it will prove useful to you! You’ll find the dataset here.

Articles on the same topic : Open data Data Sharing
Learn more
Data Portal: The essential solution to maximize impact for data leaders Ebook
Data Portal: The essential solution to maximize impact for data leaders

All organizations understand the vital importance of data to success. In a world full of data, easy and rapid access to the right datasets, in the right format, at the right time is crucial to decision-making, efficiency, collaboration, innovation and transparency. It decreases costs, builds new revenue streams, and mitigates risk. This ebook provides a comprehensive introduction to data portals at both a strategic and tactical level. It aims to help you embrace data democratization and unlock the value of your data.

Accelerating public sector data sharing – best practice from Australia Public Sector
Accelerating public sector data sharing – best practice from Australia

Data sharing enables public sector organizations to increase accountability, boost efficiency and meet changing stakeholder needs. Our blog shares use cases from Australia to inspire cities and municipalities around the world

The importance of urban data exchanges to building smart cities Public Sector
The importance of urban data exchanges to building smart cities

Data drives effective, well-functioning smart cities and helps build local ecosystems that bring together all stakeholders to meet the needs of the entire community. However, sharing data between stakeholders can be difficult - based on recent Gartner research we explain how urban data exchanges transform smart city data sharing.