Most Interesting Open Data Sources
In this post I wanted to introduce you to some of the most interesting open data sources available online. And I’m hoping it is going to be quite informative of course! And not too boring, especially to those who have limited interest in this arena but decided to spare a few moments and expand their knowledge.
You still there? Good.
If I asked you to fasten your seatbelts that would be an overkill so simply sit back, relax and let’s find out more.
Before I present you with an official definition it is crucial to understand the context and importance of open data sources. Data has the power to revolutionise and disrupt the way societies are governed, identify and predict large-scale trends and behaviours. Using open data also presents opportunities for businesses and improves quality of life by creating innovative solutions.
So what does ‘open’ really mean in the context of data?
Open data is data that can be freely used, re-used and redistributed by anyone – subject only, at most, to the requirement to attribute and share-alike.
The above description comes from the Open Definition document published by the Open Knowledge Foundation to define openness in relation to data and content.
There are also three principles underpinning the openness of data that you should be aware of:
- Availability and Access: the data must be available in a convenient and modifiable form as a whole, preferably by downloading over the internet.
- Re-use and Redistribution: the data must be provided under terms that permit re-use and sharing, including intermixing with other data sets.
- Universal Participation: everyone must be able to use, re-use and redistribute. Any exclusions or restrictions are not allowed as long as data does not contain information about specific individuals.
Those principles need to be consistently applied. They allow sets of data from various sources to work together and this is crucial for the whole openness of data.
Right – since we understand a concept of open data slightly better now – it is the time to look for some interesting sources available online. Please bear in mind this is only a small selection of those data sources I personally find interesting but the choice is unlimited really, and there’s something for everyone so keep on browsing folks!
I’ve come up with a number of categories for different open data sources so you know what kind of data you may find in there. I’ve also asked myself a few questions, namely: how attractive is the content, whether it comes from a reliable source (i.e. government agencies, reputable think-tanks) and how challenging are the problems captured by the source.
Based on the above I decided on the following categories and interesting open data sources for each of them:
1. General/Academic:
The Upshot by New York Times. News and analysis about politics and everyday life. Check out the Arts & Entertainment Guide, it’s a great way of exploring atmosphere of Big Apple.
2. Content Marketing:
The Moz Blog. The industry’s top wizards, doctors and other experts offer their best advice, research, how-tos and insights – all in the name of helping you level-up your SEO and online marketing skills.
3. Crime:
FBI Crime Statistics. Statistical crime reports and publications describing specific offenses and showing trends to understand crime threats. No records of Special Agent Dale Cooper though!
4. Education:
Education Data by Unicef. Data related to sustainable development, school completion rates, net attendance rates, literacy rates, and more.
5. Entertainment:
Million Song Dataset. A collection of 28 datasets containing audio features and metadata for a million contemporary popular music tracks. Definitely one of my favourites and that’s not all within this category!
The Numbers. Detailed movie financial analysis, including box office, DVD and Blu-ray sales reports, and release schedules.
6. Environmental & Weather Data:
Environmental Protection Agency. Data items related to more than 540 chemical substances, containing information on human health effects that may result from exposure to various substances in the environment.
National Centers for Environmental Information. Weather record published since 1927, including monthly mean values of pressure, temperature, precipitation, and station metadata notes documenting observation practices and station configurations.
7. Financial/Economic Data:
World Bank Open Data. Education statistics about everything from finances to service delivery indicators. You may find in there the 2017 Global Economic Prospects Report released by the World Bank.
Global Financial Data. With data on over 60,000 companies covering 300 years, Global Financial Data offers a unique source to analyse the twists and turns of the global economy. Those who use it will take the experiences of the past to minimise the uncertainty of the future. Pretty comprehensive view indeed…
8. Government/World:
The CIA World Factbook. Yep, that’s right. The World Factbook provides information on the history, people, government, economy, geography, communications, transportation, military, and transnational issues for 267 world entities.
9. Social:
Facebook Graph. I’m not just a pretty face.. I’m the primary way to get data in and out of Facebook’s social graph. It’s a low-level HTTP-based API that is used to query data, post new stories, upload photos and a variety of other tasks that an app might need to do.
10. Travel:
Search the World. Great place to find statistics, population, weather, webcams, travel information and more for millions of locations worldwide.
So that was my selection of the most interesting open data sources available online. Hope you’ve found this summary useful.
Give me some details please about other open data sources worth checking. What are the challenges and opportunities ahead of open data? Any thoughts welcome!