A visualisation of language knowledge in Europe

Table of Contents

What is languageknowledge.eu ?
A selection of mentions
Why not visualize the data on a map ?
Where does the data come from and how are the statistics computed ?
How reliable is the data ?
Launch text

What is languageknowledge.eu ?

Languageknowledge.eu is a visualization of language knowledge in Europe by Jonathan Van Parys, launched to coincide with the 2012 European Day of Languages. The site uses the latest available data from the Europeans and their languages survey by the European Commission.

People like Martin have done such nice things with demographic and migration data. So when the European Commission published a new Eurobarometer survey on languages, I thought it would be fun to do something interactive with language data to visualize language dynamics in Europe: which languages are growing in popularity, which languages are most spoken in each country, which countries are lagging behind in language skills, etc.

For ideas, questions or corrections feel free to email me at jvanparys@gmail.com

Thanks to @pvpbrussels, @odemarne and @madewulf for their feedback as I was working on the site.

I'd also like to thank the European Commission for making the raw survey data accessible and TNS for answering questions on the survey's methodology.

A selection of mentions

Visualising Europe’s Languages · Open Knowledge Foundation blog
How fluent are we? · Low Countries blog
Have a new look at data on language knowledge · European Commission, EU Languages and Language policy
Czechs' English the worst in the EU · aktualne.cz (in Czech)
Slovaks and European languages skills · aktualne.sk (in Slovak)
How many Europeans speak English, or other languages? · Quora
How many people speak English in Poland? · Planet English (in Polish)
Language Knowledge, en un par de clics · Documentación aplicada a la Traducción, Universidad de Granada (in Spanish)
El català, setena llengua amb més nous parlants d'Europa · VilaWeb.cat (in Catalan)
Stop povinnému dabingu v Česku · stopdabingu.cz (Stop Compulsory Dubbing Petition, in Czech)
Come spingere gli italiani a studiare l’inglese? · LINKIESTA.it (in Italian)

Why not visualize the data on a map ?

You might be wondering why I didn't just choose to visualize this data on a map, given that it has an obvious geographical dimension to it. My take is that maps aren't so good at comparing countries because area size and position in space have strong meanings when you think about data, so unless geographic location or size explain something about the data, it just makes it harder to make sense of it and potentially sends confusing signals about the ranking of countries.

Plus lists and horizontal bar charts are particularly useful for direct comparison, especially when people can play with dimensions, as you can here with the age-groups and types of knowledge. Resizing and reordering bars just makes rankings and changes in position more obvious.

Where does the data come from and how are the statistics computed ?

Language knowledge statistics are computed using the Europeans and their languages Eurobarometer survey published in July 2012. By the way, there are many other Eurobarometers out there: it's well worth taking a look.

In early 2012, a little more than 27 thousand people all over Europe above the age of 15 were polled for the Eurobarometer; grossly-speaking, about one thousand per country. I grouped people between ages 15 and 34 (Younger), 35 and 54 (Middle) and 55+ (Older) to highlight how language knowledge evolves between generations.

In the survey, every person is asked to state their mother-tongue (Native) and which other languages they speak well enough to have a conversation (Learned), and how well they speak those learned languages (basic, good or very good). In line with other studies, I consider a person knows a learned language when they state their knowledge of it is good or very good. I cross-checked the statistics on languageknowledge.eu with those that appear in the official Eurobarometer report and they are fully consistent.

You may be surprised to see that certain languages you expected to see aren't present. The reason is that the set of options offered to respondents in the survey only included the 30 or so languages included on this website.

Data on sub-titling was retrieved from Wikipedia's article on Dubbing. The reason I included it is that, as you will see, the use of sub-titling rather than dubbing or voice over in the media is highly (and positively) correlated with the knowledge of foreign languages. Sub-titling on television may be the cheapest language school there is - that some countries still allow dubbing is just non-sensical. The other determining factor in explaining foreign language knowledge appears to be the size of a country's native languages: the more widespread they are (as a native or learned language), the less a country is likely to be multi-lingual.

How reliable is the data ?

This is obviously survey data based on self-assessment, so what it says is only as good as the representativeness of the sample and the ability of people to self-assess their knowledge of a language. I've noticed that some small languages may be over-represented in some countries because of sample bias. This being said however, I've found that results from the Eurobarometer language survey match more in-depth local surveys, are consistent with other European language learning surveys, including the first European survey on language competences, and tend to be correlated with compulsory school languages in countries.

Launch text

Launching on the 2012 European Day of Languages, languageknowledge.eu is a new website that visualizes language knowledge in Europe based on the latest European Commisson survey data, published this summer in the "Europeans and their languages" Eurobarometer.

The interactive website allows visitors to find out which languages are most widely known in Europe, by country, age groups, and see the split between native speakers and people who learnt the language later in life. Visitors can also pick any language to see in which countries that language is most popular.

Here are a few interesting insights:

  • German, English and Italian are the largest mother-tongues in Europe.
  • English, French and German are the largest foreign languages in Europe.
  • Italy, Spain, Czech Republic and Hungary are the countries where young people speak the least English.
  • Some 41% of young people in Europe speak English, and 21% speak German and French.
  • Zoom in on older people, and those figures are 25%, 23% and 18% respectively.
  • The top foreign languages in Poland, where the bulk of the European Day of Languages festivities are taking place, are English, German and Russian. If you zoom in on older people, the order is reversed to Russian, German and English.
  • Outside of Poland, the countries with the largest shares of Polish speakers are Lithuania, Ireland and Germany.
  • There are three countries in Europe where Russian is known by more than 40% of the population: the 3 Baltic States.