New statistics for Wikipedia

Erik Zachte, chief manager for statistic of Wikimedia Foundation, published corrected numbers. The main problem was webcrawlers from the United States of America. These are programs searching the web for new sites and pages; when they hit a page, it looks to the statistics as if a human visitor has summoned that page. Erik now fixed that, thanks!

This has quite some repercussions for the maps I have made based on the statistics (Europe, Arabian countries, Arabic). Some items:

  • In general, the United States and English appear now less prominent in the statistics.
  • Previously, it seemed that 21 percent of the viewers of Wikipedia in Hungarian came from the US. This dropped to 0,6 percent.
  • It looked as if in the Netherlands and Sweden the Wikipedias both in the national language and in English have for about as many viewers. Now that has changed, too: Dutch and Swedish have a share of roughly 55% each, and English roughly 35%.

Still, we see in many cases an increase of the share for English, e.g. in Austria from 14.5% to 18.3% in the years 2009/2010.

The main conclusions remain the same: Most people look up in the Wikipedia edition of their native tongue, and if they don’t find anything, they go to Wikipedia in English. Third languages hardly play a role.

The Wikipedia country and language version map of Europe

 

This is a map (pdf-version) based on the numbers published by Erik Zachte, statistics manager of Wikimedia Foundation. It answers to the question which language versions of Wikipedia are mostly visited in a specific country.

At first glance there is no obvious pattern in the map. Still, the countries can be divided into four groups:

* English hegemony

* National hegemony

* Dual hegemony

* Mixed situation

In a number of countries, especially on the Balkans, English is the most read Wikipedia language. In others, there is a national or official language that champions the field, for example in France and Poland. (In the case of Britain, English and the national language are the same.) The third major group consists of the countries where English and the national language both share the hegemonic role (the Netherlands, Norway, Hungary).

The last group, ‘mixed’, contains only a small number of countries, most prominent Belgium and Switzerland, to a certain degree also Bosnia & Hercegovina. These are ethnically/linguistically mixed countries.

Usually, languages learned at school or languages of neighbouring countries do not play a notable role. An exception is Italian in Albania; other cases such as Polish in Lithuania can be explained with ethnic minorities.

The conclusion: Europeans primarily consult the Wikipedia language version in their native language. If that version is weak (and also depending on other factors), they consult the English version.

 

 

Loss in numbers, loss in real wiki life?

Recently in the Netherlands: Ronald explains the Dutch Wikipedians that Wikipedia will die away over the next three years. He backs his fear with hard numbers. After a steep grow, the number of edits made in the bigger language versions is slowly declining.

I believe that he took a lot of effort to get his numbers. But he also said: “I am not talking about content, but about numbers.”

As a historian, I have a different approach. Numbers can give me an inspiration – before going to the sources to understand “what has actually happened”, as nestor Leopold von Ranke gave the example.

Some things seem to be countable, but in fact they are not. You can count how many times the USA vetoed in the UN Security Council, or the Soviet Union. But this is not enough to tell which country caused more good or evil. You have to take a look what these vetoes were about.

What is an edit in Wikipedia? One vandalism edit and one revert – makes two edits who contribute nothing to the progress of our encyclopedia. An afternoon writing an article and putting it online in a single edit – makes only one edit but improves the wiki.

When I was a beginner and I saw an error in an article, I corrected it. Later I saw another error and corrected it, too. Nowadays, I read the whole article before saving, so I do the same work in only one edit. Statistics, nevertheless, believe that I am only half as productive as I have been before.

Most vandals are young people. The demographic evolution in the rich countries leads to less young people. Less young people, less vandalism, less edits?

Wikipedia once has been a project to create an encyclopedia. It now is an already existing encyclopedia. Writing articles (and fighting about them) causes more edits than maintaining them.

So, obviously there are fewer edits. I don’t know what that means. Maybe it is no cause for worrying at all.

What are ‘foreign helpers’?

Wikimedia Statistics have a site map that presents you a list of all language editions of Wikipedia. You can sort them, for example, by “Participation: Editors (5+) per million speakers”. What will be the result?

The language editions who are on top of that list share one trait: They have a lot of ‘foreign helpers’. A foreign helper is someone who edits a Wikipedia language edition but who is not capable to write in that language.

Take Aragonese, quite high on that list, and have a look at Categoría:Usuario an. There is an impressive number of Wikipedians who indicate Aragonese (an) on their user page, but most of them have an-0, an-1 and an-2. They do not speak Aragonese or only to a limited extent.

Why do they have a user page on Aragonese Wikipedia then? Because they are from other regions of Spain but love Aragonese. Because they are Wikimedia activists who have to communicate to many language editions. Because they take photographs and put them into the articles in question, in all language editions where they exist.

Those people with an user page do not necessarily make five edits a month (= active user) but the trend is obvious: If a lot of your edits are made by foreign helpers then it seems that there are a lot of editors per million speakers. (The linguistic community being very small.)

Other language editions of this kind are Saterland Frisian (supported by Dutch Frisians) and Icelandic (supported by other Scandinavians). And, of course, most prominent is Volapük Wikipedia, a constructed language spoken by less than thirty people in the world.