Wikipedia discovers plagiarism – as a side effect

Centre for Medical Research, University of Münster (CC-BY-SA, by STBR)

In late May 2011 newspapers reported about another case of plagiarism in Germany. It seems that within three years, the Medizinische Fakultät of the University of Münster accepted two PhD theses with huge similarities. This was discovered by a Wikipedian who tells his story on “Wikipedia:Kurier” on Wikipedia in German.

According to him, the introductory parts of theses are useful for writing an Wikipedia article: They sum up the state of the science on a specific subject. In March 2011 he read a thesis from 2006 about a lemma with relation to the prostate (he don’t want to name the specific article for privacy reasons).  As the lemma was unknown to him, he consulted Wikipedia and so he found a PhD thesis from 2009. Same university, very similar subject. Partially the very same text, even the same pagebreak. The same bibliography. The author of the 2006 thesis appears nowhere.

The Wikipedia author, after his discovery, went to bed with an awkward feeling. Nobody hurt him personally, but would it be right to do nothing? He later asked some friends from different universities who all said: Incredible – you must report that!

So he sent an e-mail to the Medizinische Fakultät of the University of Münster. Three hours later he got a reply that the proper institutions will deal with the case, and two weeks later came a preliminary report and a thank you. On May 20th came another interim report with basically what was later reported by the news.

Wikipedia offline (2)

(see part 1)

Okawix is, in theory, easier to install. The browser itself is available for Windows, Linux and Mac. The browser allows you to import the Wikipedia files directly. Like with Wikitaxi, you can choose a Wikipedia language version or one of the other projects. You can also choose whether to install it with or without pictures, which makes a huge difference in data size.

Alas, some downloads take a lot of time and cannot be taken up again when interrupted. For example, Wikipedia in German needs 3 GB, the additional images 13 GB. The image free versions can be downloaded also via torrent. Because of the file sizes you might tend to use the torrent, and that makes installation more complicated (you must know how to handle torrents, and link Okawix to the downloaded file). The data seem to be from September 2010.

In use, Okawix has some drawbacks. From all of the three browsers, it seems to be the less stable. A crash from time to time can happen, especially when you use the search function. Searching in general tends to take some time. On the left side bar you are offered a list of search results that thematically fit your search which is quite nice. A very stupid and unnecessary flaw are the (too) tiny letters in the search field.

Okawix has small Wikipedia pictures in the articles, but some appear very distorted. There appear a lot of links that don’t work and also are not supposed to work (?). The interface can appear in several languages and several skins; you find them if you happen by chance to do a right mouse button click. When you click on the help button in the menu, you get a blank documentation sheet.

OKAWIX RATING: installation 8 points, stability 5 points, interface 7 points, flexibility of content 10 points, illustrations 7 points, up-to-date 5 points

Kiwix

Kiwix as a browser combines the stability of WikiTaxi and the more advanced and button steered interface of Okawix. Kiwix uses the Wikipedia articles in the Open Zim file format which is recommended by the Wikimedia Foundation. You can download them directly from a site, or via torrent. The latter is very good because of the large files. For example, German has 13 GB (text and pictures always together). The data are from October 2010.

After the download of one of 21 language versions you tell the browser where on your hard disk is the file. The first page is the Wikipedia article about “Wikipedia”, which might be funny but is not very helpful. Why not present the main page? A major fun killer is the indexing: After the installation, when you try to use the search function you are told that the pages must be indexed, and after clicking Yes, you will see a time bar with the progress. Indexing takes very long, possibly more than half of a day, so you may do that over night.

But when everything is ready, you will enjoy your Kiwix which is a quick and stable browser. From all of the three browsers, it resembles the original Wikipedia click and feel the most. The pictures are small but good, the links work, also the categories. Using a different language version requires going to the menu and open the file in question.

KIWIX RATING: installation 6 points, stability 8 points, interface 8 points, flexibility of content 4 points, illustrations 9 points, up-to-date 5 points

Conclusion: Kiwix with its stability, quickness and images gives certainly the best browser feeling. But when you need smaller files, because your hard disk is small, use Okawix or WikiTaxi. The latter is the only one that provides you with the latest versions, while only Okawix is available also for Linux and Mac.

Wikipedia offline (1)

Many people, not only in the Global South, still have no internet access, or they don’t have it all the time. If you want to consult Wikipedia offline, there are some technical solutions. I have tried out three of them to help a senior citizen: WikiTaxi, Okawix, and Kiwix.

The principle is easy: install a browser and get the Wikipedia pages data. In practice, this is not too difficult for a digital native, but it is not easy enough for less internet savvy people. With regard to documentation and stability, it is obvious why (two of) the makers call their versions still Beta or zero dot something. Especially the documentation and translation is usually very poor.

WikiTaxi

WikiTaxi is a browser you have to install, and then you download the most recent (or an older, if you wish) version of your favorite Wikipedia language version or other Wikimedia project, e.g. Wikibooks or Wikiquote. Luckily the most recent one is from the previous month. It is not so easy to find the right version in a long list with similar file names. Then you have to use the tool “WikiTaxi importer” to make your file a WikiTaxi-file.

When in use, you have to choose which of the Wikipedia versions you have installed you want to read. As the interface is in English only, this is something you have to explain to a person who does not understand English well. The browser presents you a random page first, and is easy to use with a search function. You get the Wikipedia articles without pictures; where a picture was in the original, you see the wikisyntax for it. Mathematical formulas appear in a simplified form. WikiTaxi handles tables quite well.

Notable advantage: While the other two browsers display only pages of the article name space (plus categories), WikiTaxi presents also the Portals, a thematic approach to articles.

Browsing WikiTaxi is quick and stable. Categories work in principle, but often also not. If you want to change the language version, you have to quit the browser and start it again.

WIKITAXI RATING: installation 6(/10) points,  stability 9 points, interface 6 points, flexibility of content 9 points, illustrations 3 points, up-to-date 9 points

(to be continued)

Wikileaks world heritage?

Today Jimmy Wales appeared in German media with his call for making Wikipedia UNESCO World heritage. The media were quite reluctant in cheering. FOCUS says that there are not much chances considering the UNESCO rules. So Wikimedia wants UNESCO to introduce a new category for something like Wikipedia. Lets be honest, asks a dpa reporter, isn’t it also a PR action? Wales: “This is too big a word. But no question, it is about showing the public that Wikipedia is not only a website but a cultural phenomenon.”

ntv believes that the thresholds are high. There is no suitable category, and projects cannot nominate themselves, only governments can do that. “Wikipedia wants to try.”

“Between difficult and forlorn” are the chances of Wikipedia, according to Peiner Allgemeine. “Cathedral of Cologne, Taj Mahal, Macchu Picchu – Wikipedia?” To some people this sounds like a joke. Is it only a PR gag, answering to the falling numbers of collaborators?

Netzwelt.de obviously made a typing error when it wrote that a place came to existence where information can be exchanged in a unique way, that’s the rationale why Wikileaks (!) should be made world heritage.

Wiki hates plagiarism

Silvana Koch-Mehrin, now ex vice president of the European Parliament, had never the reputation of being the most studious of her parliamentary group (CC-BY-SA, Muffinmampfer)

In Germany the media report about a wave of politicians losing their PhD degrees and public offices. Anonymous plagiarism fighters use wikis to unravel academic fraud.

It started in February with Guttenplag Wiki, founded by User:PlagDoc to examine the plagiarism a law professor had seen in the PhD thesis of Karl-Theodor zu Guttenberg. On March 1st, Guttenberg resigned as German minister of defense.

User:Goalgetter then wanted to examine some other PhD thesises, but User:PlagDoc refused. So User:Goalgetter founded a new wiki, also at wikia.com, called VroniPlag. The object of interest was the thesis of Veronica Saß, the daughter of the former Bavarian prime minister Stoiber. The university of Konstanz now seized her degree. By the way, the hint to have a look at Saß’ thesis came from a Konstanz IP.

The next succes of VroniPlag was to bring down the political career of Silvana Koch-Mehrin, vice president of the European Parliament (of the liberal parliamentary group). She decided to remain a member of parliament, but withdrew as vice president and as member of the highest organ of her national party, the liberal FDP.

Another German liberal European MP on the list of VroniPlag is Jorgo Chatzimarkakis. Ironically, Chatzimarkakis on his own web site says that he wants to fight for intellectual property and against the stealing of knowledge. Matthias Pröfrock, a state representative for the Christian Democrats in Baden-Württemberg, chose not to use his degree until further examination of his former university of Tübingen.

User:Goalgetter described himself in an interview with Frankfurter Allgemeine as an independent entrepreneur with a lot of conservative clients who might disapprove his anti-plagiarism actions. Also plagiarism hunters linked to universities are afraid that they might suffer negative consequences should their activism become public. User:Goalgetter said that until February he did not have anything against Guttenberg or his Christian Social politics, he thought he was fresh and accurate and was deceived like most others.  Then minister Guttenberg was outed as a fraud but admitted nothing. “We are the physicians of society. We see the flaws in the sick system, we must unravel them to make society become a better one.”

But who will go to the prestigious Grimme Awards for VroniPlag and possibly accept an award? The VroniPlaggers are not unanimous, some suggested that Wikia-collaborator (and Wikipedian) Tim Barthel should attend. User:Goalsetter himself would like the group to go with baseball caps and sunglasses.

Drop-out compared

2011 April main page tutorial first second page last page
nl 7 steps 6616063 22809 3655 746
de 7 steps 36578909 10295 3217 1041
en 9 steps 146254073 26483 13219 2572
sv 9 steps 3627675 505 899 258

These numbers compare the tutorials in different Wikipedias. For example the tutorial in Dutch Wikipedia has nine steps; its first page is linked via the left side bar and this causes quite a lot of clicks. From the first to the second page the Dutch tutorial looses a lot of viewers, and in the end from 22,809 only 746 endure.

Other tutorials are not linked via the left side bar, and have considerably less views (in comparison to the main page). But it seems that they keep much more readers from the first to the last page. In Swedish (sv) Wikipedia the second page has much more views than the first one, possibly because other pages in that Wikipedia link to that second page (and others). The loss from first to second page seems to be the biggest in Dutch Wikipedia.

Lessons to learn:

  • Have on your main page and left side bar a link to your tutorial
  • Take good care of your first tutorial page, if it is confusing you immediately lose a lot of people
  • Whether you have seven or nine tutorial pages seems not to be too important

Wikipedia school drop-out

The Snelcursus is a quick guide in Dutch Wikipedia about how to edit Wikipedia. The tools stats.grok.se shows that the first page (welkom) had more than 20,000 views in April 2011. Then follows an instant and enormous loss to 3655 views, and at the end (tot slot) only 746 viewers remained.

How to explain this? Certainly, many people come via a click on the snelcursus link at the left sidebar of every Wikipedia article page, and they are just curious. Maybe some more viewers continue if the text were improved?