Wikipedia Vandalism Detection 2011
Synopsis
- Task: Given a set of edits on Wikipedia articles, separate the ill-intentioned edits from the well-intentioned edits.
- Input: [data (de, es)] [data (en)]
Task
The definition of vandalism at Wikipedia includes "any addition, removal, or change of content made in a deliberate attempt to compromise the integrity of Wikipedia." Hence, Wikipedia vandalism detection comprises the following classification task:
Given a set of edits on Wikipedia articles, the task is to separate the ill-intentioned edits from the well-intentioned edits.
Input
To develop your approach, we provide you with a training corpus which comprises a set of edits on Wikipedia articles. All of these edits have been manually annotated whether they constitute vandalism or not. Learn more »
Output
For all edits found in the evaluation corpora, your vandalism detector shall output a file
classification.txt
as follows:
OLDREVID NEWREVID C CONF FEATUREVAL1 FEAUTREVAL2 FEATUREVAL3 ... FEAUTREVALn 26864258 27932250 V 0.92 0.864726878 5.054816462 0.285489458 ... 0.000000584 28689695 87188208 R 0.50 0.642019751 3.499755645 0.123675764 ... 0.050561605 85047080 85047157 V 0.67 0.505090519 9.306061202 0.055005005 ... 0.919051616 80637222 91249168 R 0.43 0.964561645 1.505164514 0.469614241 ... 0.000000031 ...
- The column OLDREVID is the edit's old revision ID.
- The column NEWREVID is the edit's new revision ID.
- The column C denotes whether the edit's class according to your
classifier.
V denotes vandalism edits and R denotes regular edits. - The column CONF denotes your classifier's confidence. If your classifier does not return confidence values, simply use 0 and 1 according to the classifiers output.
- The columns FEATUREVAL1 to FEATUREVALn denote the n feature values your classifier has computed for the given edit. The feature values need not be normalized, and they should consist of unrounded values. If you have non-numeric features include them as well. Please send along a short description of each feature, and make sure that all entries in column FEATUREVALi are based on the same feature implementation.
Evaluation
The performance of your vandalism detector will be measured based on the area under its precision-recall-curve (PR-AUC). Details about the measures can be found in this paper (Section 1.2).
Results
The following table lists the performances achieved by the participating teams:
English Wikipedia vandalism detection performance | |
---|---|
|
Participant |
0.82230 | A.G. West and I. Lee University of Pennsylvania, USA |
0.42464 | C.-A. Drăguşanu, M. Cufliuc, and A. Iftene AL.I.Cuza University, Romania |
German Wikipedia vandalism detection performance | |
---|---|
|
Participant |
0.70591 | A.G. West and I. Lee University of Pennsylvania, USA |
0.18978 | F.G. Aksit Maastricht University, Netherlands |
Spanish Wikipedia vandalism detection performance | |
---|---|
|
Participant |
0.48938 | A.G. West and I. Lee University of Pennsylvania, USA |
0.22077 | F.G. Aksit Maastricht University, Netherlands |
A more detailed analysis of the detection performances can be found in the overview paper accompanying this task.
Related Work
- Wikipedia Vandalism Detection, PAN @ CLEF'10
- B. Thomas Adler, Luca de Alfaro, Santiago M. Mola-Velasco, Paolo Rosso, and Andrew G. West. Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features. In Proceedings of the 12th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing'11), Tokyo, Japan, 2011.
- WikiTrust --- A reputation system for Wikipedia authors and content [web, source, API]
- Spatio Temporal Processing on Wikipedia (Stiki) [web, source]
- ClueBot NG [web]