Multi-Author Writing Style Analysis 2025
Synopsis
- Task: Given a document, determine at which positions the author changes.
- Input: Reddit comments, combined into documents [data].
- Output: Where does authorship change on the paragraph level [validator].
- Evaluation: F1 [code].
- Submission: Deployment on TIRA [submit].
Task
tba.Data [download]
tba.Evaluation [code]
Submissions are evaluated by the F1-score measure (macro) across all paragraph pairs. The solutions for each dataset are evaluated independently based on the obtained evaluation scores.
We provide you with a script to compute the F1-score based on the produced output-files [evaluator and tests].
Submission
Once you finish tuning your approach on the validation set, your software will be tested on the test set. During the competition, the test sets will not be released publicly. Instead, we ask you to submit your software for evaluation at our site as follows.
We ask you to prepare your software so that it can be executed via command line calls. The command shall take as input (i) an absolute path to the directory of the test corpora and (ii) an absolute path to an empty output directory:
mySoftware -i INPUT-DIRECTORY -o OUTPUT-DIRECTORY
Within INPUT-DIRECTORY
, you will find the set of problem instances (i.e., problem-[id].txt
files) for each of the three datasets, respectively. For each problem instance you should produce the solution file solution-problem-[id].json
in the respective OUTPUT-DIRECTORY
. For instance, you read INPUT-DIRECTORY/problem-12.txt
, process it, and write your results to OUTPUT-DIRECTORY/solution-problem-12.json
.
In general, this task follows PAN's software submission strategy described here.
Note: By submitting your software you retain full copyrights. You agree to grant us usage rights only for the PAN competition. We agree not to share your software with a third party or use it for other purposes than the PAN competition.
Related Work
- Style Change Detection, PAN@CLEF'23
- Style Change Detection, PAN@CLEF'22
- Style Change Detection, PAN@CLEF'21
- Style Change Detection, PAN@CLEF'20
- Style Change Detection, PAN@CLEF'19
- Style Change Detection, PAN@CLEF'18
- Style Breach Detection, PAN@CLEF'17
- PAN@CLEF'16 (Clustering by Authorship Within and Across Documents and Author Diarization section)
- J. Cardoso and R. Sousa. Measuring the performance of ordinal classification. International Journal of Pattern Recognition and Artificial Intelligence 25.08, pp. 1173-1195, 2011
- Benno Stein, Nedim Lipka and Peter Prettenhofer. Intrinsic Plagiarism Analysis. In Language Resources and Evaluation, Volume 45, Issue 1, pages 63-82, 2011.
- Efstathios Stamatatos.A Survey of Modern Authorship Attribution Methods. Journal of the American Society for Information Science and Technology, Volume 60, Issue 3, pages 538-556, March 2009.