The Internet and online social networks have amplified information diffusion processes, but at the same time, they provide fertile ground for the spread of misinformation, rumors, and hoaxes. The goal of this work is to introduce a simple modeling framework to study these phenomena: following the epidemic approach and motivated by results in literature, we look at misinformation as an instance of the more general concept of information diffusion, and we propose an adaption of the classic SIS (Susceptible-Infected-Susceptible) model to the case of misinformation by adding two essential socio-cognitive features: forgetting and competition with fact-checking efforts. First, we focus on how the availability of debunking information may contain the misinformation diffusion. Our approach allows to quantitatively gauge the minimal reaction necessary to eradicate a hoax. Second, we simulate the spreading dynamics on networks with two communities of gullible and skeptic users, with different propensities to believe hoaxes and a segregation parameter that represents the sparsity of links between the two communities. Simulations show that segregation plays an important role in the diffusion of misinformation, but can have different effects varying other parameters. Finally, we validate our model on Twitter data (both fake news and debunking), obtaining good results. Our encouraging findings suggest that fact-checking can be still considered useful in fighting misinformation, but also that the structure of the underlying social network is very important in the spreading process evolution, then further investigation in this direction is absolutely necessary in order to develop new tools and solutions to limit the diffusion of fake news.
PAN at CLEF 2019
Shared Tasks
- Bots and Gender Profiling
- Celebrity Profiling
- Cross-Domain Authorship Attribution
- Style Change Detection
Important Dates
- March 15, 2019: Early bird software submission
- April 15, 2019: TIRA evaluation phase opens
- May 11, 2019: TIRA evaluation phase deadline
- May 31, 2019 (extended): Paper submission: [template] [guidelines] [submission]
- June 07, 2019: Peer review notification
- June 28, 2019: Camera-ready participant papers submission
- tba: Early bird conference registration
- September 09-12, 2019: Conference
The timezone of all deadlines is Anywhere on Earth.
Keynotes
The practice of using opinion manipulation trolls has been reality since the rise of Internet and community forums. It has been shown that user opinions about products, companies and politics can be influenced by posts by other users in online forums and social networks. This makes it easy for companies and political parties to gain popularity by paying for "reputation management" to people or companies that write in discussion forums and social networks fake opinions from fake profiles.
A natural question is whether such trolls can be found and exposed automatically. This is hard as there is no enough data to train a classifier; yet, it is possible to obtain some test data, as such trolls are sometimes caught and widely exposed. Yet, one still needs training data. We solve the problem by assuming that a user who is called a troll by several different people is likely to be one, and one who has never been called a troll is unlikely to be such. We compare the profiles of (i) paid trolls vs. (ii) "mentioned" trolls vs. (iii) non-trolls, and we further show that a classifier trained to distinguish (ii) from (iii) does quite well also at telling apart (i) from (iii).
Program
PAN's program is part of the CLEF conference program.
September 9 | |
Best of Labs at Auditorium | |
13:45-15:00 | An Ensemble Approach to Cross-Domain Authorship Attribution José Custódio and Ivandre Paraboni |
Labs Presentations at Auditorium | |
15:45-16:00 | Overview of PAN 2019: Bots and Gender Profiling, Celebrity Profiling, Cross-domain Authorship
Attribution and Style Change Detection
Walter Daelemans, Mike Kestemont, Enrique Manjavacas, Martin Potthast, Francisco Manuel Rangel Pardo, Paolo Rosso, Günther Specht, Efstathios Stamatatos, Benno Stein, Michael Tschuggnall, Matti Wiegmann and Eva Zangerle |
September 10 | |
13:30-15:00 | Session 1 at A31, Chair: Martin Potthast |
12:00 - 13:30 | Poster Session during Lunch |
13:30-13:40 | PAN 2019 Welcome Martin Potthast |
13:40-14:40 | Keynote: Exposing Paid Opinion Manipulation Trolls Preslav Nakov |
14:40-15:00 | Overview of the Shared Task on Bots and Gender Profiling in Twitter Francisco Rangel and Paolo Rosso |
15:00-15:30 | Break |
15:30-16:30 | Session 2 at A31, Chair: Francisco Rangel |
15:30-15:35 | Award in Bots and Gender Profiling by The Logic Value |
15:35-15:50 | Using N-grams to detect Bots on Twitter
Juan Pizarro |
15:50-16:10 | Supervised Classification of Twitter Accounts Based on Textual Content of Tweets Fredrik Johansson |
16:10-16:30 | Overview of the Celebrity Profiling Task Matti Wiegmann |
16:30-17:30 | Session 3 at A31, Chair: Paolo Rosso |
16:30-17:30 | Keynote: Hoax vs fact checking: understanding and predicting the diffusion of low quality
information on communication networks
Giancarlo Ruffo |
18:30-22:00 | Conference Dinner |
September 11 | |
12:00 - 13:30 | Poster Session during Lunch |
Multi-channel Open-set Cross-domain Authorship Attribution
José Custódio and Ivandre Paraboni |
|
Bot and Gender detection of Twitter accounts using Distortion and LSA
Andrea Bacciu, Massimo La Morgia, Alessandro Mei, Eugenio Nerio Nemmi, Valerio Neri, and Julinda Stefa |
|
FOI Cross-Domain Authorship Attribution for Criminal Investigations
Fredrik Johansson and Tim Isbister |
|
UniNE at PAN-CLEF 2019: Bots and Gender Task
Catherine Ikae, Sukanya Nath, Jacques Savoy |
|
Combined CNN+RNN Bot and Gender Profiling
Rafael Felipe Sandroni Dias and Ivandré Paraboni |
|
Detecting bot accounts on Twitter by measuring message predictability
Piotr Przybyła |
|
Bots and gender profiling using masking techniques
Victor Jimenez-Villar, Javier Sánchez-Junquera, Manuel Montes-y-Gómez, Luis Villaseñor-Pineda, and Simone Paolo Ponzetto |
|
An evolutionary approach to build user representations for profiling of bots and humans in
Twitter
Roberto López-Santillán, Luis Carlos González-Gurrola, Manuel Montes-y-Gómez, Graciela Ramírez-Alonso, and Olanda Prieto-Ordaz |
|
Naive-Bayesian Classification for Bot Detection in Twitter
Pablo Gamallo and Sattam Almatarneh |
|
Unsupervised pretraining for text classification using siamese transfer learning
Maximilian Bryan and J. Nathanael Philipp |
|
Author profiling using semantic and syntactic features
György Kovács, Vanda Balogh, Kumar Shridhar, Purvanshi Mehta, and Pedro Alonso |
|
Profiling Twitter users using autogenerated features invariant to data distribution
Tiziano Fagni and Maurizio Tesconi |
|
Bots and Gender Profiling using a Multi-layer Architecture
Régis Goubin, Dorian Lefeuvre, Alaa Alhamzeh, Jelena Mitrovic, Elod Egyed-Zsigmond, and Leopold Ghemmogne Fossi |
|
Bots and Gender Profiling on Twitter using Sociolinguistic Features
Edwin Puertas, Luis Gabriel Moreno-Sandoval, Flor Miriam Plaza-del-Arco, Jorge Andres Alvarado-Valencia, Alexandra Pomares-Quimbaya, and L.Alfonso Ureña-López |
|
Bots and gender profiling with convolutional hierarchical recurrent neural network
Juraj Petrik, Daniela Chuda |
|
Celebrity Profiling on Twitter using Sociolinguistic Features
Luis Gabriel Moreno-Sandoval, Edwin Puertas, Flor Miriam Plaza-del-Arco, Alexandra Pomares-Quimbaya, Jorge Andres Alvarado-Valencia, and L. Alfonso Ureña-López |
|
A Hierarchical Neural Network Approach for Bots and Gender Profiling
Andrea Cimino and Felice dell’Orletta |
|
Bots and Gender Profiling using Character Bigrams
Daniel Yacob Espinosa, Helena Gómez-Adorno, and Grigori Sidorov |
|
15:30-16:30 | Session 4 at A31, Chair: Efstathios Stamatatos |
15:30-15:45 | Overview of the Style Change Detection Task Eva Zangerle |
15:45-16:00 | Style Change Detection by Threshold Based and Window Merge Clustering Methods Sukanya Nath |
16:00-16:15 | Twitter User Profiling: Bot and Gender Identification
Dijana Kosmajac and Vlado Keselj |
16:15-16:30 | Twitter feeds profiling with TF-IDF Juraj Petrik and Daniela Chuda |
16:30-16:50 | Overview of the Cross-domain Authorship Attribution Task Mike Kestemont |
16:50-17:10 | Cross-Domain Authorship Attribution Combining Instance Based and Profile-Based Features Andrea Bacciu, Massimo La Morgia, Alessandro Mei, Eugenio Nerio Nemmi, Valerio Neri, Julinda Stefa |
17:10-17:30 | Community discussion |
18:30-20:30 | Civic Reception |
September 12 | |
Best of CLEF for Industry at Auditorium | |
14:00-14:30 | Shared Tasks for Industry: Experiment Platforms and Author Profiling
Martin Potthast and Francisco Rangel |