Research paper by Informatics student accepted for OpenSym 2021 conference

[2021] A research paper ‘Extracting and Visualizing User Engagement on Wikipedia Talk Pages’ authored by Carlin MacKenzie, BSc Computer Science student at the School of Informatics, alongside Dr John R. Hott from the University of Virginia School of Engineering and Applied Science, has been accepted for the OpenSym 2021 conference.

OpenSym 2021

The paper explores the outputs of ‘Namespace Database’, an open-source extraction and visualisation tool developed by the duo as a way of parsing the vast scale of Wikipedia data.

Namespace Database automates the creation of a database of Wikipedia edits, the visualisations of the resulting database, and potential research directions - a process which Carlin has found there to be little community consensus on historically, even though the datasets are publicly available.

In the extract taken for this paper, Namespace Database generated an 84GB database of edits and the first visualisations of how different user groups edit differently on Wikipedia talk pages. The authors decided to focus their research on article talk pages as they believed these would have the most interesting edits for assessing user behaviour towards each other. The results were used to investigate the feasibility of classifying blocked users using this dataset as an example of future research directions - an approach which was found to be unsuccessful and it was concluded that either additional data or a more advanced model would be needed to classify them.

The paper was accepted for inclusion at the 17th International Symposium on Open Collaboration (OpenSym 2021) on open collaboration research and practice, including open source, open data, open science, open education, wikis and related social media, Wikipedia, and IT-driven open innovation research.

OpenSym brings together the different strands of open collaboration research and practice, seeking to create synergies and inspire new collaborations between people from computer science, information science, social science, humanities, and everyone interested in understanding open collaboration and how it is changing the world. With attendees from outside the Wiki community, but still aligned and interested in research into Wikipedia, the OpenSym conference brings Carlin and John’s paper to a new and broader audience.


I am very proud of the determination it took to publish this paper, and I'm extremely grateful for the support I've received from my supervisors abroad. There were moments that nearly stopped this project in its tracks, like when a hardware failure lost us weeks of database creation time, but our persistence has allowed me to be published as a first author.

Carlin MacKenzie
BSc Computer Science student at the School of Informatics

 OpenSym 2021 Teaser


Related links

OpenSym 2021 conference

GitHub - Namespace Database