Outputs

Our students have produced various outputs beyond what is required to be awarded their degrees.

Publications

This section lists publications by year of our students, with the student name given first and in bold, on an exceptional basis:

University of Edinburgh Research Explorer

Edinburgh Research Archive (Doctoral Thesis)

Agostina CALABRESE, Leonardo Neves, Neil Shah, Maarten W. Bos, Björn Ross, Mirella Lapata, Francesco Barbieri. Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster. In ACL 2024.

Ronald CARDENAS, Matthias Galle, Shay B. Cohen. On the Trade-off between Redundancy and Local Coherence in Summarization. In JAIR 2024.

Gautier DAGAN, Gabriel Synnaeve, Baptiste Roziere. Getting the most out of your tokenizer for pre-training and domain adaptation. In ICML 2024.

Verna DANKERS, Ivan Titov. Generalisation First, Memorisation Second? Memorisation Localisation for Natural Language Classification Tasks. In ACL Findings 2024.

Lauren FLETCHER, Jennifer Culbertson, Hugh Rabagliati. Communicative efficiency and social biases modulate language learning in autistic and allistic individuals. In EVOLANG 2024.

Shangmin GUO, Samuel Garcin, James Doran, Christopher G Lucas, Stefano V Albrecht. DRED: Zero-Shot Transfer in Reinforcement Learning via Data-Regularised Environment Design. In ICML 2024.

Shangmin GUO, Tianlin Liu, Leonardo Bianco, Daniele Calandriello, Quentin Berthet, Felipe Llinares, Jessica Hoffmann, Lucas Dixon, Michal Valko, Mathieu Blondel. Decoding-time Realignment of Language Models. In ICML 2024.

Wenyu HUANG, André Melo, Jeff Z. Pa. A Large-scale Offer Alignment Model for Partitioning Filtering and Matching Product Offers. SIGIR 2024.

Amr KELEG, Walid Magdy, Sharon Goldwater. Estimating the Level of Dialectness Predicts Interannotator Agreement in Multi-dialect Arabic Datasets. In ACL 2024.

Amr KELEG, Muhammad Abdul-Mageed, AbdelRahim Elmadany, Chiyu Zhang, Injy Hamed, Walid Magdy, Houda Bouamor, Nizar Habash. NADI 2024: The Fifth Nuanced Arabic Dialect Identification Shared Task. In ArabicNLP 2024.

Matthias LINDEMANN, Alexander Koller, Ivan Titov. SIP: Injecting a Structural Inductive Bias into a Seq2Seq Model by Simulation. In ACL 2024.

Matthias LINDEMANN, Guillem Ramírez, Alexandra Birch, Ivan Titov. Cache & Distil: Optimising API Calls to Large Language Models. In ACL Findings 2024.

Oli Danyi LIU, Hao Tang, Naomi Feldman, Sharon Goldwater. A predictive learning model can simulate temporal dynamics and context effects found in neural representations of continuous speech. In CogSci 2024.

Oli Danyi LIU, Mukhtar Mohamed, Hao Tang, Sharon Goldwater. Orthogonality and isotropy of speaker and phonetic information in self-supervised speech representations. In Interspeech 2024.

Nicolas NAVARRE, Can Konuk, Neil Bramley, Salvador Mascarenhas. Functional Rule Inference from Causal Selection Explanations. In CogSci 2024.

Nicolas NAVARRE, Can Konuk, Salvador Mascarenhas. Effects of causal structure and evidential impact on probabilistic reasoning. In CogSci 2024.

Piotr NAWROT, Adrian Łańcucki, Marcin Chochowski, David Tarjan, Edoardo M. Ponti. Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference. In ICML 2024.

Alice ROSS, Martin Corley, Catherine Lai. Is there an uncanny valley for speech? Investigating listeners’ evaluations of realistic TTS voices. In ISCA Speech Prosody 2024.

Ariadna SANCHEZ, Alice ROSS, Nina Markl. Beyond the binary: Limitations and possibilities of gender-related speech technology research. In IEEE SLT 2024.

Atli Thor SIGURGEIRSSON, Simon King. Controllable Speaking Styles Using A Large Language Model. In ICASSP 2024.

Atli Thor SIGURGEIRSSON, Eddie L. UNGLESS. Just Because We Camp, Doesn't Mean We Should: The Ethics of Modelling Queer Voices. In Interspeech 2024.

Atli Thor SIGURGEIRSSON, Himanshu Maurya. A Human-in-the-Loop Approach to Improving Cross-Text Prosody Transfer. In Interspeech 2024.

Sydelle de SOUZA, Francis Mollica, Jennifer Culbertson. What can L1 speakers tell us about killing hope? A Novel Behavioral Measure for Identifying Collocations. In CogSci 2024.

Sydelle de SOUZA, Mattia Opper. Starting Small, After All? Curriculum Learning with Child-Directed Speech. In CogSci 2024.

Siqi SUN, Korin Richmond. Learning Pronunciation from Other Accents via Pronunciation Knowledge Transfer. In Interspeech 2024.

Mengyu WANG, Tiejun Ma. MANA-Net: Mitigating Aggregated Sentiment Homogenization with News Weighting for Enhanced Market Prediction. In CIKM 2024.

Yu ZHAO, Yuanbin Qu, Konrad Staniszewski, Szymon Tworkowski, Wei Liu, Piotr Miłoś, Yuxiang Wu, Pasquale Minervini. Analysing The Impact of Sequence Composition on Language Model Pre-Training. In ACL 2024.

Zheng ZHAO, Pinzhen Chen, Zheng Zhao, Shun Shao Cher. KSAA-CAD 2024: Compressing Words and Definitions into the Same Space for Arabic Reverse Dictionary. In ArabicNLP 2024.

Zheng ZHAO, Yifu Qiu, Yftah Ziser, Anna Korhonen, Edoardo Ponti, Shay Cohen. Are Large Language Model Temporally Grounded?. In NAACL 2024.

Zheng ZHAO, Manuj Malik, Marcio Fonseca, Shrisha Rao, Shay Cohen. CivilSum: A Dataset for Abstractive Summarization of Indian Court Decisions. In SIGIR 2024.

Laurie BURCHELL, Alexandra Birch, Nikolay Bogoychev, and Kenneth Heafield. An Open Dataset and Model for Language Identification. In ACL 2024.

Sandrine CHAUSSON, Ameer Saadat-Yazdi, Xue Li, Jeff Z Pan, Vaishak Belle, Nadin Kökciyan, Björn Ross. A Web-based Tool for Detecting Argument Validity and Novelty. In AAMAS 2024.

Jie CHI, Debasmita Bhattacharya, Julia Hirschberg, Peter Bell. Capturing Formality in Speech Across Domains and Languages. In InterSpeech 2024.

Jie Chi, Brian Lu, Jason Eisner, Peter Bell, Preethi Jyothi, Ahmed M. Ali. Unsupervised Code-switched Text Generation from Parallel Text. In InterSpeech 2024.

Jie CHI, Ahmed Ali, Shammur Chowdhury, Lucas Ondel, Matthew Wiesner, Ondrej Klejch, Kenton Murray, Amir Hussein, Injy Hamed, Lea-Marie Lam-Yee-Mui, Barah Fazili, Brian Yan, Electra Wallington, Brian Lu, Debasmita Bhattacharya, Dorsa Zeinali, Oumnia Chellah, Danielle Cartagenes, Peter Bell, Nizar Habash, Preethi Jyothi, Sunayana Sitaram, Ravi Mamindlapalli, Shinji Watanabe, Jan Trmal, Najim Dehak, Sanjeev Khudanpur. Multilingual and Code-Switching Speech Recognition. Final report of Eighth Frederick Jelinek Memorial Summer Workshop.

Verna DANKERS, Christopher G. Lucas. Non-Compositionality in Sentiment: New Data and Analyses. In ACL Findings 2024.

Verna DANKERS, Ivan Titov, Dieuwke Hupkes. Memorisation Cartography: Mapping out the Memorisation-Generalisation Continuum in Neural Machine Translation. In EMNLP 2023.

Stephanie DROOP, Neil Bramley. Extending counterfactual reasoning models to capture unconstrained social explanations. In ICML 2024 Workshop on Counterfactuals in Minds and Machines.

Stephanie DROOP, Max Taylor-Davies, Chris Lucas. Selective imitation on the basis of reward function similarity. In CogSci 2023.

Lauren FLETCHER, Jennifer Culbertson & Hugh Rabagliati. How communicative efficiency and social biases shape language in autistic and allistic learners. In CogSci 2023.

Balint GYEVNAR, Nick FERGUSON, Burkhard Schafer. Bridging the Transparency Gap: What Can Explainable AI Learn From the AI Act?. In ECAI 2023.

Balint GYEVNAR, Cheng Wang, Christopher G. Lucas, Shay B. Cohen and Stefano V. Albrecht. Causal Explanations for Sequential Decision-Making in Multi-Agent Systems. In AAMAS 2024.

Tom HOSKING, Tom Sherborne, Mirella Lapata. Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing. In TACL 2024.

Tom HOSKING, Hao Tang, Mirella Lapata. Attributable and Scalable Opinion Summarization. In ACL 2024.

Wenyu HUANG, Mirella Lapata, Pavlos Vougiouklis, Nikos Papasarantopoulos, Jeff Z. Pan. Retrieval Augmented Generation with Rich Answer Encoding, In IJCNLP-AACL 2023.

Parag JAIN, Mirella Lapata. Conversational Semantic Parsing using Dynamic Context Graph. In EMNLP 2023.

Amr KELEG, Walid Magdy. Arabic Dialect Identification under Scrutiny: Limitations of Single-label Classification. In ArabicNLP 2023.

Amr KELEG, Sharon Goldwater, Walid Magdy. ALDi: Quantifying the Arabic Level of Dialectness of Text. In EMNLP 2023.

Amr KELEG, Walid Magdy. DLAMA: A Framework for Curating Culturally Diverse Facts for Probing the Knowledge of Pretrained Language Models. In ACL Findings 2023.

Matthias LINDEMANN, Alexander Koller, Ivan Titov. Compositional Generalization without Trees using Multiset Tagging and Latent Permutations. In ACL 2023.

Matthias LINDEMANN, Alexander Koller, Ivan Titov. Compositional Generalisation with Structured Reordering and Fertility Layers. In EACL 2023.

Danyang LIU and Frank Keller. Detecting and Grounding Important Characters in Visual Stories. In AAAI 2023

Oli Danyi LIU, Hao Tang, Sharon Goldwater. Self-supervised Predictive Coding Models Encode Speaker and Phonetic Information in Orthogonal Subspaces. In Interspeech 2024.

Nina MARKL, Electra Wallington, Ondrej Klejch, Thomas Reitmaier, and Gavin Bailey, Jennifer Pearson, Matt Jones, Simon Robinson, Peter Bell. Automatic Transcription and (De)Standardisation. In ELRA/ISCA SIGUL 2023.

Nina MARKL and Catherine Lai. Everyone has an Accent. In Interspeech 2023.

Nina MARKL, Ramon Sanabria, Nikolay Bogoychev, Andrea Carmantini, and Ondrej Klejch, Peter Bell: The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR. In ICASSP 2023.

Nina MARKL."I can’t see myself ever living any[w]ere else": Variation in (HW) in Edinburgh English. In Language Variation and Change 2023.

Nina MARKL, Thomas Reitmaier, Electra Wallington, Ondřej Klejch, Léa-Marie Lam-Yee-Mui, Jennifer Pearson, Matt Jones, Peter Bell, Simon Robinson. Situating Automatic Speech Recognition Development within Communities of Under-heard Language Speakers. In CHI 2023.

Nicole MENG-SCHNEIDER, Rabia Yasa Kostas, Kami Vaniea and Maria K Wolters: Multi-User Smart Speakers – A Narrative Review of Concerns and Problematics Interactions. In CHI 2023.

Nikita MOGHE, Tom Sherborne, Mark Steedman, Alexandra Birch. Extrinsic Evaluation of Machine Translation Metrics. In ACL 2023.

Nikita MOGHE, Evgeniia Razumovskaia, Liane Guillou, Ivan Vulić, Anna Korhonen, Alexandra Birch. MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue. In ACL Findings 2023.

Piotr NAWROT. nanoT5: A PyTorch Framework for Pre-training and Fine-tuning T5-style Models with Limited Resources. In EMNLP 2023 Workshop NLP-OSS.

Piotr NAWROT, Jan Chorowski, Adrian Łańcucki, Edoardo M. Ponti. Efficient Transformers with Dynamic Token Pooling. In ACL 2023.

Piotr NAWROT, Jean Kaddour, Oscar Key, Pasquale Minervini, Matt J. Kusner. No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models. In NeurIPS 2023.

Nicholas SANDERS, Korin Richmond. Recovering Discrete Prosody Inputs via Invert-Classify. In ISCA 2023 Speech Synthesis Workshop.

Atli SIGURGEIRSSON, Simon King. Do Prosody Transfer Models Transfer Prosody? In ICASSP 2023.

Atli SIGURGEIRSSON, Simon King. Using a large language model to control speaking style for expressive TTS. In ISCA 2023 Speech Synthesis Workshop.

Siqi SUN, Korin Richmond, Hao Tang. Improving Seq2Seq TTS Frontends With Transcribed Speech Audio. IEEE/ACM Transactions on Audio, Speech, and Language Processing 2023.

Eddie UNGLESS, Charlotte Bird (joint first authors) and Atoosa Kasirzadeh. Typology of Risks of Generative Text-to-Image Models. In AAAI/ACM Conference on AI, Ethics, and Society 2023.

Eddie UNGLESS, Seraphina Goldfarb-Tarrant, Esma Balkir and Su Lin Blodgett. This Prompt is Measuring< MASK>: Evaluating Bias Evaluation in Language Models. In ACL Findings 2023.

Eddie UNGLESS, Bjorn Ross and Anne Lauscher. Stereotypes and Smut: The (Mis) representation of Non-cisgender Identities by Text-to-Image Models. In ACL Findings 2023.

Eddie UNGLESS, Bjorn Ross and Vaishak Belle. Potential Pitfalls With Automatic Sentiment Analysis: The Example of Queerphobic Bias. In Social Science Computer Review 2023.

Ivan VEGNER, Souza, S., Doumas, L., & Mollica, F. What can MINERVA2 tell us about killing hope? Investigating L2 Collocational Processing with a Memory Model. In CogSci 2023.

Irene E. WINTHER, Yevgen Matusevych, Martin J. Pickering. Can learning explain cognate effects in bilingual comprehension and production?. In Bilingualism through the Prism of Psycholinguistics: In honour of Albert Costa John 2023.

Zheng ZHAO, Yftah Ziser, Bonnie Webber, and Shay Cohen. A Joint Matrix Factorization Analysis of Multilingual Representations. In ACL Findings 2023.

Zheng ZHAO, Ashok Urlana, Pinzhen Chen, Shay Cohen, Manish Shrivastava, Barry Haddow. PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in India. In ACL Findings 2023.

Anil BATRA, Shreyank N Gowda, Frank Keller, Laura Sevilla-Lara. A Closer Look at Temporal Ordering in the Segmentation of Instructional Videos. In BMVC 2022.

Laurie BURCHELL, Alexandra Birch, and Kenneth Heafield. Exploring diversity in back translation for low-resource machine translation. In ACL 2022 Workshop DeepLo.

Agostina CALABRESE, Björn Ross, Mirella Lapata. Explainable Abuse Detection as Intent Classification and Slot Filling. In TACL 2022.

Georgia-Ann CARTER, Mante S. Nieuwland. Predicting Definite and Indefinite Referents during Discourse Comprehension: Evidence from Event‐Related Potentials. In Cognitive Science 2022.

Sandrine CHAUSSON, Ameer Saadat-Yazdi, Xue Li, Vaishak Belle, Björn Ross, Jeff Z Pan, Nadin Kökciyan. Kevin. A knowledge enhanced validity and novelty classifier for arguments. In COLING 2022 Workshop ArgMining.

Jie CHI, Peter Bell. Improving code-switched ASR with linguistic information. In COLING 2022.

Verna DANKERS, Ivan Titov. Recursive Neural Networks with Bottlenecks Diagnose (Non-)Compositionality. In ACL Findings 2022.

Verna DANKERS, Chris Lucas, Ivan Titov. Can Transformer be too compositional? Analysing idiom processing in neural machine translation. In ACL 2022.

Verna DANKERS, Elia Bruni, Dieuwke Hupkes. The paradox of the compositionality of natural language: a neural machine translation case study. In ACL 2022.

Stephanie DROOP, Neil Bramley. Inferring epistemic intention in simulated physical micro-worlds. In CogSci 2022.

Nick FERGUSON, Liane Guillou, Kwabena Nuamah, Alan Bundy. Investigating the use of Paraphrase Generation for Question Reformulation in the FRANK QA system. In IJCLR 2022 Workshop on Human-Like Computing.

Shangmin GUO, Yi Ren, Kory Mathewson, Simon Kirby, Stefano V Albrecht, Kenny Smith. Expressivity of Emergent Language is a Trade-off between Contextual Complexity and Unpredictability. In ICLR 2022.

Shangmin GUO, Yi Ren, Danica J Sutherland. Better Supervisory Signals by Observing Learning Paths. In ICLR 2022.

Shangmin GUO, I H Ahmed, C Brewitt, I Carlucho, F Christianos, M Dunion, E Fosong, S Garcin, B Gyevnar, T McInroe, G Papoudakis, A Rahman, L Schäfer, M Tamborski, G Vecchio, C Wang, S V Albrecht. Deep reinforcement learning for multi-agent interaction. In AI Communications 2022.

Balint GYEVNAR, Gautier Dagan, Coleman Haley, Shangmin Guo, and Francis Mollica. Communicative Efficiency or Iconic Learning: Do communicative and acquisition pressures interact to shape colour-naming systems?. In Entropy 2022.

Balint GYEVNAR, Massimiliano Tamborski, Cheng Wang, Christopher G. Lucas, Shay B. Cohen, and Stefano V. Albrecht. A Human-Centric Method for Generating Causal Explanations in Natural Language for Autonomous Vehicle Motion Planning. In IJCAI 2022 Workshop on Artificial Intelligence for Autonomous Driving.

Balint GYEVNAR. Cars that Explain: Building Trust in Autonomous Vehicles through Explanations and Conversations. In IEEE ITSS 2022 Shape the Future of ITS Competition.

Tom HOSKING, Hao Tang, Mirella Lapata. Hierarchical Sketch Induction for Paraphrase Generation. In ACL 2022.

Amr KELEG, Walid Magdy. SMASH at Qur’an QA 2022. Creating Better Faithful Data Splits for Low-resourced Question Answering Scenarios. In LERC 2022 Workshop OSACT.

Amr KELEG, Matthias LINDEMANN, Danyang LIU, Wanqiu LONG, Bonnie Webber. Automatically Discarding Straplines to Improve Data Quality for Abstractive News Summarization. In ACL 2022 Workshop NLPPower.

Faheem KIREFU, Laurie BURCHELL, Vivek Iyer, Pinzhen Chen. The University of Edinburgh’s Submission to the WMT22 Code-Mixing Shared Task (MixMT). In WMT 2022.

Nina MARKL, Thomas Reitmaier, Electra Wallington, Ondřej Klejch and Léa-Marie Lam-Yee-Mui, Jennifer Pearson, Matt Jones, Peter Bell, Simon Robinson. Situating Automatic Speech Recognition Development within Communities of Under-heard Language Speakers. In CHI 2022.

Nina MARKL. (Commercial) Automatic Speech Recognition as a Tool in Sociolinguistic Research. In University of Pennsylvania Working Papers in Linguistics 2022.

Nina MARKL. Claire Cowie, Lauren Hall-Lew, Zuzana Elliott, Anita Klingler, Stephen Joseph McNulty. Imagining the city in lockdown: Place in the COVID-19 self-recordings of the Lothian Diary Project. In Frontiers in Artificial Intelligence 2022.

Nina MARKL. Language variation and algorithmic bias: understanding algorithmic bias in British English automatic speech recognition. In FAccT 2022.

Nina MARKL. Mind the data gap(s): Investigating power in speech and language datasets. In ACL 2022 Workshop LTEDI.

Nina MARKL, Stephen Joseph McNulty. Language technology practitioners as language managers: arbitrating data bias and predictive bias in ASR. In LREC 2022.

Nikita MOGHE, Chantal Amrhein and Liane Guillou. ACES: Translation Accuracy Challenge Sets for Evaluating Machine Translation Metrics. In WMT 22.

Rimvydas RUBAVICIUS, Alex Lascarides. Interactive symbol grounding with complex referential expressions. In NAACL 2022.

Fatemeh TARIGHAT, Walid Magdy, Martin Corley. Understanding Fillers May Facilitate Automatic Sarcasm Comprehension: A Structural Analysis of Twitter Data and a Participant Study. In SemDial 2022.

Eddie UNGLESS, Amy Rafferty, Hrichika Nag, Bjorn Ross. A Robust Bias Mitigation Procedure Based on the Stereotype Content Model. In EMNLP 2022 Workshop NLP+CSS.

Dan WELLS, Hao Tang, Korin Richmond. Phonetic Analysis of Self-supervised Representations of English Speech. In Interspeech 2024.

Dan WELLS, Aidan Pine, Nathan Brinklow, Patrick Littell, Korin Richmond. Requirements and Motivations of Low-Resource Speech Synthesis for Language Revitalization. In ACL 2022.

Irene WINTHER, Stefan L. Frank, Xavier Hinaut, Edith Kaan, Yung Han Khoe, Lin Chen, Yevgen Matusevych. Bilingual sentence processing: When models meet experiments. In CogSci 2022.

Zheng ZHAO, Yftah Ziser, and Shay Cohen. 2022. Understanding Domain Learning in Language Models Through Subpopulation Analysis. In ACL 2022 Workshop BlackboxNLP.

Laurie BURCHELL, L Pinzhen Chen, Jindrich Helcl, Ulrich Germann, Nikolay Bogoychev, Antonio Valerio Miceli Barone, Jonas Waldendorf, Alexandra Birch, Kenneth Heafield. The University of Edinburgh’s English-German and English-Hausa Submissions to the WMT21 News Translation Task. In WMT 2021.

Agostina CALABRESE, Michele Bevilacqua, Björn Ross, Rocco Tripodi, Roberto Navigli. AAA: Fair Evaluation for Abuse Detection Systems Wanted. In Web Science 2021.

Henry CONKLIN, Bailin Wang, Kenny Smith and Ivan Titov. Meta-Learning to Compositionally Generalize In ACL 2021.

Verna DANKERS, Anna Langedijk, Kate McCurdy, Adina Williams, and Dieuwke Hupkes. Generalising to German Plural Noun Classes, from the Perspective of a Recurrent Neural Network. In COLING 2021.

Radina DOBREVA, Frank Keller. Investigating Negation in Pre-trained Vision-and-language Models. In EMNLP 2021 Workshop BlackBoxNLP.

Tom HOSKING, Mirella Lapata. Factorising Meaning and Form for Intent-preserving Paraphrasing. In ACL 2021.

Nina MARKL, Hall-Lew, Lauren, Claire Cowie, Stephen Joseph McNulty, Shan-Jan Sarah Liu, Catherine Lai, Clare Llewellyn, Beatrice Alex, Nini Fang, Zuzana Elliott, and Anita Klingler. The Lothian Diary Project: Investigating the Impact of the Covid-19 Pandemic on Edinburgh and Lothian Residents. In Journal for Open Humanities Data 2021.

Nina MARKL, Catherine Lai, C. 2021. Context-sensitive evaluation of automatic speech recognition: considering user experience & language variation. In EACL 2021 Workshop HCINLP.

Nicole MENG, Kholoud Althobaiti, Kami Vaniea. I Don’t Need an Expert! Making URL Phishing Features Human Comprehensible. In CHI 2021.

Nicole MENG, Dilara Keküllüoğlu, Kami Vaniea. Owning and Sharing: Privacy Perceptions of Smart Speaker Users. In ACM Hum.-Comput. Interact. 5, CSCW1, Article 45 (April 2021).

Nikita MOGHE, Mark Steedman and Alexandra Birch. Cross-lingual Intermediate Fine-tuning improves Dialogue State Tracking. In EMNLP 2021.

Dan WELLS, Pilar Oplustil-Gallegos, Simon King. The CSTR entry to the Blizzard Challenge 2021. In Blizzard Challenge 2021 Workshop.

Dan WELLS, Korin Richmond. Cross-lingual Transfer of Phonological Features for Low-Resource Speech Synthesis. ISCA 2021 Speech Synthesis Workshop.

Irene WINTHER, Yevgen Matusevych, Martin J. Pickering. Cumulative frequency can explain cognate facilitation in language models. in CogSci 2021.

Zheng ZHAO, B Webber. Revisiting Shallow Discourse Parsing in the PDTB-3: Handling Intra-sentential Implicits. In EMNLP 2021 Workshop CODI.

Laurie BURCHELL, Jie CHI, Tom HOSKING, Nina MARKL, Bonnie Webber. Querent Intent in Multi-Sentence Questions, In COLING 2020 Workshop LAW.

Shangmin GUO, Yi Ren, Agnieszka Słowik, Kory Mathewson. Inductive Bias and Language Expressivity in Emergent Communication. In NeurIPS 2020 Workshop on Emergent Communication.

Wanqiu LONG, Bonnie Webber, Deyi Xiong: TED-CDB. A Large-Scale Chinese Discourse Relation Dataset on TED Talks. In EMNLP 2020.

Nikita MOGHE, Christian Hardmeier and Rachel Bawden. The University of Edinburgh-Uppsala University’s Submission to the WMT 2020 Chat Translation Task. In WMT 20.

Other Outputs

Besides publications, our students have engaged in a variety of other activities reflecting the aims of the CDT:

Laurie BURCHELL

Poster – Exploring diversity in back translation for low-resource machine translation, ILCC Poster Session, Informatics Forum, July 2022

Talk - CDT NLP Industry Day, February 2021

Talk – research presentation to funder community (+100), online, December 2020

CDT NLP Ambassador – WomenInAI Open Day, online, October 2020

Agostina CALABRESE

Workshop - Co-organiser of 2nd Workshop on Novel Evaluation Approaches for Text Classification Systems (NEATCLasS), co-located ISWSM, June 2023

Hate Speech Detection - Slack workspace connecting researchers from academia and industry, independently from their institution, 2022

Activity Host - introducing children to computer science at Informatics Circle/zoom, February 2022, December 2021, July 2021, June 2021

Microblogger – volunteer for the ACL 2021 conference, tweeting about papers in both English and Italian, August 2021

Poster - How to evaluate abuse detection systems, 2nd ELLIS NLP workshop, July 2021

Ronald CARDENAS

Poster – On the Trade-off between Redundancy and Local Coherence in Summarization, ILCC Poster Session, Informatics Forum, July 2022

Co-organiser - with Dr S Cohen and fellow PhD student of 'Edinburgh Informatics Circle' to introduce basic concepts of computer science to children (5-16 years). 2020 & 2021

Poster - Unsupervised Extractive Summarization by Human Memory Simulation, the European Laboratory for Learning and Intelligent Systems (ELLIS-NLP 2021), February 2021

Poster - Unsupervised Extractive Summarization by Human Memory Simulation, the Advanced Language Processing Winter School (ALPS 2021), January 2021

Poster - A Grounded Unsupervised Universal Part-of-Speech Tagger for Low-Resource Languages, The 71st Language at Edinburgh Lunch, SoI, February 2020

Seminar Talk – Universal Morphological Analysis using Reinforcement Learning, Charles University, Prague, February 2020

Seminar Talk – Morphological Process Transduction: Towards Interpretable Multi-lingual Morphological Analysis, University of Malta, November 2019

Iona CARSLAW

Poster - LLMs and Reflexive binding: do LLMs encode native speakers judgements? Göttingen University – Göttingen Summer School, August 2024

Talk - LLMs and Reflexive Binding. Aberdeen University - NESS (North East Syntax Seminar), May 2024

Posters – Member of Language Lunch committee, organising two poster presentations per academic year on language related research from the postgrad community

Georgia-Ann CARTER

Talk - Leveraging context for perceptual prediction using word embeddings. Multi-word expressions reading group/NLP seminars at University of Sheffield, online, August 2023

Poster - Leveraging context for perceptual prediction using word embeddings. CogSci, online, July 2023

Mentor - Cowrie Scholarship Foundation (for disadvantaged Black British university students) 2022/23

Poster - Effects of global discourse coherence on local contextual predictions. Architectures and Mechanisms of Language Processing 28, York, September 2022

Organiser - 75th Language Lunch, online, September 2022

Open Research Facilitator - Open Science and Git Workshop. Language and Interaction Network, Edinburgh, September 2022

Poster - Discourse coherence modulates use of predictive processing during sentence comprehension. Language and Interaction Network, Edinburgh, September 2022

Poster - Effects of global discourse coherence on local contextual predictions: @ Edinburgh Open Research Conference, May 2022; @ the Architectures and Mechanisms of Language Processing September 2022; @ CogSci July 2021

I’m a Scientist - Q&A with school students, online, General Science Red Zone, April 2021

Jie CHI

Talk - Language modelling for code-switching, CDT NLP Industry Day, March 2023

Henry CONKLIN

Talk - Inductive Biases & Compositional Generalization, Centre for Language Evolution Seminar Series, May 2021

Poster & Talk: Modelling Positional Compositional Structure. Conference of the Cognitive Science Society 2020

Tutor - Centre for Language Evolution: delivering a workshop on Language Evolution in primary schools, Autumn 2019

Gautier DAGAN

Poster - Grounding Physical Common sense reasoning with Vision, ILCC Poster Session, Informatics Forum, July 2022

Verna DANKERS

Talk - Memorisation in translation and classification: the what and the where, University of Tilburg, Netherlands, February 2024

Talk - Memorisation meets non-compositionality, Stuttgart University, May 2024

I’m a Scientist - Q&A with school students, online, since December 2020, on topics including ChatGPT and the future of AI

Talk - Memorisation maps for neural machine translation, Annual Conference of the UKRI CDT in Speech and Language Technologies (SLT) and their Applications, June 2023

Talk - Idiom processing in Transformer, a translation case study, CardiffNLP seminar, March 2023

Scotland STEM Ambassador – weekly volunteer at the DataKirk in Edinburgh (currently online) to improve data literacy of children and adults on variety of data topics. Since September 2021 - ongoing

CDT NLP Ambassador - UoE PG Virtual Open Day, November 2022

Talk – Compositional Generalisation in Machine Translation, UCL Hopper Colloquium – Spotlight competition, October 2020

Artemis DELIGIANNI

Poster – Misogyny detection online: Problems with Psychological Validity, Jt SLT/NLP Conference, UoE, June 2024

Poster - Misogyny detection online: Problems with psychological validity, Bridging Psychology with AI Workshop, UoE, July, 2024

Poster – Towards a psychologically valid dataset for misogyny detection, The Social Data Science Hub Poster Session, UoE, June 2024

Stephanie DROOP

Talk - Extending Counterfactual Reasoning Models to Capture Unconstrained Social Explanations. MathPsych / ICCM / EMPG 2023, University of Amsterdam

Poster - Extending Counterfactual Reasoning Models to Capture Unconstrained Social Explanations. 45th Annual meeting of the cognitive science society, 2023

Talk - Sam Gershman’s ‘What makes us smart: The computational logic of human cognition’. Presented about book and led online discussion on Machine Learning Street Talk podcast’s discord server’s book group, March 2023

Poster - Extending Counterfactual Reasoning Models to Capture Unconstrained Social Explanations. Counterfactuals in Minds and Machines, workshop at ICML 2023, Honolulu

Talk - Inferring epistemic intention in simulated physical microworlds. 44th Annual Meeting of the Cognitive Science Society, Toronto, July 2022

Talk - Inferring goals in simulated micro worlds, Turing Institute, May 2022

Talk - Inferring goals in simulated micro worlds, CDT NLP Industry Day (online), February 2022

Xiaotang DU

Talk - Problem of knowledge conflict/methods for generating counterfactual data with knowledge conflicts. Edinburgh NLP Meeting, School of Informatics, April 2023

Poster - Faithful and attributable generation. UKRI CDT in SLP&NLP joint conference, June 2024

Lauren FLETCHER

Talk - Communicative efficiency and social biases modulate language learning in autistic and allistic individuals, International Conference on the Evolution of Language (Evolang) 2024, Madison, WI, USA, May 2024

Poster - How communicative efficiency and social biases shape language in autistic and allistic learners. CogSci 2023, Sydney

Shangmin GUO

Poster - Cultural evolution in the age of generative AI, Using Artificial Neural Networks for Studying Human Language Learning and Processing Workshop, June 2024

Poster - Expressivity of Emergent Language is a Trade-off between Contextual Complexity and Unpredictability, 10th International Conference on Learning Representations April 2022

Poster - Better Supervisory Signals by Observing Learning Paths 10th International Conference on Learning Representations, April 2022

Talk – How do we learn algorithms at University of Cambridge? online, publicly accessible, February 2021

Poster - Inductive Bias and Language Expressivity in Emergent Communication, 4th NeurIPS Workshop on Emergent Communication, December 2020

Balint GYEVNAR

Talk - Building Trustworthy Human-Centric Autonomous Systems Via Explanations. AAMAS 2024 Doctoral Consortium, May 2024

Talk - Towards Trustworthy Autonomous Systems via Conversations and Explanations. AAAI 2024 Doctoral Consortium, February 2024

Talk - Trustworthy Autonomous Systems Through Social Explainable AI. ECAI 2023 Doctoral Consortium, October 2023

Talk - Causal Explanations for Sequential Decision-Making in Multi-Agent Systems. IJCAI 2023 Workshop on Explainable Artificial Intelligence, August 2023

Talk - Causal Social Explanations for Stochastic Sequential Multi-Agent Decision-Making, 5th International Workshop on EXplainable and TRAnsparent AI and Multi-Agent Systems (EXTRAAMAS 2023), May 2023

Talk - Aligning Explainable AI and the Law: The European Perspective. 5th International Workshop on EXplainable and TRAnsparent AI and Multi-Agent Systems (EXTRAAMAS 2023), May 2023

Poster - Communicative Efficiency or Iconic Learning: Do developmental and communicative pressures interact to shape colour-naming systems? ILCC Poster Session, University of Edinburgh, July 2022

Spotlight Talk - Human-Centric Explanations in Natural Language for Autonomous Vehicle Motion Planning and Prediction, 2nd IJCAI Workshop on Artificial Intelligence for Autonomous Driving, 2022

Talk - Human-Centric Explanations for Autonomous Vehicles, Turing Institute, London, May 2022

Tom HOSKING

Talk - CDT NLP Industry Day, Online, February 2021

Project Supervisor - Nuffield Future Researchers: Modelling Variations in English between Domains, July 2020

Wenyu HUANG

Poster - Prompting Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts, Annual Conference of the UKRI CDT in Speech and Language Technologies (SLT) and their Applications, June 2023

Talk - Retrieval Augmented Generation in the Era of Large Language Model, Huawei-Edinburgh Joint Lab Workshop, December 2023

Parag JAIN

Talk - Conversational Semantic Parsing, Online International Conference on Advances in Physical, Mathematical and Computational Sciences, India, 2022

Poster - Conversational Semantic Parsing using Dynamic Context Graphs, EMNLP, Dec 2023

Poster - Integrating Large Language Models with Graph-based Reasoning for Conversational Question Answering, URKI CDT in NLP and SLT joint conference, June 2024

Anna KAPRON-KING

Poster - An Experimental Investigation of the Unidirectionality of Grammaticalization - Annual Conference of the UKRI CDT in Speech and Language Technologies (SLT) and their Applications, July 2023; & AMLAP, September 2023

Talk - How communication enables directionality in language change – 3 Minute Thesis Competition (heat-PPLS), March 2023

Talk /Poster - People prefer using body part nouns to denote spatial relations to using spatial prepositions to refer to body parts, ILCC – July 2022

Amr KELEG

Talk – Distinguishing between the Varieties of Arabic: Dialect Identification is neither Solved nor the Solution, online, July 2024

Poster - ALDi: Quantifying the Arabic Level of Dialectness of Text, The Social Data Science Hub Poster Session, June 2024

Publicity Chair - ArabicNLP 2023 conference

Talk - DLAMA: A Framework for Curating Culturally Diverse Facts for Probing the Knowledge of Pretrained Language Models, ACL online July 2023

Talk - DLAMA: A Framework for Curating Culturally Diverse Facts for Probing the Knowledge of Pretrained Language Models, SLT CDT Annual Conference, June 2023

Poster - Quantifying Arabic Level of Dialectness of Text, SLT CDT Annual Conference, June 2023

Oghentekevwe KWAKPOVWE

Talk - Semantic Chaining Under the Information Bottleneck Principal, Internal and External Pressures Shaping Language workshop, ESSLLI August 2023

CDT NLP Ambassador - UoE PG Virtual Open Day, November 2022

Matthias LINDEMANN

Talk – FSTs can help language models generalize better, foundational NLP course, University of Amsterdam, May 2024

Talk - Structural Inductive Biases for Better Systematic Generalization with Sequence-to-Sequence Models, Computational Linguistics Seminar, NLP Reading Group, McGill University, June 2024; and University of Amsterdam, April 2024

Workshop - Co-organiser of EACL 2023’s Student Research Workshop - a non-competitive platform for less experienced students from all backgrounds to publish their work, get feedback on thesis proposals and tips about paper writing; and networking.

Oli Danyi Liu

Talk - Analyzing self-supervised representations of speech: encoding structures of speaker information and phonetic context. At University of Pompeu Fabra and Barcelona Supercomputing Centre, June 2024.

Ella MARKHAM

Organiser - Hoppers International Women’s Day Event 2023, School of Informatics, March 2023

Nina MARKL

Talk - Sociolinguistic variation and automatic speech recognition, University of York

Talk - Predictive Bias in English language Automatic Speech Recognition: the role of speech datasets, Linguistics and English Language Postgraduate Conference, University of Edinburgh (online), June 2021

Talk - (HW) in Edinburgh: Variation and Change in a complex consonantal variable, 13th UK Language Variation and Change (UKLVC), September 2021, online (Glasgow)

Talk - Using commercial automatic speech recognition in sociolinguistic research, 49th New Ways of Analyzing Variation (NWAV), October 2021, online (Austin)

Talk - "Hey Siri, why don't you understand me?" Speech and Language Technologies, Language Variation & Algorithmic Bias: ...>>>Language in Context Seminar Series, University of Edinburgh (online) - October 2021...>>>Glasgow University Laboratory Phonetics seminar (online) - November 2021...>>>Lancaster Phonetics Research Group (online) – November 2021...>>>Invited talk at the Department of English at the University of Central Punjab – November 2021

Nicole MENG-SCHNEIDER

Poster & Talk - Multi-User Smart Speakers - A Narrative Review of Concerns and Problematic Interactions, Pre-Chi Event, April 2023

Talk - The Privacy Implications of Smart Voice Assistants, CDT NLP Industry Day, March 2023

Panellist - Discussion on Gender Minorities in Informatics, Hoppers International Women’s Day, March 2023

Poster – Owning and Sharing: Privacy Perceptions of Smart Speaker Users. Proc. ACM HUM.-Comput. Interact, 2021

Lead organiser – Hopper’s International Women’s Day, Informatics Forum, March 2020

Workshop - How to Make a Poster Workshop, Hopper's, February 2020

Poster - Is this URL safe to click on? Supporting Users’ Comprehension of Phishing Features, Kholoud Althobaiti, Nicole Meng, Kami Vaniea, at SICSA DemoFest 2019, Dynamic Earth, Edinburgh, November 2019

Atli Thor SIGURGEIRSSON

Blog – started personal blog https://atlithor.notion.site/Atli-Thor-Sigurgeirsson-85c380e8c3694e2ca38878ee1281e641

Nikita MOGHE

Talk (on-line) - Taking Stock of Segment Level Evaluation in Machine Translation - IST-Unbabel Seminars (Feb 2023); NLPWithFriends (April 2023)

Talk - Intermediate Fine-tuning improves NLU for Dialogue Task - Alan Turing Institute (in-person) May 2022 , Huawei (online) February 2022

Talk – The Great Intent Detective – monthly PolyAI Seminar series, online August 2021; Aveni, online November 2021

Talk - CDT NLP Industry Day, February 2021

I’m a Scientist – Stay at Home – Coding Edition: Q&A with school students, online, May-July 2020

Helper – assisted in organising Hopper’s International Women’s Day, Informatics Forum, March 2020

Nicolas Navarre

Poster – Explanations in the Form of Causal Selection Judgements Assist with Abduction of Complex Causal Structures. ComCo October 2023, Language Lunch November 2023.

Talk - Effects of causal structure and evidential impact on probabilistic reasoning. In International Conference on Thinking, June 2024

Poster - Effects of causal structure and evidential impact on probabilistic reasoning. In CogSci, July 2024

Talk - Functional Rule Inference from Causal Selection Explanation. In International Conference on Thinking, June 2024 and CogSci, July 2024

Piotr NAWROT

Poster - Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference, 2024 International Conference on Machine Learning, July 2024

‘x’ - https://x.com/p_nawrot/status/1768645461689168365

Poster - nanoT5: A PyTorch Framework for Pre-training and Fine-tuning T5-style Models with Limited Resources. 3rd Workshop for NLP-OSS at EMNLP, December 2023

Poster - Efficient Transformers with Dynamic Token Pooling, The 61st Annual Meeting of the Association for Computational Linguistics, Toronto, July 2023

Talk - Efficient Transformers with Dynamic Token Pooling, University of Cambridge, 2023

Rimvydas RUBAVICIUS

Tutor - Introductory course on Robotics and Human-robot interaction (including NLP), summer holiday online teaching programme for students from underprivileged backgrounds, organised by Macau Edinburgh Exchange Tour (MEET – charity #: SCO48402), August 2020

Alice ROSS

Presentation/Workshops – presented research work at LITHME (European Cooperation in Science and Technology (COST) initiative, May 2024

Ariadna SANCHEZ

Talk – To listen or not to listen? A quasi-experimental study on reading-only versus reading-while-listening for incidental vocabulary learning in L2 Spanish learners at different levels of lexical proficiency. Co-presenter at British Association of Applied Linguistics Conference 2024, University of Essex, September 2024

Poster - Self-supervised models for dysarthric speech: Understanding representations through visual analysis and probing. UK and Ireland Speech Workshop 2024, University of Cambridge, July 2024; and CDT in NLP/SLT Joint Conference 2024, University of Edinburgh, June 2024

Media - Fem ús de les IA, però amb consciència (‘Let us make use of AI, but with conscience’), Metadata (Catalan digital newspaper) https://www.metadata.cat/opinio/4741/fem-us-ia-consciencia-ariadna-sanchez-yitg), July 2024

ED&I Reading Group - co-lead of monthly reading group in Informatics research about ED&I topics, 2024

CDT Bookclub - co-lead of monthly bookclub focussed on books with an ED&I component, 2024

Co-organiser - Young IT Girls, organising educational activities in schools and centres around Catalonia encouraging STEAM careers to young girls

Lecture - introductory lecture to Text-to-Speech technologies and lab support for the online edition of the Postgraduate Course in Deep Learning by the Universitat Politecnica de Catalunya (Barcelona)

Nicholas SANDERS

Poster - Towards Personification in Controllable Text to Speech, Annual Conference of the UKRI CDT in Speech and Language Technologies (SLT) and their Applications, June 2023

Atli SIGURGEIRSSON

Poster - Using a large language model to control speaking style for expressive TTS, UK Speech, June 2023

Talk - Entity-based sentiment analysis toolkit for audience research, CDT NLP Industry Day, February 2022

Talk - Text-to-speech, to Postgrad NLP class at University of Iceland, October 2021

Poster - Talrómur: A large Icelandic TTS corpus, at NoDaLiDa 2021/Reykjavik/zoom, June 2021

Talk/Tutorial - an introduction to Applied Machine Learning for students on IoTSSC course at University of Edinburgh, and how to use ideas and methods from the tutorial in their final assignment, February 2021

Sydelle de SOUZA

Poster - From Memory to Analogy: Exploring Semantic Compositionality in Language Processing, Analogy 2024, July 2024

Talk - Evolution and world interactions: Constraints & Compositionality, Santa Fe Institute, NM, USA

Poster - Starting Small, After All? Curriculum Learning with Child-Directed Speech, CogSci 2024

Siqi SUN

Poster - Learning Pronunciation from Other Accents via Pronunciation Knowledge Transfer, Interspeech 2024

Talk - Entity-based sentiment analysis toolkit for audience research, CDT NLP Industry Day, February 2022

Talk - Text-to-speech, to Postgrad NLP class at University of Iceland, October 2021

Fatemeh TARIGHAT

Poster - Differences in Human and Machine Interpretation of Non-literal Meanings: The Case of Fillers, PPLS PhD Poster Forum, 08.06.23; and CDT SLT Conference, University of Sheffield, June 2023

Talk - Differences in Human and Machine Interpretation of Non-literal Meanings: The Case of Fillers, ‘CDTalks’, School of Informatics June 2023

Poster - Tracing Sarcastic Meanings of Fillers in Tweets: A Structural Analysis of Data and a Pilot Study, ILCC, July 2022

Poster - Understanding Fillers May Facilitate Automatic Sarcasm Comprehension: A Structural Analysis of Twitter Data and a Participant Study, SemDial August 2022, online; The 1st Language and Interaction Network (LINk), Edinburgh, October 2022.

Talk - Tracing Sarcastic Meanings of Fillers in Tweets: A Structural Analysis of Data and a Pilot Study, Psycholinguistics Coffee, School of PPLS, online, June 2022; & Alan Turing Institute Networking event, July 2022

Eddie UNGLESS

Interview - Don’t ask DALL-E to Draw Trans People, Queer-in-AI, July 2023

Poster - Stereotypes and Smut: The (Mis) representation of Non-cisgender Identities by Text-to-Image Models, WOAH poster session, ACL 2023

Poster - This Prompt is Measuring< MASK>: Evaluating Bias Evaluation in Language Models, TrustNLP poster session, ACL 2023

Talk - Stereotypes and Smut: The (Mis) representation of Non-cisgender Identities by Text-to-Image Models, Queer in AI workshop at ACL 2023

Blog - Monthly blog on the topic of bias in AI: https://mxeddie.github.io/

Activity Organiser/Host - introducing children (7-14) to computer science at Informatics Circle/zoom. Also, delivered an activity on machine translation, 2022

Talk - Social bias and reducing harm caused by NLP technologies, CDT Industry Day, February 2022; Lloyds Banking NLP working group, September 2022

Poster/Video/Summary - Queerphobic bias in sentiment analysis tools, WiNLP 2021 co-located with EMNLP, November 2021

Pride Picnic – organised for CDT in NLP LGBTQ+ students and their allies, as part of Induction/Welcome Week activities. September 2022 & 2021.

Talk/Activity -What are Binary Numbers? - O – Informatics Circle/Zoom - a student led session giving short presentations to kids (7-14) about topics in computer science. Also ran activity on machine translation, May & July 2021

Presentation - to peers at CDT in NLP Pizzer Club, drawing on professional experience pitching digital products to international companies. April 2021

Ivan VEGNER

Talk - Human-like in Every Way? Existential Risks from Agent AI, UCL CDT in Foundational AI, June 2023

Mengyu WANG

Talk - Enhancing Market Prediction through Ranking News for Financial Influence, Economics of Financial Technology Conference, June 2024

Talk - MANA-Net A Market Attention-weighted News Aggregation Network for Stock Price Prediction. ICAIF'22 Workshop on NLP and Network Analysis in Financial Applications, November 2022

Dan WELLS

Poster - Phonetic Analysis of Self-supervised Representations of English Speech, UK Speech, September 2022

Talk – Text-to-speech for underserved languages, CDT NLP Industry Day, February 2021

Irene WINTHER

Posters - Cognate processing and the role of learning. Annual Conference of the UKRI CDT in Speech and Language Technologies (SLT) and their Applications; & International Symposium on Bilingualism, Sydney, June 2023

Talk - Is cognate processing affected by the language of instruction? Hartsuiker L, Ghent University, Belgium, December 2022

Talk/Workshop - Computational Modelling in Psycholinguistics, 1st Language and Interaction Network, October 2022

Poster - Word Frequency Effects in Bilingual Language Models, 1st Language and Interaction Network, October 2022

Talk - Word frequency effects in bilingual language models, symposium on Bilingual Sentence Processing: when Models Meet Experiments, CogSci2022, Toronto, July 2022

Workshops - PPLS Inclusive Teaching Training Workshop and Equality, Diversity and Inclusion workshop, January 2022

Talk - Cumulative frequency can explain cognate facilitation in language models, Language and Cognition Research Group, Cardiff University, (online), April 2022

Poster - Cumulative frequency can explain cognate facilitation in language models. CogSci July 2021

Talk -Cognate facilitation in computational language models, at Psycholinguistics Coffee (Online), University of Edinburgh, April 2021

Zheng ZHAO

Talk - Revisiting Shallow Discourse Parsing in the PDTB-3: Handling Intra-sentential Implicits, 2nd Workshop on Computational Approaches to Discourse, EMNLP, November 2021

Talk - What are Binary Numbers? - O – Informatics Circle/zoom is a student led session giving short presentations to kids aged from 7-14 about topics in computer science. May & July 2021

Poster - Reducing Quantity Hallucinations in Abstractive Summarization, ELLIS NLP Workshop, online, February 2021

Poster - A Joint Matrix Factorization Analysis of Multilingual Representations, EMNLP, December 2023

Yu ZHAO

Poster – Analysing the Impact of Sequence Composition on Language Model Pre-Training, ACL 2024, Bangkok

Giulio ZHOU

Poster - Semantics and Sentiment: Cross-lingual Variations in Emoji Use. 77th Language Lunch @ UoE/School of Informatics, April 2023

Agostina CALABRESE

Worked on the development of a new evaluation metric for abuse detection systems in collaboration with the Sapienza NLP academic group, Sept 2020-Feb 2021 - paper now published.

Co-organised 8th Workshop on Online Abuse and Harms (WOAH) co-located with NAACL 2024

Ronald CARDENAS

Automatic Science Journalism – collaboration between Yufang Hou, IBM Ireland, UK; and Bingsheng Yao and Dakuo Wang, Rensselaer Polytechnic Institute, USA

Controlled generation using library DISCO - collaborated with Breaking Bad team at NAVER Labs Europe

Sandrine CHAUSSON

Project exploring a large dataset of Twitter data collected from 2019 to 2022 and focused on the 2020 US presidential elections using Computational Social Science and NLP methods. Collaboration with Prof. Marion Fourcade & Prof. David Harding, Department of Sociology, University of California, Berkeley.

Developing a dashboard for the analysis of social media data, particularly for the study of emergent narratives and audience interactions. Collaboration with the Alan Turing Institute and the Defense Science & Technology Lab (DSTL).

Jie CHI

Worked on multilingual and code-switching speech recognition in collaboration with other researchers in JSALT22, hosted by Johns Hopkins University.

Verna DANKERS

One of the leads on the GenBench initiative (https://genbench.org/), promoting non-i.i.d. evaluation in NLP; included organising a workshop at EMNLP 2024 & 2023; and 3^rd author on article published in Nature Machine intelligence in 2023.

Shangmin GUO

The Decoding-time Realignment of Language Models paper - a collaboration during internship with researchers from Google DeepMind, and University of Basel, 2024

Collaborated with BMW (China) to introduce boosting into their running systems - 2021

Inductive Bias and Language Expressivity in Emergent Communication, 4th NeurIPS Workshop on Emergent Communication, 12.12.20 - published in collaboration with researchers from industry and PhD students from other universities: K Mathewson, Research Scientist, DeepMind; A Słowik and Y Ren, PhD students from University of Cambridge and University of British Columbia respectively.

LinkedIn Learning: Translation of open courses on general programming algorithms and delivery to audience who speak Chinese. Dec 20 – ongoing.

Balint GYEVNAR

“Bridging the Transparency Gap: What Can Explainable AI Learn From the AI Act?” - in collaboration with the Edinburgh Law School represented by Prof. Burkhard Schafer.

Faheem KIREFU

Faheem Kirefu, Vivek Iyer, Patrick Chen and Laurie Burchell: WMT22 – Code-mixed shared task carried out within MT Informatics Group - the submission was one of the top of the leaderboard. 2022.

Amr KELEG

Participated as one of the organisers of ‘NADI 2024: The Fifth Nuanced Arabic Dialect Identification Shared Task’

Nina MARKL

Ongoing collaborations throughout 2022 with: Lothian Diaries Project (UoE); UnMute Project (UoE and Swansea University); Edinburgh Accents of English Corpus (UoE, ILCC-funded small grant project with Ramon Sanabria, Nik Bogoychev, Peter Bell)

Liu, Shan-Jan Sarah, Lauren Hall-Lew, Stephen McNulty, Nina Markl, Catherine Lai, Beatrice Alex, Clare Llewellyn, & Karri Gillespie-Smith. 2021. Lockdown in the Lothians: Insights from the Lothian Diary Project. Executive Summary & Parliamentary Briefing. Sent to Members of Scottish Parliament on 6 October 2021.

The Lothian Diary Project: Investigating the Impact of the Covid-19 Pandemic on Edinburgh and Lothian Residents - an interdisciplinary research project involving University of Edinburgh researchers from the School of Philosophy, Psychology and Language Sciences, the School of Informatics and the School of Social and Political Science.

Nikita MOGHE

Chantal Amrhein (University of Zurich) May 2022 - 2023, work related to ACES dataset, paper published

Evgeniia Razumovskaia, Ivan Vulić, Anna Korhonen (University of Cambridge) November 2021 to March 2023, work related to the Multi3NLU++ dataset, paper published.

Piotr NAWROT

"Efficient Transformers with Dynamic Token Pooling” - collaboration with researchers from Nvidia (Adrian Łańcucki) and University of Wrocław (Jan Chorowski, Adrian Łańcucki)

“No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models” - collaboration with researchers from UCL (Jean Kaddour, Oscar Key, Matt J. Kusner)

Ariadna SANCHEZ

“Beyond the binary: Limitations and possibilities of gender-related speech technology research” – collaboration with Nina Markl, a former CDT in NLP student whilst at The University of Essex.

Atli SIGURGEIRSSON

Worked with prior research group at Reykjavik University to publish the paper Talrómur: A Large Icelandic TTS Corpus. Accepted at NoDaLiDa 2021 for 1st-2nd of June 2021. Atli Thor Sigurgeirsson, Þorsteinn Daði Gunnarsson, Gunnar Thor Örnólfsson et al.

Worked on the project funded by the government of Iceland in expanding resources for language technology in Iceland.

Eddie UNGLESS

Working with supervisor and the Biascan team, a startup founded by Creative Informatics Resident Entrepreneur, Barbara Melville, to develop a prototype for an AI driven implicit bias detection system for job advertisements. The prototype is nearing the final stage of development. 2022

Joined the Cohere for AI community, which fosters collaboration on AI projects particular between underrepresented communities in AI and for those with diverse academic backgrounds. 2022

Collaboration with a fellow member of the CDT stemmed from presenting work on Reducing harm caused by NLP technologies, currently in progress. Also invited to give a lightning talk at the Informatics' Industrial Advisory Board.

Mengyu WANG

Collaboration with industry partner abrdn on a project focused on developing methods for automatically generating financial analysis using LLMs.

Collaboration with Weixian Waylon Li, Carsten Maple, Tiejun Ma: SynthRank: Synthetic Data Generation of Individual’s Financial Transactions Through Learning to Ranking AAAI 2024 Workshop on AI in Finance for Social Impact - 2024

Dan WELLS

Collaboration with the National Research Council Canada (NRC) on the Speech Generation for Indigenous Language Education (SGILE) project, starting summer 2022. Contributed to successful grant proposal for funding through NRC's Small Teams Initiative.

Zheng ZHAO

Co-authored a paper with another PhD student at UoE: Jointly Fishing for Word Embeddings and Definitions, 16th International Workshop on Semantic Evaluation (SemEval-2022).

Georgia-Ann CARTER

Runner-up team member in Open Research Award category of UoE's Good Research Practice Awards, 2022

Verna DANKERS

Outstanding reviewer award, ACL 2023

Best paper award, received during the 25th Conference on Computational Natural Language Learning, November 2021

Gautier DAGAN

Winner of the University of Edinburgh AI Hackathon Workshop on Generative Modeling - Submitting resulting work as short paper to COLING 2025

Shangmin GUO

Funding Award from EPSRC (165K GPU hours)

Spotlight Paper Award, ICML 2024

Best Reviewer Award, ICML 2024

Selected as one of the top Reviewers at NeurIPS 2023

Balint GYEVNAR

Recipient of the UKRI Trustworthy Autonomous Systems Hub Early Career Research Award under the Knowledge Transfer Track, 2023

Third place in “Shape the Future of ITS” Competition, IEEE Intelligent Transportation Systems Society (ITSS), 2022

Selected essay in the AI100 Early Career Essay Competition by Stanford University, 2023

Runner-up best paper in IJCAI 2022 Workshop on Artificial Intelligence for Autonomous Driving.

Amr KELEG

Outstanding paper award at ACL 2024

Matthias LINDEMANN

Outstanding Paper Award at ACL 2023

Oli Danyi Liu

Computational Modelling Prize in Perception & Action CogSci 2024

Nikita MOGHE

Outstanding Paper Award at ACL 2023

Dan WELLS

Best Special Theme Paper, ACL 2022 theme track on Language Diversity: from Low-Resource to Endangered Languages

IBM Machine Learning Practical prize to recognise and reward the best group projects among 83 submissions. First prize: Subword Modelling in Machine Translation and Automatic Speech Recognition for Diverse Languages by Steven Cassady, Dimitris Papaliouras, Dan Wells. 4 May 2020

Radina DOBREVA

Dataset created for the paper Investigating Negation in Pre-trained Vision-and-language Models available online. 2021

Xiaotang Du

Dataset – NQ-Swap (https://huggingface.co/datasets/pminervini/NQ-Swap)

Dataset – MMLU Redux (https://huggingface.co/datasets/edinburgh-dawg/mmlu-redux)

Leaderboard – Hallucinations (https://huggingface.co/spaces/hallucinations-leaderboard/leaderboard)

Balint GYEVNAR

Dataset - HEADD: Human Explanations for Autonomous Driving Decisions. https://datashare.ed.ac.uk/handle/10283/8714

Shangmin GUO

Algorithm proposed, Online AI Feedback, was implemented and incorporated by HuggingFace in their Transformer Reinforcement Learning library. https://huggingface.co/docs/trl/main/en/online_dpo_trainer.

Wenyu HUANG

Dataset: LTGen for evaluating LLMs’ ability in conversational question answering with long-tail entities, 2023

Matthias LINDEMANN

Code and data available from web page extracts annotated to identify whether summaries or straplines - a consequence of work done for paper on straplines. 2022

Nina MARKL

The Edinburgh International Accents of English Corpus

Nikita MOGHE

Translation Accuracy Challenge Sets (ACES),2023

Multilingual, Multi-intent, Multidomain dataset for task-oriented dialogue (Multi3NLU++), 2023

Piotr NAWROT

Authored and open-sourced (GitHub - https://github.com/PiotrNawrot/nanoT5) the nanoT5 repository for pre-training and fine-tuning T5-style models in PyTorch, 2023

Siqi SUN

An open-source code and data repository for Mulit-accent (EDI, GAM, RPX) Sequence-to-Sequence Text-to-Speech linguistic frontend: https://doi.org/10.5281/zenodo.12775334

Eddie UNGLESS

Developed an improved prototype for a system to detect implicit bias in job listings

Created a data set of sentences containing different queer identities intended to detect bias in sentiment analysis tools. This data set has also been used by a CDT NLP peer for their “Individual Project” coursework training data, as it contains thousands of non-toxic sentences about queer people. 2021

Mengyu WANG

Development of a financial experiment platform to conveniently deal with data and run experiments

Laurie BURCHELL

Code-Switching Workshop Participant at Public Sector (Remote; June – August 2021)

Agostina CALABRESE

Intern at Meta (USA; August – December 2022)

Research Intern at Snap Inc. (USA; August – December 2023)

Ronald CARDENAS

Researcher at NAVER Labs Europe (France; September 2022 – March 202)

Sandrine CHAUSSON

Researcher at the University of California (USA; August – December 2022)

Jie CHI

Intern at Apple MLR (Copenhagen, Denmark; March – July 2024)

Gautier DAGAN

Research Intern at Meta AI (Paris, France; August – November 2023)

Verna DANKERS

Researcher at Microsoft Research (Seattle, USA; June – September 2024)

Research Scientist Intern at Meta AI (Paris, France; June – September 2022)

Shangmin GUO

Student Researcher at Google DeepMind (Paris, France; September – December 2023)

Tom HOSKING

Research Intern at Cohere (UK; May – September 2023)

Wenyu HUANG

Research Intern at Huawei Technologies Research & Development (UK) Limited (Edinburgh, UK; May – September 2022)

Amr KELEG

NLP Research Intern at Aveni Ltd (Edinburgh, UK; May – August 2022)

Matthias LINDEMANN

Student Researcher at Google DeepMind (London, UK; July – October 2024)

Nikita MOGHE

Research Intern at Microsoft Semantic Machines (Seattle, USA; June – September 2023)

ML Intern at PolyAI (Remote; June – September 2021)

Piotr NAWROT

Research Intern at Cohere (Poland; August – November 2024)

Deep Learning and Algorithms Intern at Nvidia (Cambridge, UK; May – September 2023)

Rimvydas RUBAVICIUS

Researcher at UKRI TAS node on Governance and Regulation (Edinburgh, UK; December 2023 – April 2024)

Irene WINTHER

Researcher at Ghent University (Ghent, Belgium; August – December 2022)

Zheng ZHAO

Applied Scientist Intern at Amazon (Cambridge, UK; June – November 2023)

CDT NLP Book Club - Emily Gaughan, Ariadna Sanchez

CDT NLP Book Club - Emily Gaughan, Ariadna Sanchez

CDT NLP TALKS - Nickil Maveli, Alice Ross, Yi Wang. [Laurie Burchell, Agostina Calabrese, Georgia-Ann Carter, Henry Conklin, Verna Dankers, Amr Keleg, Nikita Moghe, Nicholas Sanders]

EDI Reading Group CDT NLP - Artemis Deligianni, Ariadna Sanchez

CDT NLP Writing Retreat - Verna Dankers, Sydelle de Souza, Aida Tarighat, Ivan Vegner

CDT NLP Reading Group – Artemis Deligianni, [Anil Batra/Lauren Fletcher, Tom Hosking/Irene Winther]

CDT NLP Firbush - Gautier Dagan, Nick Ferguson, Oghentekevwe Kwakpovwe, Yi Wang. [Agostina Calabrese, Georgia-Ann Carter, Sandrine Chausson, Radina Dobreva, Anna Kapron-King, Aida Tarighat]

CDT NLP Pride Picnic - Laurie Burchell, Coleman Haley, Eddie Ungless

CDT NLP Recruitment/Mentor Ambassadors - Anil Batra, Laurie Burchell, Agostina Calabrese, Georgia-Ann Carter, Henry Conklin, Gautier Dagan, Verna Dankers, Stephanie Droop, Balint Gyevnar, Coleman Haley, Tom Hosking, Anna Kapron-King, Amr Keleg, Oghentekevwe Kwakpovwe, Matthias Lindemann, Oli Liu, Ella Markham, Nickil Maveli, Nicolas Navarre, Argyrios Papoudakis, Alice Ross, Rimvydas Rubavicius, Ariadna Sanchez, Nicholas Sanders, Rohit Saxena, Emelie Van De Vreken, Yi Wang, Dan Wells, Yu Zhao.

CDT NLP Social Space - Rimvydas Rubavicius

CDT NLP Student Reps – 2024-25: Gautier Dagan, Christina Du/Ella Markham, Nic Navarre/Yu Zhao. [2023-24: Zheng Zhao/Siqi Sun, Nick Ferguson/Coleman Haley, Nickil Maveli/Argyrios Papoudakis, Artemis Deligianni/Ariadna Sanchez. 2022/23: Tom Hosking, Danyang Liu/Wanqiu Long, Anna Kapron-King/Mengyu Wang, Sydelle De Souza/Ivan Vegner. 2021/22: Jie Chi/Dan Wells, Lauren Fletcher/Matthias Lindemann, Sandrine Chausson/Oli Liu 2020/21: Laurie Burchell/Georgia Carter, Verna Dankers/Atli Sigurgeirsson; 2019/20: Nicole Meng/Rimvydas Rubavicius]

Edinburgh Language Lunch, CDT Leads/Committee Members – Iona Carslaw, Sydelle de Souza, Emily Gaughan, Ella Markham [Georgia Carter, Jie Chi, Nina Markl]

Edinburgh Lectures in Language Evolution, Organiser and CDT Lead – Sydelle de Souza[Henry Conklin]

Psycholinguistics Coffee, Organiser and CDT Lead - Irene Winther

CDT NLP Research Update Coffee - Laurie Burchell

CDT NLP Promotional Items Champion - Nick Ferguson

ILCC Gathertown Friday Social, CDT Lead - Rimvydas Rubavicius

CDT NLP/ILCC Quiz - Rimvydas Rubavicius

This article was published on 2025-02-24

Publications

2024

2023

2022

2021

2020

Other Outputs

Public Engagement (including Equality, Diversity & Inclusion Initiatives)

Collaborations

Awards

Datasets and Platforms

Internships and Placements

Cohort Building