LLM-Based AI Companion for Multi-Agent Collaboration | CDT in Machine Learning Systems

This project will provide methodology for interactive AI agents with two forms of interface: (i) a textual based input, and (ii) a from-action interface, where the users actions prompt communication.

Deadline for Application & Eligibility

The deadline to submit your first-stage application form is 20 February 2026, 23:59.

You must follow the CDT MLSystems application process as described in those webpages.

Please contact the project's PI ahead of submitting your application to first check suitability and interest.

Eligibility: this project is open for applications to both Home and International students.

Co-funding Company

Tencent

Tencent Games was established in 2003. We are a leading global platform for game development, operations and publishing, and the largest online game community in China. Tencent Games has developed and operated over 140 games. We provide cross-platform interactive entertainment experience for more than 800 million users in over 200 countries and regions around the world. Honor of Kings, PUBG MOBILE, and League of Legends, are some of our most popular titles around the world. Meanwhile, we actively promote the development of esports industry, work with global partners to build an open, collaborative and symbiotic industrial ecology, and create high-quality digital life experiences for players.

https://www.tencentgames.com/

Supervisory team

University of Edinburgh PI: Amos Storkey - a.storkey@ed.ac.uk (School of Informatics)
Personal website: https://homepages.inf.ed.ac.uk/amos/

Company supervisor: Ruidong Wang - ruidwang@global.tencent.com

Abstract

In multi-agent computer games, there are three aspects of multi-agent play that make for interesting interaction with AI agents.

(a) communication between human players: how can an LLM enhance inter-human collaboration, e.g. by communicating intent or actions to other players

(b) providing an in game companion to aid and unify players in understanding the game in the early stages or improving their prospects – given the joint communications from multiple players, and

(c) proving team communication to multiple actors in a large game with multiple team-agents such as e-sports games or social interaction games one can instruct a team of actors to join in a particular endeavour.

This project will provide methodology for interactive AI agents with two forms of interface: (i) a textual based input, and (ii) a from-action interface, where the users actions prompt communication. In all these settings we can potentially leverage knowledge of the game internals to make this feasible

Project Background

Tencent UK plans to further deepen partnership with University of Edinburgh in 2026 by launching two PhD co-funded studentship, aiming to support Tencent Games’ long-term strategy in AI-driven game development, data intelligence, and next-generation interactive technologies. Through this programme, Tencent Games Data & AI team will participate in the selection and supervision of doctoral candidates, aligning academic research with real-world industry challenges and innovation needs.

Project Aims

This project will:

(a) create research game environments that are excellent benchmarks for testing agent collaboration capability, building on existing research,

(b) develop AI methods that enhance collaboration within teams, enable continued play when a team member drops out,

(d) advance multi-agent methodology on collaboration by improving collaboration coherence.

The primary research aims are methodological: we wish to develop the methods and understanding that would enable the issues discussed to be tackled effectively.

Expected Outcome and Impact

Target Outcomes Papers:

Multi-agent Evaluation Environments and Benchmarks (Month 9)
Paper: Multi-agent LLM Planning for Games (Month 15)
Paper: Mitigating Forgetting in Online Adaptation for Multi-Agent settings (Month 24)
Paper: Efficient online adaptation (Month 36)
Thesis/Paper: An LLM-Based AI Companion for Multi-Agent Collaboration (Month 48)

Demonstrator: Practical demonstrator tooling showing methods in practice in an environment

Code: open source codebase for implementing the methodology

Video: Demonstration video of approach

Data and Methodology

The approach in this project can be broken down into the following tasks:

Task 1: Evaluation. Building and evaluating multi-agent collaboration tools requires evaluation environments that allow us to test such capability in an automated way. We will leverage and build on existing environments we are already developing for this purpose. A key issue for interactive multi-agent human play is the challenge of testing the benefit of an approach automatically without testing within actual play, which is prohibitive for early stage development. We will use offline methods, leveraging human play examples, and considering counterfactual benefit of other actions that the players might have taken in any setting.

Task 2: Prompted intent and instruction We will leverage existing and new approaches in LLM planning to build a model of achievement success via a Vision-State-Language model. We will also develop an in-memory plan of real/artificial player intent from instruction. This allows the mapping of language statement to individual agent plans. We will then do incontext specialisation that allows continuous adaptation as things unfold. In a computer game context, we can benefit from the knowledge of the underlying game engine structure to enable this to be feasible.

Task 3: Continuous adaptation Building on the underlying demonstrators above, more detailed research questions will be considered around how the LLM can continuously adapt to the ongoing play, style and setting to provide improved aid or direction to players. This is an open research question, considering many of the big questions in continuous adaptation in the multi-agent context, including the issue of forgetting.

Task 4: Efficient Implementation. We will consolidate the results of the previous work to provide practical examples of the earlier research is realistic game settings. We will deal with more of the practical aspect or issues that must be handled within a computer game, especially including real-time compute efficiency, agent individualisation and defaults etc.

Task 5: Realistic Demonstrators. We will consolidate the results of the previous work to provide practical examples of the earlier research is realistic game settings. We will deal with more of the practical aspect or issues that must be handled within a computer game, especially including real-time compute efficiency, agent individualisation and defaults etc. This will lead to interesting demonstrators. We will interact with Tencent to provide demonstrators for specific game-environments.

Timescale and Expected Outputs/activities (over 4 years)

Year 1: Evaluation suite and text prompted intent and instruction
Year 2: Continuous adaptation
Year 2 and 3: Efficient implementation
Year 3 and 4: Practical demonstrators

Students Requirements

A good Bachelor’s degree (First Class Honours or international equivalent) or Master’s degree in a relevant subject
Relevant research experiences a plus
More on the CDT MLSystems requirements for candidates

This article was published on 2025-11-04