Programme | CDT in Machine Learning Systems

Explore the CDT Programme's structure and activities

The CDT in Machine Learning Systems is a four-year PhD with Integrated Study meaning that students attend credit-bearing courses in addition to leading their PhD research project across the 4 years of the programme. The programme has a number of distinctive features, notably the tight coupling of theory, practice and industry relevance; an emphasis on engagement with external users providing an understanding of real practical settings; a strong cohort model, including student-directed and peer-to-peer learning; and the acquisition of essential transferable skills necessary to conduct research studies and to develop as a successful and competitive professional.

Each student has a main research project but these will generally be collaborative with others. There are also mini-projects, hackathons, and company engagement events. Most students will leverage the opportunity for a paid internship in a company over the course of the PhD.

CDT Programme Overview
	Credit-bearing courses	Non-credit bearing training and activities	Internship	PhD Research
Year 1	v	v		v
Year 2	v	v		v
Year 3		v	v	v
Year 4		v		v

Training and Activities

The training programme combines credit-bearing taught courses and projects, which make up a total of 180 credits to be obtained across the two first years of the PhD, and a range of non-credit-bearing training and activities, specific to the CDT or delivered jointly with other local CDTs.

Credit-bearing courses provide students with the necessary background; the course selection is flexible and can be tailored for each PhD student's needs. Non-credit bearing training and activities include a full programme of generic skills and competencies training covering the following areas: Public Engagement, Research Communication, Responsible Research and Innovation, EDI and Wellbeing, Career development and planning and Entrepreneurship.

The CDT organises a number of activities, cohort-based or cross-cohort, involving industry partners such as mini-projects, hackathons, BonsApps, company and entrepreneurship days and seminars.

Students and supervisor in the Informatics Forum

Students have the flexibility to attend more training according to their own needs and objectives. They also have many opportunities to build their support and professional network through regular cohort meetings and PGR community events within the CDT, the School of Informatics and beyond the CDT.

Internship

All students do an internship with a company (or an agreed alternative) as an explicit part of their PhD programme. Internships are typically between 3 and 6 months and take place in year 3. The internship can be replaced with other forms of engagement with external partners, bodies or stakeholders. This can be a research visit in another University or lab, a placement with public sector companies or charities, or via existing exchange programmes.

The CDT's industry partners are all potential hosts for internships but students are free to find a host of their choice.

CDT Partners

Research Project

The Individual Research Project forms the core of the activities to be done during the four years of the PhD and bears most of the PhD programme credits (540 credits). Throughout the full programme, students engage in PhD research, working with their first and second supervisors. Students meet at least weekly with their supervisors, and meet at least monthly with their research group.

The research project is defined during the first year by both the student and the supervisors and is the object of the final PhD thesis, due to submit by the end of the 4th year. Research projects within the CDT remit should include both aspects of Machine Learning and Systems (can be more or less of one or the other).

Example of PhD projects within the CDT remit

Novel Learning Approaches and Multi-Agent Learning for Large Neural Models

The current paradigm for learning networks of every sort are variants of gradient based learning. However such learning is highly inefficient - each gradient step erases significant information learnt in the previous step. Furthermore, such learning processes cope poorly with distributed data and distributed learners. In this project we look beyond current slow gradient methods to new learning approaches that have better theoretical properties than gradient methods, and consider informational transaction between agents that enables much improved ability for each agent to optimize for task. This is particularly valuable for edge-device learning.

ServerlessLLM

Large Language Models (LLMs) demand substantial GPU resources for online platforms, prompting service providers to investigate cost-effective serverless inference architectures. Dynamic workloads are then efficiently consolidated into a shared GPU infrastructure. But this can hit latency due to frequent loading and unloading of models from storage. Our ServerlessLLM is a low-latency serverless inference system tailored for LLMs. It has an LLM checkpoint store and the first live migration algorithm for LLM inference.

Testing Safety of Perception AI on Hardware Accelerators

Autonomous vehicles (AVs) will happen in the near future. Yet, concerns about safety remain to be addressed. This project focuses on assessing safety of perception AI tasks within AV. Perception AI is responsible for detection of vehicles, pedestrians, lanes, traffic light, etc. Such tasks use deep learning, require enormous processing power and rely on hardware accelerators like GPUs and FPGAs. Real-time failure can occur due to incorrect implementation on the hardware accelerators, leading to timing uncertainty, unsafe memory accesses, incorrect data parallelism (see Figure). GPU related bugs are one of the five real faults categories in deep learning tasks like object detection.

Progression from a year to another (year 1 to 4) is assessed annually and based on results of (1) the credit-bearing courses, (2) the PhD research review and (3) the training attendance.

This article was published on 2024-12-03