A four year integrated training programme for the next generation of NLP practitioners Natural language processing is transforming the way humans communicate with each other and with machines. We have witnessed the rapid evolution of a wide range of systems that translate text, recognise or produce speech, answer questions, retrieve documents or facts, respond to commands, summarise articles, and simplify texts for children or non-native speakers. The rapid proliferation of online news, social media and scientific articles has created an exploding demand for systems that enable people to derive critical insights from massive streams of data in many languages.Our four-year integrated training programme gives students a solid foundation in the challenge of working with language in a computational setting and its relevance to critical engineering, scientific and ethical problems in our modern world. It also offers training in the key software engineering and machine learning skills necessary to solve these problems. The programme aims to have a transformative effect as we train, and on the field as a whole, by developing future leaders and producing cutting-edge research in both methodology and applications. CDT in NLP Handbook v6 - Requires School of Informatics Login Information, training and support for postgraduate research at the University EdHelp PhD with Integrated Study The program attracts students from a diverse range of backgrounds and disciplines, including computer science, artificial intelligence, maths and statistics, engineering, linguistics, cognitive science, and psychology. Such an interdisciplinary cohort requires a training approach that is more flexible than the standard three-year PhD, which is why this program takes the form of a four-year PhD with integrated training. It interleaves training at the level of a master's degree (180 credits of courses and project work) with PhD research. The advantages of this structure are:By mixing courses and PhD work, students gradually progress from classroom teaching to independent research. At the same time, research will inform their learning experience from the first day, and they can immediately apply skills learned in the classroom to their PhD project.Students can take the courses that are relevant to their research when they need them, rather than having to anticipate all their training needs in advance and front-load all their courses in year 1.The degree structure allows for maximum flexibility to accommodate a cohort of students with a wide range of backgrounds. Students who have a lot of prior NLP training, for example, would be expected to do a research-heavy first year (followed by advanced courses informed by their PhD project), while students with less relevant backgrounds can take a larger number of foundational courses upfront.While all students select an individual set of courses, there are also shared components that everyone takes, which together with a programme of staff and student led events will promote cohort formation Courses OverviewThe structure of the PhD with integrated studies requires all students to study taught courses whilst concurrently completing the research elements required by the traditional PhD programme. We designed the progam to be maximally flexible in the way in which credits are accumulated, however, all students must successfully complete a total of 180 taught credits (with at least 150 credits at level 11) over the first three years, in addition to the equivalent of three years of PhD research, spread over the four years of the programme.A path through the programme is given in the table below - this is an example only and other paths are possible:Year 1Doing research in NLP[20pts]Foundational courses[40pts]Group project[20 pts]Individual project[40 pts]PhD research Year 2Specialist courses[30 pts]Case Studies in AI Ethics[10pts]PhD research Year 3Specialist courses[20 pts]PhD research Year 4PhD research Year 1Students are routed into courses following a Training Needs Analysis on entry. For example, students with strong computer science or maths background will take more linguistics based courses, while students with strong linguistics or cognitive background will take more programming and machine learning based courses. There are foundational and specialist courses listed in the Degree Programme Table. In year 1, the emphasis is on foundational courses (though specialist courses can also be taken).In addition, you will take between 20 and 60 credits of foundational courses, and between 10 and 60 credits of specialist courses. The list of admissible foundational and specialist courses is given in the DTP. In addition, you can take up to 20 credits of courses from any discipline (schedules A to Q, T and W). When choosing courses in year 1, please bear in mind:You need take 180 credits overall across the first three years of studyOut of these 180 credits, 150 credits need to be at level 11 (Masters degree level) or higher, therefore, you can only take a maximum of 30 credits at level 9 or 10You need to leave at least 10 credits for year 2 (as year 2 contains a 10 credits obligatory course). Hence you can take between 110 credits and 170 credits in year 1, but you should normally restrict yourself to 120 credits of courses, otherwise your course load will be too heavy Try to balance courses equally across the two semesters (typically 60 credits per semester; bear in mind that Doing Research in NLP spans both semesters)You also need to leave time to work on your PhD; you will submit a PhD research proposal at the end of year 1, which is formally assessed.Obligatory courses in Year 1The following courses are obligatory for all first year students:Group Project in Advanced NLP: Students will form interdisciplinary teams to tackle a directed research problem assigned by a team of CDT supervisors. In your group project, you can directly apply the skills you learn in your foundational courses and in the Doing Research in NLP course (see below). As all CDT students in a given year take part, the group project will build the cohort and will also train you in project management and team work skills. The project topics will be defined in consultation with our partners, who may also contribute resources.Individual Project in Advanced NLP: In addition to the group project, each student will also select a supervisor and define a short individual research project, which may be stand-alone or serve as the basis for a subsequent PhD project. We expect you to work with different supervisors on your individual and group projects in order to experience different working styles and broaden your methodological skillset. Some of the individual projects will be conducted with our partners.Doing Research in NLP: Designed to complement the first-year projects, this course will align with project milestones and teach skills that you can immediately put into practice. In addition to technical skills in NLP at the level required for PhD work, it will teach presentation, communication and writing skills. Project and time management, as well as NLP specific aspects of Responsible Research & Innovation, will also be covered.Years 2-4In these years, students take a decreasing number of courses and focus more on their PhD research. Year 2 includes the course Case Studies in AI Ethics that all students take. In addition, they are expected to take specialist courses that complement their PhD research. There are no obligatory courses in year 3, but students can take more specialist courses if they haven't taken all their required credits yet. Students should be focussing fully on their PhD research in year 4.Additional TrainingYou will have access to the training courses that the University's Institute for Academic Development runs for PhD students. Topics include Research Planning and Management, Communication and Impact, Personal Effectiveness, Public Engagement/Outreach, etc. Degree Program Table Institute for Academic Development Cohort-based Training Cohort-based doctoral training differs from a standard PhD in that you will take part in cohort-wide training modules as part of a more structured programme, rather than training in specific research-based skills as an individual or as part of a small research group on a traditional PhD. CDT students are trained in cohorts of varying sizes.The UKRI CDT in Natural Language Processing training programme has several advantages over a traditional PhD programme. The collaborative nature of a CDT means there will also be emphasis on multi-disciplinary and inter-disciplinary knowledge, training and research tailored to address the skills needed at doctoral level.As a CDT NLP student, you receive high-quality training in practical skills as well as acquiring academic knowledge and confidence for your future career - whether in academia or industry.You will enjoy a supportive environment with plenty of opportunities for collaboration with both academic and non-academic partners to provide you with a diversity of expertise. The CDT NLP also encourages you to make links with industry to develop real-world relevant skills.As a student on our CDT, you will be expected to participate and contribute fully as a member of your cohort, and the programme, in order to enhance the shared training and development of all your peers.Cohort CollaborationIn addition to your research and academic deliverables, you will take full advantage of the cohort and collaborative nature of the broader CDT NLP programme, including:outreach and public engagementindustry/public sector liaisonevent participation and co-ordinationdevelopment of your 'soft skills set'. Facilities We are hosted by the School of Informatics, which has an international reputation for excellence in research and teaching. The School includes over 100 faculty, 200 postdocs and 350 PhD students. Aside from NLP, it is internationally renowned for its contributions to machine learning, cognitive science, robotics, and databases. The School is housed in the award winning Informatics Forum, a building designed to foster innovation and to facilitate the interaction between a diverse set of research groups through a unique open design around a central atrium.CDT NLP student cohorts may be located between the Forum, Wilkie or the Bayes Centre, a purpose-built £40M multi-disciplinary hub that co-locates researchers in informatics and mathematics with R&D staff from industrial partners.The School of Philosophy, Psychology and Language Sciences is physically located in the adjacent Dugald Stewart Building, which together with the Informatics Forum and the Bayes Centre covers a city block in central Edinburgh’s technology corridor.As a PGR student, much of your work will be independent and self-directed. However, the Student Disability and Learning Support Service can still offer you support for your studies. Additionally, The University has developed a plan for supporting the use of British Sign Language in University activities. School of Informatics Bayes Centre School of Philosophy, Psychology & Language Sciences Disability and Learning Support Centre British Sign Language Compute and Lab Resources Our students will have access to a large GPU (graphics processing unit) cluster and to a terabyte storage array, both dedicated to NLP research. Furthermore, they have access to the Edinburgh Compute and Data Facility (ECDF), a central University resource that maintains a cluster of over 4,000 compute cores and a large high-performance storage facility as well as Edinburgh International Data Facility (EIDF). A number of our industry partners provide in-kind support to the CDT in the form of compute credits, GPU hardware, and access to proprietary datasets.Some of the PhD projects conducted under the auspices of the CDT involve lab-based experiments that investigate human language and speech processing. Students have access to our state of the art experimental facility comprising sound studios, an anechoic chamber, an eye-tracking lab with three high resolution trackers, and a suite of experimental booths for perception experiments. The Bayes Centre also includes a dedicated virtual/augmented reality lab combined with motion capture and eye-tracking. Edinburgh Compute and Data Facility Edinburgh Internation Data Facility Responsible Research and Innovation Throughout the four year programme, the CDT includes a strong focus on Responsible Research and Innovation (RRI) considerations.This includes a compulsory course in Year 2: Case Studies in AI Ethics and is supported by additional resources such as expert seminar speakers, workshops, industry collaboration, etc.RRI seeks to promote creativity and opportunities for science and innovation that are socially desirable and undertaken in the public interest. The aim of RRI is to strengthen research and innovation projects, making them more open, transparent, diverse, inclusive and adaptive to changes.As a recipient of public funding for research, our students have a responsibility to ensure that their research is aligned with the principles of RRI, creating value for society in an ethical and responsible way. Public Engagement and Outreach In addition to their research being published in presitigous journals and conferences, the students will also engage directly with the broader community through outreach activities, either initiated by their School or themselves.Public engagement activities include:Participating in festivalsWorking with museums / galleries / science centres and other cultural venuesCreating opportunities for the public to inform the research questions being tackledResearchers and public working together to inform policyPresenting to the public (e.g. public lectures or talks)Involving the public as researchers (e.g. web based experiments) Engaging with young people to inspire them about research (e.g. workshops in schools)Contributing to new media enabled discussion forums. Equality, Diversity and Inclusivity We value diversity and inclusiveness and believe that maximising the contribution of every individual enables us all.Whilst welcoming and supporting freedom of thought and expression, we also seek to embed a culture where all students and staff are treated with respect and feel safe and fulfilled within our community.We seek your support, as a CDT student, in ensuring that equality, diversity and inclusion are championed throughout the programme. The CDT's EDI Champion, Dr Bjorn Ross, and CDT Coordinator, Sally Galloway, are key points of contact should you wish to get involved with current School ED&I initiatives, propose new initiatives or wish to raise any concerns. The University has a zero-tolerance stance towards any form of bullying and harassment. The Respect at Edinburgh web hub has been created to bring together information and guidance on the Dignity & Respect policy, which is available through Respect at Edinburgh.CDT NLP students are requested to undertake the online course on Unconscious Bias to improve awareness.Statistical monitoring of the CDT NLP student diversity profile is carried out by the CDT team annually. Specifically, we measure the following progress indicators in the CDT population: balance of gender, age, disability, and race. Through EDI surveys, we collect information on student satisfaction with the working environment, overall School culture, CDT support structure, and the interaction with their supervisors and peers.Based on this data, our annual review identifies the CDT's success in achieving its EDI objectives and its contribution to promoting EDI in the wider School, University and research community. These reviews recommend additional actions and result in the refinement of the CDT's EDI strategy.Be a Respect ChampionSchool of Informatics’ Equality and DiversitySchool of Informatics’ Athena SWAN AwardWomen in Computing This article was published on 2025-02-24