Towards Explainable and Robust Statistical AI: A Symbolic Approach http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/R021317/1 Data science provides many opportunities to improve private and public life, and it has enjoyed significant investment in the UK, EU and elsewhere. Discovering patterns and structures in large troves of data in an automated manner - that is, machine learning - is a core component of data science. Machine learning currently drives applications in computational biology, natural language processing and robotics. However, such a highly positive impact is coupled to a significant challenge: when can we convincingly deploy these methods in our workplace? For example: (a) how can we elicit intuitive and reasonable responses from these methods? (b) would these responses be amenable to suggestions/preferences/constraints from non-expert users? (c) do these methods come with worst-case guarantees? Such questions are clearly vital for appreciating its benefits in human-machine collectives. This project is broadly positioned in the context of establishing a general computational framework to aid explainable and robust machine learning. This framework unifies probabilistic graphical models, which forms the statistical basis for many machine learning methods, and relational logic, the language of classes, objects and composition. The framework allows us to effectively codify complex domain knowledge for big uncertain data. Concretely, the project aims to learn a model that best summarises the observed data in a completely automated fashion, thereby accounting of both observable and hidden factors in that data. To provide guarantees, two distinct algorithms are considered: (a) an algorithm that learns simple models with exact computations; (b) an algorithm that learns complex models but rests on approximations with certificates. To evaluate the explainable, interactive nature of the learned models, the project considers the application of dialogue management with spatial primitives (e.g., "turn south after the supermarket"). We will study the scalability of these algorithms, and then evaluate the closeness of the learned models to actual suggestions from humans. Computationally efficient and explainable algorithms will significantly expand the range of applications to which the probabilistic machine learning framework can be applied in society and contribute to the "democratisation of data." This article was published on 2024-11-22