Safety-critical systems incorporate more and more autonomous decision-making, using Artificial Intelligence (AI) techniques into real-life applications. These have a very concrete impact on people’s lives. With safety a major concern, problems of opacity, bias and risk are pressing. Creating Trustworthy AI (TAI) is thus of paramount importance.
Advances in AI design still struggle to offer technical implementations driven by conceptual knowledge and qualitative approaches. This project aims at addressing these limitations, by developing design criteria for TAI based on philosophical analyses of transparency, bias and risk combined with their formalization and technical implementation for a range of platforms, including both supervised and unsupervised learning. We argue that this can be obtained through the explicit formulation of epistemic and normative principles for TAI, their development in formal design procedures and translation into computational implementations.
A first objective of this project is to formulate an epistemological and normative analysis of TAI as undermined by bias and risk, not only with respect to their reliability, but also to their social acceptance. Accordingly, we will analyse the Meaningful Human Control (MHC) requirement for more transparent AI systems operating in safety-critical and ethically sensitive domains.
A second objective is to define a comprehensive formal ontology, including a taxonomy of biases and risks and their mutual relations for autonomous decision systems. Our task is to offer a systematic characterization of the bias types, to make them viable for formal and automatic identification, and define risks involved in the construction and use of possibly biased complex AI systems.
A third objective is to design (sub)-symbolic formal models to reason about safe TAI, and produce associated verification tools. We will articulate principles of opacity, bias and risk in terms of cognitive representation by extensions of Description Logics and inferential uncertainty modelling in terms of proof-theoretical and semantic approaches to trust, feasible for formal verification.
Finally, a fourth objective consists in developing a novel computational framework for TAI systems explanation capabilities, aimed at mitigating the opacity of Machine Learning (ML) models in terms of hierarchical structure and compositional properties of middle-level features.
Overall, this project will advance the state of the art on TAI by: developing an epistemically and ethically guided analysis of opacity, bias and risk; investigating the integration of logical symbolic systems with currently applied statistical techniques; and supporting the verification and implementation of less opaque and more trustworthy systems.