Leveraging out of distribution testing to build robust machine learning systems
The advent of machine learning (ML) systems that learn directly from data offers powerful new capabilities and demonstrates incredible performance at a variety of tasks. However, it also presents a fundamental challenge to reliability due to their inherent nondeterminism. This poses a problem as these systems are increasingly used in production. To tackle some of these challenges, we put forward the following thesis: Out of Distribution Testing is essential to produce robust machine learning systems.
This thesis aims to study and formalise various properties an ideal machine learning system should satisfy. Additionally, we propose a few techniques that can be used to uncover instances where these properties are not satisfied. We also propose some suggestions to mitigate the broader class of errors uncovered in the course of our testing. Concretely, we propose a set of automated techniques to uncover these errors in existing ML systems.
In this dissertation, we test the robustness of machine learning models to changes in domain, distribution and target population. We uncover tens of thousands of errors in a broad suite of machine learning models from leading software providers such as Google, OpenAI, Amazon, Microsoft Azure, and IBM. In particular, we look at models dealing with text, image and audio inputs. We then propose a variety of approaches to mitigate the exposed weaknesses.
This dissertation serves to remind us of the importance of thoroughly testing these models before deploying them as they can cause societal, economical and reputational damage. These models could even cause fatalities as they are increasingly being deployed in safety critical fields. In summary, we would like to encourage thorough testing of machine learning models before they are deployed in production.
Speaker’s profile

Sai Sathiesh Rajan is a PhD student at Singapore University of Technology and Design. His research is concerned with testing AI/ML systems with a particular focus on robustness and fairness. He also holds a master’s and a bachelor’s from Georgia Tech.