authors

NEWS 20240802

Synthetic data for fairer AI systems

The creation and use of synthetic data is an known and important method to mitigate bias and unfairness in AI systems. On one hand, synthetic data can be used to augment existing data sets and reduce or remove existing biases for example with respect to class distributions. On the other hand, synthetic data generation can be used to create deliberately polarized data to introduce bias that can then be used to determine whether existing AI solutions are sensitive to such a degree of bias. The main activity for Work Package 7 of the Aequitas project is to create a data synthesizer to repair and mitigate bias in data.

Action Plan

Bias exists in many different forms and can originate from various sources. Often, bias functions as a mental shortcut or heuristic, which consciously or unconsciously affects how we think or feel about something. Biases may however lead to a structural unequal or unfair attitude towards (or treatment of) one thing/person/group over another. Over 180 cognitive biases have been identified, and they are embedded in our preferences, our decision-making, our education, our organizational structures and our societal constructs. Consequently, bias can find its way into the data that we collect.

Reflecting on when such biases are relevant, and whether they are harmful, is an important social discussion. From a technical perspective, developers of AI systems need to take action to prevent harmful bias to end up in their final product, and to ensure that the AI system creates outputs that are fair to all its users.

Many forms of bias have been shown to negatively affect applications and decision-making. For instance, during data collection well-known biases like sampling bias (e.g., self-selection) can occur, but also lesser-known biases like measurement bias, chronology bias (in the AI domain better known as data drift), confirmation bias, or compliance bias can find their way into the data. Such biases affect how data sets are created, what data ends up being collected, and how representative the data is of all prospective users.

Non-representative data can pose problems when we use it to draw generic conclusions for a large group of people. A well-known example is that clinical studies of heart failure have historically been based on a population of predominantly men. Consequently, women are often under-diagnosed for heart failure, and are more likely to die within 6 months of discharge than men are. Another example where data were non-representative of the actual population of users is an AI-based app that claimed to quite successfully detect Alzheimer’s disease from speech. Unfortunately, it turned out to only work well for users with the same regional speech accent as where the app was developed.

Examples of bias during the model (or algorithm) development stage are overfitting and underfitting when models generalize using features that are not relevant to the task at hand. For instance, when models learn to connect the absence/presence of a pathology to the absence/presence of a side marker (L/R) on an X-ray image. Underfitting happens when models undergeneralize when they are unable to capture the structure in data during model training. Latent bias occurs when a model incorrectly correlates concepts, and these relations do not apply more generally. Modeling choices can also introduce bias when proxies are used because the outcome of interest is not available. For instance, when parameters like diet, sleep patterns, and living circumstances are taken as proxies for health. An algorithm using these parameters to determine insurance eligibility may choose to disadvantage vulnerable groups.

Even during the deployment phase of an AI model, bias can occur. For instance, it has been shown that automation of procedures or decisions can lead users to be over-reliant on decision support, especially when there is high cognitive load such as there is in hospital settings.

These examples show that it is important to focus on bias during all development stages of an AI algorithm. Synthetic data can support repairing but also mitigating bias in data.

A tour around WP7

Work package 7 is one of the eight work packages of the Aequitas project. Philips manages the work package to ensure that the two main components (the data synthesizer and data from the project’s use cases) are realized and available to the other partners in the consortium.

Philips is a well-known and trusted healthtech company; worldwide many hospitals use, for instance, our MR and CT scanners, Ultrasound devices, or Image Guided Therapy technology. With the increased uptake of AI in recent years, bias and fairness of our AI solutions has become an important topic to Philips as it directly impacts our purpose to create better care for more people. Philips therefore has also provided one of the 6 use cases that are the carrier for the solutions that the Aequitas consortium develops. This use case is about the potential bias in the context of ECGs.

With the historic non-representativeness of typical clinical data on heart failure, ECG is an important topic to make sure that all, in this case, patients are treated fairly. This means that the chances that particular groups of patients are over- or underdiagnosed should be minimal. To make sure that AI solutions for ECG can be trained on fair and balanced data sets, the use case motivates creating a data synthesizer to generate synthetic ECGs. These synthetic ECG data will be made available to the research audience as part of the Aequitas project deliverables.

Next to the Philips use case, Work Package 7 is also ensuring that the data belonging to 5 additional use cases are available for all consortium partners to work on. These data sets can be used to test the implementations of existing bias metrics but also to validate the results of new metrics that other partners in this project are developing. While not all data sets can be made publicly available during the project, their meta data are published as part of the Aequitas project deliverables. Because the context of the use cases is sometimes very specific, Philips is cooperating with the University of Bologna to create a synthesizer for image data because this is a strength of those academic partners.

All of these use cases bring data into the consortium that we need to train and test our methods and techniques.

Conclusion

To summarize, synthetic data is an important component of the Aequitas project because it is part of all stages of the development of an AI solution: from the pre-processing stage where data quality can be improved with synthetic data, to the model training stage when it is clear that a model generates unfair outputs, to the validation stage where (preexisting) models need to be assessed for their sensitive to biased data input. In this way, Work Package 7 has foundational role in the project enabling the other work packages to develop and test their methods and solutions.

About the authors

Paul Lemmens is a senior scientist at Philips Innovation Engineering working on a range of topics related to Responsible AI. His colleagues Sri Andari Husen, Maarten Rietbergen, and Arlette van Wissen are senior scientists as well. Sri is an expert on synthetic ECG, Maarten supports the project with architecture and integration of the synthesizer into the main Aequitas service, and Arlette van Wissen is opinion leader on Responsible AI.

authors

found this interesting?

share this page