The Fourth of the Six Key Challenges
The issue of what constitutes unfair bias will inevitably be context specific and will involve many wider factors, such as local culture and societal attitudes.
This white paper explores the broad classes of bias. It looks at how and where bias is likely to enter. We can use classes of bias as the first step toward identifying key questions AI developers should consider.
The key classes of bias identified in this paper are:
Development and training bias.
Sample/data bias.
Outcome bias.
Dori.AI's Nitin Gupta discussed how to reduce AI bias on Arm-based edge devices.
4.4.1 Development & Training Bias
The strategic or business intent should be properly – and fairly – reflected in the AI model development. The developer has the key role of reflecting the goal and objectives of the business and its strategic intent, creating a set of attributes that will then be used for AI training and inference. The choice of which attributes should be included is a major source of possible bias.
4.4.2 Sample/Data Bias
Bias can enter the data set as a result of the distribution of the sample data or of the character of the samples themselves.
Bias due to the data itself has been the focus of attention in the debate so far. There is also a risk that bias may relate to data access. For example, certain private data might be excluded from selected processes, which, had the data been accessible to the system, could have helped remove bias. For recommendations on how to protect the integrity of a data set, see section 5.1.
By extension, being able to get access to certain classes of data (for example, private data) could be the only way to verify if an AI is unbiased.
4.4.3 Outcome Bias
Two individuals with similar characteristics in respect of metrics defined for a particular task should get a similar outcome. But even if an outcome bias is, at its root, linked to an implementation, training, or data-sample bias, reaching the state of outcome-bias free is inherently not possible due to technical inaccuracy of AI systems. If we want to ensure all groups and individuals are treated the same way, then algorithm developers must ensure that the probabilities of a false positive and a false negative for different groups are equal.
4.4.4 Other Biases
In federated learning, there are many possible causes of bias:
Bias introduced by the sampling of parties and how they are queried (e.g. network availability may bias how each party contributes).
A model may be trained on a smaller, specific set of data with a strongly heterogenous source (e.g. due to geo-location of the different parties).
The fusion algorithm, depending on how it weights the contribution from different parties, may further amplify or introduce bias.
The complexity is in the integration of models trained using heterogenous data.
Seeking Assurance
Does the system treat all groups and individuals the same way?
Providing Detailed Information
Has the system integrator considered possible bias in the collection of data?
Has the technology developer considered possible bias in the implementation of the data?
Has the system user considered possible bias in the interpretation and use of the AI-based recommendation system?