How to spot it and how to stop it

Algorithms are deciding more about our lives than ever before. They make friend suggestions and curate much of the content we view, from TV recommendations to search results. But beyond this, artificial intelligence (AI) algorithms are increasingly being used in sensitive areas, including hiring, security, healthcare and the criminal justice system. And it’s in these fields where algorithm bias can have a genuine impact on people’s lives.

There is growing awareness to go with this growing impact. Beyond the General Data Protection Regulation (GDPR) the European Commission has set out Ethics Guidelines for Trustworthy AI.

There are articles on mainstream news sites and books and rappers talking about the data available on people. Not too long ago, algorithm was a word only those in certain roles would have any awareness of. Now, thanks to Facebook and Google, it is a term known by many.

This confluence of impact and awareness is building to an imperative for any business that has data and automated decisions.

How AI bias happens

Those involved in building automated or AI decision systems don’t intentionally set out to create biased outcomes. Human-created bias is often already present in the data, and AI systems are only as good as the data fed into their algorithms.

Sometimes bias will come from the choice of fields included in the algorithm. An example is using ‘age’ as an input to identify a negative health outcome. Other times, the bias is less obvious. For example, a specific feature like ‘age’ might be excluded from the data, but the model may be able to infer age from other features like  ‘history of cataracts’.

The embedding of imperfections and prejudices into algorithms can have the effect of amplifying existing biases in society. If we’re not aware of it, this can ultimately lead to unequal outcomes in AI decision-making.

Data as a source of bias

When assessing bias, consider using a data sample that is a fair representation of the population impacted by the model. Evaluating your outcomes compared to bias in your current population may provide a more pragmatic benchmark than comparisons made to a broader population. Be mindful this approach may work in some situations but not all.

Also, consider that you might not have captured data on all the items creating potential bias in your data. The best way forward is to use what you have and seek appropriate proxies for the other potentials. Perhaps you’re concerned about bias on the socio-economic spectrum, for example, but you haven’t captured enough information on this. A good starting point is to leverage the Socio-Economic Indexes for Areas (SEIFA) developed by the Australian Bureau of Statistics.

So, you now have a representative sample plus model inputs. You also have some actuals or proxies for potential protected variables – factors like gender, age and race – that are likely to produce bias with ddiverse outcomes. The next step is checking for bias.

Five tests for detecting bias

Each of these five popular tests for detecting potential bias has a slightly different perspective. Choose whichever is appropriate to the type of bias you most need to avoid.

Let’s say you have a model predicting fraudulent applications and you’re concerned about bias by race. You might choose the Predictive Parity test to guard against bias in situations where fraud is identified. Or, you might select the Equal Opportunity test to check if there is bias in your negative outcomes.

Test 1 - Group Fairness

Assesses whether there is the same proportion of the protected attribute (like gender, age, race) in our positive attribute or not.

EXAMPLE: in a model predicting whether a person earns over $100,000, does the model predict the same proportion of men and women reaching this income?

The test compares these proportions and indicates the level of income inequality bias in the outcome.

Test 2 - Predictive Parity

Assesses correct positive predictions.

EXAMPLE: in a model predicting whether prospects will take up a marketing offer, is there bias in the positive prediction of which prospects will take up the offer?

The test looks at whether one state or another has a higher proportion of correct predictions. If so, the model is biased towards that state and potentially biased against another state.

Test 3 - Predictive Equality

Assesses correctly predicted negatives (the reverse of Predictive Parity).

EXAMPLE: in a model predicting whether prospects will take up a marketing offer, is there bias in the predictions of which prospects won’t take up the offer? 

The test compares the proportion of correctly predicted negatives to assesses whether it is a correct prediction that these prospects won’t take up the marketing offer.

Test 4 - Equal Opportunity

Asssesses how often there is an incorrect prediction of a negative outcome.

EXAMPLE: in a model predicting whether prospects will take up a marketing offer, how often is a negative outcome incorrectly predicted? 

This test assesses how often a prospect is predicted not to take up the offer, when in fact, they do take it up.

Test 5 - Well Calibration

Assesses predictions across the entire distribution of probabilities.

EXAMPLE: in a model predicting that 10% of consumers have the chance to buy a new product, is this a correct prediction? And is the ratio the same across different age groups?

This test assesses whether the model is making incorrect predictions. Also, if there are variations by group, such as by age. If so, it may indicate the model is biased towards or against an age group.

How to deal with bias

So, you’ve found the right test and discovered some bias? Now what to do about it?

Unfortunately, there’s no easy answer. Some believe that the solution is to remove the bias from the training data and build fair models. But this simplistic view has several weaknesses. Firstly, it could be dangerous to remove bias when developing a model for use on multiple populations. If these populations varied by age distribution, for example, an unbiased model could work differently for each population. If one is significantly different from the next, this might even create bias. 

Secondly, this solution lacks a robust future-proofing approach. It doesn’t account for situations where it is preferable to create an AI model containing bias and then adjust for the bias elsewhere in the system.

In a modelled outcome, the impact of the bias might vary across the range of outcome probabilities. This varied impact requires a sophisticated approach to managing this high level of complexity. Technical progress towards this is underway, but researchers are not yet there. In the future, it will be standard to have algorithms and automated processes and decisions that automatically create fair outcomes.

Bias and fairness are not the same

If you remove the bias from the AI model, does that mean it becomes fair? Not necessarily.

The bias in your model might be what ensures a fair outcome. For example, a facial recognition model can produce low accuracy in recognising non-white features if it is built on a sample from a country like Australia. Introducing bias is how the model gets better at recognising these features, which is a fairer result for all.

Of course, fairness has many dimensions. Let’s look at the example of a credit score model that is biased by income. With this model, lower-income home loan applicants may be less likely to  be approved for credit. Some would believe this outcome is fair because it denies credit to those who can’t afford it. But others view fairness as equal access to credit for all.

Achieving a balance between fairness and bias isn’t always easy, but there are ways forward. Learn more about where to start and how to test for bias by contacting Equifax. Through our tools and technology, we’re supporting businesses find ways to resolve bias and deliver the best and fairest outcomes.

Risk Solutions

Shift your perspective to drive more effective risk management and credit decisions through data-driven analytics.