Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) 🔍
“Don't think this COMPAS is pointing the right way...”
Hi there EInsighters, and welcome to this month’s issue on Correctional Offender Management Profiling for Alternative Sanctions (COMPAS).
I’m your curator, Idil, and I'll be taking you on a journey through one of the most talked about tech scandals in the world, while also drawing on the EInsight of our selected EI Experts.
So let’s dive right in - shall we?
A company called Northpointe Inc. (this was way before they merged with Courtview Justice Solutions Inc. and Constellation Justice Systems Inc. back in early 2017 to form equivant) created a case management and decision support tool by the name of Correctional Offender Management Profiling for Alternative Sanctions (COMPAS). The tool has become a court favourite in the U.S. and, despite a surplus of criticism, continues to be used in at least 46 states across the country. So what does it do exactly? It assesses the likelihood of a defendant reoffending in the future. Considering U.S. history and the diversity of its population, it will come off as no surprise to any of you that the tool wasn't exactly fair to everyone. And that, EInsighters, is exactly what ProPublica’s 2016 report revealed - here’s just one of the statistics from the team’s findings: ”The violent recidivism analysis showed that even when controlling for prior crimes, future recidivism, age, and gender, black defendants were 77 percent more likely to be assigned higher risk scores than white defendants."
But could it have been avoided?
Well, I’ve asked just that to our EI Experts and this is what they had to say...
When it comes down to business, Anna argues that companies like Northpointe end up on the news for all the wrong reasons because of two main issues: (1) lack of education, and (2) lack of accountability in the ethical sense.
“Data scientists learn a lot about the technical aspects when they embark on their learning journey - however, most courses don’t take into account the nuances of ethics. Data scientists don’t necessarily need to become ethics experts, but they need to at least be aware of the ethical implications of their work. Even if it’s not technically ‘their job to do the ethical evaluation’, companies need people within them that can carry out these ethical evaluations - people who will question where the datasets come from, whether they were made for other purposes, and so on. There needs to be someone within the company that can speak out and say, ‘yes, these datasets are obviously biased, but let’s work with them and try to make them less so, and improve the results so the results themselves won’t be biased’.”
Another thing Anna mentioned was this potential lack of accountability -
“Most companies have clear distinctions when it comes to responsibility and accountability. The sales person knows they’re going to be held accountable for the sales side, whereas a technical person will be expected to handle the technical aspects of the product. But when these roles are so clearly defined and there is no specific person that is accountable for ethics, whose role is it to focus on it? This is where I think the new AI regulations will start to provide a helpful guideline for companies (at least for those located within the European Union); the regulations will help define products by the level of risk that they’re going to have and set clear obligations for companies. Had the COMPAS tool been deployed under these, I'd argue it would fall under ‘High Risk’, which would mean there would need to be appropriate human oversight measures put in place to minimise risk.”
My question is, from a business perspective, is this actually a viable product?
“The product itself isn’t a terrible one. If the team had had that ethical awareness - if they had understood the implications that this tool would have on people once it was deployed, and they had worked on the datasets and the models to make sure that they were efficient whilst also being fair - the product would not have gathered the criticism it did. When we talk about fairness and justice in AI, we say that products need to prevent bias - that all the data should be of quality and representative, but that there should be an evaluation of the bias on that data. That was completely missing but it need not be the case going forward.”
“This is a classic example of misuse of Machine Learning that resulted from a shallow understanding of the theory, methods and techniques embedded into modern tools available to build and use machine learning models. In particular, we’re referring to the deficient differentiation between retrospective identification of patterns in historical data and inference of causal relations to perform predictions based on future data, which in turn lead to biased decisions grounded on misinterpreted correlations.”
Now, if like me you’re feeling a bit lost here - don’t fret!
Flavio’s got us covered with a hypothetical that helps clarify things further -
Meet George 👨💻👋. George is hired as a data scientist by the Secretary of Education to analyse data related to young citizens - more specifically, to find patterns that could help identify children with high likelihood of experiencing difficulty in learning mathematics. During his analysis, George finds a high correlation between shoe size and math skills, and concludes that small feet = future necessity for additional math tutoring. What did George miss? He didn’t consider that perhaps age is the relevant attribute here; although the co-occurence of shoe size and math skills had been observed previously as a pattern in historical data, “it would be a mistake to infer statistical dependence between these two features based on this observation and build a decision rule based on it”. When you really think about it, shoe size and math adequacy are conditionally independent of one another and can both be attributed to age - older kids tend to be better at math and have larger feet than their younger counterparts.
So I guess the real question is - should we get rid of historical patterns altogether?
“Historical patterns featuring co-occurrences of ethnicity and criminal recidivism may be observed and can be valuable information to identify the roots of social issues that need to be solved; however, such observed patterns provide no support to infer conditional dependence between these attributes, as should look obvious for a social scientist or an anthropologist. A more valuable interpretation of such observed patterns could be as clues to look for root causes capable of surfacing out and explaining conditional independence, which could be better candidates for social interventions than deciding to hold offenders in prison for longer given the tone of their skins.“
So far this misuse of Machine Learning seems to have had catastrophic effects and yet the tool is still commonly used in courts across the U.S.. But why?
“Some believe that a benefit of using this tool is that it aids in overcoming societal prejudices of human decision makers and shifts accountability to a more 'objective' actor. This is problematic as referring to the COMPAS tool as being objective ignores the social systems and institutions through which these risk assessment algorithms are produced. Predictions are often made based on insufficient information or biased information. The algorithm considers individual factors like employment status, age, location, sex, and family history. What this often leads to is a situation where two people charged of the same crime may have drastically different bail or sentencing results based on factors beyond their control. This results in inaccurate forecasts which are detrimental to people of colour.”
I recall there was a popular Wisconsin Supreme Court case back in 2016 which hinted at that - State v Loomis, wasn’t it?
“That’s right. Following that case, an independent testing of the COMPAS risk assessment revealed that offenders of colour were more likely to get higher risk ratings than Caucasian offenders. The Loomis case also pointed out the possibility that the COMPAS tool violates ones’ due process rights. What’s even more problematic is its lack of transparency; individuals and courts are unable to assess how the risk scores are arrived at or how the elements are evaluated because it’s a trade secret. So we are left to wonder - how then can fairness be achieved?”
It seems quite peculiar how judges continue to utilise it despite all the criticism. I mean, think about it - promoting a fair and just legal system for all is pretty much Law 101 regardless of the jurisdiction you’re practicing in.
So are the judges to be blamed?
“The COMPAS tool is not adequately understood by judges who make use of it. As a result of the tool receiving a lot of support, this makes judges believe it is trustworthy and some of these judges may tend to over rely on it. Bear in mind that the sentencing process itself is quite complex and there are various time restrictions that play into it, which makes judges more prone to over rely on the tool itself. Therefore, it can be argued that judges are unknowingly promoting racial injustice.“
Among the ways forward, Abigail suggested that: (i) the tool should be evaluated, monitored and updated as frequently as possible to reflect shifting societal norms and to ensure that data is accurate across ethnic and gender groups; (ii) emphasis be placed on transparency in regards to datasets; and (iii) adequate training and retraining be provided for all relevant parties that make use of the COMPAS tool.
“In a bid to advance the criminal justice system, we should not be carried away, but rather we must carefully define the role artificial intelligence plays. Artificial intelligence should support human decision-making in the criminal justice system, not replace it.”
When looking at the COMPAS case, fairness and justice seem to be the key actors with little to no screen time despite the tool’s influence over sentencing decisions.
ProPublica’s report on the topic addresses racial discrimination - although this form of ethical bias can take place on several levels according to Tomas, the two of particular importance here are: (1) bias of the data, and (2) bias of the AI system.
“Bias is inevitable in a dataset. The question is what is acceptable bias and what is not. From an ethical point of view, one important issue is that there is no discrimination based on race or gender. Data bias is the hardest one to get a grip on here. After all, training data is taken from historical data. And it is easy to argue that because of the structures in [American] society and the systemic bias in the American justice system, some form of historical bias has already crept into the data. In this sense, using the software seems to perpetuate and even reinforce discriminatory factors in society. In the COMPAS case, some factors were corrected for, but apparently not enough to obtain a form of fairness and justice. Simply put - data crap in, crap out.
What often gets less attention, however, is the ethics linked to the algorithm itself. Within an algorithm, choices are made. For instance, the choice of a statistical method, of factors taken into account and of accuracy can have a major impact on the false positive or false negative results. In this case racial differences in false positive and false negative remain large."
What COMPAS promised was a more objective system to support decisions - but to what extent is that true?
Tomas argues there are still several questions left unanswered -
“How do judges interact with the software? How is their judgement influenced? Do they feel compelled to follow the advice or do they also dare to go against it? Is its use not 'determinative' as the Wisconsin court ruled? Or does a biased system lead to a more unjust system?
If software is introduced into the legal system, the question inevitably arises as to whether the use of software encourages a more just legal system. There are different aspects playing a role here. For example, is there sufficient transparency to safeguard the rights of the defendant - can the judgment be challenged? Does an algorithm still take sufficient account of the specificity of certain cases? Is the probability of recidivism assessed more accurately than before? At least the latter does not seem to be the case here. What is clear is that there is still a lot of work to be done to strengthen legal support systems so that they truly lead to a more just and fair justice system.”
So what should your business do to avoid a similar tech scandal?
Make sure to continuously monitor, evaluate, and update the tool in line with evolving societal norms
Provide adequate training to the relevant individuals, whether that may be the data scientists in your team or the end users you're aiming for
Ensure human agency and transparency are at the forefront
Or even better, check out our latest offering: the EI ETHICS BOARD.
Let us help you innovate with ethics and mitigate risk.
P.S. We have a webinar coming up on 15 November | 6 pm GMT that will provide you with a blueprint for implementing an AI Ethics Strategy within your company. Register now to receive the webinar recording and a workable PDF!
A massive thank you to our incredible EI Experts for their contribution in the making of this month's issue: Anna Danés, Dr. Flavio S. Correa da Silva, Abigail Ichoku, and Dr. Tomas Folens.
That's all for now, EInsighters! See you next month...
Liked this month's issue? Share it with a friend!
Have a tech scandal you want us to cover? Let me know!