DENVER, Aug 15 — As the saying goes, don’t be fooled by appearances. And yet, artificial intelligence seems to have fallen into that trap. According to an American study from the University of Colorado, some AI-based tools being used in programmes for treating patients in the field of mental health could be relying on biased information.

Stereotypes die hard. And according to researchers at Colorado’s University of Boulder, algorithms have also picked up these clichés. A study, led by Theodora Chaspari, associate professor of computer science, reveals a worrying reality: artificial intelligence (AI) tools used to screen for mental health issues can be biased towards patients’ gender and ethnicity. This discovery raises crucial questions about the fairness and effectiveness of mental health technologies.

Published in the journal Frontiers in Digital Health, the study, entitled “Deconstructing demographic bias in speech-based machine learning models for digital health” demonstrates that algorithms, which are supposed to screen for mental health problems such as anxiety and depression, can make assumptions based on patients’ gender and ethnicity: “If AI isn’t trained well, or doesn’t include enough representative data, it can propagate these human or societal biases,” said Professor Chaspari, Associate Professor in the Department of Computer Science.

After subjecting people’s audio samples to a set of learning algorithms, the researchers discovered several potentially dangerous flaws for the patients. According to the results, the machines were more likely to underdiagnose women at risk of depression than men.

In addition to discriminating against patients’ gender, AI can also erroneously assess patients’ speech. According to the researchers, people suffering from anxiety express themselves with a higher tone and more agitation, while showing signs of shortness of breath. In contrast, people with signs of depression are more likely to speak softly and in a monotone.

To put these hypotheses to the test, the researchers analysed the participants’ behaviours as they gave a short speech in front of a group of people they didn’t know. Another group of men and women talked to one another in a clinical-like context.

While in the first group, people of Latin American origin reported being more nervous than the white or Black participants, the AI did not detect this. In the second group, the algorithms assigned the same level of depression risk to both men and women, yet the latter actually had more symptoms. “If we think that an algorithm actually underestimates depression for a specific group, this is something we need to inform clinicians about,” stressed Theodora Chaspari. — ETX Studio