Artificial intelligence struggles to understand “no” and it is a problem especially in the medical field

As far as they are in a hurry, artificial intelligences – even the most refined – continue to have several problems. Some perfectly human, such as the famous hallucinations to which they face, or prejudices …

Artificial intelligence struggles to understand "no" and it is a problem especially in the medical field

As far as they are in a hurry, artificial intelligences – even the most refined – continue to have several problems. Some perfectly human, such as the famous hallucinations to which they face, or prejudices that often inherit by their human “trainers”. Others more exotic: unlike a human being, for example, artificial intelligence seems to struggle a lot to understand the word “no”, or rather, the denial of something. The phenomenon has recently been described by a team of researchers from the Massachusetts Institute of Technology and could create many problems above all to the applications of artificial intelligence in the medical field.

For their study, the researchers created a database of almost 80 thousand couples of images in which one object is present in one and absent in the other, each accompanied by a caption that describes the absence or presence of the object in question. This tool, baptized Negbench, was therefore used to test the ability of different among the most popular Vision Language Model, or the artificial intelligence models capable of understanding and analyzing both the texts and the images. Among these, 10 versions of the CLIP model AI of Openai, and a recent Apple called Aimv2.

Furthermore, two of the clip models used to have been previously trained specifically for the interpretation of medical imaging. In the first of their tests, the researchers asked the AI ​​models to identify images that contained a certain object and did not contain another – for example, images that presented tables but not chairs – by immediately demonstrating the problems that linguistic models have to understand the denial: on average the tested artificial intelligence models obtained a precision of 80 percent in the recognition of the objects present in the image, but only 65 percent in identifying the images based on the objects not present.

The IAs that do not understand the “no”

In a second experiment, they therefore put the two models trained directly for the recognition of medical images directly, asking them to choose the right caption to describe the content of a slab, from a list of two possible answers containing not only the visible characteristics in the image, but also those not visible, such as the presence, or the absence, of signs attributable to a pneumonia. In this case, in cases where there was a negation, the best of the two to the newspapers reached just a 40 percent of accuracy, despite being a trivial activity for a doctor in flesh and blood.

According to the MIT researchers, the problem derives from the learning models used to train artificial intelligence, the so -called transfroteors developed by Google researchers in 2017 and today used by almost all artificial intelligence that develop natural language. As he explained on the pages of Newscientist Karin Verspoor, an expert of the Royal Melbourne Institute of Technology, they are in fact models designed to recognize the specific meaning of the terms in relation to the context in which they appear. And the fact that denials as “no” and “not” are independent of the context, and are repeated in many different points of a sentence, it means that these learning models have difficulties in interpreting their meaning, and end up ignoring them more often than necessary.

These difficulties may have serious consequences especially in a field such as medicine, in which what is absent in an image – for example a tumor, or a fracture – is often as important as what is present. By specifically training the Vision Language Model to understand the denials, MIT researchers managed to improve their 10 percent of them in the first of the two experiments, and 30 percent the accuracy in the second. A piece, which does not solve the problem at the root: in order to succeed, in fact, it will be necessary to change the same learning models with which the AI ​​are trained, and it is not an easy or quick objective to be achieved.