Ouishare

INTERVIEW with Paola Tubaro. Health, food, work, etc. There is not even one domain in which Artificial Intelligence (AI) is not presented as the big revolution of the future. Behind the fantasy of a new and unprecedented form of automatic intelligence, however, lie behind-the-scenes games of power and actors against a backdrop of hidden work. Researcher, Paola Tubaro spoke to us more in depth about these shadow workers without whom all "Artificial Intelligence" would be incapable of functioning.

The original article was published in French. Click here to access the original link.

We are experiencing an excitement around AI? How do you explain this?

Paola Tubaro: The excitement around AI begins with a semantic shift that has been taking place for some years now. Before, we used to speak about big data, algorithms and now it’s AI that is on everyone’s lips. But it really is the same thing! This recent revelation about AI hides a technology that is not so recent: machine learning (or apprentissage automatique, in French). This technique allows the algorithms to analyse the data and its consistency and to identify correlations, for example between people's skills and the degree they have obtained. It is from these correlations that they deduce the rules to apply: if such a person has attended such a school, then he must be recruited. It is this technology that generates so many fantasies around a new form of "intelligence".

For companies using AI, acknowledging that they rely on workers tends to be very embarrassing: as a user, we think we are talking to a machine, not a human listening to us from the side of the world!

What makes this technology intelligent?

P.T : Machine learning is associated with a form of intelligence because the algorithms learn by themselves, like a child who, after seeing two dogs, will be able to recognise the third and subsequent dogs. It will have integrated the characteristics of the dogs, beyond their differences. In this respect, machine learning differs from the classic operation of algorithms that systematically associate an "if" condition with a "then" decision. The decision taken by the machine learning algorithm is the result of a posteriori observations and not of a rule set a priority. In the 1980s, this technology was limited by the computational capabilities of computers at the time and the lack of available data. Today, these limitations have been overcome, which raises new questions and problems. By relying on large amounts of data, machine learning reproduces discriminations embedded in that information. This is the first problem.

On this issue of the risk of biased functioning of algorithms, companies are setting up ethics departments to address it... What do you think?

These departments are in reality above all a means of self-regulation, and therefore of escaping external regulation that would likely be harder for these companies. The recent dismissal of researcher Timnit Gebru, who was working on AI ethics at Google, shows that these commitments to ethics are limited. The problem is that these companies, and especially the larger ones, manage to escape democratic control. This brings me to the second problem with machine learning algorithms, which rely on increasingly massive amounts of data. To function, they require many shadow workers. This is the hidden and laborious side of AI.

How can we explain our belief in 100% automated algorithms, and therefore our ignorance of the human arsenal they require to function?

P.T: We must already specify the different types of workers behind these AIs. There are those, far upstream, who annotate images to enable autonomous vehicles to recognise signs and find their way around a city, or who record their voice to train voice assistants for example. Further downstream, there are other workers who deal in real time with questions that AIs cannot solve. A voice command issued by a user somewhere in the world that is not very explicit, for example.

Once this precision has been established, we can then further distinguish two main causes of the ‘invisibilisation’ of these workers. On the one hand, there is a reputational issue for the companies that market these AIs. They have every interest in perpetuating the idea of totally automated algorithms in the consumer's imagination, synonymous with their technological superpower. Conversely, acknowledging that they rely on workers to provide these services would be very embarrassing, especially for voice assistants like Alexa or Siri. As a user, we think we are talking to a machine, not a human listening to us from the other side of the world! On this point, the conditions of use of these services are not very clear. It is mentioned that interactions with the tool are used to improve the service but the conditions under which this happens are not made very explicit. On the other hand, it is a marketing strategy. Companies selling data to these AI producers offer a complete package. They do not specify how the data to train the algorithms were obtained, how much they were paid, or under what conditions the workers processed them.

Some people who work in AI are not even aware that what they do is work!

Isn't this commercial argument also linked to an economic reason?

P.T: Hiding the work of these people is also a way to optimise costs. Because even if these companies make people work in low-wage countries, since they need large amounts of data, it ends up being expensive. Denying the fact that the various tasks human beings perform on data constitutes work makes it possible to avoid labour law and the costs involved.

Finally, what are the changes between the proletarians of the 19th century factories and this new precarious class of AI workers?

Where platform workers such as bicycle delivery men have a place to gather in real life, on the street and in front of restaurants, AI workers are often isolated. They don't have the opportunity to meet and organise solidarity between workers. We spoke to some of them who didn't even know that what they were doing was work! Quite often, they have only had interactions with the platform that employs them, a system, and never with human representatives of their final customers. They are the victims of isolation, loss of meaning, difficulties in judging their own actions. The psycho-social risks associated with this atomisation of workers can be great for those who do this work as their main activity and not as a side-line. Faced with this observation, a few years ago Amazon Mechanical Turks opened a space for workers to talk about the difficulties linked to their work, to share information, etc. Although, an initiative of this sort is very rare. More often, any form of collaboration and organisation between workers, whether it is a forum, a Youtube channel or other, is censored by these platforms.

To protect users' personal data, there is a need to supervise and protect AI workers.

With the successive confinements, we have seen teleworking become widespread. Can we fear that this movement of precariousness and automation of workers will affect all socio-professional categories, including managers?

P.T: In reality, it is not that simple. Firstly, it depends on the sector. In the medical domain for instance, for example, having an X-ray of a French patient analysed by an Indian doctor registered on a micro-work platform poses problems of protection of personal data. Thanks to the RGPD (General Data Protection Regulation), these practices are not really widespread in France but more in other countries. Secondly, there is the theme of the quality of work delivered. In the case of voice assistants, for example, it can be seen that the country of residence of the micro-workers and knowledge of the local culture are closely scrutinised. Some micro-work tasks are thus only available in certain countries. Sometimes it is even necessary to pass tests to justify your level of language proficiency and therefore, your suitability as an AI worker. So, of course, questions of cost optimization play a role, but not only. The linguistic, cultural and institutional bases of the workers play a lot, and sometimes are an obstacle to the recourse to micro-work.

In one of your last articles, you argue that in order to “protect user privacy, you have to protect the people working in data”. Why?

P.T: AI technologies are based on blockchain intrusions, and workers are doubly affected. If we take the example of voice assistants, upstreaming, micro workers record their voice, a data recognized as personal by the RGPD, to train the AI. They are then listened to. Downstream, other workers in turn listen to the voices of users to replace AIs when they are failing. And yet, we met platform workers who had never signed a confidentiality agreement for the comments they heard. This is both worrying for them, as they are responsible, and for the users whose conversations can be overheard to by anyone, outside any legal framework. Some computer science research promises to limit these risks. But to my knowledge, this work is not yet mature enough. Especially since we could add a third risk factor: the multiplication of personal data that flows between AI workers scattered around the world. This increases the possibilities of hacking and therefore of breach of confidentiality for this data. To protect the personal data of users and workers, we must start by supervising and protecting them.

A thank you to Samuel Roumeau who helped finalise this interview.

________

Sociologist and economist, Paola Tubaro is research director at the CNRS in the Computer Science Research Laboratory. She conducts interdisciplinary research aimed at illuminating complex socio-economic phenomena using data science, multi-agent computer simulation and social network analysis.

________

On the same subject:

> Interview with Thomas Berns: "Govern by numbers to rule better"

> Interview with Eric Guichard "More than a resource, digital technology is a violent form of exploitation"

> "Going digital as a choice for society" ("Faire du numérique un choix de société")

Back to Magazine