Hurray! I was invited to write a contribution to the volume AI Ethics in Practice last year. And the book is hot hot hot off the press. Double hurray!!
In my chapter, AI and Language Interpreting: The Ethics of Breaking Down Language Barriers*, I reflect on the considerations of using AI-powered solutions vs. actual people for cross-language communication in real time. Excellent publication timing, really, with Google recently announcing that real-time speech translation will be added as a feature to Google Meet. If God Google gets it their way, language barriers shall be no more, then.
I examine the use case of conference interpreters, a sub-population close to my heart: I have gone to great lengths to qualify as such a specimen myself. I have skin in the game and I have felt the shift from the admirative “I could never do what you do, listen to one language and speak in another at the same time…” to “Really? That’s still a thing? Huh, I thought AI would have taken over that job by now?”
What was that cracking sound, you ask? Oh, never mind that, just my little human heart and ego breaking a little.
I’m planning on expanding on the ethics of bridging the language divide by different means here in the future. But before I get to that, would you most kindly indulge me and grant me a round of venting?
Trigger warning: Rant ahead.
I wrote about some of the possible – and I believe likely – reasons why AI seems to pop up everywhere you turn. One being that AI is such a vast umbrella term that covers wildly different technologies and underlying approaches. While I remain convinced that this is an issue with seriously underestimated consequences, in a series of recent discussions in various contexts I also got the impression that AI is increasingly used as shorthand for generative AI, and in many cases as a synonym for ChatGPT.
Having just binged the audio book version of Empire of AI by Karen Hao, I’m taking more and more issue with ChatGPT, and really with all the Silicon Valley tech giants, from OpenAI to Google to Meta and beyond, and with their respective large language models or LLMs in whatever release iteration they may be right now. But I’m having trouble articulating those issues. At least that is the impression that stayed with me after those discussions about generative AI. That I hadn’t been able to rationally and concisely make a case for the fundamental ethics-related issues that come with using these models, however benign the intention behind the use.
Convenient, when articulating a message with precision is sort of the essence of what you do for a living. Please respect my privacy in this difficult time while I crawl back under my beloved rock of shame for a seething hot second to take a chill pill (read “inhale a bar of Mahoney milk chocolate”. I really wish they were my official sponsors).
But the issues with the AI empires’ LLMs are massive. And pressing. Specifically because of how those models came to be and what is therefore baked into their DNA, ideologically and data-wise. So before returning in a future post to what ethics arguments are worth considering when debating whether a data-based-system solution should or should not be used to overcome language barriers (ha! See how using a different label for what is systematically called ‘AI’ in current discourse dramatically shifts the conversation? Credit for the term goes to Peter Kirchschläger), this is my attempt at summarizing Karen Hao’s points of what makes the empires of AI and their products so dangerous. For the real deal, and if you’re looking for a fab book to read/listen to, do opt for Hao’s opus magnum! 17 stars!
A non-exhaustive list of epistemologically sound reasons to freak out over the pervasive use of the products brought to us by the empires of AI:
The framing of generative AI as the preparatory phase to the inevitable arrival of Artificial General Intelligence:
Artificial General Intelligence or AGI is the sentient science-fiction kind of AI very much hyped by the Silicon Valley tech industry. Feeding the controversial debate of whether or not AGI is about to descend upon us and take over is not innocuous (check out the predictions by the AI Futures Project in their AI 2027 scenario. However, I also find Shannon Vallor’s analysis of the question in AI Mirror highly valuable, which is much less ‘optimistic’ with regard to AI reaching singularity. Vallor embeds the threats related to AI in more realistic grounds, I believe). But portraying AGI as inescapable and single-handedly and unilaterally assuming the stewardship of humanity’s best interest is at best hubris.The (re-)shaping of the research and academic landscape around AI, or follow the money:
The technologies with the highest potential of returning revenue are the ones receiving most funding. No surprises here. In an AI-empire dominated world this means that the connectionist attempts at artificially recreating the human brain in the form of deep neural networks effectively crowded out other approaches, leaving the black boxes to take it all. Side effects include hallucinations and the massive amplification of existing power dynamics and biases. But who has the time to read the package insert. Better ask an LLM to summarize it for you…
On top of that, the steep price of the main ingredient to create generative AI models – unimaginably large amounts of data – and the exorbitant cost of training deep neural networks generated by the computing power and energy resources needed, leave academic institutions without a chance to compete. So talent (the people) and research (the work done by the people) pool in the industry, creating a de facto private market monopoly with very little regulatory oversight, if any.
The capitalization of personal data and the rise of surveillance capitalism:
The transformation of the data generated by the users of digital products and services into a commodity and the influence technology companies gain and exploit to maximize their profits is what Shoshana Zuboff calls surveillance capitalism. This way of monetizing our behavioral patterns and manipulating them for maximal financial gain is not just a major privacy issue but a threat to democratic institutions and open societies. So with the money, the talent and the work, influence and power also pool in private market pockets, still without meaningful oversight.Where there is hype, there is cult:
In the case of AI and more specifically AGI, look no further than effective altruism (EA) to check the cult box. EA essentially preaches that it is better to make lots of money through morally slightly (or not so slightly?) questionable means to then be able to create a better world via philanthropy than trying to create a better world in a morally sound but low paying (read “loser”) job. The crux is, of course, that the ones making the – in this industry – insane amounts of money are the same people who get to decide what “a better world” means. And that group of people is very… let’s say non-diverse. *Cough* Silicon Valley Tech Bro Club *Cough*. Which brings us back to the bros. And their hubris.The environmental impact of generative AI:
There is the building of the infrastructure upon which generative AI technologies rely, there is the training of the models and there is the running of the infrastructure supporting the models. Building the hardware already requires expensive and sometimes rare resources. Think of all the raw materials that go into building just one computer. Then scale that by the order of magnitude it takes to build the massive amount of computer chips and lithium batteries needed, the fiber optic cables, the servers, the data centers. And then this infrastructure needs to be run and maintained. Data centers famously require drinking water for example. For cooling purposes. Do you know who else requires drinking water? For, uhm, drinking purposes? People do. I’ll let you take a very wild guess as to who wins the purchasing-power race. Yep, probably not those thirsty people in already strained communities. They are likely working “loser jobs” anyway, so what do they know…The societal impact of generative AI:
Because of the massive amounts of data needed to train an LLM, the data that effectively went into the training of ChatGPT and its siblings/rival-tribe cousins was gathered from web crawlers amongst other sources. In other words, the quality of the data used is abysmal, to put it politely, containing descriptions and depictions of the worst forms of violence and abuse. The training process being so voracious in terms of data, however, the choice was made – by the stewards of the well being of humanity, let’s not forget – to ditch input data quality controls upstream and replace them by downstream output control mechanisms. This gave rise to an entire industry of highly precarious platform work. People in the global south, for example in Kenya, are tasked with annotating data and are essentially exploited as human garbage filters. People who without a reliable contract, for very little money and even less support of any kind sift through the worst of what the internet has to offer to flag it in order to sanitize the output the LLMs produce.The empires’ new clothes:
Karen Hao’s argument that the tech companies competing for dominance on the generative AI market are effectively replicating the behavioral patterns of the bad old empires is difficult to dismiss. Just like the empires of the past, they go wherever the raw materials for their purposes are and take whatever they need with little or no regard for local populations and ecosystems, exploiting people and land for their own gain. Extractionist tactics are not a side effect. They are a strategy. False promises abound. Resistance is met with a cash sledgehammer fueled by pockets so deep, governments and local authorities can only scramble to provide even the slightest of defense. The accumulated injustices of the legacy of colonialism meet the excrescence of data colonialism.The data hungry hungry generative AI caterpiller:
It is the unfathomable quantity of data brought about by the digital age that made the undisputedly impressive advances in deep learning possible. However, the overarching goal of creating AGI “for the good of humanity” (see point one on this list) was used to justify the ruthless appropriation and use of data – text, images, photos, videos, voice recordings. Protected by copyright and intellectual property laws or not. Paywalled or password protected or not. Without consent or compensation of those who created what would simply be used as LLM fodder. Artists. Authors. Content creators of all sorts. Private individuals sharing snippets of their lives. All of us. All is fair in love and scaling, it seems.
All this is not to say that the technology that gave rise to these products is inherently bad and that every possible use case is harmful.
This is also not to claim that I possess enough knowledge on the programming or the algorithms or the computer science or software engineering that make these technological advances possible to understand and explain in depth how this massively impressive creatures of code work. I strive to deepen my understanding, but honestly, it’s a struggle.
It is to say though that if an AI-powered solution for whatever possible use case is run on one of the models brought to us deceptively cheaply by what Hao deems the empires of AI, then I think it is time to question whether such a solution is fundamentally compatible with a responsible stewardship towards others and the environment. Whether there is any scenario in which using those models is not problematic. While this is to a certain extent an issue of individual, personal choices, as with other phenomena of such broad societal relevance, it is crucially an issue of sorely missing regulatory oversight. It is a deeply political matter.
Again, there are a myriad of uses and use cases and levels of intricacy and abstraction that are so far removed from my understanding that for many aspects I fail to even begin to know what I don’t know.
But I also think that it is not the degree of computer-science understanding of the technology that should be the determining factor of whether or not somebody gets to be part of this conversation. I’ll therefore close by paraphrasing the words of Tricia Griffin, Co-President of the Association of AI Ethicists, an organization I recently got to join as a supportive member, in response to the – lightly edited – question:
“Bro, do you even code??”
No, I don’t code. I personally haven’t ventured beyond statistical data analysis in R. I am familiar with the painful process of script writing and debugging, and with the joys of model convergence. That’s about it. But if your code has such a profound impact on society, then society gets to weigh in on that discussion. If on top of that, you increasingly rely on the empire of AI’s problematic products to write that code for you, society gets to weigh in on that discussion.
Ethicists get to weigh in on the discussion, even if they identify as non-coding.
So I will keep weighing in.
With all due respect, bro.
*The publication is not open access, but if you have an institutional affiliation or if you can access the publication via a university library, you may still be able to read it at no cost to you