Every Shift is A Work of Art

Introduction

The current rise of AI technologies is directly dependent upon the labour of collecting, curating, and annotating enormous amounts of data. No AI algorithm could be trained without the 'human' labour required for the creation of accurate data sets of unprecedented dimensions.[1] The major players in the AI industry tend to outsource such work in order to reduce costs, eschew responsibility, and sustain the marketable illusion of a disembodied artificial intelligence free of human biases. Mistakenly described as low-skilled, AI data work is often overlooked and obfuscated from mainstream public discussions of artificial intelligence.

In what follows, I will be addressing a specific instantiation of AI data work, namely audio transcription for AI projects, as a performative process of spectatorship[2] and contend that it engenders a so far overlooked form of AI art. While acknowledging the importance of academic research in humanities and social sciences that unpacks the larger social and political implications of data work associated with the rise of AI technologies, I will give here a situated, playful account based on my personal experience as a transcriber for AI projects between March 2023 and March 2025 in a multinational company specialized in business process outsourcing. I do not aim to formulate a truth that holds for all transcription contexts and all transcribers, but rather to perform a situated fabulation (see Marks 2024, 164-193 and Manning 2020, 115-144) based on the contingencies of my own experience.

Taking my clue from the theory of the fold developed by Laura Marks (2024), read in conjunction with insights drawn from the philosophy of individuation in the versions proposed by Gilbert Simondon (1989, 2013), Gilles Deleuze (1994), Bernard Stiegler (1998, 2009) and Yuk Hui (2016, 2019, 2021), I will engage with the complex socio-political, ethical, but also aesthetic folds that the process of transcription is an integral part of. I propose that the audio recordings encountered by transcribers are not merely data to be exploited, but singularities (for singularities see below) that, under certain circumstances, can precipitate processes of individuation (i.e. processes of spectatorship) which result in infrathin (Manning 2020) imaginary works of art.

The main idea in the following pages is that beyond the labour of transcription, with its quantifiable targets and strict rules (algorithms) that robotize labour, there is a background work of imagination which, despite one’s best intentions to focus on the productive task at hand, weaves out surreal, fragmentary, contradictory compositions (audio, visual, haptic, affective) that remain singular experiences impossible to exteriorize. Inspired by Octavian Nemescu’s work on imaginary music (Nemescu 2015, Anghel 2020), selected examples of avant-garde art and literature that problematize the distinction between everyday life and the work of art, as well as a striking passage from a novel by Herta Müller, I advocate for the valuing these opaque (Glissant 1990, 203-209) experiences as artworks in their own right, despite the fact (or rather exactly due to the fact) that they cannot easily gain an exchange value by being exteriorized, reified, and commodified—they cannot enter the art market, the art institutions, the art world, etc. In doing so, I engage in performing a fabulation (rather than unveiling a truth), namely identifying folds inherent in the transcription labour that are incompossible with the accepted truths about AI data work, starting to unfold them differently, and hence contributing to creating new folds, infrathin vectors towards imagining different futures that do not as yet exist (cf. Marks 2024, 164-193).

More concretely, this essay operates with two intertwined threads:

First, it is an interrogation regarding art in the context of AI technologies. It adds to Joanna Zylinska’s account of the relationship between AI and art (Zylinska 2020), a new form of AI art that unpacks one step further the experience of often invisible AI data workers, who constitute one of the basic layers of the AI production chain. The problematic of AI art, from this perspective, is not simply that of art produced with the help of AI algorithms, but rather concerns aesthetic experiences that emerge in the embodied encounter between 'humans' and AI algorithms, including encounters with the production chains that ground current forms of AI. In Zylinska’s terms, this would be a form of art for Another Intelligence, rather than simply Artificial Intelligence art (137-143). While Zylinska addresses the hidden labour of workers on Amazon’s Mechanical Turk platform (110-115), I add here a consideration of AI data work in BPO companies,[3] and, importantly, attention to aspects of the experience of AI data work that cannot be neatly exteriorized, communicated or exhibited.

While working as a transcriber, I was struck from the very beginning by the exhilarating (if, paradoxically, at the very same time excruciatingly boring) piece of electronic literature that I was witnessing during my shifts. I was listening every day to a sequence of audio recordings that seemed to form an extraordinary cut-up literary composition of uncreative writing, made up of fragments of daily life, with interesting aspects of musique concrète (from background noises to glitches in the recording devices). So the question that emerged was: would it be possible to approach the work of transcription as an immense process of spectatorship? This is how I started to pay attention to what was happening in 'my' imagination while transcribing, and I was surprised to realize that fragile, often surreal, imaginary textures (audio, visual, haptic, affective) were coagulating for brief moments and then dissipating somewhere at the borders of consciousness. I am attempting in these pages to attend to these imaginary experiences, to unfold them, to frame them as works of art, and to interrogate their conditions of possibility as well as their consequences.

Second, I will engage with existing academic discourses regarding AI data work. While critical texts in social sciences and humanities fulfill the important task of highlighting the exploitative dynamics of AI data work, nonetheless they often tend to paint an oversimplified figure of the worker as a naive victim of exploitation, a figure that, despite the good intentions behind it, misrepresents workers and tends to be disempowering. Maybe not unrelated to this first aspect, a glaring shortcoming of academic research in the field is the fact that workers are present only as objects of research. I am not aware of any major academic research concerned with AI data work in which AI data workers would have a significant say in the design and implementation of the research project. When AI data workers are allowed to talk in their own name, this happens only in a frame pre-set and controlled by the researchers (for example as respondents to interviews). It can be argued that such methodology mirrors earlier colonialist practices in cultural anthropology (as one example among many), where the other is (often benevolently) framed as an object of research, but it is not given the power to speak in their own name (except in a strict frame set up by the researchers). One further shortcoming of the existing literature in the field, related to the previous two points, is that numerous publications are not open-access and hence are inaccessible for the majority of AI data workers, transforming otherwise well-meaning research into blatant examples of exploitation. The present text engages with these problems by proposing an open-access, situated account of AI transcription, written from the perspective of a worker, an account that complexifies the existing narrative and challenges the cliché image of the worker. It does so by turning towards the infrathin imaginary experiences that emerge in the process of transcription, experiences that do not leave any obvious exterior traces, and are usually discarded and overlooked both by the AI production apparatus (including transcribers themselves) and by academic research.

Referring to the imaginary artworks and their effects as infrathin, I am building upon Erin Manning’s understanding of this term in For a Pragmatics of the Useless (2020). The origin of the infrathin is in Marcel Duchamp’s posthumously published hand-written notes (Duchamp 1999)—the French word used in the notes is inframince, infra-mince or infra mince (1-46). According to Duchamp, one cannot define the infrathin, one can only give examples (de Duve 1991, 160). So, here is a series of such examples, in English translation, as they appear in Marjorie Perloff’s Infrathin: An Experiment in Micropoetics:

The warmth of a seat (which has just been left) is infrathin.
Sliding doors of the Metro—the people who pass through at the very last moment/infrathin.
Velvet trousers—their whistling sound (in walking) by brushing of the 2 legs is an infrathin separation signaled by sound.
When the tobacco smoke smells also of the mouth which exhales it, the 2 orders marry by infrathin.
The infrathin separation between the detonation noise of a gun (very close) and the apparition of the bullet hole in the target.
Infrathin (adjective) not a name—never make of it a noun.
In time, the same object is not the same after a one-second interval.
[…]
The difference (dimensional) between two objects in a series (made from the same mold) is an infrathin one when the maximum (?) of precision is attained.
(Perloff 2021, Introduction)

Manning engages with Duchamp’s notes, as well as the readings offered by Thierry de Duve (1991) and Marjorie Perloff (2002), to propose an understanding of the infrathin as “the potentiation of a relational field that includes what cannot quite be articulated but is nonetheless felt […] the thisness, the haecceity of an experience that cannot be reduced to the sum of its parts” (Manning 2020, 16). This is the sense in which I will be using the term in the following pages: an infinitesimally small difference that arises from a field of relations, a difference that is felt, that affects embodied experience, but that cannot be easily grasped and articulated in representation. The flickering imaginary artworks that emerge in the process of transcription at the periphery of consciousness (and often beyond its limits) are infrathin modulations of the embodied experience of transcribers, and their potential effects are infrathin glitches of the work-entrertainment system that grounds them.

Marjorie Perloff argues for the infrathin as an experimental methodology for approaching complex poetic texts with attention to their subtle, barely perceptible (if at all), nuances (2021). The question that imposes itself here is: to what extent a transcriber could employ such a methodology with respect to the immense electronic literature piece constituted by the stream of audio recordings that they are witnessing every day? Given the labour context, it is out of the question for transcribers to perform the type of extremely close reading with attention to prosody, sound, form, semantics, etymology, contexts, etc. that grounds Perloff’s inspiring insights into texts that otherwise seem entirely obscure, but an adaptation of this methodology seems to be possible (hybridized with Laura Marks’ methodology of enfolding-unfolding aesthetics[4]): allowing one’s attention to dwell on the apparently insignificant details of the audio recordings (semantic, sensorial, formal, etc.), inviting imagination to draw improbable connections between them, setting thus in motion an imaginary artwork (affective as much as emotional and rational), and attending to its unpredictable unfolding.

Strictly speaking, it is impossible to directly write about such infrathin modulations and glitches, it is impossible to articulate them, one can only gravitate around them, letting them be guessed between the lines, beyond what it is possible to represent in words or images.

Starting in the Middle (of the Panopticon)

One always has to start in the middle. There is no stable ground that could offer a beginning, there is no absolute end that could orient (Deleuze 1994, 272-276, 297-304). Origins emerge 'retrospectively' in the contingent processes that they ground (Derrida 1967, 1995), ends are endlessly deferred (Nancy 1997), altered[5] and multiplied—an expanding labyrinth that is actualized by the very errancy that seeks to grasp its always imminent, yet infinitely distant, solution (Grama 2008). One speaks from the middle, writes in the middle, lives in the middle, dies in the middle: in the fold (Marks 2024).

I am listening to a long and intricate stream of voices recorded in the midst of daily life. Life caught unawares: no theatrical conventions, no mise-en-scène, no scripted narrative. Digital devices ready-to-hand, voice recognition functions on, hundreds of thousands of users are searching for orienting clues (from fetishized commodities to traffic information, to sex, to the words of gods, to the meaning of dreams or of illness symptoms, etc.) in an absurd, discordant labyrinth whose only exit infinitely recedes. The more one believes in the vectors that orient one’s world, the more lost one is against the textures of conflicting interwoven threads that constitute the convoluted context of contemporary embodied experience. The promised solution (oneself, the absolute ground, the world in its reality)—the solution that could give a stable sense, the solution that could afford orientation—remains always à-venir (cf. Derrida 1995), always imminent but never here, always awaiting after the next step, after the next turn in the labyrinth (Grama 2008). Living voices, dying voices: scared, happy, annoyed, angry, aroused, excited, restless, and everything in between, everything that words cannot even begin to capture. Voices that have the future before them and voices that have the future behind them. I am listening five days a week, hours in a row, to this incredible piece of performative electronic literature, a randomized cut-up collage of epic proportions, an immense texture of modulated breath. Respiră, respiră adânc. Reia.[6]

I am working as a transcriber for AI projects in Eastern Europe in a multinational company that is contracted by The Client, a major player in the AI industry. Audio transcription, one of the basic data annotation jobs that ground the current AI boom, supposes listening to the audio recordings provided by The Client and either typing them down from scratch or correcting an AI generated transcription. Transcribers know that the data they produce will be utilized in training AI models, but are kept in the dark regarding any specific details. In a setting that is not at all dissimilar to that of John Searle’s Chinese Room thought experiment (Searle 1980), we transcribe according to a given set of rules, but are impeded from understanding the implications of our own actions.

Searle’s (in)famous thought experiment, designed to disprove the claim that AI algorithms could understand language, asks readers to imagine a person, who does not recognize Chinese writing, locked in a room. The person receives two batches of Chinese script and a set of rules for establishing correlations between the characters in these two batches based solely on their formal aspects. Searle claims that, given an input in Chinese characters, along with some further instructions on how to correlate it with the symbols in the first two batches, the person in the room could produce an output that for those outside the room would look like a valid use of Chinese language. Searle’s point is that manipulating linguistic symbols according to a given set of rules (an algorithm), as AI does in his account, can produce the illusion of meaningful conversation, without for that matter implying a genuine understanding of language (just as the person in the room produces correct answers in Chinese without understanding Chinese, simply by following the rules). Arguably, Searle’s thought experiment is inadequate for thinking about the limits of AI, as it misrepresents the operation of language, the individuation of embodied thinking in relation to technology, as well as the functioning of AI algorithms.[7] Nonetheless, it does offer, inadvertently, a fitting image for thinking about the condition of AI data workers and especially about the work of transcribers. Guided by a similar ideology oblivious to the complexities of language, embodied subjectivity and its interrelation with technology, we are asked by The Client to produce desired outputs for a given set of inputs, while remaining oblivious with respect to the contexts in which our work takes place. Meaning belongs to The Client, and it is a carefully guarded secret.[8]

Who is The Client? The name of The Client cannot be named, the ways of The Client cannot be revealed. The Client is probably best understood as a contemporary caricatural divinity, a Deus absconditus sheltering its panoptic data-driven business with apparently benevolent laws and liberal economic violence. As transcribers, we perform our daily tasks behind closed doors in strictly monitored 'secure labs'. The primary function of such secure labs is to provide a safe space for working with potentially sensitive data, and at the same time to safeguard corporate secrets regarding the data and the methodologies and tools used for its manipulation. From the perspective of a worker though, this also implies alienation from the meaning of one’s actions, the secure lab (including the laws that govern it) being a contemporary material instantiation of the enclosed room from Searle’s thought experiment: it impedes workers from knowing anything else than what is strictly necessary in order to produce the desired outputs from the given inputs. Moreover, it is illegal for workers to reveal anything regarding what is going on in the secure lab, anything about the audio recordings that they are listening to or about the environment that they are working in. Again, as in Searle’s thought experiment, those outside of the enclosed room have no means of knowing what is happening inside, except that, whatever it might be, it produces the desired outputs.

According to the Law that The Client in their mercy offered us together with a (decent) monthly check, the first and foremost commandment of transcribers is: Though shall not disclose. I am not disclosing anything by saying this out loud.[9] Working with potentially sensitive personal data and proprietary methodologies for manipulating such data comes with the requirement of ensuring that nothing leaks out of the secure labs where the transcription takes place, these impenetrable temples of the contemporary panopticon. And maybe it is right that it should be so? I will try not to comment, at least not for now. I will obey, hoping to continue to receive my paycheck and to avoid any cumbersome legal proceedings.

But what could one reveal, anyway? It is already common knowledge that the recorded audio data comes from user voice interaction with digital devices. A series of articles published in The Guardian in 2019, for example, offer a glimpse of the situation as it appeared at that time in the wake of a series of whistleblower revelations. It turns out that, unsurprisingly, Apple’s Siri, Amazon’s Alexa, and Google Assistant all send a portion of their voice recordings to transcription labs in order to analyze performance and improve capabilities. While all three companies have policies to mitigate the possibility of revealing sensitive data, nonetheless these policies often fail. Transcribers do have access on a regular basis to a vast array of personal data and sensitive information provided (unknowingly?) by the users (Hern 2019a, 2019b, Paul 2019).[10] So, to start with, you already know for a fact that everything that begins with the keywords that trigger Apple’s Siri, Amazon’s Alexa, and Google Assistant will potentially be heard by a 'human' transcriber. You also know, from personal experience and from transcriber’s accounts as presented in the articles of Hern and Paul, that these devices are at times triggered accidentally. Consequently, you do know that everything that you say in the presence of a device that has applications such as Google Assistant, Siri or Alexa will potentially end up being heard by a transcriber, whether you actually intend to use the respective device or not.[11] Now, as long as you are not naive enough to believe that those applications and devices that were mentioned in the press are the only ones that benefit from such practices, you have a good idea of what we are listening to as transcribers (there is no need for further revelations): life caught unawares.

'Life caught unawares' is a phrase coined by cinema pioneer Dziga Vertov, who advocated that the revolutionary Kyno-Eye (the movie camera) should capture life as it is, without any staging and artifice (Vertov 1984). The Kyno-Eye, in Vertov’s revolutionary view, was affording a new way of seeing life, one that was supposed to recursively contribute to modifying life itself by playing a role in building a new communist society and new ways of living. And, arguably, the movie camera, along with a host of other recording technologies, did participate in a re-distribuiton of the sensible (to use Jacques Rancière’s term)[12] that became integral to life in XX century in many parts of the world, albeit in ways essentially different from those envisioned by Vertov and often reactionary rather than revolutionary. The quasi-ubiquitous digital recording at work today (audio, video, and otherwise),[13] performs a new fold in this re-distribution of the sensible, being directed towards the abstraction of the living body as data (Fuller 2005, Haggerty and Ericson 2000) and participating in a logic of enclosure that often captures such data as private property (of The Client) and exploits it for financial and political gains (Choi 2019). As Wai Kit Choi (2019) contends, the rise of digital media cannot be explained only by user preference, one also has to account for surveillance capitalism—the state and corporate push for thorough societal digitization as a strategy to maintain power and generate profit, rendering the use of digital media unavoidable. The problem with the digital tracking of our activities (voice recordings among many other practices) is not only the loss of privacy but, more than that, the unethical commercialization and use of the resulting data—including for targeted behavioural modifications (swaying shopping decisions, swaying political decisions, etc.). We are faced with an updated version of life caught unawares, datafied, and recursively used to modify life itself, not in order to experiment with new ways of living, as Vertov would have wanted, but simply to gain economic and political advantages by reinforcing the existing distribution of the sensible (what Choi calls surveillance capitalism).

Breath recorded by The Client, owned by The Client, guarded by The Client, and transformed into units of data that deconstruct the subject (Haggerty and Ericson 2000) and that could be seen as an integral part of a more general logic of 'extraction of humanness' (Morreale et al. 2023). But, importantly, units of data that at the very same time reaffirm the 'human subject' in its narcissistic projection as a predictable consumer on the market and obedient citizen of the nation-state (arguably the primary meaning of the 'human'). The logic of the panopticon, that of constructing subjectivity through surveillance (Foucault 1995), is not simply replaced by digital surveillant assemblages (Haggerty and Ericson 2000) that deconstruct the intensive life of the 'human subject' into packets of data, but rather modulated and enhanced by it. Surveillant assemblages and capitalist extraction machines are not simply a violence against a pre-defined 'human' subjectivity, but rather an integral part of the folds that make us who we (never quite) are (cf. Marks 2024), an integral part of the panoptic logic that produces the 'human' subject. And the panoptic machine, as theorized by Foucault, relies on the invisible centers from which everything is potentially visible. The secure lab of AI transcription is one contemporary version (far from being the only one) of such a non-place of power, and of course, it is essential for the panoptic logic that whatever happens inside the secure lab, regardless of how underwhelming and innocuous it might be, remains undisclosed.

I cannot reveal anything about the secure lab, nor about the audio recordings, but you already know enough in order to be able to glitch this logic of enclosure through an exercise of imagination. What am I transcribing? Everything that you can imagine a user saying towards a device with voice recognition capabilities, or more exactly in the presence of such device, given the fact that the recording can be unintentionally triggered. You know it because of articles in the press, such as the ones mentioned above, but you also know it because you have agreed to it in the Terms and Conditions of your apps. Given the fact that you can be recorded just as well by your own devices as by the devices of other users, it is only a small logical step to realize, without me having to reveal anything, that, because of the general uptake of digital technologies in our societies, you could be recorded at any moment, intentionally or not, by the devices owned by your neighbour, the policeman who arrested you, the doctor’s office, the company that provides your medical supplies, your lawyer, the phone company, the supermarket, the tech support company, etc. Nothing new.

When it comes down to it, a transcriber could only 'disclose what everybody knows': that people live and die, that they have sex, that they pray and that they watch porn, that they fight, that they oppress each other and protect each other, that they try to learn and cannot help forgetting, that they play and work, that they get sick and sometimes recover, that they want fast cars, effective medicine, fancy clothes and trendy music, that they love and hate their technological prostheses, that they want new phones and better computers, that they cannot survive without help and care even if they are rarely ready to acknowledge it. And, of course, that all of this happens in the proximity of the digital devices that we have agreed to be recorded by. And record they do, so I can transcribe, so they can better record.

AI Data Work

In order to get a general understanding of the work of audio transcription, let us paint a succinct picture of the larger contemporary labour context that it is part of: AI data work. I will turn in the following pages to recent studies of AI data work and subsequently, in the next chapter, I will introduce a personal account of my experience as a transcriber in order to complexify and further problematize some of their claims.

AI technology tends to be advertised as a form of rational disembodied thinking free from the biases of situated 'human' intelligence and capable of operating independently of the contingencies associated with 'human' thought and action. Yet, behind the curtains hides the immense labour of data workers at various levels in the AI production pipeline (cf. Tubaro et al. 2020)—interconnected with the exorbitant environmental impacts of the infrastructures required for many of the popular AI models (cf. Muldoon et al. 2024b). The current AI boom is dependent on the embodied work of collecting, curating, and annotating immense data sets, work performed under conditions that are often deeply problematic (Muldoon et. al. 2024a, 2024b). Because of the costs involved in this labour intensive process, the major AI companies outsource it either through crowd-work platforms or through partner companies (Muldoon et. al. 2024a, Miceli and Posada 2022) and, in parallel, often rely on data work performed by unwitting labourers—i.e. individuals who are unaware that the activities they perform online are used to train AI (Morreale et al., 2024). A growing number of academic publications (Muldoon et al. 2024a, Morreale et al., 2024, Tubaro et al., 2020 among others) and journalistic reports (Williams et al. 2022, Perrigo 2023, Rowe 2023 among others) insist that these outsourcing practices, in their current configuration, result in unethical exploitation of cheap labour, some researchers cogently arguing that this constitutes a continuation and extension of previous colonialist practices (see for ex. Muldoon et al. 2024b).

Recent accounts of work conditions in the AI industry paint a grim picture at odds with the fantasies of disembodied intelligence that suffuse the commercial image of AI. Workers in Kenya were reportedly paid between $1.32 and $2 per hour after tax by an intermediary company for labeling textual descriptions of sexual abuse, hate speech, and violence for an OpenAI tool meant to detect “toxic content” (Perrigo 2023, Bartholomew 2023). People in Venezuela, Columbia, and the Philippines are working around the clock on digital platforms, performing micro-tasks for the AI industry, and barely managing to gain a living—the time spent waiting on the platform for tasks to pop up, as well as the extra research needed to perform them, remaining unpaid (Rowe 2023). Vocational school students in China were forced into low-paid “internships” in which they performed repetitive AI data work for tech giants, without any benefit for their professional skills—the government stepping in with new regulations trying to eradicate such practices (Zhou and Chen 2023). Workers in Brazil found themselves in shady working conditions, underpaid or not paid at all for transcribing jobs that ultimately benefitted TikTok (although through a chain of intermediary companies) (Ribeiro 2021). And the examples could continue at length, with all major tech companies (The Client) being at some point involved in revelations of this sort.

Despite the complexity of particular cases and the differences between them, in the end there is a rather simple reason why the AI industry is inherently based on exploiting workers (and it is unlikely that the issue will disappear in the near future): maximizing profits by lowering production costs—which is one of the main drivers of what Miceli and Posada call the data-production dispositif, namely: “the network of discourses, work practices, hierarchies, subjects, and artifacts comprised in ML [machine learning] data work and the power/knowledge relationships that are established and naturalized among them” (Miceli and Posada, 2022, 2). Under the capitalist logic of accumulation that subtends the data-production dispositif and that is at the same time reinforced by it, it is only natural that labour intensive AI data work is outsourced under conditions that profit The Client and exploit the workers.

From the perspective of the workers, there are significant differences between three major forms of AI data work: unwitting labourers remain unpaid and often unaware that they are performing free labour; digital platform workers are usually paid per micro-task completed and are considered independent contractors (which means they have no work security or social benefits and no ways of contesting the sometimes abusive work practices); BPO companies have a more traditional hierarchical managerial structure, workers are employed on short or long-term contracts, and are based in one or more offices (sometimes with some variations allowing for remote work), and they are protected by national and international labour legislation.[14] In general terms, considering the existing research, those working on crowd-work platforms tend to be in a more precarious position in comparison with the employees of BPOs, nonetheless, as a quick glance through the academic literature cited in this chapter would make it clear for any reader, BPOs are also often involved in an unethical exploitation of their workers.

From the perspective of The Client, AI data work performed by BPOs tends to be more costly in comparison with crowd-work platforms, but it does offer more reliability, the promise of higher quality results (Miceli and Posada 2022), and a more secure framework for dealing with sensitive data (Muldoon et al. 2024a). Many BPOs are located in the Global South, in a drive to draw profits from cheap labour—a trend publicly promoted as offering work opportunities to people from disadvantaged communities[15]—, but not exclusively.

Working for BPOs, working on crowd-work digital platforms or being an unwitting labourer can all be considered forms of ghost work, where ghost work is defined as the hidden human labour inherent to mobile phone apps, websites, and artificial intelligence systems (Gray and Suri 2019, 7).[16] Rendering AI data work invisible is interlinked with the failure to acknowledge the broad-reaching ethical, political, and societal stakes of such work. Both the organization of the AI data work pipeline and the practical actions of each worker are sites in which social values and biases are unavoidably introduced in the datasets (Paullada et al. 2020, Miceli and Posada 2022) with far-reaching consequences for how AI systems operate and their larger societal effects. AI can never perform objective reasoning independent of the material, socio-political, economic, historical etc. conditions of its functioning (cf. Pasquinelli 2019b, 2023), and one of the many ways in which societal biases find their way into the 'reasoning' of AI is through AI data work (cf. Pasquinelli 2019a).

As one example among many others, in the case of voice recognition, AI is often better at recognizing some accents and dialects at the expense of others (not to mention that there are many spoken languages that are entirely missing from the datasets). This is, at least partially, a consequence of the types of speech that are represented in the datasets (i.e. social groups with lower access to digital devices tend to be underrepresented) and of the skills of the transcribers—who are more likely to provide correct transcriptions for the dialects that they are familiar with and to misunderstand or ignore altogether unfamiliar dialects. As long as the transcription work (and AI data work more generally) is mistakingly considered low-skilled (disregarding for example the fact that understanding underrepresented dialects is a high-skill task), and remains undervalued and obfuscated from public debate, these biases are not likely to be addressed and our AI tools will be increasingly discriminatory. Rendering AI data work more visible is a necessary step if we are to gain a better understanding of the costs of AI, costs that at this point are still largely hidden from users and hence bypass to a great degree public scrutiny and its potential political implications.

AI data work is not simply a transient phenomenon, but a structural component of the AI production process and it is unlikely to vanish in the near future (Gray and Suri 2019, Tubaro et al. 2020). As Hamid Ekbia and Bonnie Nardi argue, the coupling of capitalism and computation in its current instantiation is driven by heteromation, i.e. by “the extraction of economic value from low-cost or free labor in computer-mediated networks” (2017, 1), and hence human labour is intrinsic to the working of this system, not a temporary incidental inconvenient. This means that in order to build more ethical AI tools we need a profound re-thinking of labour conditions, and not just short-term measures (Tubaro et al. 2020). In this context, Muldoon et al. propose a convincing list of practical steps that could be taken to improve work conditions in the AI industry (Muldoon et. al 2024b). It is worth reiterating here some of their main points.

According to the authors, the first step for improving labour conditions consists of building worker power by joining existing worker unions and creating new ones that take into account the specificities of AI data work. In the case of AI data work, in order to be effective, unions would have to forge international alliances (to avoid The Client responding to union requests by simply switching the AI work to a different location) and bring together the interests of workers in key positions (who have more leverage against the company) with those in positions with high turnover (that companies are ready to dismiss in big numbers at any sign of discontent).[17] Besides unionization, a more ethical power equilibrium between workers and investors could be achieved through schemes of co-determination (in which worker representatives are involved in director-level decisions and in day-to-day discussions about the operation of the company), and employee ownership schemes (in which workers have significant financial stakes in the company). An even further step in this direction would be the creation of worker cooperatives, in which workers jointly own and manage the business.

Another crucial point for improving the condition of AI data workers would be to hold companies accountable, through consumer and social movement-led pressure, for work conditions throughout the supply chains. On the global market, consumer trends can be very influential and the public image of a company is essential for business. Hence the importance of consumer and social movement-led pressure in triggering change. Of course, in order for such pressure to be effective, it has to be coupled with better legislation and regulation at national and international levels for holding companies responsible for their production chains. The main idea would be to point out that the responsibility for work conditions should be taken in great part by those who set the rules of the game. Intermediary companies, by no means innocent, are nonetheless in an impossible position once the contractual conditions imposed by The Client do not meet minimum standards. An intermediary company can refuse a deal, but cannot refuse all deals, this would simply mean losing business and jobs, which does not help anyone.

While these options are far from being straightforward and unproblematic, they do offer valuable hints towards ways of creating more ethical work conditions and distributions of profit. However, in order for major change to happen, in order to overturn the logic of exploitation that powers contemporary AI technologies, arguably we need to go much further, experimenting with alternative political and economic models that disrupt the current logic of global capitalism based on colonialist histories and practices of extraction (cf. Muldoon et. al 2024b). Some possible reforms in this direction are proposed by Ekbia and Nardi in their argument for moving beyond heteromation, i.e. beyond the contemporary coupling of capitalism and computation based on inequitable exploitation of human labour (Ekbia and Nardi 2017, 187-209). In short Ekbia and Nardi advocate for

[…] a real utopia that modifies the socioeconomic system in three ways: (1) it recovers control of subsistence through digitally mediated food production, local fabrication, and computer-mediated means of distribution; (2) it implements a social income; and (3) it moderates work arrangements to reduce work time and offer the option of labor in the electronic cottage at fair wages (Ekbia and Nardi 2017, 209).

It is not the place to discuss here the pertinence of such proposals, but simply to point out that the problem of improving work conditions in AI data work does lead for Muldoon et. al (2024b) as well as for Ekbia and Nardi (2017) to the necessity of engaging with the basic logic of the current versions of computational capitalism. While their proposals are geared towards provoking observable material changes, I am more interested here in engaging with computational capitalism as it coagulates in the unobservable lived experience of the workers. Hence, the present essay will attempt to make a subtle, oblique contribution to this discussion by turning towards the process of imagination that transforms the work of transcription into fragile, ephemeral, and uncommunicable works of art that elude usefulness and capitalist economic value.

In a political and economic paradigm based on extraction, “nature” is framed as standing-reserve (Heidegger 1977), as a set of resources waiting to be extracted and exploited, and “human nature” is not an exception (cf. Morreale et al. 2023 on the extraction of humanness in the AI industry). Singular products of imagination are often discarded and excreted in this worldview—singular products of imagination refers here to those products of imagination that remain impossible to exteriorize and represent—when they cannot be reified and appropriated in the capitalist market (i.e. when they cannot become products on the art market, in the entertainment industry, etc.). Valuing such imaginary experiences as artworks in their own right, on the one hand introduces infrathin (dis)orienting glitches in the context of the work-entertainment system that they are inscribed in, and on the other hand offers the opportunity of rethinking the image of the AI data worker beyond the oversimplified cliché of a naive, exploited, powerless figure.

Challenging the Image of the AI Data Worker

I start here by acknowledging that I will be speaking strictly from my situated perspective, which cannot be generalized to all transcription workers: I am employed on a long-term contract by a BPO company located in Sofia, Bulgaria and find the transcription job to be relatively easy, stress-free and decently paid, if somewhat unstable.[18] Work conditions vary widely across the AI industry and different workers have fundamentally different experiences. All these experiences, with their insurmountable contradictions, should be valued in their own right.

While existing academic research on AI data work fulfills the important task of highlighting the exploitative dynamics that subtend the AI industry, it often risks homogenizing and oversimplifying the image of the worker, contouring a naive, innocent figure, inherently good, ruthlessly exploited by a capitalist market in which it is forced to participate. Such clichés are an integral part of the exploitative logic, rather than effective means of fighting it.[19] I believe it is important to challenge this trope, complexifying the narrative of transcription work with situated accounts that do not quite fit this standard image.

In more theoretical terms, using the vocabulary proposed by Gilbert Simondon, challenging the trope of the naive, exploited worker proposes an impulse towards rejecting the logic of inter-individuality in favor of trans-individual relations. Simondon argues that inter-individual rapports are established between apparently welldefined individuals, as a result of the projective images that they create with respect to each other, while trans-individual relations put the individual into question and hence open up new directions of becoming (Simondon 2013, 167, 273). In other words, inter-individual rapports can reduce the individual to a rigid predefined image, while trans-individual relations open up processes of individuation (psychic and collective) by decentering the individual with respect to its presupposed image and position in the collective. In our case, inter-individual rapports, as benevolent as they might be, reduce transcribers to the caricature of an innocent, naive, oppressed figure, hence fixing them in a specific socio-political position that safeguards against unforeseen changes in thinking or in politics (i.e. safeguards against new un-teleological processes of individuation). Building trans-individual relations, where both the individual and the collective are involved in the risky and uncomfortable process of becoming towards the unknown (unknown because individuation is not driven by a predefined telos) would mean in this case bypassing this cliché trope and accepting that the otherness of the other cannot be reduced to a cohesive classifiable representation.[20]

This is not to dismiss the contention that the current AI boom is driven by a logic of exploitation of cheap labour—among other forms of exploitation—but to suggest that transcription workers do not fit any single pre-established idealized image. In my experience, transcribers are far from simply being naive victims of exploitation, quite on the contrary, most often they are highly skilled individuals with very diverse backgrounds, able to work across multiple languages, and being very aware of the advantages and the downsides of the job. Many of my fellow transcribers have extensive experience of living, studying, and working outside of their native country. Unsurprisingly, the work of transcription is perceived differently by different people, and occupies vastly different roles in their lives: for some it is close to a dream job, for others it is a boring nightmare; for some it is stressful and demanding, for others it is a welcome relaxation in comparison to other jobs that they have performed; for some it is a temporary fix for financial difficulties, others hope that it is a first step towards a career in multinational companies. I don’t think that I have met any transcriber who would fit the category of “innocent victim”. In fact, most of us are anything but innocent, deeply involved in the oppressive dynamics of contemporary capitalism, participating just as much as everyone else in the very system that exploits us and many others.

An oversimplified dualistic logic that opposes fundamentally good, oppressed, naive workers (and well-meaning theoreticians) to the 'evil' of big corporations is an integral part of the problem rather than the path towards possible solutions. The situation is much more complex than that and it is imperative to go beyond the embarrassingly weak narrative of 'good guys' versus 'bad guys', and to critically examine our position (workers and theoreticians alike) as both exploiters and exploited in the system.

Overall, in my experience (subjective, situated, and contingent, without any claims to an objective generality) the exploitative dynamics described in the existing literature that I have addressed above are in general lines correctly identified, but the work conditions that I have faced as a transcriber were better than the ones that I encountered in academia. In comparison to my time as PhD researcher and sessional teaching staff at The University of Melbourne (hereafter UniMelb), I find transcription to be (most of the time) an undemanding job and quite stress-free, relatively well paid (definitely it is much easier for me to make ends meet financially than it was at UniMelb), but not very stable. Yes, there are absurd targets to meet, and sometimes it is challenging to do so, but it is not quite as bad when compared for example to the words/hour targets set by UniMelb for correcting student papers. Overtime is paid as overtime (unlike correcting papers and a host of other teaching and research tasks at UniMelb), and when I go home after work I can actually rest, while at UniMelb that is when one is supposed to actually catch up with one’s own research. The salary system is inequitable,[21] but as far as I know, it is at least within legal limits, which is a significant improvement compared to UniMelb, that was forced to offer back-payments of $72 million to staff after 'unlawful' conduct across an entire decade, between 2014 and 2024 (Chwasta 2024). Also, I found the workspace provided by the BPO to be much more suited for the tasks that I was required to perform than the workspaces offered by UniMelb to PhD researchers (and at least I was not expected to reapply for a desk every 6 months). And the list could continue. These are not just trivial, minor aspects that remain inconsequential, but rather the threads that add up to form the complex mesh of the lived labour context. So, all in all, from the perspective of the remuneration and the work conditions, it is not too bad working for The Client through the intermediary BPO that is my direct employer. The Client and the BPO are not monsters, just companies with impressive political and economic power, obsessed with keeping their ways secret and whose main aim is making profits as big as possible. As a consequence, they tend to exploit their employees to the maximum degree that is allowed by the local economic, legal, and socio-political context. Nothing new, just like The University of Melbourne.

For me, one of the hardest parts of the transcription job is not being overly tired or severely underpaid, as much of the existing literature tends to suggest, but boredom,[22] the almost unbearable boredom of repeating the same simple task over and over again, hundreds of times each day. Regardless of how benign the content of the audio is, one comes to have a sort of repulsion towards it. I transcribe the same things again, and again, and again, and again, and again, performing a set of simple steps that reduces thinking to a series of repeatable operations.[23] Yet, in weariness and boredom embodied thinking glitches and betrays its supposed algorithmic smoothness. I should be attentively listening to the audio recordings and faithfully transcribing them, but I am the astonished spectator of an absolutely mind-blowing piece of performative electronic literature (an extremely boring one, to be sure, but not less fascinating for that matter). Imagination subtly intrudes in the work process and creates surrealist compositions that combine the fragments that I am listening to in improbable broken narrative lines and poetic images. Porn and prayers, death and entertainment, come together in scenes worthy of a Bosch Last Judgment with postmodern tints. Some of the voices gain faces from my past, and I am surprised to realize that the dogs barking in the background of one audio recording are placed by my imagination in a familiar countryside yard. The voice of an elderly person struggling to complete some task that remains undefined brings in front of my eyes an old lady lacking physical force, paralyzed from the waist down, seen from the back, failing to open the window from which she intends to jump—a scene from Michael Haneke’s movie Amour (2012) misremembered and distorted by my imagination. I quickly discard these fantasies, but the laboured breath of one speaker invokes a snoring that deeply annoyed me as a child, while someone trying to learn Romanian as a second language brings back scenes from my undergraduate years in Bucharest. The sound of rain momentarily sends me back into a cozy kitchen, face pressed against the window, looking at the patterns that the raindrops create in the puddles outside. Children playing in the snow evoke a weird combination of childhood memories and the first episode of Krzysztof Kieślowski’s Dekalog (1989). And all these images mesh together, leak into each other, contaminate each other, creating an affective and emotional texture that is impossible to capture in writing because it eschews conscious representation. Meanwhile, I type robotically, I execute my algorithm as best I can. For all intents and purposes, I am just working earnestly, focussed on the task at hand, reaching my targets for accuracy and throughput. I wonder what is going on in the imagination of the other transcribers.

If we suppose that everyone is just doing their job, would we not be seeing the other merely as a mechanism executing an algorithm? Yet what happens behind the seemingly robotic surface remains essentially inaccessible, beyond direct perception and beyond rational deduction. For the better or for the worse neither I nor The Client can access the imagination of the transcribers. Even if a sort of projection of imagination would be possible with the help of future technologies, that would still not solve the problem, because the imaginary experiences at stake here are not comprised solely of fragments which can be recognized (visually or otherwise), but rather of images that never quite manage to coagulate, of budding feelings that never quite come to be, minuscule nuances that are lost in recognition, and importantly of all the unacknowledged and unconscious relations between these infrathin modulations of a non-teleologic interplay of sense. Even if it would be possible to represent all this, to exteriorize it on a screen as images or data, the very fact of representation would betray its semi-conscious and unconscious layers, whose function is dependent exactly on not being captured in representation.

This is not to see imagination as some sort of inner world that is sheltered from the outside and from technology. The singularity of 'the interior' is shaped from the outside, it is the result of intercalated processes of individuation (Simondon 2013), of countless processes of enfolding and unfolding that precede the individual (Marks 2024). The very affects that constitute the core of interiority are always already collective, negotiated in the intimacy-of-the-common (Combes 2013)—where the intimacy-of-the-common, beyond Combes’ usage of the concept, should be understood as the 'human collective', but also, beyond that, as the environment in which processes of psychic and collective individuation are situated, including the technological environment. Our most interior and hidden truth is a product of exterior relations, always already technological (Stiegler 1998), and arguably under the contemporary condition our imagination is already modulated and modified by AI predictive algorithms (Hui 2016). Nevertheless, while being a product of exterior relations that create interiority as a result, imagination remains at the same time a way of modulating the outside from the inaccessible singularity of the 'interior' (i.e. impossible to exteriorize) phenomenon: it is a vector that (dis)orients the very material conditions (physical, but also historical) that constitute its ground. As Sara Ahmed argues, phenomenological experience is always already oriented by the material historical conditions of its being, but it is from within phenomenological experience that these orientations are constructed and deconstructed (Ahmed 2006).

Imagination is often an overlooked aspect of AI data work, except when it results in mental health problems for the workers.[24] We tend to take imagination into account as an integral part of the labour process only at the point when it explodes in pathologies that cannot be ignored, but otherwise it remains entirely neglected and seems to be discarded as irrelevant by workers, employers, journalists, and academics alike. Yet, it just so happens that the transcription work takes place in a setup that is conducive to an overdrive of imagination.[25]

Again, we are working in a so-called secure lab whose main function is supposed to be that of stopping the confidential information, including the content of the audio recordings, from spilling into the profane outside world from which it is mined. Leaving aside the mildly amusing fact that secure lab is too fancy a word for a few cameras, a couple of beeping doors that anyway do not close properly, and some intrusive algorithms meant to check the time that workers spend on their computers, this drive towards secrecy does create a dramatic frame of sorts for the work that we are performing.[26] No personal digital devices are allowed in the lab, nothing that could have a recording function, no paper and no writing tools. What a beautiful incentive to remember! How can one forget without pen and paper? Moreover, as transcribers, we know close to nothing about the provenience of the data that we transcribe, about the ways in which it is manipulated before reaching us, or about the ways in which it will be used after it passes through the transcription lab. Likewise, we are allowed no insights regarding the larger societal, political, ethical implications of our work. In short, as I mentioned before, the secure lab with its walls, its surveillance and its regulations alienates workers from the meaning of their actions. Workers are in the position of robots that receive an input and are supposed to produce an output by strictly following a set of rules (by executing an algorithm). The prolonged execution of monotonous tasks, the lack of meaning, the thick boredom of insipid repetition, the many interdictions that are reinforced (more or less successfully) in the secure lab, constitute a perfect fertile ground for the intrusion of imagination.

Maybe this prevailing boredom also has something to do with living in a culture industry predicated on overwhelming consumers with catchy experiences. What is one supposed to do with a contemporary hyperactive attention in search of ever-new stimuli (cf. Hayles 2007) when faced every day, for hours in a row, with little secrets, middle class and tame, all the same, again and again? With the risk of disappointing you, even the worst tragedies, even the most brutal of punchlines, will never rise to the emotional intensity of the little lives described in well-written American tear-jerk novels. Under the contemporary dynamics of the culture industry, that each of us is an integral part of, everyday life and death appear by and large as dull and unremarkable, even at their most shocking. One cannot help it, brutal boredom impregnates audio transcription, an unusual intrusion in the dynamics of the culture industry that inadvertently invites the imagination to take over. Even when we do not pay much attention to its results, it does work in the background.

The secure lab, for all its purportedly hi-tech security, is leaky: our embodied memories modulated by imagination, fallible as they may be, let the mined data spill back into the world, not as concrete information as such (that would be illegal) but as infrathin disturbances of who we are. Even if I do not exactly remember as such much of what I transcribe, it does haunt me and stirs my imagination: broken fragments of life caught unawares, inflated and distorted by my imagination, become an integral part of the lived memory that drives my being and becoming.

The main argument of this essay is that this work of imagination that operates in the background of transcription labour should be valued as a work of art in its own right, and that would mean also valuing transcribers not as robots (be they innocent and naive) that fit a predetermined image in the system, but as others to be respected for their absolutely inaccessible opacity.

On the other hand, I am not that willing to give you, the reader, by default, the same assumption of being more than a robot. Are you more than an algorithm (materialized in a 'human' body or otherwise) searching this string of characters for useful data that could be exploited for your benefit (as an academic citation for example)?

Instead of reCAPTCHA[27]

Please verify that you are not a robot.

I cannot reveal any information about the contents of the enormous piece of electronic literature that I am witnessing. But, if you are not (fully) a robot, then there are ways of glitching this legal requirement. Imagine.

Audio data is gathered from user interaction with their devices (cf. Hern 2019a, 2019b, Paul 2019). Since I am not allowed to give any hint about which particular devices and applications I have heard recordings from, you can imagine that it is the case with all digital devices with the capacity to capture voice recordings. Since I am not allowed to give any hint about the kind of recorded contents that get to be transcribed, you can imagine that everything that one says in the presence of a recording device could eventually be listened to by a transcriber. Since I am not allowed to tell you anything about the format of the recordings, you are free to imagine them in whichever format, from extremely short fragments to hours-long audio files.

Yes, the results of this exercise of imagination will be strictly speaking a falsification with respect to the concrete string of audio data that I transcribe, but in many ways it is a falsification that reveals more about the truth than the exact details (on falsification cf. Deleuze 1985, 171-172). For one thing, if The Client is unhappy with the false supposition that everything is recorded, it is up to them to specify what exactly they are recording. And, of course, you would be a fool to believe what The Client tells you, except when transcribers and other workers in the AI production chain are also allowed and encouraged to uncover their part of the story. But beyond this, remember Foucault’s (1995) insight that in the Panopticon it does not really matter at which exact moment there is a guard in the central tower actually surveilling the prisoners, and neither the actual limited direction of the guard’s gaze, but the fact that at any moment there could be someone surveilling, so that surveillance is internalized as a potentiality that shapes the subject from inside. For the panoptic logic, it is enough that anything you say in the presence of a digital device with audio recording capabilities could eventually get to a transcription lab (although probably just a small percentage actually does).

So here are the indications for creating an imaginary piece of audio poetry that will exist only in your imagination, and that will help us share a (falsified) common ground with respect to the contents of the audio files:

Step 1.

Try to remember all the digital devices with audio recording capabilities that you have encountered during the last week. Do not stop at the obvious smartphones.

Step 2.

Try to remember some of the things that you or the people around you said in the presence of those devices.

Step 3.

Imagine what your parents, your children, your friends, your neighbours, and your colleagues say in the presence of such devices. (Don’t be shy. Imagine what they are whispering under their blankets and what they are screaming in the public square, imagine what they are saying on the hospital bed, in the pub, or in the classroom.)

Step 4.

Walk on the street and watch the people around you. Imagine their voices captured by their digital devices. Imagine them being happy and imagine them being sad. Imagine them being confused and angry. Imagine them crying. Imagine them masturbating and having sex. Imagine them being excited and imagine them being scared. Imagine them suffering and dying. Imagine them in their tedious daily routines. Imagine the fragments of breath that their digital recording devices are capturing from all of this.

Step 5.

Forget everything.

Now you know. Such are the threads that compose the mind-blowing piece of electronic literature that I am privileged to guiltily witness. Yes, the discomfort, anger or boredom etc. that you feel when faced with this exercise are an integral part of your piece of imaginary poetry, just as they are an integral part of the imaginary compositions that emerge during audio transcription.

When one listens for hours on a row each day, month after month, to such recordings, though, the resulting texture becomes rather homogenous (with the occasional unexpected punchline that itself barely rises above the background boredom, if at all), the individual threads become blurred and lose their contours, what remains is what everybody already knows: life, suffering, sex, work, entertainment, death.

Art History / Media Archaeology Exercise

To further deepen our shared (false) understanding of the piece of electronic literature constituted by the audio recordings, we can turn to a short art history / media archaeology exercise in two steps.

a. Avant-garde literature that eschews the figure of the human author as an origin of sense (cf. Barthes 1977) and insists on the incoherent coherence of language that coagulates at the moment of reading can be invoked as a precursor of the electronic literature piece encountered by transcribers. I am referring to experiments that play with the structures of language and their broken relations to the logic of experience, denying the oversimplified account of writing as communication between predefined subjects. If we agree with Trevor Paglen’s suggestion that promoting AI chat bots as having a form of intelligence akin to that of sentient beings is a magic trick of sorts based on the common sense presupposition that linguistic sense is the result of the intentional actions of a thinking being that expresses itself (Paglen 2024), then avant-garde experiments that problematize the figure of the author are fitting pharmaka for this large scale societal delusion. By extension, engaging with the audio recordings used for training AI through the lens of experimental forms of modernist writing glitches the very logic that subtends their use in the AI industry: it problematizes the illusion that an AI generated answer is the result of a form of intelligence akin to that of living beings.[28]

So, as a first part in this attempt to further understand the literary texture created by the audio recordings, I will ask you to revisit, as one possible example among others, Gertrude Stein’s Tender Buttons. Pay attention to the nonteleological work of imagination that is required from the reader in order to imbue such texts with sense. It is not a question of reaching, as a telos of the reading process, a predefined sense communicated by the author but rather of erring along the vectors proposed by imagination (modulated by a host of other factors) at the encounter with the texts, never being able to capture a final definitive meaning (not least, because sense, in this case, is not reduced to rational meaning).[29]

Here is a short fragment from Stein’s text, to raise your appetite:

OBJECTS
Within, within the cut and slender joint alone, with sudden equals and no more than three, two in the centre make two one side.
If the elbow is long and it is filled so then the best example is all together.
The kind of show is made by squeezing.
(Stein 1914, “Objects”)

The incoherence of the text at first glance becomes a complex mesh of heterogeneous senses once we take the time to attend (maybe with boredom) to the impulses that it offers, and to the infrathin modulations that it operates. It is not the place here to start unfolding the particular senses that emerge for me at the encounter with these lines, my claim is simply that they offer a complex, ever-shifting, interplay of sense as long as the reader performs an exercise of imagination.

Stein’s text is far from being a lone example in the context of the avant-gardes in terms of art that requires the spectator to engage in a non-trivial effort of imagination in such a way that the process of spectatorship becomes integral to the works themselves. Thierry de Duve, for example, suggests that one key vector in Duchamp’s passage from painting to the readymade is the realization that it is the onlooker who finishes any painting (de Duve 1991), and by extension any work of art. Other obvious examples in which the spectator is in the position of an active participant through an effort of imagination include the Merz series of Kurt Schwitters, the dadaist cut-ups of Tristan Tzara’s poems, or the latter cut-ups used by William Burroughs in Naked Lunch and other novels. All these works (just random examples among many others) present good approximations for the process of spectatorship that can emerge in audio transcription when we are faced with heterogenous texts that tend to combine into an interwoven texture rich with fragmented flickering images that only sometimes manage to establish themselves as conscious phenomena. Moreover, such works bear a certain formal similarity with the cut-up piece of electronic literature constituted by the stream of audio recordings that transcribers are listening to.

b. One of the most common digital devices that will record your voice (in order to automatically transcribe it, translate it, etc.) is, obviously, the smartphone. Given the fact that smartphones are involved in so much of our daily routine, it can be hard to grasp the incredible wealth of the information that they record: it is just a shopping list, it is just a reminder for a meeting that I will have next week, it is just a quick message to my boss, it is just a routine conversation with my friend. So many aspects that we are ready to forget and discard as lacking any value. Yet, the patina of time tends to put things in a different light, and such insignificant sherds become valuable insights into daily patterns that will soon come to pass. This becomes especially obvious if we turn towards past recording technologies and attend to the bits and pieces of life that they captured.

For the second step of this exercise, I will invite you to listen to a recording of the audio messages left on Charlotte Moorman’s phone answering machine from the 24th of November to the 6th of December 1971, one of the (many) hidden gems that can be found on UbuWeb. On the one hand, this will put us in a better position for reflecting on the value of the everyday, and on the other hand it will give you a good idea of the audio textures that transcribers are listening to. Not that it will provide a true account of the audio recordings that we are transcribing, after all this tape was recorded more than half a century ago, well before phones were to be 'smart' and AI was to be able to recognize voice patterns, but rather that there is an eery similarity at play (another vector of falsification). If you want to get one step further in understanding the experience of transcribers, I challenge you to try to transcribe everything that you hear on the tape.

Accents intermingle, broken stories are guessed in-between the lines, characters are created. If you are familiar with some of the names, a whole new level of intertextuality comes into play. One’s imagination evokes a confusing texture of half forgotten art history references, from the The Annual Avantgarde Art Festival of New York to Nam June Paik’s video art installations, Moorman’s (in)famous performance of the TV cello, discussions around authorship in the chance generated music of John Cage, Symphony pour un homme seul, John and Yoko, The Kitchen etc. You can stop at any point and explore in more detail this intertextual mesh. You will find out (with surprise?) that others did it too (Allen 2013). At the same time, improbable fragments of everyday life are glitching your art histories. Invitations to Thanksgiving dinners and plans for meeting friends are intimately interwoven on the audio tape with snippets that evoke dimly remembered theoretical debates around media art, music, and conceptual interventions.

Besides what is actually said, besides the words that come together and make rational sense, there are also all the noises that escape description. The sounds of a technology now mostly extinct, with its intriguing hisses and beeps, the breathing of the people calling, the asperities in their voices, their hesitations, the occasional background sounds.

Such sounds too are an integral part of the daily routine of transcribers and create an impossible to describable background for everything else that happens, a musical piece of sorts, a strange version of musique concrète as it were. The imaginary work of art that emerges while transcribing is dependent not only upon what people say but also upon the illegible sounds and the acoustic background. From breathing in and out, to the tone of the voice, to the whisper or the scream that carries the word, but also the coughing, spitting, grunting, snoring, panting, and everything that I don’t have the words to describe, and everything that I cannot consciously remember because I don’t have the words to describe. From the TV noises, to the car engines, the rhythmic clicks and clacks of who knows what devices, the computer games, the shopping centers, the hospitals, the fights, the unhappy parents, the crying children, the bots, the doors opening and closing, the phones ringing, the wind, the sheep, the cows, the dogs, the chicken, the songbirds, the religious services, the teachers talking (probably to no one since the kids are whispering to me), the school break’s pandemonium, the football matches, the traffic… It’s all in there. Nothing new.

Even when it comes to the spoken words, the emerging artwork is not only about the broken narrative lines that my brain unwillingly creates between (sometimes not that) heterogenous points in this interwoven pattern of voices but more about the poetry of the forgettable, boring, uninteresting everyday utterances, so fragmented that they elude any narrative. Just words, patterns of words, modulations of breathing with their fragmented contexts.

Listening to the phone answering machine gives you a fairly accurate idea about all these aspects that are intrinsic to transcription.

While it is illegal to share any specific details about the string of audio recordings that transcribers are listening to, and, strictly speaking, it is impossible to describe in any satisfactory way the imaginary artworks that emerge in the labour process, nonetheless it is possible to gain a complex (embodied) understanding of all this by following the simple exercise of imagination proposed above, along with this short incursion through art history / media archaeology. By now you have a clear and accurate, if falsified, impression of the imaginary artworks that are the (absent) subject of this text.

Singularities

During transcription, despite one’s best intentions to focus on the productive task at hand, imagination tends to intrude, weaving out surreal, fragmentary, contradictory landscapes, artworks in their own right.

There is a machinic part to these works, namely the sequence of audio recordings that transcribers are listening to, along with the digital work environments and their trans-continental material implementation. The audio recordings, both individually, and taken together, constitute singularities that, under certain conditions, act as seeds for spectatorial processes—in a way, the recordings have the same function as a score in a musical or choreographic work, they offer vectors for the artworks that are to be performed. There is also a 'human' part to the works, the experience of the embodied spectator (i.e. the transcriber) with folded layers of personal and collective histories, conscious and unconscious, ontogenetic and phenomenological, etc. The artwork emerges at the intersection between the two as a very fragile product of imagination that does not get to be materialized or exteriorized in any way.

But things quickly get more complicated once we acknowledge that the 'human' is always already technological and vice versa. Technology is an integral part of who we are, how we think, and what we imagine (Stiegler 1998, Hui 2016, 2019), and 'human' intentionality is always, to some degree, part of the functioning of the technological complex, even when it ceases to be the main driving factor (Simondon 1989). The intimate process of imagination is already modulated by technology (Hui 2023a), while technology is modulated in turn by the products of imagination. The imaginary artworks that emerge in the process of transcription are as such neither strictly technological products, nor internal to the 'human' subject, but fantasmas that appear somewhere in between 'human' and technology, for beings that in the very process of imagination transform and problematize their own limits, definitions, and vectors of becoming—i.e. for beings that are engaged in processes of individuation that are at once psychic, collective, technological and vital, in Simondon’s understanding of these terms.

Referring to the audio recordings as singularities I am gesturing towards Simondon’s contention that every process of individuation starts around a seed that creates a mediation between problematics of different orders (Simondon 2013, 78-82)—where individuation refers to the processes through which individuals emerge out of pre-individual fields.[30] In our case, on the one hand there is the socio-technological surveillant assemblage, with its dynamics of power, that the recordings are an integral part of, and on the other hand there is the personal embodied experience of the transcriber—where the personal has to be understood as a fold emerging in the intimacy-of-the-common (Combes 2013), already impregnated by technology and politics. The audio recordings afford surprising points of mediation between these two problematics (i.e. act as singularities), glitching the logic of usefulness that drives the system that they are inscribed in and potentially provoking further processes of individuation—potentially provoking a renegotiation of who we are and how we interact with our associated milieu (i.e. our 'natural' environment inflected by societal, political, economic etc. dynamics). The imaginary artworks that emerge in the process of transcription are an integral part of such processes of individuation.

At the same time, by using the term singularities in this context I invoke Laura Marks’ definition of singularity as an intensive tip point of folding, that can become the starting point of entire processes of unfolding and enfolding anew. For Marks, the cosmos is an infinitely folded and internally differentiating surface (Marks 2024, 243), and living individuals are apparently bounded centers of experience that instantiate a situated perspective from which they partially unfold the infinite—i.e. are affected by the cosmos, perceive it, think it etc. (Marks 2024 6-7, 81). At the same time, the experience of individuals is integral to the cosmos and its dynamics; the cosmos is composed of experience, and not simply reflected in experience (Marks 2024, 7). All individuals are constituted from histories of relations that they enfold in their own being, and, at the limit, each individual can be said to indistinctly enfold the entire cosmos as its internal condition of being and becoming (Marks 2024, 11). In this view, the embodied experience of the thinking subject (always in strict relation to its associated milieu) is the result of ongoing processes of enfolding and unfolding that take place on a multiplicity of levels, and recursively participates in the infinite fold that the cosmos (that grounds its existence) is, introducing its infrathin mark. The current socio-political, ideological, and economic context of computational capitalism, with its many competing actors (including, for our interest here, the big players in the AI industry), constitutes itself a 'dominant fold' (Marks 2024, 28-32, 174) in this system of relations that shapes our experience from within. In this context, Marks proposes that singularities are intensive tips of folding that afford the possibility of interfering with the dominant order by unfolding differently (Marks 2024, 55-61)—i.e. unfolding that which is supposed to remain hidden (for example, the hidden labour that AI relies on, the environmental impact of AI technology, the potential effect of AI as a catalyst for the homogenization associated with globalization, the experiences of those who are silenced and deprived of representation, etc.) or, on the contrary, unfolding differently by respecting what Glissant calls 'the right to opacity' (Glissant 1990), abstaining in certain situations, for ethical considerations, from unfolding at all costs, and acknowledging the inherent limitations of all unfolding inasmuch as it is necessarily performed from a situated and contingent perspective (for example, respecting the otherness of the other in its own terms).

Starting from Marks’ insights, I propose that turning towards the imaginary artworks that emerge in the process of transcription (occasioned by the audio recordings understood as singularities) can constitute a small, infrathin impulse towards valuing folds in our environments ('natural', historic, economic etc.) that escape the possibility of being captured in faithful representations and reified as data. These imaginary artworks act as vectors that could (subtly) glitch the current enfolding-unfolding dynamics that ground living experience in digital cultures subtended by global capitalism.

Choose Your Character

It just so happens that my work day starts as a computer game of sorts. The shopping mall adjacent to the office building where the secure lab is located opens its doors one hour earlier than the beginning of my shift, and I often spend some time there. The shopping mall at that time in the morning is a somewhat strange place. The cleaning personnel washes the floors and empties the bins, the shop assistants arrive one by one to prepare their stores for opening, an occasional alarm goes off, big boxes of merchandise are carried in and out of the shops, and as the minutes tick by more and more people take the escalators to descend to the underground floor where a grocery shop is already open. All in all a rather calm and relatively quiet atmosphere (at least compared with the main library on the UniMelb campus), so not too bad for a bit of reading, or writing, or thinking, or whatever it is that one has to do to plant the seeds of senses which are to grow and wither in the next 8 hours. At some point, the generic shopping mall music starts playing in the background, usually on a rather low volume at first, impregnating one’s thoughts with the bleak rhythmic anthems of cheery overconsumption. Coincidentally, once in a while, there is this short striking sound motif that seems very familiar to me. I have the impression that I have encountered it before at the beginning of Petra Szeman’s video work Trajectories (2017). I’m not entirely sure that I’m not mistaken, it could very well be that I’m misrecognizing the sounds from Szeman’s work, but what is certain is that (misrecognition or not) in such moments Trajectories tends to echo in my imagination.

Szeman’s video is a meditation on embodied identity in the context of contemporary digital technologies, interconnected cultural spheres, and unprecedented possibilities for traveling and living beyond the confines of one’s own culture. Instead of the widespread uncritical reaction to globalization that tends to simply fetishize the local as the ground of a revealed truth (revealed by Nature, by History, or by Culture) that orients, Szeman’s work focuses on strategies of (dis)orientation that acknowledge the local not as a truth but as a problematic field to be negotiated with care, and embodied identity not as a stable label, but as an ongoing process of individuation that in its movement hybridizes both the living subject and its environment. The first scene of Trajectories simulates a retro computer game in which the player has to choose a character in order to start their quest. There are two options, both fictionalized representations of the artist, both suggesting such (dis)orienting processes of individuation, being caught between divergent cultures, geographies, and technologies. Subsequently, the chosen character is placed into the game world in a short animated sequence reminiscent of a wide range of computer games, except that in the case of Trajectories, at this moment in the video, the game world is constituted by Google Maps in Street View mode.[31] It is at this point, as the character takes its place in the game world, as it takes its position in Street View ready to begin the quest, that this striking sound pattern, that now I am (mis)recognizing from time to time in the shopping mall at the beginning of my work day, is played (around min 01:10 in the video).

So, the sound played in the shopping mall invites me to choose my character for the day and to reconsider the representations that I embody. What character will I play today? Will I be listening as if from a safe distance, surveilling the stream of audio data impartially, performing professionally the job that I am paid for? Will I recognize in the audio recordings the past faces that I wore, listening to fragments from my pasts that never happened? Will I be the bored spy who hears all your dirty little secrets? Will I be the condescending, educated, privileged spectator raging inside with ethical and moral indignation towards the content of some of the recordings and towards the job that I am performing? Will I be upset by the illiteracy and lack of basic education that transpires through so many of the recordings? Will I just let this interwoven pattern of breath and the meanings that it carries wash over me, fighting to keep at bay my resentment?

A sense of disorientation prevails, being thorn between accepting what I come to hear and the indignation towards it, between accepting as they are these patterns of breath carrying frightful meanings, this interwoven arrangement of 'human' life in its atrocious, uncontrollable, violent, beautiful, self-destructive absurdity and the desire to advocate for one’s ethical positions. I cannot simply uphold my ethical indignation, and yet I cannot let go of it. Remaining committed to a set of stable ethical values would mean failing to question one’s orientation by sublating it into a truth of nature or of culture. And that would be a quintessential colonialist gesture—even when, or especially when, performed towards one’s 'own' culture. It would mean claiming to accept radical difference but at the same time holding to laws (written or not) that remain unquestioned in their erasing of difference. It cannot be a question of letting go of the ethical imperatives either. That would fail into supposing an unavoidable anarchic telos, rather than staying with the trouble (cf. Harraway 2016), that is, rather than acknowledging that the end in its absoluteness has not yet come, that the end is a-venir, always after the next step, and that the labyrinth of life is the careful pharmako-logical differentiation and deferral (cf. Stiegler 2018, 188-270) of this end to come (and of its absolute truth). Being alive implies an ethical responsibility (and response-ability). I have to renegotiate my ethical system of values, refusing any stabilization, with each audio recording that I am listening to. The punchlines (from homophobia to misogyny, from violence towards children to extremist politics) bring a certain acuteness to this crisis of disorientation, but the crisis subsists well beyond these particularly striking moments. The same ethical crisis is there in the much more numerous voices in search of the usual products from the supermarket, in search of spare parts for their cars and kitchen utensils, in search of the correct answer to the math test, etc. The interwoven pattern of voices that washes over me and inscribes its traces as modulations of who I am attests for a society in the grip of accelerated capitalism, deep environmental crisis, failing education systems, raising ultranationalisms and problematic fetishization of the local, a society longing for divine interventions, money, medication and technologies that would absolve from past, present and futures sins. And at the center of it all seems to be an unwavering belief in who one is, in one’s identity and its absolute truth (natural and cultural), to the detriment of hybridization and othering.

I cannot choose my character. I am heading to the 'secure lab' disoriented, and a bit anxious, as always.

Boredom

The work is boring, excruciatingly boring. Even the outrageous punchlines are boring. I know by now what you whisper and scream towards your devices. The same river of breath again, and again, and again. Alte voci aceeași piesă, Alte guri aceeași gamă. After the first few minutes, I already start to lose attention. I am too accustomed to your porn preferences and to your favorite religious hymns, I can quickly pass over your obnoxious belief in your exceptionality as individuals and communities, I am not surprised anymore that you are sexually aroused by your mother-tongue and father-land, I know that each of you believes they would have had a better life if it wouldn’t have been for the others who stole what by right (or might) should have been yours. I am not that touched by your appointment for the next colonoscopy, and neither by (the very poetic) diagnostics that doctors recite into who knows what device—don’t worry, I do my job well, none of your personal data leaves the secure lab, except inasmuch as it lives, forgotten, in me and flashes through each of my everyday actions. I don’t think twice when I transcribe the scary symptoms some of you get from cancer treatments, pregnancies gone wrong, or spinal problems. I fail even in that basic ethical responsibility of simply listening, paying attention, and making a space in myself where the meaning carried by these patterns of breath could coagulate.

I am bored and I seek relief from it. When I’m confident enough of reaching my targets I tend to listen to music in the background or to put on a second screen some slow TV footage of spectacular 'nature' (nature porn).[32] The (not so) weird thing that happens is that the audio, the pattern of interwoven breaths, changes its meaning and the way in which it affects me. Hiroshi Yoshimura’s Green, 9 Postcards or Wetlands, infuse the audio recordings with a bitter-sweet nostalgia; the recorded voices seem like a few fragile pieces of the puzzle of time passing whose forgetting is for a very brief moment delayed. Phillip Glass’s themes from Koyaanisqatsi, Powaqqatsi, and Naqoyqatsi tend to make everything a bit more politically involved (maybe because they bring back unconsciously the movies), phantomatic voices caught in a technological society accelerating towards environmental disaster and collapse, victims and perpetrators of unbearable crimes, and yet everything is aestheticized to a bearable, even enjoyable level. Sachiko M and Ryuichi Sakamoto’s snow, silence, partially sunny blends in with the sounds of the audio to create an illusion of sorts that destabilizes the recordings in unpredictable and always new ways making them become poignant pieces of sound art in which the unruly sequences of frequencies takes precedence over the meaning of what is said—it is by far the most intriguing result that I obtain, but it is hard to transcribe with this background sound, it requires too much attention and either the transcription process or the sound pieces that are created have to suffer. Light pop-rock with an upbeat rhythm is often a good choice for improving productivity by keeping alert and in a good mood, ready to transcribe rapidly, but it doesn’t bring much in terms of the spectatorial process. Various pieces that have strong personal significance tend to strengthen the input of personal imagery in the imaginary artworks. 'Nature porn' videos help with fighting boredom and resonate perfectly not only with the more traditional pornography that is relatively abundant in the audio recordings but also with the position of the transcriber folded in front of the computer (on folded bodies in front of the screen cf. Michele White 2006), dreaming of being out in an imaginary nature that can never be experienced in the spectacular and overdramatized way that is presented on the screen. Watching 'nature porn' while transcribing underlines the pathetical self-contradictory vectors of desire that pass through the transcription process. Background music or not, 'nature porn' or not, boredom eventually engulfs you.

What is the probability that the AI algorithm, that I help train, will feel bored with its task? This is not a trivial question. It touches exactly on the potentiality of a sentiendum (cf. Deleuze 1994) that comes from outside of the space of possibility defined by the actual reality that one encounters. Boredom is not of the same order as the data, and neither is it of the same order as the rules that govern the manipulation of data. It does not fall in the statistical distribution of possible solutions to the given set of inputs. The utterance 'I am bored' could be part of the possible outcomes, but that is very different. My boredom is not defined by the fact that I say 'I am bored,' but by the fact that boredom modulates the entire dataset that I am asked to deal with, it glitches the instructions that I have to follow, and it changes who I am both with respect to the task of hand and more generally with respect to the associated milieu that I become together with.

The temptation is to explain phenomena such as boredom by simply postulating an immensely enlarged data set that I, as a 'human', have to process and that can produce boredom as a more or less legitimate actual outcome. In other words, from this perspective, if the computer would have 'more neurons', the capacity of dealing with significantly larger data sets, if it would be composed of more networks with competing interests (as the diverse biological systems of the 'human' body that can lead towards incongruous goals) then the computer could get bored (maybe as a sort of non-critical error, maybe as one possible 'legitimate' outcome of the processes that its 'body' instantiates). But this is to entirely miss the point. Boredom is not simply an ('inner') outcome of an algorithm but at the same time its recursive ground—not an outcome that becomes an input, but an outcome that becomes a ground upon which everything else happens. I’m not claiming either that digital algorithms will never get bored, nor that they will (although it is pretty clear that for the time being they don’t). The point is simply that in order for this to happen the algorithm will have to stop being an algorithm, at least to some extent. That it will have to function as a couple individual associated milieu, in which some of the outputs change the very ground upon which the algorithm functions, along with the algorithm itself, and that such changes come from outside of the space of possibility defined by the representation of the associated milieu that the individual algorithm has—and consequently that the relation between inputs and outputs that defines the algorithm is glitched, broken and open to ongoing mutations.

In this boredom that seeps through all my pores (cf. Lu Xun’s short story Revenge I in Hsun 1974), that suffocates everything the secure lab, broken breath and broken sherds of sense are inscribed upon me only half consciously. Singularities that precipitate, in their infrathin way, miniature fissures and crises that sometimes grow into full-blown (dis)orienting processes of individuation.

Parerga

One is in a strange disorienting position when transcribing, in a continuous ethical crisis. The transcribers are faced with an ethical imperative of paying attention (on attention as ethical imperative cf. Blanchot 1993), yet they are diverted by intolerable boredom; the same boredom, paradoxically, that is a fertile ground for processes of imagination that produce convoluted compositions which require themselves attention, albeit in a very different modality; transcribers are asked, for ethical reasons, to forget all that they are listening to, and yet they are remembering too much; at the very same time, they are unable to remember enough, unable to continue to bear witness to the absurd fragments of life that they encounter; they are balancing their own ethical system of values against the responsibility of accepting the otherness of the other. It is upon this background ethical crisis that the imaginary artworks emerge, and, as such, they are dependent upon the limits of the 'secure lab', upon the material frames that condition the work environment and that directly participate in setting up this ethical problematic that runs through the audio transcription labour. The allegedly impermeable walls of the 'secure lab', with their societal and economic reinforcement, frame the stream of recorded voices, and this frame offers the condition of possibility for the imaginary artworks that transcribers create.

The most immediate function of the walls of the secure lab is to legitimize the intrusion of my ear, as a transcriber, in a space otherwise legally protected as private. It would be impossible in an art context, for example, to experience these raw fragments of life caught unawares. Any artist (or scientist for that matter) willing to work with recordings of 'human' subjects that are unaware of being recorded (or at best vaguely aware) would most probably face insurmountable legal barriers. There are only two categories of actors that eschew such legal requirements: big tech companies (The Client), and the authoritarian state (numerous whistleblower cases in the past decade attest to this). For obvious reasons, we are primarily interested here in the first of these cases, although they are in fact deeply connected.

We are selling our privacy and intimacy in exchange for the experience of using spectacular new digital devices. This transaction is given an appearance of legality by the fact that it is vaguely stipulated in the Terms and Conditions that we have to agree with in order to use a digital application. There are at least three factors that make this legal aspect nothing more than an appearance: 1. it is actually impossible to read in totality the Terms and Conditions that we have to agree with, and the legal jargon used in such documents makes them inaccessible to anyone without legal training; 2. there are huge societal and economic pressures for using some of these applications, and the individual does not have a fair chance to refuse the Terms and Conditions (for example UniMelb uses Gmail for the emails of its employees and students, which forces everyone to accept Google’s Terms and Conditions, regardless if they agree with them or not); 3. there is a stark lack of education for the general public on the actual implications of agreeing with the Terms and Conditions of popular online applications, which makes the transaction not unlike the colonialist practice of exchanging glass beads for gold and ivory. Of course, the point is not that glass beads are intrinsically less valuable than gold, but rather that they are so in the context of the global economy (dominated by the colonizers) in which indigenous populations are forced to participate. In the same way, there is nothing inherently wrong with giving away personal data in order to use Facebook, Instagram, Google Search, TikTok, etc. but such a transaction is certainly disempowering in the context of the current global capitalist markets and geopolitical situations in which data is the new gold.

Yet, it would be wrong to conclude that this lack of awareness is the result of malevolent actors secretly manipulating an innocent, naive, good public. No, times and again we have been warned that our assumed privacy is breached both in the name of corporate interests and in the name of national interests (and the two are deeply connected). Our lack of collective action proves that we are just fine with it, that we choose (not a 'free choice' by any means, but a choice nonetheless) not to know and not to act. I am not arguing that this state of fact is either good or bad, I am simply remaining aware of the fact that such breaches of privacy have been flagged again and again. None of us can claim to be surprised in any way that our devices are recorded in the name of a higher order (i.e. economic interests, national security interests, etc.), an order that is there to protect us from the evil Other by any means, including by breaching the basic rights that it purports to safeguard. Nothing new. Each and every one of us by not confronting our employers and political leaders on this subject, by continuing to use, more or less willingly, applications that record personal data for private commercial interests are condoning and supporting the workings of this system. There cannot be any pretension of innocence. We, The Clients of The Client, are one of the integral parts of this mechanism that empowers and oppresses us at very the same time.

The function of the wall of the secure lab in this mechanism is to institute a logic of the private versus the public that closely resembles that of the panopticon: on one side of the wall everything is potentially visible (in virtue of apparently legal Terms and Conditions), everything is public, but only to the public that has access to the secure lab (i.e. The Client and its partners); on the other side of the wall, inside the secure lab, the utmost secrecy governs, all operations are protected from the view of the public by strict laws. From the outside of the secure lab, it is not only that you cannot see what happens inside the lab, but just as importantly that you cannot see outside in the same way that those that are inside can. The two worst things that could happen to a panoptic system, the two things that could almost instantly collapse its logic are:[34] on the one hand, rendering the walls of the central tower transparent so that the mechanism of surveillance becomes visible, so that the prisoners can see when, how, from what situated perspective they are seen; and on the other hand, allowing the prisoners to see each other, which is a risky and possibly disastrous democratization of the benefits of surveillance, but one that could also potentially develop into a politics of care. From this perspective, it seems to me that often the very insistence on the protection of privacy can be a move towards its reification and exploitation for economic and political gains. I believe much more in introducing care, tolerance, and acknowledgement of interdependence as basic education principles at the expense of freedom, independence, and auto-determination, than in the laws of privacy (necessary as they are) which continuously prove to reinforce rather than challenge the panoptic mechanism. It is not that personal freedom and the strictly interrelated personal privacy are not important, on the contrary, they are essential, but they are by-products of systems of care and not goals that can be reached in themselves. Secure labs are a poignant example: they are instituted (and legally required) exactly in order to protect privacy and sensitive data, yet they function as breaches of privacy in the name of corporate interests.

As a transcriber, I am a (very) small (easily replaceable) clog at an interesting juncture in this panoptic machine. Inside the walls of the secure lab I am allowed to (in fact requested to) listen to (often personal) fragments extracted from everyday life, at the same time I am low enough in the hierarchy of the company that my position is exactly at the point of passage between that which has to remain visible, and that which has to be hidden at all costs. The walls of the secure lab make legally possible my (guilty) position as a spectator with respect to sherds extracted from your private lifes, and stop me from feeding back outside the lab any of that information, but at the same time, they keep me entirely ignorant with respect to the workings of the system that I am part of, and, importantly, in the 'secure lab' I am myself the subject of strict panoptic surveillance. As a transcriber I am always watched, all my actions are quantified, I know next to nothing about the data pipeline that I am an integral part of, and, of course, I am kept in the dark regarding its larger political, economic, environmental consequences. It goes without saying that I have very little agency, and I am expected to act rather like an obedient robot, producing the expected outputs from the given inputs, at the required rates. There is very little room for anything that could count as meaningful political or social action when one is caught in this double bind.

As a consequence, the walls of the lab shift the problematic of ethical responsibility and response-ability. I am somehow removed from the intentions of the voices that I hear, and my response-ability with respect to them comes to be very problematic. The same walls of the secure lab that make my intrusion possible, protect the people that I hear against my infringements and protect me from their possible reactions. (If some of the people that I have listened to would know that I know what they were whispering into their devices, I would probably not be safe. And if I wouldn’t be constrained by the law to be a grave for everything that I hear in the secure lab, they probably wouldn’t be safe either.) For the better and for the worse, the walls of the secure lab limit my possibilities for action, my response-ability. And this also means that I have to either suspend or continuously challenge, any system of ethical values that I could be grounded in. What ends up happening is that I listen to the audio recordings in the same way that I read, say, Journal du voleur (Genet 1949), as a framed piece of literature that acts on a different level than that of everyday life. Not that there is no ethical responsibility involved, but rather that the ethical responsibility comes to be one of listening and paying attention, of attending to the breath that carries whatever immoral and unethical stance. Life in a frame.

The ongoing contemporary art trope of challenging the boundaries between 'art' and 'life' remains hopelessly naive if it is merely an attempt to remove the frame, to live-art or art-live without the constraint of the parerga that make the ergon possible. It is a very powerful paradigm though when it attempts to multiply the frames, to play the framing functions against themselves in a mise-en-abîme that confounds art and life and challenges their distinction not because art becomes unframed, but because it engages with the processes of framing (biological, political, social, etc.) that 'real life' in its 'naturalness' relies upon.

Going back briefly to our art history / media archaeology exercise, we can find numerous examples of modern and contemporary artworks that enact, in different ways, a playful re-framing of the frames that drive 'human' life. Remember Fountain (Marcel Duchamp, 1917) and its problematization of societal expectations with respect to aesthetic standards, accepted art practices, and definitions of authorship. Remember Kitch’s Last Meal (Carolee Schneemann, 1976) and the questioning of the limits that separate public life from private life, body from discourse, but also its engagement with the frames that guide the relationship between 'humans' and companion species, especially in relation to death. Remember Riddles of the Sphinx (Laura Mulvey and Peter Wollen, 1977) and the way it challenges the framing of gender roles in their intertwining with cinematographic paradigms, and more generally with scopic regimes. Remember 7000 Oaks (Joseph Beuys, 1982) and the reframing of social dynamics as the material support of sculptural practices. Remember Dialogue with President Ceaușescu (Ion Grigorescu, 1978) and the daring challenge towards the frames constituted by the dominant relations of power during the communist dictatorship in Romania. Remember Dan Perjovschi’s “Romania” tattoo and its subsequent removal (1993, 2003) with the uncomfortable questions it raises regarding the function of nationality as a frame in the contemporary art world. Remember The Mirror of Faith (ULTRAFUTURO, 2004-2018) and the problematization of the relationship between genetic code and cultural practice in their mutual framing and de-framing of each other. Remember Garlic=Rich Air (Shu Lea Cheang, 2002-2003) and its engagement with the frames of internet technologies as they act with respect to local communities.[35] In all these cases, chosen almost randomly, and in many others, challenging the traditional frames of the artwork does not mean pretending to confuse art with an unframed vitality, but rather re-framing the frames—political, societal, aesthetic, biological, environmental, discursive, etc.—that shape our lives (the complex embodied performances that our lives are). Following a by now well-established modernist tradition, these works question both the frame that is framed and the frame that is framing—which in the end are just moments of the same larger process of individuation. Not the orientation of an unframed art simply indistinguishable from its outsides, but the (dis)orientation of intercalated frames that challenge each other in an absurd dis-harmonic interplay.

If we miss the function of the frame or dismiss it, we risk instantiating an over-encompassing colonialist drive towards an unquestioned Good strictly interrelated with usefulness, be it moral usefulness. When we do take into account the function of the frame, we plunge into an uncomfortable (dis)orientation that problematizes the moral and ethical vectors that drive everyday life. In the frame, I can pay attention to the voice that might or might not be involved in an unethical act,[36] I can attend to it and to the imaginary narrative that it suggests, without taking the stance of a self-righteous judge. Yet the frame does not render the spectator innocent. It simply opens a dangerous, yet valuable, process of (dis)orientation.

So, beyond the repetitive robotic work, what happens is that one tends to attend to the interwoven patterns of breath and the meanings that they carry as life caught in a frame.[37] The intrusion of imagination in the work process, invited inadvertently by the work environment created inside the walls of the secure lab, re-frames the labour of transcription and the frames (the walls of the secure lab) that make it possible, affording the emergence of improbable imaginary artworks that require a different ethical positioning, one based on witnessing and accepting otherness, but also on hybridization and falsification. This is never simply a new orienting ethical position but rather an ongoing crisis and (dis)orientation of ethical responsibility and response-ability. Which also means that one has to distort the frame that one’s embodied being is with respect to the world, to make space in oneself, through the work of attention and imagination, for the work of the artwork to take place.

Every Shift is a Work of Art

The title of this essay, Every Shift is a Work of Art, is borrowed from Herta Müller’s acclaimed novel The Hunger Angel. The protagonist of the novel, a young Transylvanian german arrested and relocated in the years after World War II to a forced labour camp in the Soviet Union, spends the idle minutes of his work breaks imagining surreal scenes in which elements of a distant past, distorted by memory, intrude in the settings of the excruciating work environment of the camp. The 'work of art' is here an imaginary individual experience that glitches 'the real', yet that is not shared as such with anyone (Müller 2012, 206).

I am using in this essay Müller’s phrase[38] in order to think about the labour performed by audio transcribers for AI projects. As explained above, the current wave of AI technologies is subtended by the labour of annotating colossal amounts of data. In the case of AI that recognizes spoken language this task implies the tedious (and, paradoxically, at the very same time exhilarating) work of transcribing millions of audio recordings gathered from the daily use of digital devices around the world. I am not implying that the work of audio transcription is in any way comparable with forced labour, I am interested rather in the suggestion that its logic can be glitched through an act of imagination, through a work of art that remains a singular embodied experience impossible to fully communicate.

My art historical point of reference for thinking about the products of imagination as artworks is the imaginary music developed by composer Octavian Nemescu starting in the mid-1970s. For Nemescu, imaginary music is a musical genre characterized by the fact that it is performed by the spectator/practitioner in their imagination, without any material exteriorization (Nemescu 2015). It is based on musical scores that combine traditional musical notation, extramusical elements (images, texts, drawings, non-musical symbols, etc.), and detailed guiding explanations provided by the authors (Anghel 2020). In other words, imaginary music is a guided exercise of imagination that results in uncommunicable, individual, 'inner' musical experiences. It is possible to interpret Nemescu’s imaginary music from a multitude of perspectives, from being a dissident art form that eschewed the constraints imposed on artistic expression by the communist dictatorship in Romania at the time, to being a reaction towards existing avant-garde musical trends predicated on formal explorations (cf. Anghel 2020, 3-4). Nemescu’s own writing on the subject insists that the main aspect of imaginary music is its anti-spectacular character:[39] as opposed to the society of the spectacle where communication supposes exteriorization (the more spectacular the better) and a distinction between those who emit the message and manipulate and those who receive the message and submit to manipulation, imaginary music is an intimate ritual in which the practitioner communicates with the “unseen”, with the “transcendental” and with its “higher self” (Nemescu 2015, 4-5), and that can create communion between individuals as equal participants in this performance that touches on the borders of the sacred.

While the theoretical framework that I’m operating in is significantly different from the one that Nemescu proposes, I find the preoccupation with anti-spectacular (or non-spectacular) forms of art, that do not rely on exteriorization, to be highly relevant not only for the contemporary art context, but more than that, for navigating life in contemporary digital cultures. In a culture industry where work and entertainment are dominated by a logic of quantification, the undetectable 'inner' ritual becomes an infrathin aesthetic form of resistance. Just as Nemescu proposes to understand imaginary music as a musical genre, like symphonic music, electroacoustic music, opera, or jazz (these are some of Nemescu’s examples), I contend that the imaginary artworks that emerge for transcribers should be considered as a valid form of AI art, as important as any other.

Joanna Zylinska, in AI Art: Machine Vision and Warped Dreams (2020), makes a cogent argument that AI art is not primarily about the creativity of computers, and neither about 'fun' spectacular products of new technologies (also cf. Kelber and Trojanowska, 2019), but rather about the fraught relations between current AI paradigms with their technological and philosophical underpinnings and the forms of embodied being (individual and collective) that emerge in this context. AI art forms are modalities of thinking about, modulating, and often glitching these relations.[40] Some of the forms of AI art that Zylinska discusses include generative art that uses neural networks from a critical perspective, art that deploys AI in conceptual ways, or AI used in relation to computational photography. I propose here that the imaginary artworks that emerge for transcribers should be seen as another art form, different from the ones touched upon by Zylinska, to be added to the considerations of AI art.

The imaginary artworks experienced by transcribers constitute an embodied problematization of performance in the context of AI algorithms. In digital cultures, performance is primarily understood in a mathematical-technological meaning. In mathematics to perform (an operation) is to solve, and more than that, in technological applications to perform means to solve optimally. A high-performance algorithm, for example, is an algorithm that optimally organizes the intake of input and the production of output (Morrison et al. 2019). And the performance (the capacity to solve optimally) of algorithms, or lack thereof (because of glitches, because of insufficient computing power, because of inherent biases, etc.),[41] is intrinsic to every aspect of networked experience, and mediates all embodied life in digital cultures (whether it is consciously acknowledged or not). In their introduction to a special number of The Drama Review on the theme Algorithms and Performance, Elise Morrison, Tavia Nyong’O, and Joseph Roach (2019) propose to address the anxiety concerning the potential standardizing and normalizing effect of algorithms when uncritically used and produced, by hybridizing this technological meaning of performance with the meaning that the term has in performance studies, namely that of situated embodied experience. In this sense a critical approach to algorithms (AI algorithms included) entails working through the tensions between these two meanings of performance: the ambiguity of an embodied presence on one hand and the optimal solving on the other, while acknowledging at the same time that embodied presence is itself dependent on the technological mediation.

The imaginary artworks are placed at an intensive point of intersection in these incongruent dynamics of performance: the role of transcribers is to produce the data necessary for optimizing AI, being an active if rarely acknowledged part of the performance of AI algorithms; the robotization of transcription work, with its quantitative targets, is an instantiation of subsuming the messiness of embodied experience to the technological logic of performance, aiming at standardization and normalization; yet, the (unavoidable) biases of transcribers, as well as their mistakes, that is the failures in their performance, introduce errors in the alleged objective rationality of AI impacting its performance (not incidentally, but necessarily); the imaginary artworks are also results of such failures in the performance (in a technological sense) of transcribers, attending to these artworks and allowing them to grow and develop is an infrathin performance (in the sense that performance has in performance studies) that hybridizes the position of the transcriber and glitches the logic of optimization.

Claiming that the work of imagination can constitute valid artworks does not mean dismissing the importance of the collective experience in art, but reconsidering its limits, framing the frames of collective experience and glitching them in this process. The 'inner' artwork produced in imaginary music, for example, remains collective inasmuch as imagination happens in the intimacy-of-the-common, even if it produces singular 'interior' experiences, but also inasmuch as the score introduces in the public space the invitation to an experience that defies the possibility of being censored by imposed ideological limits. From this second perspective, what is collectively shared, and hence is politically powerful, is the potentiality of following the instructions and of performing imaginary pieces of music that remain individual experiences beyond the control of political authority, and that, according to Nemescu, create a communion between individuals as equal participants in this 'inner' ritual. Another, more subtle face of the collective character of imaginary music is the fact that the world of the spectator, modulated by the experience of the piece that they perform, becomes itself a modulation in the individuation that drives the collective and that subtends all processes of psychic individuation through which individuals become who they (never quite) are.

Returning to audio transcription, there is a sense of political empowerment in recognizing the potentiality of imagination to subtly and imperceptibly distort the rules of the game, to perform infrathin glitches in the contemporary politico-economic system as experienced by (highly skilled) low-level workers. This political connotation is not of the order of catharsis, and neither of the order of the revolution (on catharsis vs revolution cf. Boal 2008). It does resonate to a certain extent with the stoic insistence on one’s control over the way in which they are affected by the world (Marcus Aurelius, Epictetus), except that this 'one' is no longer free and independent but a knot of interdependencies in the first place—a knot of interdependencies with a certain degree of indeterminacy that makes possible this subversive partial power of affecting how the world affects you. But this political power is infrathin, not impeding or changing directly the way the system works, not openly fighting against it, but only engendering for a flickering moment an experience that is of a different order and that introduces a potentially significant, if infinitely small, glitch in the system. The lack of usefulness of such glitches is what makes them valuable, what gives them sense as infrathin (dis)orientations of labour that operate outside of the domain that can be directly grasped by the panoptic system.

If Yuk Hui is correct that the danger of contemporary devices that perform tertiary protentions could be an automatization of transcendental imagination (Hui 2016), then the process of spectatorship advocated here is a glitch that throws imagination into the unknown, the impractical and the lack of usefulness—a gesture pertaining to what Erin Manning calls a pragmatics of the useless (Manning 2020). Spectatorship as a process that disorients who one is by throwing the experience of being alive into absurdity (absurd: out of tune, disharmonious)[42] for a brief moment, glitching automatized imagination, throwing the faculties of thought in a disharmonious interplay upon which the world itself emerges as a confusing plane of immanence in a continuous process of enfolding and unfolding.

But in order for this absurd (dis)orienting of the world to happen, one has to make space in oneself, in spite of oneself, through the work of attention and imagination, for the work of the artwork to take place. One has to distort the frame that one’s embodied experience is with respect to the world. The problematic that hence opens up is that of what Zylinska—using a terminology proposed by Toke Lykkeberg in the context of a critical engagement with Katja Novitskova’s work If Only You Could See What I’ve Seen with Your Eyes—calls art for Another Intelligence (Zylinska 2020, 137-144). Zylinska playfully proposes to read AI not as a form of artificial intelligence that supposedly mirrors 'human' sensibility and thinking reinforcing it, but as another intelligence that coagulates in-between 'humans' and technology. Consequently, Zylinska argues:

The efficacy of art that engages with AI lies perhaps first and foremost in its ability to redraft the conceptual and discursive boundaries of human perception, human value and human cultural practice, while drawing us as its human recipients into the recognition of our becoming (with) machines. (Zylinska 2020, 141-142)

Resonating with Zylinska’s argument, the imaginary artworks that emerge in the process of transcription are a form of art for Another Intelligence, a form of art that subtly destabilizes the intertwined becoming of the couple living individual / associated milieu (environment) as it happens in the context of AI data work and, more generally, in digital cultures permeated by AI algorithms. The prerequisite for such a destabilization to take place though, is a (dis)orientation of the subject in its strict interconnection with technology, a (dis)orientation of who one is and how one acts in the world.

Sharing Imaginary Artworks

There is a sense in which art always bears on an interplay of the public and the private, and hence one could say that art is in a privileged position for glitching the dynamic of public and private instantiated by the walls of the secure lab. Even if aesthetics is not about the beautiful anymore, but rather about sensibility itself (cf. Deleuze 1994, Rancière 2004), Kant’s argument for the (ungrounded) pretension to universality intrinsic to the aesthetic experience remains relevant (Kant 2007, 45-51). In different words, we could say that an impulse of sharing is inherent in aesthesis. In Rancière’s terms, the sensible becomes sensible only as politically shared and distributed. The aesthetic experience, while not pertaining to the subject, as Kant believed, remains a situated singular juncture, yet one that emerges in and recursively influences a system of relations in processes of individuation.

Aesthetic experience, in its singularity, emerges in the intimacy-of-the-common, and despite its situated aspects that remain impossible to exteriorize, it inherently attempts to (and pretends to) bridge across and become, if not universal, at least collective. We have to pay attention though: this singularity that tends towards the collective is not the object, the artwork, and neither a judgement about the artwork, but the dynamic system of relations that is instantiated in processes of spectatorship. In other words, what I advocate for and pretend to share as a spectator, is not the particular work of art that I am encountering, and neither a judgement provoked by the work, but the world as modulated by this encounter. It is not necessary that the artwork itself is public, and I do not expect that it provokes a universal response—i.e. the beautiful in Kant—, but I do expect that the world that I live in, modulated by the encounter with the artwork, is the world that we all share. Although I am rationally aware of the impossibility of my claim, of the multiplicity of worlds that we inhabit. Nonetheless, I live as if my world, modulated by the private encounter with the artwork, is the world. I can’t help doing so.

It is tempting in the contemporary western episteme to discard that part of experience that is not directly shared between subjects, to reduce the 'reality' of the world to what is common and invariant. But that would mean remaining anchored in the modern world picture and its framing of nature as standing-reserve, as a resource to be exploited and that comes to be only inasmuch as it is exploitable. Fighting against this world picture and its catastrophic contemporary consequences implies valuing experiences that remain singular and cannot be fully communicated between 'subjects', fantasies impossible to exteriorize, phantasms, or non-spectacular imaginary artworks are just as much a part of the universe in its reality as atoms or genes. The singular experience is still constitutively collective inasmuch as it emerges from processes of individuation, but it is a shared relational space that coagulates in an impossible to share present.

In the case of transcription work, this is rather obvious and easy to grasp: the transcriber is part of a well determined collective, working in a specific micro-political environment (itself part of larger macro-political dynamics)[43], follows strict rules and aims at producing outputs that meet certain predetermined criteria. This entire process is a collective one, and if we go deeper we would have to add the larger environmental and technological factors that are an integral part of this collectivity (and of any collectivity more generally). As in many other instances of contemporary labour, all the actions of transcribers have to be visible and quantifiable (from the perspective of The Client), denying as much as possible any potential singular experience and its consequences. There are rules for everything, rules that one is supposed to follow in the same way as a computational algorithm would do: as little personal input as possible, as much homogenization as possible. Of course, it doesn’t quite work. Language tends to have this stubborn ambiguity and iterability despite the (rather hilarious) attempts of The Client to stabilize it. At least in part, the ambiguity and iterability of language has to do with the ever-shifting unconscious ground upon which sense coagulates, and this ground, even if shaped in the intimacy-of-the-common, is constituted each time by the singularity of experience. Given an audio recording, we are not hearing the same thing, not experiencing the same thing, and not understanding the same thing. In the majority of cases, the differences are negligible, so the output of our transcriptions will be the same. But in others, the ambiguity comes through. Beyond this measurable effect of ambiguity though, there are also the individual imaginary experiences provoked by the work of transcription, which are entirely neglected in the production process. They are excretions that the system is not interested in and does not even acknowledge. As transcribers, for the most part, we are dismissing them too, most of the time before they even gain any conscious contour. The only context in which we are forced to come back to these excretions of the work process and engage with them is when they provoke surprising unpleasant symptoms in our lives, when it becomes necessary to unpack them with the help of mental health professionals.

When transcription is subtly transformed from straightforward capitalist production into a process of spectatorship, it operates with these excreted experiences, attending to them in the singularity of an uncommunicable process of individuation that is shaped in this strictly regulated public context. For the panopticon nothing is different. I am just as productive as before, I try to follow the rules with the outmost attention (although sometimes I fail to do so). And yet, an infrathin difference is performed that disorients (infinitesimally) the transcription work and its context, even if it does so in a way that is not immediately shareable. Unavoidably the audio recordings raise memories, bring back fragments of the past filtered through boredom and haste (meeting the targets, finishing the tasks as quickly as possible so I can gain some time on the side for reading and writing). Unavoidably the audio recordings contour improbable surrealist narratives and poetic fragments in a cut-up technique that would make Tzara envious, superimposing contents that would probably make Genet or Burroughs blush (if not of discomfort towards some of the punchlines, then of anger towards the infuriating blandness of it all). And these memories, that combine the data with the imaginary, and modulate the imaginary in its turn with the experiences generated in this process, tend to be a fertile ground for improbable imaginary futures, that flicker for a moment, often not even conscious, and sink back into non-being. It is these barely acknowledged futures, closed in the coffin of embodied thinking, that act as seeds for the becoming of the individual. Everything that I do is partially conditioned by these flickering dying futures that I cannot even properly convey in language.

There is a pretension to universality to the experience of the world as modulated by these infrathin excretions of labour, a pretension to share the future(s) that they seed, and yet, just as with Kant’s beautiful, this pretension is rationally unfounded and will never be realized. The world, the experience of life is singular, even if shaped in the collective. One lives and dies alone, although one cannot live except as a metastable system of relations, except as a consequence of systems of care, that is, as the result of relations of exteriority in the intimacy-of-the-common.

There is also another aspect of sharing involved in the imaginary artworks that result from the labour of transcription. When it comes down to it, despite the intentions of The Client and the arguably unethical system of exploitation that the process of audio transcription is an integral part of, there might be an infrathin (maybe fantasmatic) positive ethical aspect in all this. Attending (or at least attempting to), in boredom, to The Other’s breath, to the (never bare) life that flows and dies, and to the senses that it carries in its modulations. We live and die alone (despite the fact that this singular life is itself nothing but a knot of relations), and we need a witness to that loneliness. We have invented deities and countless conspiracy theories just in the hope that someone shares the un-shareable present, be it the enemy, be it the one who we are fighting. We desperately need a witness in order to give some appearance of sense to the singularity of one’s life. The panopticon is not only a system of oppression through surveillance, it is also a system that gives sense by oppressing through surveillance, be it the sense of a fight against being surveilled. Laura Marks astutely observes that the experiences of other beings are for the individuated being virtual (Marks 20-23). There is a sense in which each one of us longs to have other individuals actualize in their embodied being this virtuality (for them) that is actual for us as the present experience of the world. And each of us longs to actualize as our present the virtuality that the world of The Other is, this absolutely unreachable reality that is not simply an object in one’s world, and neither another subject like oneself (cf. Blanchot 1993). This longing for communion that drives our lives remains always unfulfilled, although there are different ways of failing to be in common, and some are (relatively) better than others.

This longing for communion is nothing but, what in Derrida’s reading of Freud appears as the search for Gradiva’s step, a quest for living the other’s present (Derrida 1995, 97-99). You will never step Gradiva’s step, you will never live my present, nor die my death. And yet, one can get close. Following Blanchot (1993), attending to the other in boredom and weariness is the path towards this communion that is never to happen. And that is because in boredom and weariness one can attend to the other as an absolute Other, as an absolutely unknowable world that cannot be known, nor used, nor saved, but whose actual traces can be witnessed and attended to with care. The world of the other will remain virtual to me, and I should allow it to remain so, I should never pretend to know it, to unveil it (cf. Glissant 1990 the right to opacity), but I can allow the actual traces of this virtuality, the traces of the other, to be inscribed in me, against me, becoming in their actuality (dis)orienting vectors on the virtual plane that drives my being and becoming (i.e. a poetics of relation, an ongoing infrathin creolization, a constant becoming other).

The labour of audio transcription, in those moments when it becomes a process of spectatorship and to the extent that it does so, creates an infrathin opening for such a performance of attention. One attends, in boredom and weariness, with haste and guilt, to the digital recording of the other’s breath at its most intimate—in relation to prostheses that became an integral part of their embodied being. Attending to everyday life caught unawares, to that which often escapes representation in more traditional art media, attending to the intimate moments that are usually censored out in institutionalized art, attending to the shameful weak moments that we hide and that we cannot help but hope that are witnessed, attending to marginalized voices that are rarely heard in their own terms in the public space (except for well-defined circumstances in which they are reified: the victim of the natural disaster, the object of the justice system, the object of sociological studies etc.), attending to this historically situated relation between users and their devices that is so banal and widespread now and yet so volatile (and that is likely to soon change and disappear). In a way, for me at least, the fact that someone might listen to what I am saying towards the microphone of my devices is comforting, as much as it is scary. To the transcribers who might hear my voice: Thank you for listening to me. Thank you for attending to the breath that I exhale, and to the meanings that it tries to carry.

Life passes in its unbridgeable singularity. It is nothing but this infinitely thin passing that cannot be grasped and cannot be shared. Neither can it be owned. A singularity shaped in the intimacy-of-the-common, that longs to be witnessed by the collective, and yet that eludes any form of communication. Life leaves traces, but the trace never quite leads to the present of inscription, to the moment when tracing and trace, experience and its mark were not yet split in this unbridgeable duality. One cannot re-live Gradiva’s step. Hence the an-archival impulse of attending to the trace (of attending to the traces of life) trying to grasp the living present that inscribed it, while at the same time trying to destroy the trace, trying to reach, beyond the inscription that obscures it the same originality of the living present. Yet, life gains its presence only as that which is lost in leaving its mark. It is the separation, the genesis of the caesura, between the uncommunicable present and the trace that it inscribes that defines both life and its trace. The present is present only as continuously re-veiled and revealed by the traces that it produces. And the other way around, the trace is a trace only as re-covered by an uncommunicable present that disappears in the very act of inscribing its traces. The present, living, experience is the consequence of the emergence of the trace that veils it, while, at the same time, the trace is nothing but an inscription of the present that it grounds. The trace is of the order of the intimacy-of-the-common that is the condition of possibility of the living individual, while being at the same time a mark that veils life in the process of revealing it.

This does not erase the ethical problems of surveillance and does not redeem either The Client or the transcriber, nor The Client of The Client in their double position as aggressors and victims. But it does raise a series of stringent questions: how to learn to love and accept all these voices that are so different and so contradictory? How to learn to attend to them, and to let them inscribe their traces in who one is? And, on the other hand, how to stop them from doing so?

Conclusion

The purpose of this essay was to reconsider the labour of audio transcription for AI projects from a perspective hitherto absent (to my knowledge) from academic literature: from the situated point of view of a transcriber. While it is impossible to generalize the transcription context that I am responding to, nonetheless such partial, situated accounts are necessary in order to overcome an oversimplified generalized figure of the AI data worker as a naive, innocent victim of exploitation. Writing from the perspective of the worker, writing as an audio transcriber for AI projects, is itself a gesture that complexifies the figure of the worker.

I understand theoretical writing as an embodied performance in which the resulting text is a trace and a vector of processes of individuation rather than being a stable rational construction or an ultimate telos. Consequently, the process through which the text emerges is just as important as the text itself and should be seen as an integral part of this project.

I have attempted an experiential (and experimental) writing methodology that would allow the embodied context of audio transcription to seep into the text not only as a subject under discussion but also at more basic affective, emotional, and unconscious levels. The quasi-entirety of the first draft for this essay was written while I was working as a transcriber: mornings before work, evenings after work, weekends, or during work hours when reaching my targets and having some extra time on my hands. My contract was terminated at the beginning of March 2025 due to the fact that The Client discontinued the collaboration for Romanian language transcription with the BPO that was hiring me. I was lucky to have some idle days between finishing the last transcriptions and the date my contract actually ended, and I took the opportunity to finalize the first draft in this time. Subsequently, I have reworked my way through the text correcting any errors, completing notes and references, and making other changes that I found necessary in preparing it for publication, but I have tried to stay true to that first complete version realized during the period I worked as a transcriber. For the better and for the worse, the writing style, the tone of the voice, and the arguments that this essay makes are all imbued in this context of transcription labour and should be read accordingly, with an eye to all that is flowing between the lines.

Ultimately, the subject of this essay, imaginary artworks that only exist for a brief moment 'in the mind' of the transcriber, can never be clearly captured in words or addressed directly, it can only be guessed between the lines as we turn and return around it, as we are avoiding it. So, as a closure of sorts, instead of retracing our way through the preceding chapters, I am proposing yet a new way of avoiding the subject: a low-fidelity audio-visual work, in the spirit of small file media.[44]

The work was made in the last days of my contract as a transcriber when I already knew that my encounter with this impressive piece of electronic literature had reached its conclusion. The audio is based on the recording of a small private performance, without spectators, in which I have tried to remember, in a given time frame, as many fragments of the transcribed audio recordings as I could, as closely as I could. Subsequently, I have split the resulting audio into four parts of approximately equal lengths and superimposed them, with two of the parts being reversed, creating a barely intelligible texture of voices. The next step involved compressing the audio file, and converting it a few times between different formats and different audio codecs, until the words became entirely unintelligible. So, the audio is based on a flagrant trespassing of the walls of the secure lab, I am divulging information that should remain secret (even if most of what I have managed to remember is entirely innocuous), except that this information is distorted to the point of noise (through a series of simple, amateurish steps). The video is based on images of my blood seen under a microscope, distorted by glitches resulting from file compression. The idea was to reduce the entire audio-video work, in the spirit promoted by the Small File Media Festival, to a dimension of less than 1 MB per minute.

Data Leaks (2025):

(https://vimeo.com/1085625427)

This abstract, haptic (cf. Marks 2002), audio-video work touches on aspects of audio transcription that I cannot write about. On the one hand, it points to the physical experience of the body of the transcribers, the blood flowing through their veins, that which is negated by the robotization of work[45] and that cannot be captured in systems of representation—life at an organic, unconscious yet very real and present level. On the other hand, it uses glitches resulting from information compression (half in joke and half seriously) in a politically charged manner, to convey on a perceptual and affective level something that it is forbidden to communicate.

Despite this brief explanation, I am not trying to say anything with this work, there is no stable meaning that as an 'author' I want to transmit to the spectators. I rather see the work as offering an opportunity for making sense, an opportunity for a proliferation of senses that depend as much on the work itself as they do on the embodied processes of spectatorship in which they are actualized. And this is valid for the entirety of this text.

In the same way that I argue for valuing the imaginary artworks that emerge in the labour of transcription, I am mindful of the processes of imagination inherent in reading theoretical texts. I hope that this essay, the above audio-video work included, rather than orienting the reader in the problematic of AI data work, will offer fertile (dis)orienting seeds for such processes of imagination, for dynamics of sense, that could grow into larger intertwined processes of psychic, collective and technological individuation.

Notes

[1] Following Muldoon et al. 2024a, I refer to this type of labour as AI data work.

[2] Probably a better designation in this case would be specta(c)torship. The term has its origins in the politically involved forms of theatre developed by Augusto Boal, where the line between spectators and actors is blurred, spectators are invited to act on the stage along with the actors and become spect-actors (Boal 2008). In our case the parenthetical c of specta(c)torship would indicate a politically charged interplay of passivity and activity involved in audio transcription when addressed from the perspective proposed here. For the sake of readability, I will be using spectatorship rather than specta(c)torship, but I would ask the reader to keep in mind this political nuance emerging from a paradoxical intertwining of activity and passivity.

[3] For a short explanation of the difference between AI data work on crowd-work platforms (such as Amazon’s Mechanical Turk) and AI data work in BPOs (companies specialized in business process outsourcing) see below. For a thorough discussion of this topic see Muldoon et al. 2024a.

[4] See Marks 2024, 88-89.

[5] “And what you thought you came for / Is only a shell, a husk of meaning / From which the purpose breaks only when it is fulfilled / If at all. Either you had no purpose / Or the /purpose is beyond the end you figured / And is altered in fulfilment.” From T.S. Eliot, “Little Gidding” (Eliot, 2009).

[6] Fragment from a text by Ana Ioniță, from the experimental theatre piece Pinocchio’s Dream, director Alexandru Pamfile, Club The Ark, Bucureşti, 2010. Approximate English translation: Breathe, breathe deeply. Resume (/Repeat).

[7] For a cogent critique of Searle’s argument see Hui (2023b).

[8] On the common practice in AI data work of preventing workers from understanding the context of their labour cf. Miceli and Posada 2022, 30-31.

[9] The widespread use of non-disclosure agreements is far from being a secret. Cf. for example Bartholomew and Perrigo 2023.

[10] Yes, there are strict procedures in place that, as far as I can see as a transcriber, make any major data leaks unlikely, if we understand 'major data leaks' strictly quantitatively—i.e. it is not probable that the personal data of hundreds of thousands of users will be leaked at the same time in one major event, at least not by any individual transcriber. Nonetheless, the personal information of users is frequently exposed, and for some users this happens repeatedly, so that transcribers occasionally have a fairly good idea of someone’s love life, family situation, or health situation for example. Because of the measures that are in place, it is often hard to actually identify such people whose voices one hears repeatedly, but sometimes this is possible, and indeed easy, when a name, an address, a phone number etc. is mentioned.

[11] The companies involved claim that only a very small percentage of the recordings are heard by 'human' transcribers (Hern 2019a, 2019b, Paul 2019). Given the lack of independent mechanisms for verifying this information, the silence imposed upon transcribers, and the history that each of these companies has of pushing legal and ethical limits for the sake of profit, such claims cannot be taken at face value. Moreover, from an ethical perspective, the fact that only a small percentage of users will have their sensitive data exposed is not actually reassuring.

[12] Le partage du sensible (the distribution of the sensible, or the partition of the sensible) refers to the implicit rules that govern the sensible order (what we perceive, how we perceive, how we make sense of what we perceive, etc.). In simple terms, in every society the self-evident facts of perception are dependent upon an underlying logic of inclusion and exclusion regarding what and how can be seen (heard, touched, felt etc.) and what has to remain hidden, who can see and from what position, who has to remain invisible and silent, etc. (Rancière 2004).
Rancière’s argument regarding the 'mechanical arts', including cinema, is that they are inscribed in an aesthetic revolution that subverted the distinctions between high and low subject matter in the arts (making it possible for the representation of the anonymous individuals to take center stage), questioned the privilege of speech over visibility, dismantled the hierarchy of arts, and insisted on the immanence of meaning in things themselves (Rancière 2004, 31-34). Such aesthetic revolution, that took place with modernity, had widespread consequences (not always positive) on the political and societal level because it disturbed the established distribution of the sensible.

[13] If it took incredible commitment, skilfulness and inventivity for Vertov’s film crew to capture life unaware (a virtuosic performance thematized in Man With A Movie Camera for example), if it took 'heavy brain work' and 'heavy muscle work' to develop and deploy pioneering devices for recording sound on film outside of the film studio in order to capture the life of the workers in the region of Donbas for Enthusiasm (see Vertov 1984, 106-112), today we cary technology with such recording capacities in our hands, in our pockets or strapped around our wrists, always ready-to-hand. The recording instruments that we are using are so integrated in our daily lives that the action of recording remains invisible in the seamless functioning of our devices. In fact, as Bruno Bachimont argues, recording is not a choice in digital societies, it is a prerequisite inherent to all digital communication. The choice is rather what records should be kept and what records should be deleted (Bachimont 2018).

[14] For an account of the different types of BPOs and digital platforms used in the AI industry, and their respective advantages for The Client, see Muldoon et al. 2024a.

[15] While it is true that certain BPOs offer employment opportunities for disadvantaged communities, their exploitative work practices contradict their philanthropic public image. In fact, given the profit driven data-production dispositif, it is an open question if truly community friendly and empowering BPOs could exist.

[16] Gray and Suri restrict the meaning of ghost work to labour performed on crowd-work platforms, here I use the term more broadly to also encompass BPOs and the unrecognized work of unwitting labourers. While work conditions differ in these three cases, nonetheless they are all forms of hidden labour, invisible for the users of AI technologies.

[17] Also see in this sense Williams et al. 2022.

[18] Not long after writing the draft for this text my contract was actually terminated, confirming the sense of instability that I am referring to here. I have completed the first draft of the essay while still being employed as a transcriber—which is an important methodological point of this project, namely the attempt of writing about transcription while working as a transcriber, letting the conditions of work transpire in the text not only as subjects of discussion, but also as the unformulable affective background that suffuses the writing process. Consequently I have kept throughout the text the usage of the present tense when speaking about my employment, even if the revisions of the text were completed after my employment ended.

[19] This claim is supported by theoretical work in other fields. As one example among others, there are arguments in black feminism studies that the cliché image of the black woman as a naive, oppressed figure is not conductive to empowering strategies (cf. Horáková 2017, 34).

[20] Cf. Blanchot (1993, 49-58) on the absolute otherness of the Other; Cf. Glissant (1990, 203-209) on opacity.

[21] The work contracts make it illegal to address the problem in the public space (the salaries are confidential), but the unethical biases of the system are clear even if we limit ourselves to what everybody already knows. As in other jobs that require foreign languages (for example many customer service jobs), workers’ salaries depend on the language in which they work. From the perspective of the companies this is justified in terms of available resources and demand. So, for example, working in Romanian language tends to pay less than working in German, French, Dutch, Swedish, Norwegian, etc. The common knowledge, that for (unethical) legal reasons I can neither confirm nor deny, is that the lowest earnings are for those who work in the local language of the country where the company is located (because those positions are easy to fill) and for those who work in English (everyone is expected to speak English, so it is not considered a skill anymore)—which is supported for example by Paul Victor Ribeiro’s article on the exploitation of AI data workers in Brazil (2021), mentioning that those working in Portuguese were hired on smaller wages than those working, for instance, in Italian. What this amounts to, is that for the same job people from rich countries (or who speak the respective languages at close to native level) are paid more than others, simply because it would not be possible to fill those positions at the same salaries as those offered for workers from countries with lower incomes. Besides the personal frustration of many transcribers in face of this inequitable distribution of revenues, it is not hard to see how on a more general plane such unethical payment practices, that seem to be ubiquitous in the field, exacerbate economic, social and political inequalities.

[22] The other major problem being the lack of sense resulted from the alienation of workers, which I mentioned above, and I address in more detail elsewhere under the topic of the robotization of AI data work.

[23] It can be argued that this reduction of thinking to a simple algorithm is necessary in order to be able to claim that thinking is de jure nothing more than an algorithm.

[24] As described for example, in relation with text data, in an article for Time by Billy Perrigo 2023, and in a follow-up interview with Perrigo conducted by Jem Bartholomew, 2023.

[25] To be clear, I am not saying that working as a transcriber allows one to live is some sort of fairy tale world of imagination that transcends the material work conditions. Such views of imagination as being a positive and unproblematic escape from 'the real' are naive at best and need to be discarded if we are to understand what is at stake here.

[26] For more on the framing function of the 'secure lab' please see below Chapter 10 Parerga.

[27] On reCHAPTCHA as exploitation of unwitting labour see Morreale et al. 2024.

[28] My contention is not that AI does not produce a new form of intelligence, I think it does, but rather that the bodies that think (the embodied conditions of thinking) are not simply the computers or networks of computers, but a complex folded structure (cf. Marks 2024) that includes 'human beings' and environmental factors (cf. Simondon 1989). Importantly, the same argument goes for 'human' intelligence: there is no 'human' body that thinks, the bodies that think (the embodied conditions of thinking) are not predefined individuals, but complex systems in processes of individuation, with thick technological, political, and environmental layers.

[29] Cf. in this respect Marjorie Perloff’s (2002) connections between Stein’s Tender Buttons and Marcel Duchamp’s conceptual poetics that problematize the figure of the author as the transmitter of meaning and force the spectator/reader “to stop looking (or reading) and to think through what the art work is doing” (94). In other words, the works make sense only through an intensive labour of spectatorship, a guided exercise of imagination that cannot find any stable orienting result.

[30] It is essential to keep in mind Simondon’s insight that processes of individuation never simply produce individuals as independent beings, but rather drive the being and becoming of the couple individual-associated milieu (i.e. individual-environment). In other words, any individual is and becomes only in strict interdependence with its associated milieu, and the problematic of the individual is, strictly speaking, that of its interrelations with its environment (Simondon 2013, 63).

[31] For the purpose of the present argument, going further into reading Szeman’s work would be a too long detour. Suffice to say though that over the past few years I have returned to this work again and again, to the point that it is now an integral part of my imagination. I warmly invite the reader to visit the work on their own: https://www.petraszeman.com/videos.html.

[32] Of course, none of this is officially allowed, but one gets to know the cracks in the walls of the secure lab.

[33] Jacques Derrida argues that every ergon (oeuvre, work of art) is dependent on a par-ergon, on a frame of sorts, a marginal element that, without pertaining to the internal logic of the work, creates the condition of possibility of the work (Derrida 1987).

[34] Note that in a panoptic system the removal of the guards (big companies and mechanisms of national states) from their central tower would not actually accomplish anything as long as the system stays in place.

[35] These are all fairly well known art historical examples, I will not describe the works here. If you are unfamiliar with any of them, a simple online search will suffice to find the necessary information.

[36] Language is tricky and as a transcriber it is often impossible to tell without further context if what you are witnessing is an abuse, a joke, a fiction or a well intended search for information.

[37] At the limit, transcription can even feel sometimes like a minimalist choreography, my hands dancing their strict robotic dance to the modulations of your voice that I recognize as phonemes of a particular language and that I transcribe.

[38] In Müller’s novel, “every shift is a work of art” is a recurrent motive that occurs in a few different contexts having different meanings. For the purpose of this essay, the aspect that I am interested in is the one mentioned above, which refers to the products of imagination as works of art.

[39] Irinel Anghel argues that a better designation for this practice would be non-spectacular rather than anti-spectacular (Anghel 2020, 3-4).

[40] This approach to AI art fights the current imbalance that exists within the established relationship between art and AI, where research leans heavily toward AI and its further development, while art is relegated to being a means of dissemination, public presentation, and commercialization of new technologies (cf. Kelber and Trojanowska, 2019).

[41] The problem of failing algorithms in the case of Artificial Intelligence—i.e. algorithms that are not performing optimally, that perform erroneously, or that fail to perform altogether—is not merely incidental, it is an indelible inherent problem of all AI and disregarding it (which tends to often be the case) has widespread societal, political and environmental consequences (cf. Pasquinelli 2019a).

[42] According to one possible etymology of absurdity the root 'absurdus' originally means (among others) 'out of tune', 'discordant.' See for example the etymology accepted by the Oxford English Dictionary (https://www.oed.com/view/Entry/792?redirectedFrom=absurd#eid, accessed 08.04.2025).

[43] What with Miceli and Posada (2022) we could call a data-production dispositif.

[44] cf. Small File Media Festival, https://smallfile.ca/.

[45] A blatant example of the disavowal of the body in contemporary work environments, including audio transcription but not only, is that the midday lunch break is not included in the 8 hours of work that are paid by the company. The fact that my blood has to carry nutrients to my cells in order for me to be able to transcribe and that those nutrients usually have to be obtained through eating, is considered by The Client and the BPO merely as a waste of time that consequently will not be remunerated. Like a robot, you are valued for the outputs per unit of time, and any periodical 'maintenance' that you need is a liability.

Works Cited

Anghel, Irinel, ed. 2020. Redescoperind Muzica Imaginară. București: Asociația Jumătatea Plină.

Ahmed, Sara. 2006. Queer Phenomenology: Orientations, Objects, Others. Durham: Duke University Press.

Allen, Greg. 2013. “The Annotated Charlotte Moorman Answering Machine Tapes.” Blog post on greg.org, October 16. https://greg.org/archive/2013/10/16/the-annotated-charlotte-moorman-answering-machine-tapes.html

Aurelius, Marcus. 2006. Meditations. Translated by Martin Hammond. London: Penguin Classics.

Bachimont, Bruno. 2018. “Between Formats and Data: When Communication Becomes Recording.” In Towards a Philosophy of Digital Media, edited by Alberto Romele and Enrico Terrone. Palgrave Macmillan.

Barthes, Roland. 1977. “The Death of the Author.” In Image. Music. Text. Translated by Stephen Heath. London: Fontana Press.

Bartholomew, Jem and Billy Perrigo. 2023. “Q&A: Uncovering the Labor Exploitation that Powers AI.” Colombia Journalism Review, August 29. https://www.cjr.org/tow_center/qa-uncovering-the-labor-exploitation-that-powers-ai.php.

Blanchot, Maurice. 1993. The Infinite Conversation. Translated by Susan Hanson. Minneapolis: University of Minnesota Press.

Boal, Augusto. 2008. Theatre of the Oppressed. Translated by Charles A., Maria-Odilia Leal McBride, and Emily Fryer. London: Pluto Press.

Chwasta, Madi. 2024. “University of Melbourne to repay $72 million to staff after 'unlawful' conduct across a decade.” ABC News, December 9. https://www.abc.net.au/news/2024-12-09/university-of-melbourne-underpayment-fair-work-ombudsman/104701012#

Choi, Wai Kit. 2019. “Power and Money: Explaining the Rise of Digital Media through Surveillance Capitalism.” In Interfacing Ourselves: Living in the Digital Age. Edited by Cristina Bodinger-deUriarte. New York: Routledge.

Combes, Muriel. 2013. Gilbert Simondon and the Philosophy of the Transindividual. Translated by Thomas LaMarre. Cambridge, MA: The MIT Press.

de Duve, Thierry, ed. 1991. Pictorial Nominalism: On Marcel Duchamp’s Passage from Painting to the Readymade. Translated by Dana Polan and Thierry de Duve. Minneapolis: Minnesota University Press.

Derrida, Jacques. 1967. De la Grammatologie. Paris: Les Éditions de Minuit.

Derrida, Jacques. 1987. The Truth in Painting. Translated by Geoff Bennington and Ian McLeod. Chicago: The University of Chicago Press.

Derrida, Jacques. 1995. Archive Fever: A Freudian Impression. Translated by Eric Prenowitz. Chicago: The University of Chicago Press.

Deleuze, Gilles. 1994. Difference and Repetition. Translated by Paul Patton. New York: Columbia UP.

Deleuze, Gilles. 1985. Cinéma 2: L’image-temps. Paris: Les Éditions de Minuit.

Duchamp, Marcel. 1999. Notes. Paris: Flammarion.

Ekbia, Hamid R., and Bonnie Nardi. 2017. Heteromation, and Other Stories of Computing and Capitalism. Cambridge, MA: The MIT Press.

Eliot, T. S. 2009. Collected Poems 1909-1962. London: Faber and Faber. epub edition.

Enthusiasm: The Symphony of Donbas. 1931. Director Dziga Vertov, cinematography B. Cejtlin, editor Elizaveta Svilova. Ukrainfilm.

Epictetus. 2008. Discourses and Selected Writings. Translated by Robert Dobbin. London: Penguin Books.

Foucault, Michel. 1995. Discipline and Punish: The Birth of the Prison. Translated by Alan Sheridan. New York: Vintage Books.

Fuller, Matthew. 2005. Media Ecologies: Materialist Desire in Art and Technoculture. Cambridge, MA: MIT Press.

Genet, Jean. 1949. Journal du voleur. Paris: Gallimard.

Gray, Mary L., and Siddharth Suri. 2019. Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass. New York: Houghton Mifflin Harcourt.

Glissant, Édouard. 1990. Poétique de la Relation. Paris: Gallimard.

Grama, Sebastian. 2008. Note pentru o fenomenologie a eranței. București: Editura Universității din București.

Haggerty, Kevin D. and Richard V. Ericson. 2000. “The surveillant assemblage.” British Journal of Sociology, Vol 51, No. 4: 605-622.

Haraway, Donna J. 2016. Staying with the Trouble: Making Kin in the Chthulucene. London: Duke University Press.

Hayles, N. Katherine. 2007. “Hyper and Deep Attention: The Generational Divide in Cognitive Modes.” Profession, 187-199.

Heidegger, Martin. 1977. “The Question Concerning Technology.” In The Question Concerning Technology and Other Essays, translated by William Lovitt, 3-35. New York: Garland Publishing.

Hern, Alex. 2019a. “Amazon Staff Listen to Customers' Alexa Recordings, Report Says” The Guardian, April 11. https://www.theguardian.com/technology/2019/apr/11/amazon-staff-listen-to-customers-alexa-recordings-report-says.

Hern, Alex. 2019b. “Apple Contractors 'Regularly Hear Confidential Details' on Siri Recordings.” The Guardian, July 26. https://www.theguardian.com/technology/2019/jul/26/apple-contractors-regularly-hear-confidential-details-on-siri-recordings.

Horáková, Martina. 2017. Inscribing Difference and Resistance: Indigenous Women’s Personal Non-Fiction and Life Writing in Australia and North America. Brno: Filozofická Fakulta Masarykova Univerzita.

Hsun, Lu. 1974. “Revenge.” In Wild Grass. Peking: Foreign Languages Press.

Hui, Yuk. 2016. On the Existence of Digital Objects. Minneapolis: University of Minnesota Press.

Hui, Yuk. 2019. Recursivity and Contingency. London: Rowman & Littlefield International.

Hui, Yuk. 2021. Art and Cosmotechnics. Minneapolis: U of Minnesota Press.

Hui, Yuk. 2023a. “Imagination and the Infinite: A Critique of Artificial Imagination.” Balkan Journal of Philosophy, Vol. 15, No. 1: 5-12.

Hui, Yuk. 2023b. “ChatGPT, or the Eschatology of Machines.” e-flux, 137. https://www.e-flux.com/journal/137/544816/chatgpt-or-the-eschatology-of-machines/.

Kant, Immanuel. 2007. Critique of Judgement. Translated by James Creed Meredith. Oxford: Oxford University Press.

Kleber, Pia and Tamara Trojanowska. 2019. “Performing the Digital and AI: In Conversation with Antje Budde and David Rokeby.” The Drama Review, Vol. 63, No.4 (winter 2019): 99-112.

Man with a Movie Camera. 1929. Director Dziga Vertov, cinematography Mikhail Kaufman, editors Dziga Vertov and Elizaveta Svilova. VUFKU, Dovzhenko Film Studios.

Manning, Erin. 2020. For a Pragmatics of the Useless. Durham: Duke University Press.

Marks, Laura U. 2002. Touch: Sensuous Theory and Multisensory Media. Minneapolis: University of Minnesota Press.

Marks, Laura. 2024. The Fold: From Your Body to the Cosmos. Durham: Duke University Press.

Miceli, Milagros, and Julian Posada. 2022. “The Data-Production Dispositif.” arXiv:2205.11963v1. arXiv.

Morreale, Fabio, Elham Bahmanteymouri, Brent Burmester, Andrew Chen, Michelle Thorp. 2023. “The Unwitting Labourer: Extracting Humanness in AI Training.” AI & Society, Vol. 39, No. 5: 2389-2399.

Morrison, Elise, Tavia Nyong’O, and Joseph Roach. 2019. “Algorithms and Performance. An introduction.” The Drama Review, Vol. 63, No.4 (winter 2019): 8-13.

Muldoon, James, Callum Cant, Boxi Wu, and Mark Graham. 2024a. “A Typology of Artificial Intelligence Data Work.” Big Data & Society, Vol. 11, No. 1 (January–March): 1-13.

Muldoon, James, Mark Graham and Callum Cant. 2024b. Feeding the Machine: The Hidden Human Labour Powering AI. Edinburgh: Canongate. EPUB.

Müller, Herta. 2012. The Hunger Angel. Translated by Philip Boehm. London: Protobello Books.

Nancy, Jean-Luc. 1997. The Sense of the World. Translated by Jeffrey S. Librett. Minneapolis: University of Minnesota Press.

Nemescu, Octavian. 2015. “Muzica imaginară.” Revista Muzica, Vol. 26, No. 3-4: 3-29.

Paglen, Trevor. 2024. “Society of the Psyop, Part 2: AI, Mind Control, and Magic.” e-flux, 148. https://www.e-flux.com/journal/148/631017/society-of-the-psyop-part-2-ai-mind-control-and-magic/.

Pasquinelli, Matteo. 2019a. “How a Machine Learns and Fails - A Grammar of Error for Artificial Intelligence.” Spheres. Journal for Digital Cultures, No. 5. https://spheres-journal.org/contribution/how-a-machine-learns-and-fails-a-grammar-of-error-for-artificial-intelligence/.

Pasquinelli, Matteo. 2019b. “Three Thousand Years of Algorithmic Rituals: The Emergence of AI from the Computation of Space.” e-flux, 101. https://www.e-flux.com/journal/101/273221/three-thousand-years-of-algorithmic-rituals-the-emergence-of-ai-from-the-computation-of-space/.

Pasquinelli, Matteo. 2023. The Eye of the Master: A Social History of Artificial Intelligence. London: Verso.

Paul, Kari. 2019. “Google workers can listen to what people say to its AI home devices.” The Guardian, July 11. https://www.theguardian.com/technology/2019/jul/11/google-home-assistant-listen-recordings-users-privacy.

Paullada, Amandalynne, Inioluwa Deborah Raji, Emily M. Bender, Emily Denton, Alex Hanna. 2020. “Data and its (Dis)Contents: A Survey of Dataset Development and Use in Machine Learning Research.” arXiv:2012.05345v1. arXiv.

Perrigo, Billy. 2023. “OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic.” TIME, January 18, 2023. Accessed 09.11.2024. https://time.com/6247678/openai-chatgpt-kenya-workers/.

Perloff, Marjorie. 2002. “The Conceptual Poetics of Marcel Duchamp.” In 21st-Century Modernism: The “New” Poetics, 78–114. Oxford: Wiley-Blackwell.

Perloff, Marjorie. 2021. Infrathin: An Experiment in Micropoetics. Chicago: The University of Chicago Press.

Pinocchio’s Dream. 2010. Experimental theatre piece, director Alexandru Pamfile, at Club The Ark, Bucureşti.

Rancière, Jacques. 2004. The politics of aesthetics: The Distribution of the Sensible. Translated by Gabriel Rockhill. New York/London: Continuum.

Ribeiro, Paulo Victor. 2021. “Brazilian Workers Paid Equivalent of 70 Cents an Hour To Transcribe TikToks.” The Intercept, October 2, 2021. Accessed 09.11.2024. https://theintercept.com/2021/10/02/tiktok-bytedance-transcription-brazil/.

Rowe, Niamh. 2023. “Millions of Workers Are Training AI Models for Pennies.” WIRED, October 16, 2023. Accessed 09.11.2024. https://www.wired.com/story/millions-of-workers-are-training-ai-models-for-pennies/.

Searle, John R. 1980. “Minds, brains, and programs.” The Behavioral and Brain Sciences, No. 3: 417-424.

Simondon, Gilbert. 1989. Du mode d’existence des objets techniques. Paris: Aubier.

Simondon, Gilbert. 2013. L’individuation à la lumière des notions de forme et d’information. Grenoble: Éditions Jérôme Millon.

Stein, Gertrude. 1914. Tender Buttons. New York: Claire Marie. eBook edition by Project Gutenberg, 2005.

Stiegler, Bernard. 1998. Technics and Time 1: The Fault of Epimetheus. Translated by Richard Beardsworth and George Collins. Stanford, CA: Stanford University Press.

Stiegler, Bernard. 2009. Technics and Time, 2: Disorientation. Translated by Stephen Barker. Stanford, CA: Stanford University Press.

Stiegler, Bernard. 2018. The Neganthropocene. Edited and translated by Daniel Ross. London: Open Humanities Press: 2018.

Szeman, Petra. 2017. “Trajectories.” Video artwork on Szeman's website: https://www.petraszeman.com/videos.html

Tubaro, Paola, Antonio A. Casilli, and Marion Coville. 2020. “The Trainer, the Verifier, the Imitator: Three Ways in Which Human Platform Workers Support Artificial Intelligence.” Big Data & Society, Vol. 7, No. 1.

Vertov, Dziga. 1984. Kino-Eye: The Writings of Dziga Vertov. Edited by Annette Michelson, translated by Kevin O’Brien. Berkeley: University of California Press.

White, Michele. 2006. The Body and the Screen: Theories of Internet Spectatorship. Cambridge, MA: The MIT Press.

Williams, Adrienne, Milagros Miceli and Timnit Gebru. 2022. “The Exploited Labor Behind Artificial Intelligence.” Noēma, October 13, 2022. Accessed 09.11.2022. https://www.noemamag.com/the-exploited-labor-behind-artificial-intelligence/.

Zhou, Viola and Caiwei Chen. 2023. “China’s AI boom depends on an army of exploited student interns.” Rest of the World, 14 September 2023. Accessed 09.11.2024. https://restofworld.org/2023/china-ai-student-labor/.

Zylinska, Joanna. 2020. AI Art: Machine Visions and Warped Dreams. Open Humanities Press.