
Data Leaks (2025)
Data Leaks is a low-fidelity audio-visual work, in the spirit of small file media. (I am borrowing the idea of small file media from the Small File Media Festival https://smallfile.ca/)
The work is grounded in my personal experience of being employed as a transcriber for AI projects in a business process outsourcing company between 2023 and 2025. Audio transcription is a widespread form of AI data work necessary for the training of AI algorithms that operate with natural language. As a transcriber, I was listening every day, hours in a row, to audio recordings captured from user interaction with digital devices—with many users seemingly unaware of being recorded. The audio fragments were provided by The Client (a major tech company whose name cannot be revealed because of nondisclosure agreements).
The transcription work, in the form that I have experienced it, took place exclusively in “secure labs” with strict rules that prohibited transcribers from revealing any information about the work processes or the audio recordings transcribed.
This imperative of secrecy, rather than protecting the privacy of the users, obscures from public view the ways in which sensitive personal information is exploited by giant tech corporations. At the same time, it alienates AI data workers, impeding them from knowing and understanding the context of their work: the data pipelines that they are an integral part of, as well as the larger societal, ethical, or environmental consequences of their labour. Workers are treated as robots that are fed inputs (the audio recordings) and are supposed to produce the desired outputs (transcriptions of the recordings) at certain rates, with a measurable accuracy, while being cut off from any “unnecessary” information. (For a more detailed discussion of this problem see my article “On Being a Robot: Aisthesis and Sense in Audio Transcription for AI Projects” in Journal of Science and Technology of the Arts, Vol. 17, No.2, 2025, pp. 77-93. Open access at: https://revistas.ucp.pt/index.php/jsta/article/view/17898).
At the same time, the algorithmic combination of myriad voice recordings of everyday life caught unawares creates a mind-blowing piece of experimental electronic literature, AI data workers being its only spectators. This “artwork” inadvertently created by The Client, could easily rank among the most intriguing of the genre, despite the fact that it is not the conscious creation of an artist, it is not intended for public display, and it is highly problematic from an ethical perspective.
Data Leaks was made in the last days of my contract as a transcriber, when I already knew that my encounter with this impressive piece of electronic literature had reached its conclusion. Engaging with the imperative of secrecy, the audio track is based on the recording of a small, private performance, without spectators, in which I have tried to remember as closely as I could and reproduce, in a given time frame, as many fragments of the transcribed audio files as possible. I have cut-up the resulting recording, superimposed the parts, compressed the audio file, and converted it between different formats and different audio codecs until the words became entirely unintelligible. The video is based on images of my blood seen under a microscope, captured the very same day, distorted by glitches resulting from file compression. I have reduced the entire audio-video work, in the spirit small file media, to a dimension of less than 1 MB per minute, interrogating the extent to which file compression and low resolution can act as politically charged vectors in contexts permeated by imposed secrecy: personal information is leaked and in plain view (images of my blood, my memories of recordings captured from users of digital devices by a big tech company) yet distorted to the point of noise; leaked data rich in affective charge, yet hidden from direct access and understanding.
(Data Leaks is part of a more extensive experimental essay reflecting on my experience as an audio transcriber for AI projects, that can be found at: https://mihaibacaran.com/writing/every_shift_is_a_work_of_art.html)