
Data Leaks (2025)
Data Leaks is a low-fidelity audio-visual work, in the spirit of small file media. (I am borrowing the idea of small file media from the Small File Media Festival)
Data Leaks is grounded in my personal experience of working as a transcriber for AI projects in a business process outsourcing company between 2023 and 2025. Audio transcription is a widespread form of AI data work necessary for the training of AI algorithms that operate with natural language. As a transcriber, I was listening every day, hours in a row, to audio recordings captured from users of digital devices, some of whom seemed unaware of being recorded. The audio fragments were provided by The Client (a major tech company whose name cannot be revealed because of nondisclosure agreements). The transcription work, in the form that I have experienced it, takes place exclusively in 'secure labs' with strict rules that forbid transcribers from revealing any information about the work processes or the audio recordings transcribed. This imperative of secrecy, rather than protecting the privacy of the users, obscures from public view the ways in which private and sensitive user information is exploited by giant tech companies.
This audio-video work was made in the last days of my contract as a transcriber, when I already knew that my encounter with this impressive piece of electronic literature (the intertwined fragments of daily lives caught unawares) had reached its conclusion. Engaging with the imperative of secrecy, the audio track is based on the recording of a small, private performance, without spectators, in which I have tried to remember as closely as I could and reproduce, in a given time frame, as many fragments of the transcribed audio files as possible. I have cut-up the resulting recording, superimposed the parts, compressed the audio file, and converted it between different formats and different audio codecs until the words became entirely unintelligible. The video is based on images of my blood seen under a microscope, captured the very same day, distorted by glitches resulting from file compression. I have reduced the entire audio-video work, in the spirit small file media, to a dimension of less than 1 MB per minute, interrogating the extent to which file compression and low resolution can act as politically charged vectors in contexts permeated by imposed secrecy: personal information is leaked and in plain view (images of my blood, my memories of recordings captured from users of digital devices by a big tech company) yet distorted to the point of noise, leaked data rich in affective charge, yet hidden from direct access and understanding.
Data Leaks is part of a more extensive experimental essay reflecting on my experience as an audio transcriber for AI projects, that can be found here