Musical Lunch (2022)

Project Summary

Musical Lunch was an approximately 4-minute performance for the Small Data is Beautiful: Analytics, Art and Narrative virtual symposium hosted by the University of Melbourne on 19 February 2022.

An invitation to perform before the lunch break prompted me to design a participatory work that would send-off the act of having lunch as an act of creating data, small in significance yet emblematic of the everyday surveillance we experience over our bodies and actions. Through the act of assembling and consuming a meat baguette, the work invited references to the social media mukbang phenomenon, simultaneously inviting the audience to participate in communal musicking-lunching using the browser-based interface.

Mukbang is a type of online audiovisual broadcast where the host consumes various quantities of food while interacting with the audience. It is often live-streamed through platforms such as TikTok, Twitch and YouTube. Emerging from the solo-eating culture in South Korea, mukbang provides a remote social experience around eating (Kim 2021).

Aside from the comic value of the banality of making music from an everyday routine in a kind of Fluxus ethos, I was interested in the idea of women’s bodies being used as a repository or mouthpiece for other people’s interests and prejudices. At or around the time of the performance, a debate was raging in the media and community around activist Grace Tame’s refusal to smile at Prime Minister Scott Morrison because "survival of abuse culture is dependent on submissive smiles" (Chrysanthos 2022). His wife, Jenny Morrison, gave an interview criticising Tame’s lack of manners. A recording of this interview was sampled for this performance, so that I could use my own female body as a site of protest against the pressure to "be polite".

The artefacts of the performance were collated into Leftovers from Lunch, part of a collection of essays and thought pieces investigating conceptual, artistic and computational qualities of small data published by Grattan Street Press (Rachel Fensham 2023).

System Overview

Musical Lunch utilises Google MediaPipe’s AI holistic tracking ("MediaPipe: Live ML anywhere" 2020) to track face and hands. A web page was chosen as the platform to host the AI and user interface, so that it could be easily accessed by the audience to participate without requiring them to download any applications or other dependencies. The entire ML pipeline, user interface and audio synthesis was implemented in JavaScript, HTML and CSS and was uploaded in GitHub. Tone.js was used as a library to implement web audio (Mann).

The open-source code repository can be accessed here and the webpage that the audience accessed is available here.

Web audio is a relatively new protocol, with the first draft appearing in 2011 and progressively adopted by most browsers (Aldunate). As a protocol, it is still in flux with compatibility issues across different browsers and devices. However, the ability to synthesise audio on the browser presents an exciting opportunity to create interactive sound works that are free and easily accessible.

I used the same web interface provided to the participants for my own performance and shared my screen and sound through video conferencing app Zoom to the audience. Each participant accessed the interface on their own computer, with all processing done locally on their browser.

Systems diagram for Musical Lunch

Sound and Interactive Design

A sampler-based digital instrument provided by Tone.js was used, which sounded like a piano. The digital piano sound was chosen for its banal, commonplace nature, in keeping with the everyday aesthetics of making lunch. This was controlled by both hands. While the hands were in the lower half of the screen, any movement triggered an arpeggio based on 3 alternative chords - Dm, B♭, and Dm6.

Screen Shot 2023-01-14 at 2.09_edited.jp

Alternate arpeggio sequences triggered by hands in lower half of screen in JavaScript code and equivalent symbolic notation

Akin to playing a piano, movement towards the left played lower notes and movement towards the right played higher notes. By manipulating both hands, two layers of arpeggios could be played. Chord changes could be made by reversing the direction of the right palm. Related chords were chosen so that there would be minimal dissonance even where different participants had different chords activated. A long baguette was chosen as the ideal lunch for my performance to enable large horizontal movements of both hands to play the arpeggio.

Conversely, any movement of the hands in the upper half of the screen triggered dissonant cluster chords. Similarly to the arpeggio, movement towards the left triggered lower notes and movement to the right higher notes. This was designed to provide a contrasting section which could be triggered by different performative acts such as shaking drink bottles. All note velocities were mapped to the velocity of the hands, providing perception-based control.

Screen Shot 2023-01-14 at 2.26.43 pm.png

Screen Shot 2023-01-14 at 2.25.18 pm.png

Dissonant chords triggered by hands in upper half of screen in JavaScript code and equivalent symbolic notation

Screen Shot 2022-09-21 at 1.18.00 pm.png

Screenshot from Musical Lunch performance with open mouth

Screen Shot 2022-09-21 at 1.20.10 pm.png

A third layer of sound was mapped to the openness of the mouth. In this case, a sample of Jenny Morrison’s voice saying "have manners, be polite" looped and played back whenever the mouth was open. The amplitude of the voice, as well as an accompanying distorted static sound, increased with the openness of the mouth. In addition, a red visual outline was created around the lips, whose thickness would increase as my mouth opened wide to bite into the meat baguette, becoming an obscene red orifice.

While MediaPipe’s holistic tracking often failed to recognise the hands when the face region was not visible, this “glitchiness” was deemed to be acceptable in the context of the performance where we wanted the processes of technology to be transparent and subject to scrutiny. The performance also played with “tricking” the AI computer vision, with, for example, placing a napkin in front of my mouth.

MediaPipe computer vision tries to infer where my mouth is

Evaluation

Audience feedback was positive for the performance and led to the development further related work. It was difficult to evaluate the level of participation of the audience in the work, as I did not have control over the Zoom conference and was therefore unable to allow multiple participants to screenshare. The chat during the Zoom conference indicated that some of the audience did access the webpage to participate, however they either did not share their audio or the audio was subject to automatic noise cancellation. A more comprehensive system design could be implemented in future versions which would allow participants to share their audio outcomes, which would contribute greatly to the sense of collective musicking and connection.

Significance to Research

Despite the very short, informal nature of this work, it solidified my interest in working with collective participatory networks beyond the aesthetics of sound but as a space for shared reflection into cultural and socio-political issues. The experience, while trying to create connection, for me amplified the sense of ‘alone togetherness’ (Jagoda 2016), characteristic of much networked sociality. This contradiction arising from technologically-mediated social connection and the aesthetics of the ordinary was further explored in Scrape Elegy.

Credits

Creative concept, coding, sound design and performance: Monica Lim
Video documentation: Patrick Hartono