A hand gesture recognition tool to modify and manipulate sound
for
Sound, Space and Interaction
✌︎︎
This project for Sound, Space and Interaction creates an audio environment where we can control four different nature sounds through hand gestures inspired by the series called Jujutsu Kaisen. The system uses Python's MediaPipe library for real-time gesture recognition and PlugData for audio synthesis, creating a purely audio interaction that requires no visual interface. We used a camera to recognize specific hand movements and turn them into elemental sounds: water, thunder, wind, and fire. Each gesture has a name, "Infinity" for water, "Malevolent" for thunder, "Maho" for wind, and "Fuga" for fire. The entire experience works through sound alone, with no need to look at any screens.
The interaction works by making sounds louder or quieter based on how long you hold each gesture. The longer you maintain a gesture, the more prominent that elemental sound becomes. There's also a suspenseful background sound that builds up over time, warning you to switch to a different gesture before everything stops and resets. When you first use the system, only the sounds you create with gestures can be heard, but once you've introduced a sound, it continues playing softly in the background until you activate it again.
The technical side combines gesture recognition with smart audio processing. The system remembers how long and how often you use each gesture, then uses that information to control volume levels with some randomness to keep things natural. For example, if you've used the wind gesture a lot, it might play at 50% volume with random variations. The nature sounds were created using special audio techniques within Plugdata, and musical chords add extra depth that changes based on your gestures.
This project shows how physical movements can control abstract sounds in ways that feel natural and intuitive. Since there's no visual feedback, all the communication happens through audio cues that guide users and create engaging interactions. The system is simple to set up, just a laptop, camera, and speakers. But creates rich, immersive soundscapes that respond naturally to human gestures.
✌︎︎
Sketches of the hand gestures made by the amazing Dewi
✌︎︎
< Here you can find the proposal
And final report on our project >
✌︎︎