The findings are interesting, but the study design is lacking. A single device is used (to be fair, it's a commonly used device) and as far as I can tell a single person recorded the keystrokes and was assessed. I don't think it did a good job of simulating trying to train and create a model for someone via recorded audio from a medium such as zoom given many realistic variables like audio quality, being on or off mute, connection quality issues, mic sensitivity, etc. With that being said, it is exposing a theoretical attack vector and I think that's important to identify and recognize.
I'd assume this only works with non-normalized stereo audio. Just flip mono audio on and normalize, then you can't really tell which key is pressed, or if you're talking at the PC or from the living room.
There was previous (german?) research that was able to do this from just well-recorded sound.
HRTF etc. wasn't required.
https://www.newscientist.com/article/dn7996-keyboard-sounds-reveal-their-words/ (Paywall, apologies, and it's US, I couldn't find the german one)
Technology
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.