126
submitted 1 week ago by yogthos@lemmy.ml to c/opensource@lemmy.ml
you are viewing a single comment's thread
view the rest of the comments
[-] django@discuss.tchncs.de 3 points 1 week ago

There is nothing special going on. This whole project is just a bunch of python libraries coupled together to a cli tool. It uses the package SpeechRecognition to connect to the google speech recognition api: https://github.com/microsoft/markitdown/blob/main/src/markitdown/_markitdown.py#L691

Pretty uninteresting and a bit disappointing. Pandoc is a lot more interesting.

[-] utopiah@lemmy.ml 1 points 1 week ago

Thanks for the clarification. I checked the code you linked and noticed recognize_google and seems it's relying on https://github.com/Uberi/speech_recognition which then seems to rely on https://github.com/Uberi/speech_recognition/blob/master/speech_recognition/recognizers/google.py so basically are they using an API, sending all the audio data to Google servers?

[-] django@discuss.tchncs.de 1 points 1 week ago

Yes, this is how I read it as well. The library would support to use a local model, but they decided to just send the audio data to Google.

[-] utopiah@lemmy.ml 3 points 1 week ago

Might open up a GDPR related issue there. I don't think people using such a library assume they need connectivity nor that their data would be send to a 3rd party.

this post was submitted on 16 Dec 2024
126 points (96.3% liked)

Open Source

31723 readers
132 users here now

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

Rules

Related Communities

Community icon from opensource.org, but we are not affiliated with them.

founded 5 years ago
MODERATORS