322
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 27 Dec 2023
322 points (99.1% liked)
Technology
60337 readers
4419 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
Sorry if this is a noob question, but...how?
DNS will tell you the server name and address, which would just be some server owned by the company. Nothing weird there unless they have the chutzpah to name it something telling. They could even bypass DNS entirely with hardcoded IP addresses.
Timing wouldn't be a great indicator either if they aggregate requests.
They could slide anything nefarious in with daily software update checks or whatever other phone-homing they normally do, and without deep packet inspection or reverse engineering the software, it would be very difficult to tell.
I don't think Wireshark can do deep packet inspection, can it? Assuming the client is using SSL and verifying certs, maybe even using cert pinning?
Size would be a big indicator if they're sending full voice recordings, but not if they're doing voice recognition locally and only sending transcripts, metadata, or keywords.
I've never actually done this kind of work in earnest, and my experience with Wireshark is at least a decade out of date. I'm just approaching this from the perspective of "if I were a corporate shitbag, how would I implement my shitbaggery?"
The answer is: it wouldn’t. You’re right on the money, you couldn’t do anything other than speculation.
Just spitballing here but you might be able to try and correlate the amount of data sent with how much real life activity there was. Say, have silence for a week around the TV then play recorded speech near it for a week and see if that changes the frequency or size of the data being sent back home. Then do this for random 1/2/3 day periods. If offline text to speech is as crap as I've heard then the increased data transfer should stick out pretty clearly.
That’s a completely unhinged level effort for what would still ultimately boil down to speculation lmao. Smart TVs phone home frequently, semi randomly, with varying data amounts, both when used regularly and when off for months at a time, both when you’re walking and talking around it, and if you’re on vacation for two weeks. If despite all that you tried to control the environment around it you’d somehow need to… ensure absolute silence in the room that it’s in for DAYS at a time? Unless you live in the middle of the woods that’s not very likely, and even then, all it would be is guessing lmao
Oh entirely, but it's the best I could come up without disassembly. (And I'm fairly sure I've done worse debugging a prod environment)
First, someone would be able to prove that communication is happening. Second, if the keys are stored locally, and the original packets saved, the encryption can be reverse engineered.
Encryption prevents man in the middle attacks. If you have one of the ends, you can usually get the data. If you have the device that's doing the encryption of the data, and you have the encrypted data, you can decode the data. It's just a matter of getting through obfuscation at that point.
The reason this hasn't been done yet is that it's not happening yet. CMG was lying in their advertising.
Try it out. Setup dnsmasq and connect your phone to the network. You'll see a ton of requests initially, that gives you some idea of what apps/services/accounts are on the phone. Let the phone go to sleep, and watch what is sending requests in the background. Many services use very specific host names which indicate what is being processed.
On the TV, it would be similar. You walk into the room and it starts sending packets? You say something unrelated to its trigger word yet Wireshark shows activity? Suspicious. If you can get a certificate onto the TV you can use mitmproxy to view the HTTPS traffic, but that's probably kinda difficult.
I do not use smart TVs but I have been doing stuff like the above for a while. If they are recording and storing stuff some engineer eventually figures out, it's not an NSA backdoor.
I'm not saying they are/aren't, I do not know, it just seems very unlikely and improbable especially given smart phone ubiquity. What is known to be actually occuring is a complete violation of consumer privacy for marketing purposes, but OPs form of spying is so far unsubstantiated.
Now, can that TV be hacked and used by your neighbor to spy on you? Or can your government access your mic/camera? That's an entirely different question and field of expertise.
More info
In this case, it would be pretty hard. We have wiretap laws, which would mean you have to tell the user you're doing this. Even though no one reads the ToS, someone does, and it would be news if someone was doing this.
Even then, it would be a hard enough problem that companies would think twice about it for a few reasons. Number one, processing 24/7 of all audio in your home is going to be rather difficult/expensive, so you'd have to go with something like keyword-triggers-processing the way that your phone listens for "hey google/siri" or Amazon listens for "Alexa." It works kinda like game video sharing - they are always listening and recording for a short time frame* but they only send the data somewhere if they hear the trigger phrase. That's not easy in itself, they've spent a ton of time getting the right algorithm so that it correctly hears the right trigger phrase and you don't get a ton of false positives to varying degrees of success. And keeping in mind these are companies that are best suited to it, they still struggle sometimes with even that. The ad companies would have to listen for dozens/hundreds/thousands of triggers...
And then you get to the data retention policies. Google is an ad company, Apple is not. One of the reasons that Apple can tout privacy as a feature is simply that they don't need the data, so they don't collect nearly as much, and they save even less. They get the bonus of not dealing with law enforcement and all that.
So, assuming they solve that, solve some big issues with the laws of the land and physics, now we're to the point where they have to think about network traffic. Which is going to be trivially easy for nerds to figure out and circumvent, so they would have to have their own ad-hoc network which comes with another 137 or so difficulties.