Well, you chose to put that thing in your house
GOOGLE ASSISTANT devices are recording users and sending the data for transcription, sometimes without a wake-word, according to a new report.
Virtual assistant devices are supposed to listen for the “OK Google” or “Hey Google” commands (or ‘Alexa’, ‘Hey Siri’, ‘Hey Cortana’ or ‘You keep out of this, Bixby’), wiping every fraction of a second if they haven’t heard it.
It was established recently that Amazon’s Alexa recordings are sent for transcription, in order to improve the language skills of the AI that powers it.
Now it would seem that Google is doing exactly the same and sometimes it does so when it hasn’t been asked.
A Belgian whistleblower told VRT News about the issue, and it was able to personally identify the address of two sets of recordings from the content, and neither set had an “OK Google” command at any point.
The whistleblower, who went to VRT after hearing the allegations surrounding Alexa, confirmed that he worked for a sub-contractor, paid by Google, to transcribe and annotate recordings, including presumptions of the speaker’s age and other demographics.
The report suggests that the whistleblower works at what is likely to be hundreds of global locations, all doing the same work, eavesdropping on everything from our bank details to the noises we make in the bedroom. Just… yuck.
What’s particularly notable about this is the fact that nowhere in Google’s T&Cs does it mention anything about recordings being listened to by another human being. That’s the sort of thing you’d think it’d want to make clear.
Google says it only transcribes about 0.2 per cent of the total number of recordings it receives (because that’s alright then) and that it uses the data to improve voice recognition (as is Amazon’s defence too).
“We partner with language experts around the world to improve speech technology by transcribing a small set of queries – this work is critical to developing technology that powers products like the Google Assistant,” it said in a statement given to INQ.
“Language experts only review around 0.2 per cent of all audio snippets, and these snippets are not associated with user accounts as part of the review process.
The whistleblower says that those doing the work were given no guidance as to what to do if they heard someone in danger or undertaking something illegal, adding that there were regular arguments between couples, make-up sex noises and, in some cases, concern over the welfare of the voice at the other end.
Rather than confront this, Google has suggested its more interested in finding the whistleblower for breaching its security policies than righting the wrong.
“We just learned that one of these reviewers has violated our data security policies by leaking confidential Dutch audio data,” it said. “Our Security and Privacy Response teams have been activated on this issue, are investigating, and we will take action. We are conducting a full review of our safeguards in this space to prevent misconduct like this from happening again.”
Although Google says that data is anonymised before it is sent, this first-hand account suggests that it’s not hard, in some cases, to work out exactly who is talking and what they’re up to.
We’d quite like an explanation of that too, please Google. μ