Are online transcription services secure and confidential?

Hello, Habr! I present to you the translation of the article “Are Online Transcription Services Safe and Private?” author Matthew Hughes .

Transcription was once a manual, tedious process. Doctors, journalists, and a whole crowd of professionals wrote down their notes and conversations on the recorder, and then sat down in front of the computer to type them on the keyboard.

In 2020, there are a number of services that you can use to turn your audio recording into text. However, the question remains: Are they safe? For example, you can download recordings of sensitive conversations and private voicemail.

Let's look at these services, and how you can protect your information.

image
Illustration by Yangard
Creative Commons License
This image is licensed under a Creative Commons Attribution-ShareAlike 4.0 Global License .

How audio transcription services work


Audio transcription services, as a rule, are divided into three categories. The first is fully computer-controlled and uses existing AI and machine learning models to handle the conversation. The second is the most expensive, when people complete the whole process. The third is a combination of computer processing and human.

Most likely, you are most familiar with the first category. Voice transcription services - such as Google, Apple, and Otter.ai - convert the analog waves created by your voice into a digital representation. Then they are divided into small (sometimes a thousandth of a second) segments and compared with the well-known "phonemes" or elements of the language.

Then these algorithms try to consider them in the context of other phonemes and pass them through statistical and AI models, which ultimately produce text. Since these transcription services are fully computer controlled, they are usually the most inexpensive. However, accuracy is not always up to par, especially when it comes to extracting text from a noisy or multi-user environment.

Human transcription includes specific platforms, such as Rev , that connect clients to a pool of preapproved transcriptors. You can also hire someone from freelance employees, such as Upwork or Fiverr .

Finally, there is a mixture of the two. To speed up the transcription process, some sites allow the AI ​​to do preliminary work, and then someone removes the output and corrects any errors.

Transcription services behave badly


In recent years, many transcription services have become the subjects of violations and scandals.

Perhaps the oldest (and perhaps most egregious) was SpinVox, which in the “noughties” offered a service that turns voice mail into SMS messages. At that time it was considered a technological breakthrough. The company quickly attracted positive reviews from the press, customers and extensive funding.

What is the problem? Secretly from customers, their voice messages were processed by people working from offices located in Pakistan, Mauritius and South Africa . One insider company claimed that only 2% of voicemail was handled by machines, while the rest were handled by approximately 10,000 employed workers.

When SpinVox’s Pakistani office was not paid, they began sending messages directly to customers in protest. In the end, the truth came out, and SpinVox lost most of its value, and the rest of the company was sold to Nuance , one of the largest voice recognition service providers in the world.

More recently, cyber security journalist Brian Krebs has discovered a serious violation at MEDantex , a Kansas provider of voice transcription services for healthcare providers. There was a leak of data (some of which date back to 2007) containing confidential medical records. Their contents could be downloaded from an unsafe portal in the form of Microsoft Word files.

Even fully digital transcription services are unsafe. You order such a computerized service, and the company can use it to control the quality of people working under the contract.

In 2019, the Belgian news site VRT NWS discovered that Google contractors are listening to conversations between people and their smart assistants at Google Home. One of the contractors even provided VRT NWS with access to conversations, many of which were deeply sensitive, and in some cases sexually intimate.

Amazon, Apple and Microsoft also used contractors. In other words, someone could listen to the voice recordings of your virtual assistant .

Actual question: Are online transcription services safe?


The answer to this question is a bit complicated.

At the moment, the market is largely ripe for transcription services, and the most blatantly bad players have been eliminated.

However, when you trust your data (in this case private conversations) to a third party, you expect that they are sufficiently protected. Regardless of whether it’s an online service, or a service using transcript workers.

But in any case, ask yourself two questions: Do you trust this service and how delicate are your conversations?

If you are studying a transcription service, you should always do some research. Does this company have a good reputation? Has she established herself well? Did she have any violations in the past? Is there a privacy policy that clearly spells out how your data will be processed and protected?

As previously mentioned, AI-based services often rely on employees and third-party contractors to conduct quality checks. Although these checks are only a fraction of all orders, there is always a chance that someone can listen to your recordings.

But in many cases, this does not interfere with the transaction. However, if your conversation is deeply private or commercially sensitive, consider opening a text editor and transcribing yourself.

All Articles