Apple working on how to read back iMessages in the sender's voice

A new Apple patent application describes converting an iMessage into a voice note played back in a voice that the user has created with samples of the sender's voice.

Apple users can already send audio recordings in iMessage or have Siri read text messages back to them, but the patent describes a way to have the device read the text message in the sender's voice instead of Siri, using a voice file.

"The voice model is provided to a second electronic device," according to the patent. "In some examples, a message is received from a respective user of a second electronic device."

Meaning, when someone sends an iMessage, they can choose to attach a voice file, which would be stored on the device. If this happens, the receiver will be prompted to decide if they want to receive both the message and the voice recording.

"In response to receiving the message, a voice model of the respective user is received," the patent reads. "Based on the voice model, an audio output corresponding to the received message is provided."

According to the patent, the iPhone in question would then build a Siri-like profile for the sender's voice, and then simulated it when reading that message and any future messages they receive from that sender. The voice simulation model could also be sent on its own so a person's contacts can download it before messages.

Watch the Latest from AppleInsider TV

It would offer more personalization when friends and family text each other, instead of hearing Siri's voice when it reads messages. Couples could also hear messages in a more personal way, such as hearing "I love you" in their partner's voice.

The patent's inventors are Qiong Hi, Jiangchuan Li, and David A. Winarsky. Winarsky is Apple's director of text-to-speech technology, while Li is a senior Siri software engineer for machine learning at Apple, and Hu formerly worked on Siri at the company.

As usual, with patents, it won't necessarily become a reality, but it's possible given Apple's recent work with artificial intelligence and voices. For example, with iOS 11, Apple switched Siri's voice from relying on recordings from voice actors to a text-to-speech model using machine learning.

In 2020, Apple acquired a company called Voysis that worked on improving natural language processing in virtual assistants. They used WaveNet technology, which was introduced by Google's DeepMind program in 2016.

WaveNets are "deep generative models of raw audio waveforms" that can be used to generate speech that mimics any human voice.

Apple has also started using artificial intelligence to narrate specific audiobook genres instead of using humans. So the patent is entirely in the realm of possibility of having an Apple device eventually learn to read messages in a person's voice.

18 Comments

DAalseth 7 Years · 3220 comments

About 2 years ago

A bit torn. Sounds like it might be cute. But also sounds like it could be creepy beyond all reason. You know,, depending on the circumstances.

2 Likes · 0 Dislikes

maltz 14 Years · 530 comments

About 2 years ago

Patenting does not necessarily mean "working on". I can't imagine anyone would want or care about this.

1 Like · 0 Dislikes

bloggerblog 17 Years · 2550 comments

About 2 years ago

This is so possible today. A friend of mine asked ChatGPT a question, he took the answer and uploaded a few samples of his voice to a different AI and then had it read the answer from ChatGPT with his own voice, and let me tell you... it was dead on, with all the quirks and nuances in his voice. Crazy stuff... So this patent is 100% plausible

1 Like · 0 Dislikes

foregoneconclusion 13 Years · 2939 comments

About 2 years ago

DAalseth said:

A bit torn. Sounds like it might be cute. But also sounds like it could be creepy beyond all reason. You know,, depending on the circumstances.

I vote for "creepy".

3 Likes · 0 Dislikes

mike1 11 Years · 3465 comments

About 2 years ago

Not sure. Might be OK for close contacts like my wife or kids. In reality, those would likely be the only people I would have a voice sample from anyway.