• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

My TechDecisions

  • COVID-19 Update
  • Best of Tech Decisions
  • Topics
    • Video
    • Audio
    • Mobility
    • Unified Communications
    • IT Infrastructure
    • Network Security
    • Physical Security
    • Facility
    • Compliance
  • RFP Resources
  • Downloads
  • Podcasts
  • Subscribe
  • Project of the Week
  • About Us
    SEARCH
Audio, Unified Communications

Google AI’s Translatotron Can Translate A Speaker’s Voice — With the Same Characteristics

New direct speech-to-speech translation tool, Google AI Translatotron, still in development but could soon change the way we communicate across languages.

May 29, 2019 Adam Forziati 1 Comment

Google AI Translatotron,

Google AI recently announced Translatotron, an experimental new direct speech-to-speech translation tool that Google says is capable of “faster inference speed, naturally avoiding compounding errors between recognition and translation… [and retaining] the voice of the original speaker after translation…”

Google AI Translatotron “is based on a sequence-to-sequence network which takes source spectrograms as input and generates spectrograms of the translated content in the target language,” the development team says.

How Google AI Translatotron Works: Simplified

Here is a visual recreation — taken from Google AI’s announcement — of how the technology works:

Preserving the Sound of the Original Speaker

“By incorporating a speaker encoder network, Translatotron is also able to retain the original speaker’s vocal characteristics in the translated speech, which makes the translated speech sound more natural and less jarring… The speaker encoder is pretrained on the speaker verification task, learning to encode speaker characteristics from a short example utterance. Conditioning the spectrogram decoder on this encoding makes it possible to synthesize speech with similar speaker characteristics, even though the content is in a different language.”

What does this mean in practice? Let’s listen to find out.

The audio clips below, taken from the Google AI announcement, show the Google AI Translatotron transferring the original Spanish speaker’s voice into a translation in English.

Spanish Source: 

https://mytechdecisions.com/wp-content/uploads/2019/05/10148907792880119076.wav

Reference Translation in English:

https://mytechdecisions.com/wp-content/uploads/2019/05/10148907792880119076-1.wav

Google AI Translatotron Translation in Original Speaker’s Voice:

https://mytechdecisions.com/wp-content/uploads/2019/05/10148907792880119076-2.wav

 

What this Means for Collaboration

Google AI claims the Translatotron is possibly the first end-to-end model direct speech-to-speech translation tool that can directly translate speech from language into similar-sounding speech in a different language.

If the technology is further developed, this could effectively break down the language barrier in a more instantaneous, seamless way for teams working across cultural or international borders. It could also allow for quicker client relations and a reduced translation service cost.

 

Tagged With: Artificial Intelligence, Collaboration

Related Content:

  • Logitech Rally Bar Logitech Releases New Rally Bar, Rally Bar Mini…
  • Mimecast Certificate Compromised Mimecast: Hackers Compromised A Digital Certificate
  • ClearOne Unite Cameras ClearOne Announces New UNITE Webcams for Hybrid Work
  • 2021 Collaboration Trends 5 Trends That Will Impact Collaboration In 2021

Free downloadable guide you may like:

  • The Ultimate Guide to RFPs is Now FREE – For A Limited Time

    Get expert tips for creating an effective technology request for proposal. Our free guide will help you get the best tech proposals the first time.

Reader Interactions

Trackbacks

  1. Making the NAS Algorithm More Accessible: This MIT Research Could Be a Boon to Machine Learning - My TechDecisions says:
    May 31, 2019 at 10:22 am

    […] Related: Google AI’s Translatotron Can Translate A Speaker’s Voice — With the Same Characteristics […]

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Get the FREE Tech Decisions eNewsletter

Sign up Today!

Latest Downloads

Top 9 Reasons Enterprise IT Leaders Are Moving Their Video Surveillance to the Eagle Eye Cloud

Working in IT has enough challenges without adding in the complications of surveillance video. Things like total cost of maintenance, how the VMA m...

Using Live Chats and Chatbots to Increase Customer Engagement

There's a lot to consider when building out a chatbot experience to ensure that it delivers a seamless experience and meet your business goals.

Finding a New Balance

The shift to a hybrid work environment is a trend that has increased in popularity over many years. The COVID-19 pandemic has spurred this trend, f...

View All Downloads

Would you like your latest project featured on TechDecisions as Project of the Week?

Apply Today!
Sharp Microsoft Collaboration HQ Logo

Learn More About the
Windows Collaboration Display

More from Our Sister Publications

Get the latest news about AV integrators and Security installers from our sister publications:

Commercial IntegratorSecurity Sales

Footer

TechDecisions

  • Home
  • Welcome to TechDecisions
  • Subscribe to the Newsletter
  • Contact Us
  • Media Solutions & Advertising
  • Comment Guidelines
  • RSS Feeds
  • Terms of Use
  • Privacy Policy
  • Twitter
  • Facebook
  • Linkedin

Free Technology Guides

FREE Downloadable resources from TechDecisions provide timely insight into the issues that IT, A/V, and Security end-users, managers, and decision makers are facing in commercial, corporate, education, institutional, and other vertical markets

View all Guides
TD Project of the Week

Get your latest project featured on TechDecisions Project of the Week. Submit your work once and it will be eligible for all upcoming weeks.

Enter Today!

© 2021 Emerald X, LLC. All rights reserved.