Microsoft Translator API
Written by Sue Gee   
Saturday, 02 April 2016

Microsoft has released a new version of its Translator API. This provides developers with the same speech-to-speech facilities as those used in the Skype Translator and in the iOS and Android Microsoft Translator apps.

mstranbanner

The blog post announcing the availability of the new Microsoft Translator API Microsoft describes it as:

the first end-to-end speech translation solution optimized for real-life conversations (vs. simple human to machine commands) available on the market. 

It also explains how it works using AI technologies, such as deep neural networks for speech recognition and text translation and outlines the following four stages for performing speech translation.

  1. Automatic Speech Recognition (ASR) — A deep neural network trained on thousands of hours of audio analyzes incoming speech. This model is trained on human-to-human interactions rather than human-to-machine commands, producing speech recognition that is optimized for normal conversations.

  2. TrueText — A Microsoft Research innovation, TrueText takes the literal text and transforms it to more closely reflect user intent. It achieves this by removing speech disfluencies, such as “um”s and “ah”s, as well as stutters and repetitions. The text is also made more readable and translatable by adding sentence breaks, proper punctuation and capitalization. (see picture below)

  3. Translation — The text is translated into any of the 50+ languages supported by Microsoft Translator. The eight speech languages have been further optimized for conversations by training on millions of words of conversational data using deep neural networks powered language models.

  4. Text to Speech — If the target language is one of the eighteen speech languages supported, the text is converted into speech output using speech synthesis. This stage is omitted in speech-to-text translation scenarios such as video subtitling. 

mstransqhow

(click to enlarge)

Microsoft Translator covers two types of API use and integration:

1) Speech-to-speech translation is available for English, French, German, Italian, Portuguese, Spanish, Chinese Mandarin and  Arabic.

2) Speech-to-text translation, for scenarios such as webcasts or BI analysis, allows developers to translate any of these eight supported conversation translation languages into any of the supported 50+ text languages.

A two-hour free trial is available. This provides 7,200 transactions where a transaction is equivalent to 1 second of audio input and is the same as the free monthly tier. Beyond this subscriptions are are available: 

 

mstranlatorprices

The prospect of being able to communicate without language barriers is becoming ever more a reality and the more we use it the better the facility will become. Ironically there's a error in the sample Microsoft uses in its artwork above - Gurdeep is the object of the final sentence in the English and becomes the subject in the French. This sort of error will quickly be corrected by machine learning as more data becomes available.

mstransq

Banner


DuckDB And Hydra Partner To Get DuckDB Into PostgreSQL
11/11/2024

The offspring of that partnership is pg_duckdb, an extension that embeds the DuckDB engine into the PostgreSQL database, allowing it to handle analytical workloads.



IBM Opensources AI Agents For GitHub Issues
14/11/2024

IBM is launching a new set of AI software engineering agents designed to autonomously resolve GitHub issues. The agents are being made available in an open-source licensing model.


More News

 

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Saturday, 02 April 2016 )