With the world getting more and more connected, the need for multilingualism increases every day. There is no problem in getting in touch with someone of a completely different nationality, culture and language, but we fall short in the maintenance of this contact if the participants share no language to communicate in. Luckily, there have been some great improvements in machine translation, which offers us the opportunity to translate any word, sentence or text to a different language without having any knowledge about this language ourselves.
One of these machine translation systems was announced in April 2006 by Franz Och, one of Google’s main research scientists. He called it an exciting new project, which would later turn out to be Google Translate nowadays translating over 140 billion words every day. Since then, Google Translate has experienced some great developments, which will be pointed out in this article.
The beginning
Starting from scratch, in April 2006 Google Translate came in with a phrase-based, statistical machine translation system. For the time being, this was rather innovative. Most state-of-the-art machine translation systems were either rule-based or example-based. These rules and examples on which the translations were based, were stated by linguists based on their studies on the syntax of the source and target language, which was a rather expensive task in time and effort. Instead, Google used Statistical Machine Translation (SMT), meaning that their system generated translations based on statistical models. The computer would have access to billions of words of text in both the source and target language. Analysing all these texts, parameters could be derived for the statistical models. Using this method, Google Translate has not implemented any linguistic advice on grammatical rules or rare words and exceptions of a language. In fact, Google Translate has no clue how a language is constructed. It just gathers all information it needs out of thousands of documents, including United Nations and European Parliament transcripts, to write its models.
Furthermore, Google used a phrase-based SMT, in contradiction to previously used word-based translation. In phrase-based translation, sentences or texts are divided in sequences of words, called blocks, instead of in individual words. These blocks do not necessarily equal linguistic phrases, but they are found by the use of the statistical methods mentioned before. By translating sequences of words rather than independently translating every word of a text, the quality of translation increases.
The results of an evaluation of translation quality are mostly measured using bilingual evaluation understudy (BLEU). BLEU compares machine translations to a professional human translation, where the main aim is to get as close to the human translation as possible. The output of BLEU is given by a number between 0 and 1, where 1 would indicate a candidate text that is identical to the human reference translation. Since a text can always be translated in multiple ways, for example using synonyms, a translation does not necessarily have a BLEU score of 1 to be a perfectly correct.
The findings of Google’s Brain Team
After the launch of Google Translate as a phrase-based SMT in 2006, the innovations and developments in Google’s Brain Team and Translate Team did not stagnate. Google Translate broadened its available languages to 103 languages, and great improvements were made in the speech and image recognition capabilities of the system. As stated by Google Brain Team Member Mike Schuster, Google was convinced they could raise the translation quality further with neural networks reforming many fields, “but doing so would mean rethinking the technology behind Google Translate”. And so they did.
In September 2016, ten years after the first announcement of Google Translate, Google’s Research Blog revealed some of their latest improvements: the Google Neural Machine Translation system (GNMT). GNMT is Google’s form of Neural Machine Translation (NMT), which is a form of machine translation that uses neural networks and deep learning. NMT systems mostly contain three different systems: encoders, decoders and an attention network. The system encodes an input text as list of vectors denoting the words. The encoders read every word independently, and use the context of a word to find its meaning and give some abstract representation of it. After this, the decoders begin to generate the translation of the text by looking for the best representation matches in the target language. The attention network aligns input words to output words, and draws a weighted distribution on which encoded vectors are most relevant to translate. In both the encoder and the decoder layers, Recurrent Neural Networks (RNNs) are used to directly learn the mapping between the input and output text. NMT recognizes the whole input text as a translation unit, whereas the previously used phrase-based machine translation translates the separate blocks of sentences independently.
The RNNs that Google uses in their GNMT are Long Short-Term Memory (LSTM) RNNs. These are deep learning RNNs, which perform better in a network of multiple layers than traditional RNNs. The LSTM RNNs in GNMT contain eight layers in both the encoder and the decoder network. Furthermore, the memory of LSTM RNNs enables them to learn tasks where the memory is needed of events that happened up until millions of steps in discrete time ago. Google Translate uses the analysis of numerous text documents to learn about the syntax and grammatical rules of a language. Therefore, it is of importance that the analysis of all previously registered texts is somehow stored in the memory of the system, to be used right away when needed. Hence, LSTM RNNs generate much more accurate translations than traditional RNNs would.
Although it seems that NMT systems are a great development in machine translation, the translations appeared to be less accurate than that of previously used phrase-based translation. The three greatest weaknesses of NMT were slower training and inference speed, dealing with rare words and missing out on parts of the input text. With GNMT, Google tried to come up with solutions for these weaknesses, as described in their technical report Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. For example, the issue of the recognition of rare words is solved using sub-word units, so the words are divided in so-called wordpieces. In some languages, it is possible to put together numerous nouns to one single word. In Dutch for example, the word ‘autoventieldopjesverzamelaarsvereniging’ is a, though not widely used, perfectly correct word. By the use of sub-words, a long and rare word can be divided into different nouns. This makes it a lot easier to translate it into something close to the human translation ‘club of collectors of valve caps of cars’ in English. Overall, GNMT made great improvements compared to the previous phrase-based SMT. According to Google’s technical report, translation errors are reduced by an average of 60%.
Zero-Shot Translation
The first released version of the GNMT system supported nine languages, including English, Chinese and Spanish. Although this covers about 35% of all input, the next challenge for Google was to extend GNMT to enable it to translate from and to all 103 languages supported by Google Translate. Normally, an NMT network is specified on a single language pair, which means that 103 NMTs would be needed. However, Google would not be Google if they would not try to do it differently. In November 2016, two months after the first announcement of GNMT, Google revealed their findings on a multilingual network, surprisingly called Multilingual GNMT.
Instead of changing the original architecture of the GNMT system, Google “just added” a token at the beginning of each input text. This token specifies to which language the input text has to be translated. For example, “<2en>” denotes English as target language and “<2ko>” Korean. But besides covering many more languages in one single system, the Multilingual GNMT system revealed something else rather interesting. Described in the technical report Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation, Google introduces the ability to translate between language pairs that the system itself has never explicitly seen before. The multilingual system shares its parameters in the translation of several different language pairs, which makes it possible to transfer the “translation knowledge” that the system has built up in between several language pairs. For example, let the system be trained to translate Japanese texts from and to English texts, and Korean texts from and to English texts. The shared parameters and “translation knowledge” gives the system the ability to translate Japanese texts from and to Korean texts, even though the system is never specifically taught to make such translations.
When taking a closer look on this Zero-Shot Translations, another question raises. In the system, the encoders convert an input text to an abstract representation, after which the decoders convert this into a translation in the target language. When the input text contains three sentences with the same meaning but different in language, do the encoders convert these three sentences to the same abstract representation? And does that mean the system is learning some common representation for texts with the same meaning, where the source language does not matter anymore? In his research blog announcing Zero-Shot Translation, Mike Schuster says that this phenomenon is interpreted by Google “as a sign of existence of an interlingua in the network”. In this sense, an interlingua is considered a commonality between multiple languages.
Over the past ten years, Google has made some giant steps towards the development of a universal translation machine with the ability to perfectly translate any text to and from any language. However, do not distance yourself from professional translators and your own grammatical knowledge yet. Even an advanced system like GNMT can make errors, and the smallest mistranslation or dropped word can cause major misconceptions when dealing with highly detailed international documents. Luckily, this fact only encourages Google even more to further and further improve their exciting project Google Translate.
Dit artikel is geschreven door Marleen Schumacher