Brief us your requirements below, and let's connect
1101 - 11th Floor
JMD Megapolis, Sector-48
Gurgaon, Delhi NCR - India
1st floor, Urmi Corporate Park
Solaris (D) Opp. L&T Gate No.6
Powai, Mumbai- 400072
#12, 100 Feet Road
Banaswadi,
Bangalore 5600432
UL CyberPark (SEZ)
Nellikode (PO)
Kerala, India - 673 016.
Westhill, Kozhikode
Kerala - 673005
India
Table of Contents
In this article, we will introduce the existing state-of-the-art in portative translating machines, for example, small devices able to help English speaking tourists to dialog ‘in real-time’ with Russian speaking people (or the other way round).
Imagine that you – an English speaker – have to visit Moscow or Saint-Petersburg. Wouldn’t it be nice that you can just ask help from some local people in their own language – Russian?
Wouldn’t it be nice that they could hear the questions as you speak on your smartphone or tablet or laptop? Finally wouldn’t it be nice that they also can dialog with you and that their answers get translated to you in an equivalent way, allowing you to dialog with them?
The ‘real-time’ translator has been abundantly featured in sci-fi literature and films for example, with the ability for a space traveler to dialog with locals from another world, in their native language, without knowing anything of it.
This may be a surprise to some of our readers, but such devices – while in a primitive form – actually do already exist.
Basically, a universal translator is a concept device that potentially allows us to instantaneously translate any language into each other, even without prior knowledge of it.
The way such a device could extrapolate and interpret a totally ‘exotic’ language is still – as of 2020 – a mystery and so there is no clue that such a device could be produced soon.
Also, actually, there are no ‘exotic’ languages and all languages on earth are supposed to be recorded and known.
Translation and deciphering have many common points. An exotic language would appear to us as ‘crypto-code’, plainly ununderstandable.
Actually how much of the following languages are you able to understand?
“Здравствуйте, я пришел с миром!”
“Hallo, ek kom in vrede!”
“مرحبا ، لقد جئت بسلام!”
“Kaixo, bakean nator!”
“你好,我平安!”
“Bonjour, je viens en paix!”
“Helló, nyugodtan jövök!”
“こんにちは、私は安心して来ます!”
This is the English sentence “I come in peace!” translated respectively in Russian, Afrikaners, Arabic, basque, Traditional Chinese, French, Hungarian and Japanese.
They’re just a few examples over the hundreds of the main languages that exist in the world. In fact, there exist in the world more than 6,500 spoken languages.
However many of these languages (2,000) are in fact very rare dialects spoken by a small population – usually a tiny ethnical group.
For example, the Kaixana or the Taushiro languages are only spoken by one person in the entire world.
This should underline the incredible challenges to overcome to build a Universal translator which could convert one of these exotic languages into any other language – let us say into English.
The inability to understand each other’s language is most often the cause of war and conflicts. In a totally foreign territory, the initial communication with unknown people speaking unknown languages is crucial.
The universal translator must
Now with the development of Artificial Intelligence and especially deep learning, we become closer to the development of such a device.
Again, miniaturization, an increase of power in processors, and research in AI allow the creation of ‘primitive’ types of Universal Translators.
Automatic universal translators may not be a toy really, imagine what happens if we meet aliens from outer space, what would they want from us? How would you know to deal with them unless we have the Universal Translator?
The problem is featured in many sci-fi movies where the mere survival of the humankind is linked to that universal translation problem, friends, or enemies?
An incredible novel by Richard Matheson named “To serve man” and adapted by the famous TV series “The Twilight Zone”, features, in a very dramatic way, this translation problem.
In what appears to be a complete plot twist, an army translator discovers that a book left by an alien race pretending to help humanity has a title: “To serve man” and concludes that the race has nothing but friendly and compassionate goals until dozens of years later the same translator understands, but much too late, that the book was in fact a cookbook.
Now let us look at how automatic translation works behind the scene. Basically, there are currently two main techniques: statistical translation and the newest technique, neural network translation.
The main techniques for Machine Translation (MT) are:
A very basic way of translating is the ‘dictionary’ approach or word by word. It is relatively straightforward to implement such a translator.
Of course, one word may have multiple senses depending on the context. In the case of the Richard Matheson novel “How to serve man”, this proves how dangerous such an approach can be.
Here we present a few results of such a primitive approach:
English | I | COME | IN | PEACE | |
Hebrew | אנ | לבוא | ב | שָׁלוֹם | “Please come in peace” |
Greek | Εγώ | Έλα | σε | ειρήνη | “I come in peace” |
Russian | я | приходить | в | мир | “I come into the world” |
With this approach, the Hebrews would be required to “come in peace” by foreigners, themselves arriving in the Hebrew land (which may be considered as a hostile act of invasion) and the Russians would be told that the foreigners are “coming into the world” (the idea of birth as well) which may also mean hostile intentions.
Imagine a total foreigner arriving in your own house and telling you:
-“Please, come in peace!”
or
-“I come into the world!”
This is enough to show the limited approach of such a word-to-word translation. When considering more complex sentences, this approach would usually fail dramatically to produce an adequate translation.
Yet rules-based translation, transfer rules, and building of lexicons are widely used in commercial machine learning systems and rely therefore on expert translator’s work.
Here is a very brief overview of some historical evolution of Machine translation techniques.
Period | scientists/organizations involved | Technique |
1949 | Warren Weaver | Initiate the idea of using machines for performing natural language translation tasks |
1950-1980 | MT done mostly through rule-based machine translation (RBMT) | |
1960 | US Gov | ALPAC report |
1970 | Systran Corp | Creation of the SYSTRAN system |
1980-200 | Development of Statistical based Machine Translation (SMT) | |
End of 80’s | Makoto Nagao | Development of Example-based machine Translation |
1990 | Development of strong statistical parsers for language translation | |
1990 | IBM | Candide Project. Develop fundamental aspects of SMT. |
2000-+ | Development of Neural Machine Translation (NMT) | |
1997 | Neco & Forcada | First mention of using encoders/decoders for MIT |
2003 | Bengio & Al | First language model based on neural networks |
2013 | Kalchbrenner & Blunsom | Birth of the neural machine translation. Development of a full end-to-end neural translation machine using CNNs and RNNs |
2014 | Sutskever & al. + Cho & al | Seq2seq MT. using RNNs and LTSM. Solving of the “long distance reordering” problem |
2014 | Bengio & Al | Creation of the “attention” mechanism |
Statistical MT uses statistics to make a correlation in large records of bilingual text corpora.
The basic idea of statistical MT is to compute probabilities/distributions p(A|B) where B is a sentence/word in the ‘source” language and A is the translated sentence in the “target” language.
Statistical MT is not simple and while it started in the ’50s. It is still an area of active research. Markov Models are also typically used in the process.
In the process of statistical translation, sentences are represented by n-grams. This is a fundamental concept in linguistics.
The difficulties to overcome are tremendous and obviously while statistical MT may give very good results with languages that have similarities and/or belong to the same group, it may fail when considering languages that are very different by nature.
The following picture illustrates how a typical sentence-bases statistical translator will work by finding the pair of sentences with the maximal probability.
A Statistical MT will work much better with an English-German translation rather than with an English-Japanese translation.
A good example of statistical MT is moses. Their website contains most of the fundamental resources and information about statistical MT.
The neural network approach, and especially the deep learning approach is relatively new to Machine Translation.
Among this, the Recurrent Neural Networks (RNNs) are good because they can process a sentence as a time series of conditional events, the same way a Markov Model would do.
The first models of MT using RNNs were in fact made of two separate RNNs: an encoder and a decoder.
Usually, the MT neural network will compress/code the source (eg the input sentence) into a fixed-length vector and then after having performed a classification will decode/decompress the output, which of course may create issues when the sentence is long.
This problem has been more or less solved with the “attention” mechanism, where a relevant part of the input text is weighted so that the translation mechanism can focus more on it than on the rest of the text.
Therefore attentional encoder-decoder networks are actually considered as the state-of-the-art in NMT.
RNNs have ‘insane’ performances when it comes to MIT tasks and this explains why most commercial organizations offering language translation have moved to such techniques and gradually abandoned Statistical machine translation.
Some organizations such as Yandex Translation are using a hybrid approach, combining NMT and SMT.
In what comes next, we want to present to the reader what are the current devices that can be commercially bought and used as a basic Universal Translator.
This category features some computers mostly acting as portable dictionaries and often using rule-based translation.
ECTATO offers a wide range of electronic dictionary through the PARTNER products (and also instant voice translators through the ITRAVL products )
Their products are often to be used by professional and skilled translators and more focuses on aiding human translation (like international interpreters for example on duty)
Some of their pocket dictionaries include rarely spoken languages such as the Yiddish for example.
The Franklin TWE-118 5-Language European Translator offers no more than 210,000 total translations.
Casio offers many portable electronic dictionaries. Each model is loaded with a given pair of languages.
Atlas provides specialized electronic translators (engineering, medicine, etc.), especially for the Arabic language.
Here are ‘real-time’ conversational translators. Often using Statistical Machine Translation or Neural Networks Translation. They can be seen as a primitive form of the Universal translator.
These devices will be able to convert ‘ instantaneously’ your own voice into spoken sentences in a foreign language.
Pocket Talk has 74 languages built-in in its memory. It is using AI for translation and relies entirely on its own data, which means it will fully work even offline.
The translator comes with its own embedded sim card (and therefore a data plan). It can translate in 105 languages but needs always to be online to get translation from a dozen well-known translation engines.
This pocket translator is considered to be a real bargain. It needs to be online to work.
It is one of the fastest translators on the market (0.3 seconds/translation). It needs to be online to work.
Very good AI and features a huge screen.
Languages almost translated in ‘real-time’ and work as a freehand kit.
Mostly used for English to Chinese translation. Works online or offline.
We are of course extremely far from being able to build a universal translator. The actual devices can only translate at maximum the main hundred of languages on earth but they cannot understand the rest of the 6,400 languages that are spoken on our planet.
Some languages are only spoken in specific regions of the Amazon river or Peru for example. This is where a universal translator could show all its strength.
Finally, we would be unable to ‘decode’, ‘decipher’ an unknown language with the actual existing translators but as a conclusion, nothing beats learning yourself a foreign language when you want to translate something.
Acodez is a renowned web development and Emerging Technology Services company in India. As a web design company, we offer all web development services to our clients using the latest technologies. We are also a leading digital marketing company providing SEO, SMM, SEM, Inbound marketing services, etc at affordable prices. For further information, please contact us.
Contact us and we'll give you a preliminary free consultation
on the web & mobile strategy that'd suit your needs best.
How Emotional Intelligence is Transforming Conversational AI
Posted on Jun 20, 2024 | Emerging TechnologiesHighest Paying Tech Jobs
Posted on May 09, 2024 | Emerging TechnologiesWeb 3 Gaming and Virtual Reality: The Perfect Pairing for Immersive and Decentralized Gameplay
Posted on Nov 22, 2023 | Blockchain