Embark on a linguistic journey with Ioannis Papachimonas to explore ‘How Computers translate human language.‘ In this listening activity, we unravel the complexities of language translation, contemplating the idea of a universal translator inspired by science fiction. Join us as we navigate the challenges of rule-based and statistical machine translation, bridging the gap between artificial intelligence and the beauty of human language.
Watch the video and listen
Video transcript (click here ⬅)
(00:00) How is it that so many
intergalactic species in movies and TV just happen to speak perfect English? The short answer is that no one
wants to watch a starship crew spend years compiling an alien dictionary. But to keep things consistent, the creators of Star Trek
and other science-fiction worlds have introduced the concept
of a universal translator,
(00:26) a portable device that can instantly
translate between any languages. So is a universal translator
possible in real life? We already have many programs
that claim to do just that, taking a word, sentence,
or entire book in one language and translating it into almost any other, whether it’s modern English
or Ancient Sanskrit.
(00:49) And if translation were just a matter
of looking up words in a dictionary, these programs would run circles
around humans. The reality, however,
is a bit more complicated. A rule-based translation program
uses a lexical database, which includes all the words
you’d find in a dictionary and all grammatical forms they can take,
(01:10) and set of rules to recognize the basic
linguistic elements in the input language. For a seemingly simple sentence like,
“The children eat the muffins,” the program first parses its syntax,
or grammatical structure, by identifying the children
as the subject, and the rest of the sentence
as the predicate
(01:29) consisting of a verb “eat,” and a direct object “the muffins.” It then needs to recognize
English morphology, or how the language can be broken down
into its smallest meaningful units, such as the word muffin and the suffix “s,”
used to indicate plural. Finally, it needs to understand
the semantics,
(01:49) what the different parts of the sentence
actually mean. To translate this sentence properly, the program would refer to a different set
of vocabulary and rules for each element of the target language. But this is where it gets tricky. The syntax of some languages
allows words to be arranged in any order,
(02:07) while in others, doing so could make
the muffin eat the child. Morphology can also pose a problem. Slovene distinguishes between
two children and three or more using a dual suffix absent
in many other languages, while Russian’s lack of definite articles
might leave you wondering whether the children are eating
some particular muffins,
(02:30) or just eat muffins in general. Finally, even when the semantics
are technically correct, the program might miss their finer points, such as whether the children
“mangiano” the muffins, or “divorano” them. Another method is
statistical machine translation, which analyzes a database
of books, articles, and documents
(02:51) that have already
been translated by humans. By finding matches between source
and translated text that are unlikely to occur by chance, the program can identify corresponding
phrases and patterns, and use them for future translations. However, the quality
of this type of translation depends on the size
of the initial database
(03:14) and the availability of samples
for certain languages or styles of writing. The difficulty that computers have
with the exceptions, irregularities and shades of meaning
that seem to come instinctively to humans has led some researchers to believe
that our understanding of language is a unique product
of our biological brain structure.
(03:35) In fact, one of the most famous
fictional universal translators, the Babel fish from
“The Hitchhiker’s Guide to the Galaxy”, is not a machine at all
but a small creature that translates the brain waves
and nerve signals of sentient species through a form of telepathy. For now, learning a language
the old fashioned way
(03:57) will still give you better results than
any currently available computer program. But this is no easy task, and the sheer number
of languages in the world, as well as the increasing interaction
between the people who speak them, will only continue to spur greater
advances in automatic translation. Perhaps by the time we encounter
intergalactic life forms,
(04:18) we’ll be able to communicate with them
through a tiny gizmo, or we might have to start compiling
that dictionary, after all.
Uncover your listening skills with this quiz
As we conclude our exploration of ‘How computers translate human language‘ with Ioannis Papachimonas, we find ourselves at the crossroads of technology and language intricacies. The journey through syntax, morphology, and semantics reminds us of the unique cognitive abilities underlying our understanding of language. Despite current limitations, the quest to overcome linguistic barriers through automatic translation promises a bright future. Happy listening!