Frieda Steurs
Language Technology: What Is Next?
Conferencista
Resumo →
Frieda Steurs
Language Technology: What Is Next?
Experts have always dreamt about the possibility of computers to fully understand and decode language. However, human (natural) language seems to be the most difficult paradigm for a computer.
In the fifties of last century, a lot of research was done attempting to capture language in universal rewriting rules so that computational analysis could be made possible. The famous ALPAC(Automatic Language Processing Advisory Committee) report that was published in 1966 brought the positive feelings about computational linguistics to a standstill; the report condemned all attempts to try and build a working machine translation (MT) software tool as a complete failure. As a result, a lot of the research in this area lost its funding. Later on, in the seventies and eighties, other views on how to analyze language using computers started to pop up and were more successful. Gradually, dedicated MT applications on limited domains, with restricted vocabularies and limited language pairs seemed to work well. At the same time, Moore’s law made it clear that the number of transistors in an integrated circuit (IC) doubles about every two years. It meant a revolution in the growing capacity of hardware and storage of computers. As this power grew, more complex software applications could be used.
In this lecture, we will use the history of MT to explain the growing power and development of new algorithms. The first types of MT were rule-based, after that, statistical MT became the standard, and nowadays neural machine translation is the new reality.
We will also go deeper into the effects of using Artificial Intelligence (AI) in linguistic applications. As an example, we will use the case of traditional lexicography, compiling dictionaries in a manual way, versus the more modern corpus based methods, and now resulting in new algorithms using AI. In order to find new ways to use AI in traditional lexicography, the Dutch language institute (Leiden) organized a Lorentz workshop in November 2019. During a very intensive week, 55 international scholars from different domains (linguistics, computer science, AI experts, terminologists, etc.) were brainstorming on the question of finding new ways to build and update academic dictionaries.
Finally, we will also briefly touch upon the world of the language industry. The language industry is booming. Free lancers, language service providers, translation agencies, everyone can profit from the new trends in this field. Domain specific knowledge is an important aspect: life sciences, legal domains, e-commerce and online retail industries, governance and politics, finance, IT, electronics, automotive, … are major topics with a lot of specific information and communication to be dealt with. Language technology can help to create texts (writing tools) and also to assist in translating these materials. New technology is evolving at a very fast pace: Machine Translation and Post Editing, Video Translation and Subtitling, a lot of new techniques and workflows are waiting for skilled language professionals. On the other end of the spectrum, Transcreation is a new trend, and this involves other skills: localisation and creative writing.