Word-level machine translation for bag-of-words text analysis

Cheap, fast, and surprisingly good

By A. Maurits van der Veen in machine translation computational text analysis methods

September 28, 2023

Abstract

The quality of automated machine translation is rapidly approaching that of professional human translation. However, the best methods remain costly in terms of money, computational resources, and/or time, particularly when applied to large volumes of text. In contrast, word-level translation is both free and fast, simply mapping each word in a source language deterministically to a target language. <br><br> __Initial work on generating good word-level machine translation was done with STAIR lab students during the 2019-2020 academic year. Over the course of the next few years, we gradually improved and expanded the method.__ <br><br> This paper demonstrates that high-quality word-level translation dictionaries can be generated cheaply and easily, and that they produce translations that can be used reliably as inputs into some of the most common automated text analysis methods.

Date

September 28, 2023

Time

12:00 AM

Event
Posted on:
September 28, 2023
Length:
0 minute read, 0 words
Categories:
machine translation computational text analysis methods
See Also: