How to Deliver Quality Machine Translation for Games

How to Deliver Machine Translation for Games

In previous articles, we’ve discussed what Machine Translation (MT) is—this article looks at how we work MT into the game localization production pipeline.

The general interest in machine translation centers around cost-saving, quicker speed to market, and exploring new languages or territories. The limiting factor inherent in these aspects is, and should always be, the quality of the translated output. However, as the audience for games has grown in both volume and sophistication, the demand for better, more accurate translation has increased. A game’s reception suffers from poor translation, with a concomitant drop in sales. Thus, it's more important than ever for developers and publishers to integrate MT into their workflow in the right way, being aware of all the ins and outs, and respecting some basic rules to create quality content.

MT System-Agnostic

Our MT process involves a system-agnostic approach in which we determine the best MT systems for the project on a case-by-case basis. This allows us to provide the faster turn-around times, cost savings, and data security that our partners demand. This process is a proven, best-of-breed practice that we’ve honed over the last few years of implementation, but we do consider it a continuous evolution, where quality is always the goal.

Maintaining Quality with Machine Translation

Our baseline guide for quality begins with the existence of a previous title’s content. If there is an accepted translation from which to work, that is a solid foundation upon which to base future projects, because it will have already gone through a rigorous process. Ideal projects include game sequels, MMOs (which tend to have iterated content), online games (because these tend to continuously update their content), and games that are similar. Because the industry values building upon IP over standalone content, there is ample opportunity to create strong collections of Translation Memory (TM), which are reservoirs of validated content. If the IP’s TM is usable, that saves a great deal of time and effort for future projects based on the original game.

Types of Game Localization Workflows

There are three typical scenarios:

Stock: The stock or generic MT engine(s) is used and PTW game expert linguists post-edit its output. This is utilized when not enough TMs are available, and productivity needs to begin right away. Nonetheless, a feasibility test is always performed at the beginning of a project to confirm that the increase in productivity does not involve a loss of quality.

Semi-customized: In a semi-customized workflow, the stock or generic engine is complemented with a glossary and some fine-tuning rules may be added. This is utilized when not enough TMs are available, and productivity needs to begin right away. This may be a solution to a feasibility test that had borderline results with only the stock engine. The client terminology and vocabulary will always be respected.

Fully customized: The third possibility is to build a fully customized engine. This can happen ahead of the project start or in the middle of progress, once standard translation has been completed, plus a review of a third of the content for games with no previous usable data.

It should be noted that PTW does not favor any engine provider over others; the needs of each project reveal which engine will best suit translation. Currently, for some content type or purposes, GPT-3 is explored as a real-time translation solution where NMT engines may become too slow or expensive. This is ongoing research in the community of MT.

The final step is always Machine Translation Post-Editing (MTPE) and final review. Obviously, the more parallel data is used to customize an engine, the better the MT output will be. Less post-editing effort by our game expert linguists means they will be able to handle more volume in less time.

During this process, consideration is given to the effort required to customize the translation engine. Certain language pairs—English to French, for example—translate better than others; German to Korean is an example of a language pair that would most probably require English as a pivot.

Evaluation of the efficacy of a given engine is a critical component of determining whether to add it to the bench of tools.

Pilot or Feasibility Test

The pilot testing phase provides a rigorous framework for making that determination. Training the engine begins with data preparation, feeding the curated TM and existing glossaries into it, which leads to the final MT system application. This MT application will be evaluated with automatic metric scores and human assessment.

The type of human assessment can vary from A/B testing to rating, and if time is short or resources are lacking but the automatic metric scores were encouraging enough, we can leap directly to a very heavy post-editing task. With this strategy, not only are we making progress on the actual project but we’re also collecting valuable feedback for the MTPE consultancy (namely the editing distance score). The goal is its eventual deployment into a CAT (Computer Aided Translation) tool.

To be clear, these processes are not the final arbiter. Comparisons between MT tools are also a valuable component of the evaluation. Results from several engines are assessed to discover the best engine for a given project. The output is then given a post-editing test to compare the post-editor’s productivity and consider the level of usability of the machine-translated text.

Final Evaluation and Integration

Finally, gauging the data on quality may involve rating fluency and accuracy, annotating the main error types, and garnering human feedback from linguists (professional translators/post-editors specialized in the given domain or genre) for the given language.

None of this work happens in a vacuum, or outside of the developer’s and publisher’s awareness. Indeed, the process is very much a collaboration between the parties involved. Our partners provide any relevant previous content and their quality expectations, and PTW recommends the MT solution best suited to the project, if any. This collaboration also includes the engine customization and quality testing that will make the solution fit the needs of the project. The developer will also share their workflow so PTW can engineer the best integration, which eases any slowdowns that might otherwise occur.

Human Post-Editing

Linguists at PTW have been trained in Machine Translation post-editing with Neural MT engines. To reduce the time and effort of such post-editing, a process called “APE” (Automatic Post-Editing) is applied. APE includes rules to manage gender control, redaction, moderation, and other post-processing techniques applied to the target MT output before sending the files to our linguists. These rules are customized to the project requirements when style guides or glossaries are available. Of course, we can assist in creating and developing relevant glossaries and style guides to be exploited with the semi- or fully- customized MT engines.

Our post-editors are trained in providing structured feedback to keep improving the MT engine, and they are involved in the set-up and the closure of the project. While sometimes one may not find game specialists as post-editors, it is advisable that your translators have extensive experience both in the register (for example, in-game dialogue may require colloquial speech, which can be a feature present in the profile of audiovisual translators who worked on TV series or film translations) and in the domain or genre (a racing game is, of course, different from an arcade game). Like MT systems, the best human resource must be selected for your MTPE project.

Continuous Improvement

So, it should be clear at this point that we understand the Machine Translation process to be a marathon, rather than a sprint. It is a process that builds upon itself, so constant refinement—and more importantly, the need to allow for the time it takes to confirm those refinements—is a necessary part of the methodology that produces the best results over time. It’s important not to consider MT to be an easy solution, as MT output does need continuous validation.

We consider the effort and time this process takes to be well worth the investment, as it allows us to provide the highest quality output for our partners while reducing time to market.

Related Localization

Gateway to the Korean Game Market

From Localization & LQA to Player Support: How to Prepare Your Game for the Taiwan Market

The Ins and Outs of Gaming Terminology in Video Game Localization