One of the things I’ve been acutely aware of while working on Slatona, that I never thought about much before, is just how often I(and presumably most people) tend of view language and communication as something easy, or rather, something that we are fully aware of and conscious of, and of all of it’s moving pieces. We speak, think, and communicate using language all day, almost without effort. We have a thought, and words just seem to appear as we open our mouths. But there’s a lot going on under the surface, and remember that it took most of us about 18 years to become proficient is just a small sliver of our native tongue. Though even still “proficient” is a stretch. Even as I write this I’m often thinking “is it “I” before “E” except after “C” or is that adage totally useless? Does a comma go here, or should this be a new sentence? There’s a better word for that, but I can’t think of it right now”. So at least for me, but probably you too, “proficient” is a stretch, even if we’re only being judged on a sliver of the language.
This post is specifically about why translators often get word usage wrong. It’s hard to give clear examples because though I experience it often, it’s typically in situations that are very unique to the conversation I’m having, and who I’m having it with. Or the examples are contrived and jokey like:
Did you see her duck?
Sure, it can obviously mean multiple things, but it’s not likely something anyone would say, at least not in those exact terms. For a more realistic, but still contrived example, if you’re a marine biologist, and you’re talking to someone you know, and you use the word “seal”, the other person will immediately know that you mean the “animal”, so much so that the thought of other meanings of the word likely won’t even cross their mind. Likewise, if you were a mechanic, or a plumber, they would immediately pick up that you were refering to a “gasket”(unless of course you said something recently that override that default fallback. Of course plumbers sometimes talk about “seals” meaning the “animal”, just typically less so than marine biologists).
We might not think about it, or notice it, but our brains are hyper aware of context in every conversation we have. You’ll see it if you watch out for it. If someone joins a conversation while someone was mid-way though telling a story, the speaker will often slow down and fill in contextual details to catch that person up. Or sometimes someone else may interject to say something like “Steve is a teacher as the local school” to the conversation newcomer, if that piece of information helps fill in needed context. In every conversation we have, and with every phrase we say, our brains are calculating if everyone there has the context needed to make sense of what we’re about to say, and if not, adjusting on the fly to reword things.
So why do ML models struggle so much, when we’re able to do it almost effortlessly? Well there’s the obvious reason. Some context, maybe even most, was established often days, weeks, or years before the conversation. Your spouse likely doesn’t have to frequently remind you of their occupation, hometown, relationship with their family, or their personal interests. Your brain can accurately decipher the sounds they make with their mouth based on a long history of context. Obviously ML translation models don’t have that information(it would be worrying if they did), but even when all the context needed is included in a given text, they still struggle. The main reason comes down to how ML models work, and how they’re trained. ML models can’t translate arbitrarily long pieces of text. There’s no model, that I know of, that can translate the first chapter of Harry Potter in one go. But that’s not to say you can’t translate that text, of course you can, but really behind the scenes it’s being broken up into sections and translated separately. ML models are often trained on sentence pairs, one in the source language, and one in the target language, and a typical limit might be 512 token, which equates to roughly 400 words. That’s the limit for what that particular models can translate, so any text longer than that needs to be broken up, often at the sentence level. These sentences, for performance reasons, are often batched together, so a model may process 10 sentences in a batch, which could all be from the same document you’re wanting translated, or they might be 10 different sentences from 10 different people. So the ML model not only doesn’t have the benefit of knowing all the context you have about the text, but it also doesn’t even get the context of the previous sentence, of the following sentence. Each sentence is left to fend for itself, and if it doesn’t contain all the context needed to correctly pick the right words, then the model will use whatever makes the most sense in the given sentence, and it does a pretty good job, but even if it’s getting it right 19 out of 20 words, that’s a high batting average, but it still won’t take very long for you to hit word 20 where it fails.
Sometimes these errors can be small. Sometimes the person on the other end can fill in the gaps if there’s an obvious word that makes more sense, given the context. But again, even if that’s true 90% of the time, it doesn’t take long until a word get translated wrong, but in a way that’s logical given the text, but not at all what you said. A typo like “noe”, when you meant to write “now”, may get corrected and translated as “not”, which completely flips the meaning. Or a word with nuanced, idiomatic meaning in your language, but that mean something completely different based on how the translator decided to translate it.
Slatona aims to solve this problem. Often translating is like playing the game telephone. You pass on a message to the translator, and they pass it on to the person you’re communicating with, but you usually have no chance to double check that your intent is being represented accurately, and just about every conversation worth having, is worth having your meaning conveyed accurately. Slatona gives you the ability to take a peek into the translation, and to get insight into what it’s really saying, as if you yourself spoke that language. It lets you send a message to someone knowing that it says what you want it to say, and not having to hope the other person will fill in the gaps correctly.