How Does Machine Translation Work? Part 3: Final Words

If you haven’t read Part 1, or Part 2, start there.

So now we have a number(or number-like thing, a matrix), that represents a word, and it’s relationship to other words. This was our sticking point in the beginning. How do we plot a word/sentence? I’m going to just skip over getting from words to sentences, that’s more math/numbers.

Now that we have a number, we can plot things on our graph. First we need a bunch of coordinates to be able to draw our line through, so we gather English/Spanish sentence pairs like That is a dog./Eso es un perro., and a whole bunch of other sentences. In reality, ML models are trained with thousands, or millons of paired sentences, as the actual equations are much more complex and multi-layered so lots of data is needed to be able to draw and accurate line.

Back to our simplified version, we start by plotting our sentences, then we can draw a line through the middle of it, and then take another English sentence, and find out it’s matricies, and follow that up to the line, to get the Spanish word matricies, and then replace those with the Spanish words, and we have a translated sentence. Obviously in reality, it’s many, many, many different equations, much more complex than plotting a couple coordinates and drawing a line through it.

So there you have it, a very simplified, layman’s explanation for machine translation, and machine learning/AI in general, but there’s so much more to it. But at least it should give you some idea what’s happening when you write a sentence into a translator, and it takes a moment before coming back with a translation, it’s doing complicated matrix stuff. It still might seem like computer magic, and really, it is, but at least you know a little about how that magic trick works.