How Does Machine Translation Work? Part 1: Explain Like I'm 5

You might be curious how machine translation works, but not that curious that you want to learn all the math and technical bits, so I’ll try to give a basic, layman’s explanation. (Though of course that will mean glossing over lots of things, and oversimplifying many others)

We’ll start with a very small piece of math. X,Y coordinates. You probably plotted some of them at school. Your teacher probably used fancy math terms like “Cartesian plane”, and “axis”. What we care about is that these coordinates can be plotted, and they have a relationship. They have a specific place. (4, 1) is 4 units to the right of center, and one unit up. If you plot enough of these things, and they’re all coming from the same source(or more accurately, the same function), you can do something quite interesting. If we take the coordinate (9,?), we can figure out what the ? should be. First we have to take all the points we already plotted, and we draw a line through them(or as close as we can if they don’t line up perfectly). Then we go over nine units to the right, and since we don’t know how far up we need to go, because there was just a ? there, instead we go up until we hit the line we drew, whatever number is inline with that point is our Y value(or at least a pretty good estimate of it).

Cartesian plane(X,Y plane) graph illustration

This is, in a very simplified way, how machine translation, and ML works. If I have 100 pairs of English and Spanish sentences, and I plot them, and draw a line through those points. I can then take a new English sentence, follow it up until I hit the line, and whatever Spanish sentence is there is the translation. Your first question might be, “how do you plot a sentence on a graph?”. That’s a very good question, it turns out you can’t, at least not directly. In part 2 I’ll discuss how words are turned into numbers, and it’s not as boring as that may sound. It involves, The Matrix!

Part 2 is next.