Machine Translation
Why translation is hard
Translating isn't swapping each word for its dictionary match. Word order flips, one word becomes several (or none), and meaning depends on context. "The red car" becomes "la voiture rouge" in French — the adjective moves after the noun. A good translator has to reorder, not just substitute.
Early systems used dictionaries and grammar rules — brittle and endless to maintain.
Learn phrase alignments and probabilities from millions of translated sentence pairs.
One network reads the source and generates the target, learning alignment on its own. Today's standard.
From word-swap to alignment
Watch a literal word-by-word swap produce broken French, then see the alignment that maps each target word to the right source word — including the reordering attention learns automatically.
Modern MT, in brief
NMT is a seq2seq model: an encoder reads the source, a decoder writes the target. Attention lets the decoder, at every output word, weigh all the source words and focus on the ones that matter — exactly the alignment shown above, learned from data. Today's translators are Transformers doing this at scale.
- Ignores word order
- Can't handle one-to-many words
- No context or agreement
- Reorders fluently
- Uses full-sentence context
- Learns alignment without being told