veda.ng
Back to Glossary

Machine Translation

Machine Translation (MT) automatically translates text or speech between languages, a foundational NLP task that has evolved from rule-based systems through statistical methods to neural approaches that achieve near-human quality for many language pairs. The challenge extends beyond word replacement: languages differ in word order, express concepts differently, use gendered forms, embed cultural context, and have ambiguous words requiring context to translate correctly. Rule-based MT used linguistic rules and dictionaries but couldn't handle language's complexity. Statistical MT learned translation probabilities from parallel corpora but struggled with long-range dependencies. Neural Machine Translation (NMT), particularly transformer-based approaches, transformed the field by learning continuous representations that capture semantic similarity across languages. Sequence-to-sequence architectures with attention encode source sentences into representations that guide target language generation. Multilingual models like mBART and NLLB handle translation between many languages with a single model. Zero-shot translation between low-resource language pairs uses knowledge from higher-resource pairs. Evaluation uses BLEU scores comparing output to human reference translations. Applications include Google Translate, professional translation assistance, cross-lingual search, and document localization.