Meet the Chuvash language in Yandex.Translate: how we solve the main problem of machine translation

Regions of Russia are not just borders on the map. Each region has its own cultural traditions, and many have their own languages. Machine translation could help preserve and apply these languages ​​- in particular, publish articles on Wikipedia. But what if the data for learning machine intelligence is not enough?


Today we will talk about our approach using the example of the Chuvash language that we taught Yandex.Translator. According to the latest census, more than a million people consider this language native.




But for starters, a little background.


. , : — , . . . , . , . . .


, . . , — , . .


, . . , , . , , . .


, , , ( ). ( , ), . , , , . , ?



, . . , . , , 250 . . , , .


. (, ), . - , : - . , — .



— —


: , - (!) . .

?



. , , , . , .


. : , , , , , , — . . , . , «»: gümüş — өө — ө — үі — үү — gümüş — kumush — ӗӗ. , , , «» . SMT, / — .



, . . . , back-translation. ?


, , . , . : ( ). , - , . , . , , , . , .


, , , «» . tagged back-translation: «» — (pun intended) - .


!


, , . - ., : [ -].


, -, . , . .

Source: https://habr.com/ru/post/undefined/


All Articles