Renat Bikmatov, Sergey Nozhenko
In an article published on the Kazakhstan Government portal on April 12, President Nursultan Nazarbayev said the Kazakh alphabet would soon be switched to the Latin alphabet.
Development of events is expected to be extremely rapid: everything necessary for implementing the idea must be ready by the end of 2017; in 2018 there are plans to develop a unified standard of the new Kazakh alphabet, to start training specialists in the Latin alphabet and translating all school books;, and by 2025, all business documentation, newspapers, magazines and books must be converted.
Among the tasks that language experts have already faced is planning and preparing the process. They must make an assessment of all currently available content in Cyrillic characters and set priorities. What exactly will need to be translated besides school books, documents, books and newspapers? What are the criteria for selecting the things that should be translated first, and what can be postponed? How are the exact deadlines for translating both determined?
Such rapid changes involve not only numerous technical and linguistic complexities, but also the frightening budget figures (about $300 million). Moreover, these funds will not be easy to get and use, since participation in government tenders is very unattractive because of the dumping prices. There are also difficulties connected with peculiarities of designing processes within the translation industry.
An important challenge at the early stage is an assessment of the scope of work in all areas. It will depend very heavily on the method chosen to translate Kazakh script into Latin: will it be just transliteration by hard-and-fast rules amenable to automation, or will some new spelling of the words be specified in a more complicated way (in the worst case scenario, words and glossaries are rewritten again; but even worse is when grammar rules that regulate word-form generation in general and declension endings in particular, are changed).
At this point, implementation of the Latin alphabet does not eliminate the use of the Russian in Kazakhstan as an official language, and for some time Cyrillic and Latin will be used synchronously. This means that we will have to wait until international standards bodies approve the appropriate encoding, and CAT developers implement support for an actually new language : “Kazakh (Latin)”.
Conversion of all existing linguistic databases (translation memories and glossaries) and basic instructions (style guides) is a separate task. After all, if this isn''t done in advance, it''s possible to forget about code conversion during the replacement of old translations from a TM.
This will be followed by adaptation of content management system (CMS) and translation tools and — and most importantly — all best practices in automatic translation quality control, beginning with automatic spell check tools.
Sorting is also an important task. All previously sorted indexes and indicators must be re-sorted according to the order of letters in the new alphabet.
Finally, the existence of specific letters in the modern alphabet assumes the use of a large number of diacritics — marks above, below or within letters. But who will code the spell checking program for the new language, and how quickly can it be done?
If you look more closely, from a technical point of view, without details about the structure of the Kazakh script, it is difficult to judge what processes may still be necessary. Provided that the Kazakh alphabet contains 28 letters, as stated, does this mean that some non-spacing letters with diacritical marks will appear beside them? Or will some Cyrillic letters correspond to combinations of some Latin ones? In the latter case, another problem will be added to the simple substitution of some symbols for others during conversion of the texts: abbreviations. For example, if a capital Cyrillic letter is replaced with a Latin one, then what we have to do when it is a part of an abbreviation? Do we just replace it with the first Latin letter only? Or with two capital Latin letters? With a first capital and second lower case letter? However, it''s still not the hardest thing. It will be worse if it turns out like the Bosnian alphabet: transliteration from Cyrillic to Latin is made partially by lexical rules. This means that some combinations of letters are converted differently, depending on the context, in each particular case.
By the way, from our point of view, the Uzbeks solved this problem with the alphabet in the most reasonable manner. They didn''t introduce any additional characters in their new Latin script, eliminating even those that had been there in prewar times. As a result, there are fewer characters than in English, plus one back quote and several letters composed of a pair of Latin ones. Thanks to that, nearly all character sets and keyboards can be used for the script.
Due to all the above-mentioned features, it is essential to get all the necessary information from Kazakh linguists as early as possible in order to readjust all translation systems.
This task still has too many unknowns, but specialists of Logrus Global LLC continue to monitor the situation. We will cover the questions that can be clarified in our subsequent publications.
* The first Kazakh Cyrillic alphabet was introduced in the 19th century by Ibrayem (Ybrai) Altynsarin. Orthodox missionaries had developed the Kazakh “missionary” alphabet in Cyrillic characters using specific ligature and diacritical marks for religious publications and “Russian native” schools. This alphabet was used marginally until 1917.
In the 1930s, Soviet turkologists developed the Uniform Turkic Alphabet (Janalif, or Kazakh - Жаңаліп) based on Latin script. It was used officially with some modifications specific for the Kazakh language from 1929 to 1940, when it was replaced by the Cyrillic alphabet.
The modern Kazakh Cyrillic alphabet is used in Kazakhstan and Mongolia. This alphabet, which was developed by S. A. Amanzholov and accepted in 1940, contains 42 letters, 33 of which are from Russian alphabet and 9 are specific letters of the Kazakh language: Ә, Ғ, Қ, Ң, Ө, Ұ ( Ӯӯ was used instead of the last one until 1957), Ү, Һ, І. Historically, the Kazakh letters were placed after the Russian alphabet, but afterwards each of them was put after the Russian letter with similar phonation.
The future version of the alphabet is as close as possible to the Turkish alphabet The number of letters in it is significantly optimized: there are 28. Specialists promise that any Kazakh word and any Kazakh text in this version of Latin script will be absolutely clear and readable.