Encyclopedia of Linguistics

Sample Entry: Language


The South Slavic languages--Serbo-Croatian, Slovene, Bulgarian, and Macedonian--descend from Slavic dialects that were brought to the sub-Alpine and Balkan regions of southwestern Europe ca. 500 C.E. by waves of westward migration along and across the Danube, Drava, and Sava River systems. In their new territory, the South Slavs encountered and undoubtedly mixed with Latin-speaking peoples, probably descendants of older Indo-European-speaking peoples, such as Illyrian and Thracian.

The exact relationships among the dialects at the time of settlement is uncertain, but we do know that there existed at that time no nascent Slovene, Serbo-Croatian, Macedonian, or Bulgarian dialects. Rather, these formed over the subsequent millennium. The South Slavic group may now be defined by its geographical discontinuity to the remainder of the Slavic-speaking world. To the north, Slovene is bounded by Friulian and Italian in Italy, by German in Austria, and Hungarian in Hungary. Croatian and Serbian are also bounded by Hungarian as well as Romanian (Romania). Bulgarian is bounded by Romanian, and each is separated from Ukrainian by the Black Sea.

Within the South Slavic branch two sub-groups are distinguished: Western South Slavic (constituted by Slovene and Serbo-Croatian) and Eastern South Slavic (constituted by Macedonian and Bulgarian). The languages are also divided along cultural and religious lines: Slovene and Croatian are spoken predominantly by Catholics, while Serbian, Macedonian, and Bulgarian are spoken mostly by Eastern Orthodox Christians. These divisions have determined the choice of alphabet, Latin being chosen in Catholic areas, and Cyrillic (a modified variety of the Greek alphabet) in Eastern Orthodox areas. Bosnia, which has been religiously and ethnically mixed and includes also a significant Muslim population, had vacillated among different alphabets. Since the disintegration of Yugoslavia, the standard Bosnian of Muslims is written in the Latin alphabet, whereas the Bosnian Serbs use Cyrillic. As with most Indo-European languages, the South Slavic group is characterized by many grammatical endings, with nouns and verbs changing form depending upon their position in the sentence, or their function as subjects or objects, singulars or plurals. Slovene and Serbo-Croatian go with the rest of the Slavic-speaking world in having preserved most of these endings in nouns, but verbs have become somewhat simplified. Macedonian and Bulgarian have the opposite: simplified nouns but more complicated verbs.


Serbo-Croatian is spoken by approximately 16 million people. It is the state language of the Republic of Croatia (where it is called Croatian, hrvatski jezik), Bosnia and Herzegovina (where it is called Bosnian, bosanski jezik), and Yugoslavia (where it is called Serbian, srpski jezik); minority speakers are found also in Italy, Hungary, Austria, Romania, Bulgaria, and Macedonia.

The Serbo-Croatian standard was formed in the 19th century as a compromise among Serbs and Croats, whose major dialect divisions and corresponding divergent literary traditions, particularly in the Croatian case, had fostered disunity. The Cakavian and Kajkavian dialects, both spoken in Croatian ethnic territories, and both having developed into sophisticated literary vehicles during the Renaissance and Reformation, respectively, were abandoned as models for the standard language in favor of the Štokavian dialect, spoken in Croatia and all of Serbia as well as Bosnia and Herzegovina and Montenegro. In Serbia, the new Štokavian-based standard replaced the artificial Slaveno-Serbian literary language, which was based largely on Old Church Slavic. The compromise, which was engineered by intellectuals around the Croat Ljudevit Gaj and the Serb Vuk Karadžic, was codified in the Literary Agreement of 1850. The standard had two varieties, the Croatian (or Western) written in a modified Latin alphabet, and the Serbian (or Eastern) in a modified Cyrillic. This standard persisted officially as the language of the Croats, Serbs, and (Bosnian and Sandžakian) Muslims, as well as the de facto lingua franca of Yugoslavia, until the disintegration of the state in 1991. Since then, separate Croatian, Serbian, and Bosnian state languages (the latter using the same alphabet as Croatian and having a relatively higher number of Turkish and other Islamic cultural borrowings) have been cultivated, each continuing from its inherited Štokavian-based precursor; all three standard languages remain almost completely mutually intelligible. (For this reason "Serbo-Croatian" persists as a linguistically valid term, referring to the speech territory and the common base of the separate language collectively. However, it is no longer considered an acceptable term to most lay speakers or the governments of the successor states). Other regional movements, including notably a Montenegrin one, suggest the possibility of forming further standard languages in the future.

The Serbo-Croatian speech territory is characterized by three distinct dialect areas, each labeled by both professionals and the laity by the word meaning "what." A transitional zone called the Torlak group displays features of both Štokavian and neighboring Macedonian and is thus arguably within the scope of the Balkan Sprachbund, an area of linguistic convergence among distantly related or even unrelated languages due to long-term contact, which includes also Albanian, Aromanian, Greek, Romanian, Romany, and, to some extent, Turkish.

Generally speaking, linguists' attention has been drawn to Serbo-Croatian (as well as Slovene) especially for its phonological (sound-pattern) features, high degree of dialect variation, and preservation of key archaisms that aid in the reconstruction of Proto-Slavic, the prehistoric language thought to have been spoken by all Slavs before 500 CE. Standard Slovene and Serbo-Croatian, as reflected in many of their dialects, contrast long and short vowels, and, along with stress, have rising and falling tones (similar to Chinese), such as Slovene brá:t(i) 'to read' (long low pitch), brà:t 'to go read' (long high pitch), bràt 'brother' (short high pitch). Other features are of interest, particularly word and sentence structure; for example, Serbo-Croatian has begun to simplify its nouns--as has occurred more radically in Macedonian and Bulgarian--by reducing the number of grammatical endings ("cases"), especially in the plural.


Structurally, Slovene is closest to Serbo-Croatian and is spoken by approximately 2 million people, largely in the Republic of Slovenia, where it is the primary official language (alongside regionally official Italian and Hungarian). It is spoken also by significant minorities in neighboring Italy, Austria, and Hungary.

Modern standard Slovene, which began its development with the religious translations of the Protestant Primus Truber (Primož Trubar in Slovene) in the mid 16th century, was established in largely its current form toward the end of the 19th century. It is based on the urban speech of the capital, Ljubljana, and the surrounding central dialects, although it also has features selected from its highly variegated dialects. It is written in a modified variety of the Latin alphabet, similar to Croatian.

With its relatively small speech territory, Slovene has seven dialect bases and greater internal differentiation than any of the South Slavic languages. Speakers from the most extreme dialects (e.g., Rezija, Prekmurje) generally cannot be understood by standard speakers. Slovene preserves archaic features that have been lost in Serbo-Croatian. For example, it distinguishes not just singular and plural, but also dual number (pogovarjava se "we two are conversing"); it makes the future tense with an auxiliary verb and a participle (bom sedela "I shall sit"); and it preserves a special "supine" form of the verb that signals intention (kupovat bom šel "I shall go to shop." In contrast to Serbo-Croatian, Slovene has a relatively significant number of borrowings from German (e.g., farba "color" from Farbe), Italian (fant "boy" from fante), and Friulian (križ "cross" from a 7th-century Friulian form krože).


Bulgarian is spoken by approximately 9 million people, predominantly in the Republic of Bulgaria, where it is the primary state language, as well as by minority speakers in Yugoslavia and Macedonia. Structurally, Bulgarian is closest in type to Macedonian.

Modern Bulgarian dates to the 17th century and developed substantially into its current form in the mid 19th century. It is based on the Tarnovo dialect of northeastern Bulgaria, but with elements from various dialect areas. Medieval varieties of Bulgarian served as the primary examples of Slavic writing, with prominent writing centers located in Preslav and Tarnovo. Modern Bulgarian is written in a modified variety of Cyrillic.


Macedonian is spoken by approximately 2 million people, primarily in the Republic of Macedonia. Significant groups of Macedonian speakers are also found in northern Greece, western Bulgaria, Serbia, and in some villages in Albania.

Although Macedonian was codified as a standard language as recently as 1944, the beginnings of the contemporary language may be traced to the middle of the 19th century. Macedonian is written in a modified variety of the Cyrillic alphabet. The language of the Macedonian speech territory can be traced back organically to the speech that gave rise to the first Slavic written language in the ninth century CE, known today as Old Church Slavic.

Linguists have tended to concentrate on the structure of Macedonian and Bulgarian words and their relationship to syntax and meaning, as well as on the interaction of the languages with others in the Balkan linguistic convergence area (or Sprachbund). For the period between the 10th and 12th centuries, the textual evidence of the Proto-Macedonian and Bulgarian is important for the earliest body of attestations of Slavic in general, known as the canonical period of Old Church Slavic. For this reason Indo-Europeanists have made substantial use of older Macedo-Bulgarian material.

Because of their participation in the convergence area, Macedonian and Bulgarian display features not found elsewhere in the Slavic-speaking world. For example, the category of definiteness is marked by the presence (vs. absence) of an article following the first member of a noun phrase, such as Macedonian Ja vidov zhenata, "I saw the [a certain] woman" vs. Vidov zhena, "I saw a woman"; Serbo-Croatian makes no such distinction, having only Vidjela sam ženu, "I saw the/a woman." A distinction expressed by choices among alternative verb forms is made between witnessed and non-witnessed events, for example, Bulgarian Toj napisa pismoto, "he wrote the letter [I know so because I saw him do it]" vs. Toj napisal pismoto, "he wrote the letter [so it is said-I did not see him do it]"; Serbo-Croatian makes no such distinction, having only Napisao je pismo, "He wrote the/a letter." The inherited infinitive has been lost and replaced by a subordinate clause, as in Bulgarian Iskam da otida na mac, "I want to go [literally "that I go"] to a game" vs. Serbo-Croatian Hocu ici na utakmicu, "I want to go to a game." The origin of such convergence features is much debated: they may be a continuation of structures from languages that have disappeared (substratum languages)--Illyrian and Thracian--or a result of language contact itself and diffusion of linguistic features. The working of both explanations together is certainly possible.

The South Slavic languages represent a picture of great diversity among the Slavic languages, and, as they are located at a crossroads of European languages and cultures, have been affected by contacts with numerous languages. The volatile political fortunes of the region promise to push the development of the languages, especially the newly differentiated Bosnian, Croatian, and Serbian, towards ever greater diversity.


Further Reading

