Braille Patterns
Braille is an international writing system for the blind. It was invented in 1821 in Paris by Louis Braille, who was himself blind. The script is used today worldwide. Braille uses a system of six or eight raised dots that are arranged in two vertical rows of three or four dots.
Unicode blocks | Braille Patterns |
Alternate names | — |
Timeframe | 1821 to present |
Regions | East Asian |
Type | alphabet |
Alternate names | left to right |
Status | living |
Number of speakers | — |
Languages | International |
Main sources | Daniels, P. 1996. “Shorthand: Braille” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 807-820. |
Secondary sources | — |
Proposal | — |
Buginese
The Buginese script, also known as Lontara, is used to write the Buginese (Bugis) language on the island of Sulawesi in Indonesia, primarily in southwest Sulawesi, but the language is also spoken in other areas. The script is a descendant of Brahmi, and may be related to Javanese. It shows some affinity to the Tagalog script. Buginese has been in use since the 14C. It has also been used to write the Makasar, Bima, and Mandar languages.
Unicode blocks | Buginese |
Alternate names | Lontara |
Timeframe | 14C to present |
Regions | East Asian |
Type | abugida |
Alternate names | left to right |
Status | living |
Number of speakers | 4 million |
Languages | Buginese (Bugis), Makasar, Bima, Mandar |
Main sources | Matthes, B. F. 1875. Boeginesche Spraakkunst. Den Haag: Martinus Nijhoff. |
Secondary sources | Sirk, Ü. 1983. The Buginese language. Moscow: Nauka. (Languages of Asia and Africa.) |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2633r.pdf |
Buhid
Buhid is a living minority script used in Mindoro in the Philippines used to write the Buhid language. Buhid is a Brahmi-derived script, distantly related to the South Indian scripts. Buhid is closely related to the Hanunóo and Tagbanwa scripts of the Philippines. All three scripts are related to Tagalog, but may not be directly descended from it. The ancestor of these Philippine scripts (including Tagalog) may have been transported to the Philippines via palaeographic scripts of western Java between the 10 and 14C CE.
Unicode blocks | Buhid |
Alternate names | Mangyan |
Timeframe | pre-19C to present |
Regions | East Asian |
Type | abugida |
Alternate names | left to right |
Status | living |
Number of speakers | 8000 |
Languages | Buhid |
Main sources | Kuipers, J., and R. McDermott. 1996. "Insular Southeast Asian Scripts" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 474-484. |
Secondary sources | Santos, Hector. 1994. The Living Scripts. Los Angeles: Sushi Dog Graphics. (Ancient Philippine scripts series, 2). |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n1933.pdf |
Byzantine Musical Symbols
Byzantine Musical Symbols are musical symbols used to write the religious music and hymns of the Christian Orthodox Church and some folk music manuscripts. These symbols first appeared in the 7C to 8C CE. In 1881, the Orthodox Patriarchy Musical Committee established the New Analytical Byzantine Musical Notation System, which is used today. Most of the manuscripts are in Greek, although a few are in in Russian, Bulgarian, Romanian, and Arabic.
Unicode blocks | Byzantine Musical Symbols |
Alternate names | — |
Timeframe | 7C or 8C to present |
Regions | East Asian |
Type | symbols |
Alternate names | left to right |
Status | historical |
Number of speakers | 0 |
Languages | — |
Main sources | Hellenic Organization for Standardization (ELOT). 1997. The Greek Byzantine Musical Notation System. Athens. (=ELOT 1373) |
Secondary sources | — |
Proposal | — |
CJK
"CJK" refers to to the unified Han characters used to write to the Chinese, Japanese, and Korean languages. Technically, CJK can also be used for the Vietnamese language, since early Vietnamese writing systems were based on Han. There are several blocks of CJK characters, including multiple blocks of unified ideographs, and blocks of compatibilty ideographs, symbols and punctuation marks, strokes, and radicals. A description of the CJK blocks and the unification principles appears in section 12.1 of The Unicode Standard, and a history of the encoding appears in Appendix E of the Standard.
Unicode blocks | CJK Compatibility Forms, CJK Compatibility Ideographs, CJK Compatibility Ideographs Supplement, CJK Radicals Supplement, CJK Strokes, CJK Symbols and Punctuation, CJK Unified Ideographs, CJK Unified Ideographs Extension A, CJK Unified Ideographs Extension B, CJK Unified Ideographs Extension C, CJK Unified Ideographs Extension D |
Alternate names | — |
Timeframe | — |
Regions | East Asian |
Type | logosyllabary |
Alternate names | variable |
Status | living |
Number of speakers | 1.3 billion |
Languages | Chinese, Japanese, Korean, Vietnamese |
Main sources | Mair, V. 1996. "Modern Chinese Writing" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 200-208; Smith, J. 1996. "Japanese Writing" in Daniels & Bright, pp. 209-217; King, R. 1996. "Korean Writing" in Daniels & Bright, pp. 218-227. |
Secondary sources | Lunde, Ken. 2009. CJKV Information Processing. 2nd ed. Beijing, Cambridge, MA: O’Reilly. |
Proposal | — |
Carian
Carian is a partly undeciphered script, which has some relationship to the Greek alphabet. It is used to write the Carian language. Carian dates to the first millennium BCE. A few inscriptions have been found in Caria, on the western portion of present-day Turkey, and a fragmentary bilingual has also been found in Athens. However, the bulk of extant texts have been found in Egypt, and were left by Carian mercenaries. In 1996 an extensive Carian-Greek bilingual was discovered at Kaunos in southwestern Turkey. The bilingual clearly demonstrated Carian was a member of the Anatolian branch of Indo-European, though details of the language still remain unclear.
Unicode blocks | Carian |
Alternate names | — |
Timeframe | x-7C to -3C |
Regions | East Asian |
Type | alphabet |
Alternate names | variable |
Status | historical |
Number of speakers | 0 |
Languages | Carian |
Main sources | Melchert, H. C. 2004. "Carian" in The Cambridge Encyclopedia of the World’s Ancient Languages, ed. Roger Woodard. Cambridge: Cambridge University Press, pp. 609-613. |
Secondary sources | Swiggers, P., and W. Jenniges. 1996. “The Anatolian Alphabets” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 281-287. |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3020.pdf |
Chakma
The Chakma script is used to write the Chakma language, which is spoken in Bangladesh and India, and is being adapted for the Tanchangya language in Bangladesh. The script has been used for liturgical purposes, but also is used in teaching materials today. The Chakma language is also written in the Latin and Bengali scripts.
Unicode blocks | Chakma |
Alternate names | Ojhopath, Ajha path |
Timeframe | ? to present |
Regions | East Asian |
Type | abugida |
Alternate names | left to right |
Status | living |
Number of speakers | 560000 |
Languages | Chakma, Tanchangya |
Main sources | Khisa, Bhagadatta. 2001. Cāṅmā pattham pāt = Chakma primer. Rāṅamāṭi: Tribal Cultural Institute (TCI). |
Secondary sources | Cāṅmā, Cirajyoti, and Maṅgal Cāṅgmā. 1982. Cāṅmār āg pudhi (Chakma primer). Rāṅamāṭi: Cāṅmābhāṣā Prakāśanā Pariṣad. |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3645.pdf |
Cham
The Cham script, also known as Akhar Thrah, is a Brahmi-derived script used write the Cham language, an Austronesian language. There are two main varieties of the Cham language: Western Cham, which spoken in Cambodia (and to a lesser extent in Vietnam and Thailand), and Eastern Cham, spoken in Vietnam. Speakers of the former tend to use the Arabic script while some speakers of the latter still use the Cham script.
Unicode blocks | Cham |
Alternate names | Akhar Thrah |
Timeframe | 1000C to present |
Regions | East Asian |
Type | abugida |
Alternate names | left to right |
Status | living |
Number of speakers | 290000 |
Languages | Western Cham, Eastern Cham |
Main sources | Aymonier, Étienne, and Antoine Cabaton. 1906. Dictionnaire cam-Français. Paris. |
Secondary sources | Kono Rokuro, Chino Eiichi, and Nishida Tatsuo. 2001. The Sanseido Encyclopaedia of Linguistics. Volume 7: Scripts and Writing Systems of the World (Gengogaku dai ziten (bekkan) sekai mozi ziten). Tokyo: Sanseido Press. |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3120.pdf |
Cherokee
The Cherokee script is used to write the indigenous Cherokee language. Cherokee is the native tongue to about 20,000 people, though most speakers today use it as a second language. It was originally invented by Sequoyah, a Cherokee silversmith, in the early 19C, between 1815 and 1821. Sequoyah devised a system of numbers for Cherokee, but today Latin numbers are used instead. The script is still taught today.
Unicode blocks | Cherokee |
Alternate names | — |
Timeframe | early 19C to present |
Regions | East Asian |
Type | syllabary |
Alternate names | left to right |
Status | living |
Number of speakers | 20000 |
Languages | Cherokee |
Main sources | Scancarelli, J. 1996. “Cherokee Writing” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 587-592. |
Secondary sources | — |
Proposal | — |
Combining Diacritical Marks
Diacritical marks are ancillary marks that are added to a base character, and can be used to indicate how the character is to be pronounced or stressed. The Combining Diacritical Marks block includes characters intended for general use with any script. Diacritical marks that are specific to a particular script are encoded with that script. The Combining Diacritical Marks Supplement comprises a set of lesser-used combining diacritical marks.
Unicode blocks | Combining Diacritical Marks, Combining Diacritical Marks Supplement |
Alternate names | — |
Timeframe | — |
Regions | East Asian |
Type | symbols |
Alternate names | — |
Status | living |
Number of speakers | — |
Languages | — |
Main sources | The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, p. 234 (Section 7.9). |
Secondary sources | — |
Proposal | — |
Combining Diacritical Marks for Symbols
This set of Combining Diacritical Marks for Symbols are, in general, to be applied to mathematical or technical symbols, and serve to extend the set of such symbols. A number of compatibility enclosing marks are also included in this block, which can enclose the base character in different ways.
Unicode blocks | Combining Diacritical Marks for Symbols |
Alternate names | — |
Timeframe | — |
Regions | East Asian |
Type | symbols |
Alternate names | — |
Status | living |
Number of speakers | — |
Languages | — |
Main sources | The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 234-235 (Section 7.9). |
Secondary sources | — |
Proposal | — |
Combining Half Marks
The Combining Half Marks set consists of a number of combining mark pieces that can be used to visually encode certain combining marks that extend over multiple base letterforms. They are included to facilitate the support of such marks in legacy implementations. However, double diacritics, such as U+0360 and U+0361, are to be preferred. The block also includes macron marks that are recommended for representing a style of supralineation in Coptic.
Unicode blocks | Combining Half Marks |
Alternate names | — |
Timeframe | — |
Regions | East Asian |
Type | symbols |
Alternate names | — |
Status | lviing |
Number of speakers | — |
Languages | — |
Main sources | The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, p. 235 (Section 7.9). |
Secondary sources | — |
Proposal | — |
Common Indic Number Forms
The Common Indic Number Forms are characters that are used to represent fractional values in various scripts of North India, Pakistan and Nepal. These signs were used to write currency, weight, measure, time, and other units. They have been used since the 16C and are still employed today in a limited capacity.
Unicode blocks | Common Indic Number Forms |
Alternate names | — |
Timeframe | 16C to present |
Regions | East Asian |
Type | numeric |
Alternate names | left to right |
Status | living |
Number of speakers | — |
Languages | Gujarati, Gurmukhi, Devanagari, Bhojpuri, Magahi, Awadhi, Maithili, Urdu, Hindi, Marwari, Punjabi |
Main sources | The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, p. 486-487 (Section 15.3). |
Secondary sources | — |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3367.pdf |
Control Pictures
Control pictures are conventional representations of nongraphic characters for use when it is necessary to show the position of a control code within a data stream. Three characters are included in this block to visibly represent ASCII space.
Unicode blocks | Control Pictures |
Alternate names | — |
Timeframe | various |
Regions | East Asian |
Type | symbols |
Alternate names | — |
Status | living |
Number of speakers | — |
Languages | — |
Main sources | The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, p. 495 (Section 15.6). |
Secondary sources | — |
Proposal | — |
Coptic
The Coptic script represents the final stage in the development of the Egyptian writing system and was used for writing the Coptic language. Coptic was based on the Greek uncial alphabets but several letters were added that were unique to Coptic. Although the language died out in the 14C, it is still maintained as a liturgical language by Coptic Christians. Before Unicode 4.1, Coptic was considered a stylistic variant of Greek, so 14 Coptic characters appear in the "Greek and Coptic" block. Coptic is now considered to be disunified from Greek, so one can use the 14 Coptic characters in the "Greek and Coptic" block as well as additional letters that appear in the new "Coptic" block (which also includes characters for Old Coptic and Nubian).
Unicode blocks | Coptic, Greek and Coptic |
Alternate names | — |
Timeframe | 4C to present |
Regions | East Asian |
Type | alphabet |
Alternate names | left to right |
Status | liturgical |
Number of speakers | 0 |
Languages | Coptic, Nubian, Old Coptic |
Main sources | Ritner R. 1996. "The Coptic Alphabet" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 287-290. |
Secondary sources | — |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2744.pdf |
Counting Rod Numerals
Chinese counting-rod numerals were used to represent and manipulate numbers in pre-modern East Asian mathematical texts. The rods consisted of a set of small sticks that were several centimeters in length; these were arranged in patterns on a gridded counting board. The glyph shapes represent the conventions of the Song dynasty (960 - 1279), when traditional Chinese mathematics was at its height. The symbols go back to the Warring States Period in China, ca. 4C or 5C BCE.
Unicode blocks | Counting Rod Numbers |
Alternate names | — |
Timeframe | x-4C or -5C to present |
Regions | East Asian |
Type | numeric |
Alternate names | variable |
Status | historical |
Number of speakers | 0 |
Languages | — |
Main sources | Pettersson, J. S. 1996. “Numerical Notation" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 795-806. |
Secondary sources | — |
Proposal | — |
Cuneiform
Sumero-Akkadian Cuneiform is a logosyllabary that was used from the end of the third millennium until the 1C CE. It spread beyond Mesopotamia to Elam, Assyria, eastern Syria, southern Anatolia, and Egypt, and was used for many languages outside of Sumerian and Akkadian. The script developed from Proto-Cuneiform and Early Dynastic cuneiform. Cuneiform is one of the world's oldest writing systems.
Unicode blocks | Cuneiform, Cuneiform Numbers and Punctuation |
Alternate names | — |
Timeframe | x-2350 to 1C |
Regions | East Asian |
Type | logosyllabary |
Alternate names | left to right |
Status | historical |
Number of speakers | 0 |
Languages | Sumerian, Akkadian (Babylonian, Assyrian), Elamite, Hittite, Hurrian, Luvian, Eblaite, Urartian |
Main sources | Cooper, J. 1996. “Sumerian and Akkadian” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 37-57. |
Secondary sources | Gragg, G. 1996. “Other languages” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 58-72. |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2786.pdf |
Currency Symbols
The Currency Symbols block includes customary symbols used to indicate certain currencies in general text. The signs may vary in shape and are often used for more than one currency. Not all currencies are represented by a currency symbol; some use sequences of multiple-letters, while the abbreviations for currencies can vary by language. Some contemporary or historic currency symbols, not found in the Currency Symbols block, may be found in other blocks.
Unicode blocks | Currency Symbols |
Alternate names | — |
Timeframe | various |
Regions | East Asian |
Type | symbols |
Alternate names | — |
Status | living |
Number of speakers | — |
Languages | — |
Main sources | The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 478-480 (Section 15.1). |
Secondary sources | — |
Proposal | — |
Cypriot Syllabary
The Cypriot syllabary is an historical script used to write the Cypriot dialect of Greek and a non-Indo-European language, "Eteo-Cypriot." The script was used from the middle of the 11C to 3C BCE and appears at be descended from one of the Cypro-Minoan scripts of Cyprus. The Cypriot syllabary shares some orthographic conventions with Linear B, but the script is written right to left.
Unicode blocks | Cypriot Syllabary |
Alternate names | — |
Timeframe | x-11C to -3C |
Regions | East Asian |
Type | syllabary |
Alternate names | right to left |
Status | historical |
Number of speakers | 0 |
Languages | Ancient Greek, "Eteo-Cypriot" |
Main sources | Woodard, Roger. 2004. "Greek dialects" in The Cambridge Encyclopedia of the World’s Ancient Languages, ed. Roger Woodard. Cambridge: Cambridge University Press, pp. 650-672. |
Secondary sources | Bennett, E. 1996. "Aegean Scripts" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 125-133. |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2378.pdf |
Cyrillic
The Cyrillic script has traditionally been used for writing various Slavic languages, including Russian. The script dates to the 9C or 10C CE, and is named after St. Cyril, a Byzantine missionary. It is one of several scripts that were ultimately derived from the Greek script. Cyrillic has been extended to write non-Slavic languages, particularly the minority languages of Russia and surrounding countries. The Cyrillic Extended-A block is made up of Old Church Slavonic superscripted combining letters, while Cyrillic Extended-B comprises various historic characters, such as those used in Old Cyrillic and Old Abkhasian. The Cyrillic Supplement block is made up of additional letters needed to write various non-Slavic languages.
Unicode blocks | Cyrillic, Cyrillic Extended-A, Cyrillic Extended-B, Cyrillic Supplement |
Alternate names | — |
Timeframe | 9C or 10C to present |
Regions | East Asian |
Type | alphabet |
Alternate names | left to right |
Status | living |
Number of speakers | 629 million |
Languages | Abkhazian, Abaza, Adyghe, Assyrian Neo-Aramaic, Southern Altai, Avaric, Azerbaijani, Bashkir, Belarusian, Bulgarian, Buriat, Russia Buriat, Chechen, Mari, Shor, Chukot, Crimean Turkish, Chuvash, Chuvash, Dargwa, Dungan, Evenki, Nanai, Ingush, Kara-Kalpak, Kabardian, Khanty, Khakas, Kazakh, Komi-Permyak, Komi-Zyrian, Koryak, Karachay-Balkar, Karelian, Kurdish, Kumyk, Komi, Kirghiz, Lak, Lezghian, Moksha, Macedonian, Mongolian, Mansi, Erzya, Nogai, Ossetic, Romany, Russian, Yakut, Serbian, Tabassaran, Tajik, Turkmen, Tatar, Muslim Tat, Tuvinian, Udihe, Udmurt, Ukrainian, Uzbek, Kalmyk, Nenets, Gagauz, Romanian, Northern Sami, Selkup, Uighur |
Main sources | Cubberly, P. 1996. "The Slavic Alphabets" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 346-355. |
Secondary sources | Comrie, B. 1996. "Adaptations of the Cyrillic Alphabet" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 700-726. |
Proposal | — |