Aegean Numbers

The Aegean Numbers are derived from the non-Greek Linear A script. This numerical system is used in Linear B and may be the numerical system for the Cypriot syllabary.

Unicode blocks Aegean Numbers
Alternate names
Timeframe x-14C to 3C
Regions European
Type numeric
Alternate names left to right
Status historical
Number of speakers 0
Languages
Main sources Ventris, Michael, and Chadwick, John. Documents in Mycenaean Greek. 2nd ed. Cambridge: Cambridge University Press.
Secondary sources Pettersson, J. S. 1996. “Numerical Notation: Linear A and B” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 795-806.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2378.pdf

Alchemical Symbols

Alchemical symbols were originally used by Greek, Syriac, and Egyptian writers around the 5C-6C CE. These symbols were used and expanded upon by European alchemists, natural philosophers, chemists, and apothecaries. Alchemical symbols continue to be used extensively today in scholarly literature, in creative works, in New Age texts, and in the gaming and graphics industries.

Unicode blocks Alchemical Symbols
Alternate names
Timeframe x5C to present
Regions European
Type symbols
Alternate names left to right
Status historical
Number of speakers 0
Languages
Main sources Lüdy-Tenger, Fritz. 1973. Alchemistische und chemische Zeichen. Würzburg: JAL-reprint.
Secondary sources Schneider, Wolfgang. 1962. Lexicon alchemistisch-pharmazeutischer Symbole. Weinheim/Bergstr.: Verlag Chemie.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3584.pdf

Alphabetic Presentation Forms

The Alphabetic Presentation Forms block includes Latin and Armenian ligatures, as well as variant Hebrew letters and marks and several Hebrew precomposed forms. All of the vocalized letters of the Yiddish alphabet are included in this block. The Latin ligatures derive from older character encodings. The Armenian ligatures in this block are found in handwriting and traditional fonts that mimic manuscript ligatures.

Unicode blocks Alphabetic Presentation Forms
Alternate names
Timeframe
Regions European
Type alphabet
Alternate names left to right, right to left
Status living
Number of speakers
Languages Hebrew, Yiddish, Armenian, and Latin script-based languages
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 213, 225, 242 (Sections 7.1, 7.6, 8.1).
Secondary sources
Proposal

Ancient Greek Musical Notation

The ancient Greek people devised a system of musical notation to be used for vocal and instrumental melody. The Ancient Greek Musical Notation block contains many of these symbols, which are based on Greek letters and were used in a variety of texts, including plays and hymns. The symbols still appear in the modern publication of these texts and in studies of ancient music. The notation appeared more fully developed in documents in the 3C BCE, so it likely was devised in the 4C or, more likely, 5C BCE.

Unicode blocks Ancient Greek Musical Notation
Alternate names
Timeframe x-5C or -4C to 4C
Regions European
Type symbols
Alternate names left to right
Status historical
Number of speakers 0
Languages
Main sources West, M. L. 1992. Ancient Greek Music. Oxford and New York: Oxford University Press.
Secondary sources Barker, A. 1996. “Music” in Oxford Classical Dictionary, 3rd ed. Oxford: Oxford University Press, pp. 1003-1012.
Proposal http://std.dkuug.dk/jtc1/sc2/WG2/docs/n2547.pdf

Ancient Greek Numbers

Ancient Greek numbers were often represented by using letters of the Greek alphabet. The ancient Greeks also devised extensions to this usage; these extensions appear in the Ancient Greek Numbers block. The block includes: the Greek zero number (which appeared in historical astronomical texts); various symbols found in papyri to represent fractions, weights, and measures; acrophonic numerals. Acrophonic numerals use the initial letters of the word for the number: for example, “H”, which is the first letter of the word for '100' or 'hekaton', is the acrophonic number for '100'. The acrophonic numbers appeared from at least 5C BCE down to 100 BCE. The number zero appeared in papyri dated to 100 CE, and has shown up as late as the 13C in manuscripts.

Unicode blocks Ancient Greek Numbers
Alternate names
Timeframe various
Regions European
Type numeric
Alternate names left to right
Status historical
Number of speakers 0
Languages
Main sources Heath, T. and G.J. Toomer. 1996. “Numbers, Greek” in Oxford Classical Dictionary, ed. S. Hornblower & A. Spawforth, 3rd ed. Oxford; New York: Oxford University Press, pp. 1052-1053.
Secondary sources
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2612/n2612-3.pdf, http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2612/n2612-2.pdf, http://www.raymondm.co.uk/prog/GreekZeroSign.pdf

Ancient Symbols

The Ancient Symbols set contains miscellaneous historical symbols, including ancient Roman symbols for weights and measures. These are typically derived from ancient epigraphic, papyrological, or manuscript traditions.

Unicode blocks Ancient Symbols
Alternate names
Timeframe various
Regions European
Type symbols
Alternate names
Status historical
Number of speakers 0
Languages
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, p. 507 (Section 15.8).
Secondary sources
Proposal

Arabic

The Arabic script is a cursive writing system that is used for the Arabic language, and has been extended to write many other languages today, such as Persian and Urdu. In the past, Arabic was used to write languages, such as Turkish and Ingush, that today are written with another script: Turkish, for example, is written today with the Latin script. The Arabic Supplement block includes characters needed for several African languages, as well as a number of additional characters needed to write the Khowar, Torwali and Burushaski languages, which are spoken mainly in Pakistan. The Arabic Presentation Forms blocks A and B include characters that were added because they appeared in earlier standards or implementations; in general, it is preferable to not use characters in these two blocks.

Unicode blocks Arabic, Arabic Presentation Forms-A, Arabic Presentation Forms-B, Arabic Supplement, Arabic Extended-A
Alternate names
Timeframe 6C to present
Regions European
Type abjad
Alternate names right to left
Status living
Number of speakers 676 million
Languages Arabic, Azerbaijani, Baluchi, Beja, Balti, Western Cham, Dogri, Persian, Gbaya, Hausa, Kashmiri, Kurdish, Kirghiz, Lahnda, Parsi-Dari, Pashto, Sindhi, Comorian, Tajik, Turkmen, Uighur, Urdu, Uzbek, Zaza, Indonesian, Ingush, Kazakh, Malay, Punjabi, Somali, Susu, Turkish, Wolof, Coptic, Fulfulde, Songhoy
Main sources Bauer, T. 1996. “Arabic Writing" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 559-564.
Secondary sources Kaye, A. 1996. “Adaptations of Arabic Script" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 743-762.
Proposal

Arabic Mathematical Alphabetic Symbols

The Arabic Mathematical Alphabetic Symbol block is made up of a widely used version of the Arabic alphabet that appears in manuscripts, books on epigraphy, manuals, and traditional print editions. The symbols appear in the standards and conventions adopted by those languages using Arabic alphabet based scripts, such as Arabic or Persian. The majority of handbooks in mathematics used in the Middle East, Libya, and Algeria are typeset according to these conventions. The order of the characters in this block reflects the old Semitic order (a, b, j, d, ...), which is different from the order typically found in dictionaries today (a, b, t, th, ....).

Unicode blocks Arabic Mathematical Alphabetic Symbols
Alternate names Arabic Mathematical Alphabet Symbols
Timeframe 15C? to present
Regions European
Type symbols
Alternate names right to left
Status living
Number of speakers
Languages Arabic, Persian
Main sources Union of the Arab Scientific Linguistic Groupings. 1987. Scientific Symbols and Method of their Use in Arabic Language. Amman, Jordan. (in Arabic)
Secondary sources
Proposal http://std.dkuug.dk/jtc1/sc2/WG2/docs/n3799.pdf; http://std.dkuug.dk/jtc1/sc2/WG2/docs/n3085-1.pdf

Armenian

Used primarily for writing the Armenian language, the Armenian script was devised about 406 CE by Mesrop Maštoc‘ to make Christian scriptural and liturgical texts accessible to the Armenian people. The script has been used to write the the official literary dialects of Eastern Armenian and Western Armenian, as well as Classical (or Grabar) Armenian and Middle Armenian.

Unicode blocks Armenian
Alternate names
Timeframe 406 to present
Regions European
Type alphabet
Alternate names left to right
Status living
Number of speakers 6.3 million
Languages Armenian, Classical Armenian, Middle Armenian
Main sources Sanjian, A. 1996. "The Armenian Alphabet" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 356-363.
Secondary sources
Proposal

Arrows

Arrows are symbols that can be used for various purposes: to suggest directional relations, show logical derivation or implication (used in mathematics), and on computers to represent the cursor control keys. The main Arrows block (from code points 2190-21FF) includes a wide variety of arrow shapes. The Supplemental Arrows-A and -B blocks contain a large collection of arrow symbols that supplement the main set in the Arrows block. Many of the arrows in the Miscellaneous Symbols and Arrows block are included to ensure of the availability of left-right symmetric pairs of less common arrows, which are needed for bidirectional layout of mathematical text. As symbols, arrows have been used throughout history; arrows (as weapons) appear already in the Paleolithic cave paintings of Lascaux, France.

Unicode blocks Arrows, Supplemental Arrows-A, Supplemental Arrows-B, Miscellaneous Symbols and Arrows
Alternate names
Timeframe various
Regions European
Type symbols
Alternate names
Status living
Number of speakers
Languages
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 492-493 (Section 15.4).
Secondary sources
Proposal

Avestan

Avestan is the language of the Zoroastrian scriptures, and dates to 5 or 6C CE. It was used to write religious texts from the Avesta (written in the Avestan language) and to write texts in Pazand (a writing system for the Middle Persian language, written in the Avestan script). Avestan was derived from the Book Pahlavi script. The Avestan script is used in ritual and other hieratic contexts in Zoroastrian communities today. Zoroastrians number approximately 140,000 members in India, Iran and North America.

Unicode blocks Avestan
Alternate names Pazend, Pazand
Timeframe 5C or 6C to present
Regions European
Type alphabet
Alternate names right to left
Status liturgical
Number of speakers 0
Languages Avestan
Main sources Hale, Mark. 2004. "Avestan" in The Cambridge Encyclopedia of the World’s Ancient Languages, ed. Roger Woodard. Cambridge: Cambridge University Press, pp. 742-763.
Secondary sources Skjaervo, P.O. 1996. "Aramaic Scripts for Iranian Languages" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 515-535.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3197.pdf

Balinese

The Balinese script, a descendant of Brahmi, is used to write the Balinese language, which is spoken on Bali and the neighboring island of Lombok in Indonesia. It is also used to write the Sasak language, Sanskrit, and Kawi (Old Javanese). The Balinese script is still taught in the schools on Bali today, but the Latin alphabet predominates. Many religious and literary works are written in the Balinese script, and the script appears on signs in Bali.

Unicode blocks Balinese
Alternate names aksara Bali, Carakan
Timeframe 11C to present
Regions European
Type abugida
Alternate names left to right
Status living
Number of speakers 5 million
Languages Balinese, Sasak, Kawi (Old Javanese), Sanskrit
Main sources Kuipers, J., and R. McDermott. 1996. "Insular Southeast Asian Scripts" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 474-484.
Secondary sources Nakanishi, Akira. 1980. Writing Systems of the World. Rutland, Vermont; Tokyo: Charles Tuttle, p. 80.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2908.pdf

Bamum

King Njoya created the Bamum script at the age of 25 in 1896 for the Bamum language, now spoken in present-day Cameroon. The script went through several phases: it transitioned from a logosyllabary to a syllabary in the present-day version of the script. Old Bamum manuscripts, which number several hundred, use characters contained in the Bamum Supplement block, which contains 596 characters. The modern script uses the 88 characters in the Bamum block. The script is being taught today, although there is only one living fluent user. The script engenders tremendous enthusiasm and pride among the user community.

Unicode blocks Bamum, Bamum Supplement
Alternate names A-ka-u-ku script, Shu-mom
Timeframe 1896 to present
Regions European
Type syllabary, logosyllabary
Alternate names left to right
Status living
Number of speakers 215000
Languages Bamum
Main sources Daniels, P. 1996. “The Invention of Writing: The Bamum script” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 579-586.
Secondary sources Dugast, J., and M. D. W. Jeffreys. 1950. L’écriture des bamum: sa naissance, son evolution, sa valeur phonétique, son utilisation. Mémoires de l’Institut Français d’Afrique Noire, Centre du Cameroun.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3522.pdf; http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3597.pdf

Basic Latin

The Basic Latin set contains the 26 basic letter pairs in the Latin alphabet that make-up US-ASCII, a character encoding first published in 1963. Only a small number of the languages can be written entirely with this basic set of 26 uppercase and 26 lowercase Latin letters. ASCII still appears widely on computers and cell phones today. For more information on the languages using Latin and other Latin-related blocks, see the entry "Latin."

Unicode blocks Basic Latin
Alternate names Roman alphabet
Timeframe x-7c to present
Regions European
Type alphabet
Alternate names left to right
Status living
Number of speakers 329 million
Languages English, Latin, Swahili, Hawaiian
Main sources Knight, S. 1996. "The Roman Alphabet" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 312-332.
Secondary sources
Proposal

Batak

The Batak script is used to write five dialects of the Batak language in Sumatra, Indonesia: Karo, Mandailing, Pakpak, Simalungun, and Toba. The script is taught in schools, though not as the practical way to write the language, which today uses Latin alphabet. The script can be found in public signage. Like many other scripts of Southeast Asia, this script derives ultimately from Brahmi.

Unicode blocks Batak
Alternate names si-sia-sia, surat na sampulu sia
Timeframe ca. 14C to present
Regions European
Type abugida
Alternate names left to right
Status living
Number of speakers 6.1 million
Languages Batak
Main sources Kuipers, J., and R. McDermott. 1996. "Insular Southeast Asian Scripts" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 474-484.
Secondary sources Nakanishi, Akira. 1980. Writing Systems of the World. Rutland, Vermont; Tokyo: Charles Tuttle, p. 81.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3320.pdf

Bengali

The Bengali (Bangla) script is used to write the Bengali language in the West Bengal state of India and in Bangladesh. The script is also used for Assamese in Assam, and to write a number of other minority languages. This script, like others in South Asia, derives from Brahmi, and is closely related to Devanagari. In West Bengal and Bangladesh, the preferred name for the script is "Bangla." In the Indian state of Assam, the preferred name for the script is "Asamiya" or "Assamese."

Unicode blocks Bengali
Alternate names Bangla
Timeframe 11C to present
Regions European
Type abugida
Alternate names left to right
Status living
Number of speakers 181 million
Languages Assamese, Bengali, Bishnupriya Manipuri, Chakma, Daphla, Garo, Hallam, Khasi, Kok Borok, Lushai, Mizo, Munda, Mundari, Naga, Rangpuri (Kamta), Rian, Riang (Myanmar), Santali, Sylheti, Manipuri
Main sources Bagchi, T. 1996. “Bengali Writing” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 399-403.
Secondary sources
Proposal

Block Elements

The Block Elements are a set of graphic compatibility characters created for shading. They are designed to fill a portion of a display cell or to fill a display cell with some defined degree of shading. The Block Elements characters were used in graphic displays in terminals or in terminal modes when bit-mapped graphics weren't available. The Unicode Standard does not encourage use of this model of character-cell-based graphics. The Block Elements characters derive from earlier standards and terminal graphic sets, and appeared in various MS-DOS codepages, including the IBM PC CP 437, which dates at least to the 1980s.

Unicode blocks Block Elements
Alternate names
Timeframe 1980s? to present
Regions European
Type symbols
Alternate names
Status living
Number of speakers
Languages
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 498-499 (Section 15.7).
Secondary sources
Proposal

Bopomofo

The Bopomofo block is made up of a set of characters used to annotate or teach the phonetics of Chinese, primarily the standard Mandarin language. These characters appear in dictionaries and teaching materials, but do not appear in the actual writing of Chinese text. The name “Bopomofo” comes from the first four letters of the system, though the formal Chinese names are Zhuyin-Zimu ('phonetic alphabet') and Zhuyin-Fuhao ('phonetic symbols'). The Bopomofo characters devised in the period following the 1911 revolution, as part of a populist literacy campaign. Bopomofo are part alphabet and part syllabary. Note that in the People’s Republic of China, the function of the Bopomofo has been largely taken over by the Pinyin romanization system. The Extended Bopomofo block includes additional phonetic characters used to represent sounds of Chinese dialects other than Mandarin and a few characters needed to represent an old orthography devised for the Ge and Hmu languages.

Unicode blocks Bopomofo, Bopomofo Extended
Alternate names Zhuyin-Zimu, Zhuyin-Fuhao
Timeframe 1911 to present
Regions European
Type alphabet/syllabary
Alternate names variable
Status living
Number of speakers 894 million (potential users)
Languages Chinese, Hmu, Ge, Taiwanese, and other aboriginal languages of Taiwan
Main sources Mair, V. 1996. "Modern Chinese Writing" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 299-208.
Secondary sources
Proposal

Box Drawing

The Box Drawing symbols are a collection of graphic compatibility characters meant for drawing boxes of various shapes and line widths for user interface components used in character-cell-based graphic systems. The Unicode Standard does not encourage use of this model of character-cell-based graphics. The Box Drawing characters derive from earlier standards and terminal graphic sets, and appeared in various MS-DOS codepages, including the IBM PC CP 437, which dates at least to the 1980s.

Unicode blocks Box Drawing
Alternate names
Timeframe 1980s? to present
Regions European
Type symbols
Alternate names
Status living
Number of speakers
Languages
Main sources The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 498-499 (Section 15.7).
Secondary sources
Proposal

Brahmi

The Brahmi script is a “parent script”, ancestor of many Asian scripts, particularly those from India (including Devanagari, Tamil, etc.). The oldest datable records in this script are the rock and pillar inscriptions of the Mauryan emperor Aśoka throughout India, which date to the 3C BCE. It was used into the late first millennium CE.

Unicode blocks Brahmi
Alternate names
Timeframe x-3C to 1000 CE
Regions European
Type abugida
Alternate names left to right
Status historical
Number of speakers 0
Languages Inscriptional Prakrits (including Inscriptional Pali), Pali, Ardhamagadhi, Maharashtri (including Jaina-Maharashtri), other literary Prakrits, Shauraseni and Jaina-Shauraseni, Magadhi, Sanskrit, Tamil
Main sources Salomon, R. 1996. “Brahmi and Kharosthi” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 373-383.
Secondary sources Jamison, S. 2004. "Sanskrit" and "Middle Indic" in The Cambridge Encyclopedia of the World’s Ancient Languages, ed. Roger Woodard. Cambridge: Cambridge University Press, pp. 673-699, 700-716.
Proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3490.pdf; http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3491.pdf