Sharada
The Sharada script is an important historical Brahmi-based script used for Kashmiri, Sanskrit and a number of other languages of northern South Asia. The Sharada script was the main inscriptional and literary script of Kashmir from 8C to 20C, and is still in use in rituals and to write horoscopes by a small group of Kashmiri Pandits. The historical spread of the script ranges from northern India, Pakistan, and Afghanistan.
Unicode blocks | Sharada |
Alternate names | Sharda, Sarada |
Timeframe | 8C to 20C |
Regions | South Asian |
Type | abugida |
Alternate names | left to right |
Status | liturgical |
Number of speakers | — |
Languages | Kashmiri, Sanskrit |
Main sources | Kaul Deambi, and Bhushan Kumar. 2008. Sarada and Takari Alphabets: Origin and Development. New Delhi: Indira Gandhi National Centre for the Arts. |
Secondary sources | Grierson, G. 1919. The Linguistic Survey of India. Volume VIII. Indo-Aryan Family. North-Western Group. Part. II. Dardic or Pisacha Languages (Including Kashmırı). Calcutta: Office of the Superintendent of Government Printing, India. |
Proposal | http://std.dkuug.dk/JTC1/SC2/WG2/docs/n3595.pdf |
Shavian
The Shavian script, also known as the Shaw script, is used for the phonetic spelling of English and contains 40 letters. Playwright George Bernard Shaw directed in his will that the Public Trustee in Britain search for and publish an alphabet for English with 40 (or fewer) letters. This request from Shaw was an attempt to address the idiosyncrasies of English orthography. The script that was selected was devised by Kingsley Read, but has not met with widespread acceptance. A version of Shaw's play Androcles and the Lion: An Old Fable Renovated was published containing English spelling and Shavian, and is generally accepted as the normative version of the script.
Unicode blocks | Shavian |
Alternate names | Shaw's Alphabet, Proposed British Alphabet, Shaw script |
Timeframe | 1958 |
Regions | South Asian |
Type | alphabet |
Alternate names | left to right |
Status | artificial |
Number of speakers | — |
Languages | English |
Main sources | Crystal, David. 1997. The Cambridge Encyclopedia of Language. 2nd ed. Cambridge, New York: Cambridge University Press, p. 216. |
Secondary sources | Shaw, George Bernard. 1962. Androcles and the Lion: An Old Fable Renovated, by Bernard Shaw, with a Parallel Text in Shaw’s Alphabet to Be Read in Conjunction Showing Its Economies in Writing and Reading. Harmondsworth: Penguin Books. |
Proposal | — |
Sinhala
The Sinhala script, also called Sinhalese, is used to write the Sinhala language (the majority language in Sri Lanka), Tamil, and the liturgical languages Pali and Sanskrit. The script descends from Brahmi and resembles the scripts of South Asia.
Unicode blocks | Sinhala |
Alternate names | — |
Timeframe | x-3C (or -2C) to present |
Regions | South Asian |
Type | abugida |
Alternate names | left to right |
Status | living |
Number of speakers | 19.2 million |
Languages | Sinhala, Tamil, Pali and Sanskrit |
Main sources | Gair, J. 1996. "Sinhala Writing” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 408-412. |
Secondary sources | — |
Proposal | — |
Small Form Variants
Small form variants is a block of small variants of ASCII punctuation marks, including a small ampersand, small percent sign, small question mark and a small comma. These were encoded in the Unicode Standard as compatibility characters from the Chinese standard, CNS 11643.
Unicode blocks | Small Form Variants |
Alternate names | — |
Timeframe | — |
Regions | South Asian |
Type | — |
Alternate names | — |
Status | — |
Number of speakers | — |
Languages | — |
Main sources | The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, p. 201 (Section 6.2). |
Secondary sources | CNS 11643-1992: Zhongwen biaozhun jiaohuanma (Chinese standard interchange code). Taipei: 1992. |
Proposal | — |
Sora Sompeng
The Sora Sompeng script is used to write the Sora language in the Orissa-Andhra border area of India. The script was developed in 1936 by Mangei Gomango, based on a vision of 24 letters that he received. The script is used today in religious contexts and appears in a variety of published materials.
Unicode blocks | Sora Sompeng |
Alternate names | — |
Timeframe | 1936 to present |
Regions | South Asian |
Type | abugida |
Alternate names | left to right |
Status | living |
Number of speakers | 310000 |
Languages | Sora |
Main sources | Zide, N. 1996. “Scripts for the Munda Languages: Sorang Sompeng” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 612-614. |
Secondary sources | — |
Proposal | http://std.dkuug.dk/JTC1/SC2/WG2/docs/n3647.pdf |
Spacing Modifier Letters
The Spacing Modifier Letters block is primarily made up of a set of phonetic modifiers used to indicate that the pronunciation of an adjacent letter is different in some way, or to mark stress or tone. In some cases, the character may itself represent a sound. The block includes many characters required for the International Phonetic Alphabet, and a number of Uralic Phonetic Alphabet modifers. Spacing clones of diacritics, specified in some corporate standards, are also included.
Unicode blocks | Spacing Modifier Letters |
Alternate names | — |
Timeframe | various |
Regions | South Asian |
Type | alphabet |
Alternate names | — |
Status | living |
Number of speakers | — |
Languages | — |
Main sources | The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 228-229 (Section 7.8). |
Secondary sources | — |
Proposal | — |
Sundanese
The Sundanese script is used to write the Sundanese language, which is spoken on west Java in Indonesia. Sundanese is a descendant of the Brahmi script, and hence is related to many other scripts of South Asia and Southeast Asia that are derived from Brahmi. Today Sundanese is primarily written using the Latin script, but the Sundanese script is taught in the schools and appears on signage. Old Sundanese (Sunda Kuna) dates from 14C to 18C, and is handled by the characters in the Sundanese and the Sundanese Supplement blocks. Modern Sundanese has been in use from the 17C. The current form of the script was made official in 1996.
Unicode blocks | Sundanese, Sundanese Supplement |
Alternate names | aksara Sunda, Sunda Baku |
Timeframe | 14C to present |
Regions | South Asian |
Type | abugida |
Alternate names | left to right |
Status | living |
Number of speakers | 34 million |
Languages | Sundanese, Sanskrit, Old Sundanese |
Main sources | Baidillah, Idin, Cucu Komara, and Deuis Fitni. [2002] Ngalagena: Panglengkep Pangajaran Aksara Sunda pikeun Murid Sakola Dasar/Dikdas 9 Taun. [Bandung]: CV Walatra. |
Secondary sources | — |
Proposal | http://std.dkuug.dk/JTC1/SC2/WG2/docs/n3022.pdf; http://std.dkuug.dk/JTC1/SC2/WG2/docs/n3666.pdf |
Superscripts and Subscripts
The Superscripts and Subscripts block includes letters or digits that are positioned above or below the baseline in typographical layout. In many cases, superscripts and subscripts should be handled with style or mark-up (instead of using the characters from this block), in cases where the raised or lowered characters do not belong to plain text. The exception is when the superscript or subscript letters are part of a specialized phonetic alphabet, such as the Uralic Phonetic Alphabet. Several of the characters in this block derive from other standards or vendor code pages, and are considered compatibility characters.
Unicode blocks | Superscripts and Subscripts |
Alternate names | — |
Timeframe | various |
Regions | South Asian |
Type | — |
Alternate names | — |
Status | living |
Number of speakers | — |
Languages | — |
Main sources | The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 488-489 (Section 15.3). |
Secondary sources | — |
Proposal | — |
Syloti Nagri
The Syloti Nagri script is used for writing the Sylheti language, an Indo-European language spoken in the Barak Valley region of northeast Bangladesh and southeast Assam in India. The script is derived from Brahmi. It has traditionally been dated to 14C, but may be dated to 16C or 18C.
Unicode blocks | Syloti Nagri |
Alternate names | Jalalavad |
Timeframe | 14C? to present |
Regions | South Asian |
Type | abugida |
Alternate names | left to right |
Status | living |
Number of speakers | 10.3 million |
Languages | Sylheti |
Main sources | Bhuiya, M.A. 2000. Jalalavadi Nagri: a unique script & literature of Sylheti Bangla. Badarpur, Assam, India: National Publishers. |
Secondary sources | Qadir, Dr. S.M. Ghulam. 1999. Sileti Nagri Lipi - Bhasha O Sahitya (The Sylheti Nagri script - language and literature). PhD thesis, Bangla Academy, Dhaka. |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2591.pdf; http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2592.pdf |
Syriac
The Syriac script is used for writing a number of modern languages and dialects, including literary usages, Neo-Aramaic dialects, Garshuni (Arabic written in the Syriac script), Christian Palestinian Aramaic, and historically for writing Armenian, Persian, and other languages. The earliest datable Syriac writing dates from the 6 CE. Syriac is also the active liturgical language for several communities in the Middle East (Syrian Orthodox, Assyrian, Maronite, Syrian Catholic, and Chaldaean) and southeast India (Syro-Malabar and Syro- Malankara).
Unicode blocks | Syriac |
Alternate names | — |
Timeframe | 6C to present |
Regions | South Asian |
Type | abjad |
Alternate names | right to left |
Status | living |
Number of speakers | 501000 |
Languages | Syriac (Assyrian Neo-Aramaic and Chaldean Neo-Aramaic ), Arabic (including "Garshuni"), Turoyo, Armenian, Christian Palestinian Aramaic, Persian, Malayalam, Sogdian, Ottoman Turkish |
Main sources | Daniels, P. 1996. "Aramaic Scripts for Aramaic Languages” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 499-510. |
Secondary sources | — |
Proposal | — |
Tagalog
The Tagalog script was used to write Tagalog, Bisaya, Ilocano, and other languages in the Philippines. There are accounts dated to the mid-1500s written by Spanish missionaries mentioning the Tagalog script. However, the script fell out of common usage by the mid-1700s. The modern Tagalog language, also known as Filipino, is today written in the Latin script. The Tagalog script is a Brahmi-derived script, distantly related to the South Indian scripts. It is closely related to the Buhid, Hanunóo, and Tagbanwa scripts of the Philippines, though it may not be their direct parent. The ancestor of all four Philippine scripts may have been transported to the Philippines via palaeographic scripts of western Java between the 10 and 14 C CE.
Unicode blocks | Tagalog |
Alternate names | Baybayin |
Timeframe | 16C to mid-18C |
Regions | South Asian |
Type | abugida |
Alternate names | left to right |
Status | historical |
Number of speakers | 0 |
Languages | Tagalog (Filipino), Bisaya, Ilocano and other languages |
Main sources | Kuipers, J., and R. McDermott. 1996. "Insular Southeast Asian Scripts" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 474-484. |
Secondary sources | Santos, Hector. 1994. The Living Scripts. Los Angeles: Sushi Dog Graphics. (Ancient Philippine scripts series, 2). |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n1933.pdf |
Tagbanwa
Tagbanwa is a living script used to write the Tagbanwa language (also known as Apurahuanoin) in Palawan, the Philippines. Tagbanwa is a Brahmi-derived script, distantly related to the South Indian scripts. It is closely related to the Hanunóo and Buhid scripts of the Philippines. All three scripts are related to Tagalog, but may not be directly descended from it. The ancestor of these Philippine scripts (including Tagalog) may have been transported to the Philippines via palaeographic scripts of western Java between the 10 and 14 C CE.
Unicode blocks | Tagbanwa |
Alternate names | Bisaya |
Timeframe | pre-19C to present |
Regions | South Asian |
Type | abugida |
Alternate names | left to right |
Status | living |
Number of speakers | 10000 |
Languages | Tagbanwa |
Main sources | Kuipers, J., and R. McDermott. 1996. "Insular Southeast Asian Scripts" in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 474-484. |
Secondary sources | Santos, Hector. 1994. The Living Scripts. Los Angeles: Sushi Dog Graphics. (Ancient Philippine scripts series, 2). |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n1933.pdf |
Tai Le
The Tai Le script is used to write the Tai Le language (also known as Tai Nüa, Dehong Dai, Tai Mau, Tai Kong, and Chinese Shan), spoken primarily in south central Yunnan, China. The script derives from Old Dehong Dai, whose history goes back some 700-800 years. The present form of the script dates to ca. 1954, when a systematic representation of the tones was introduced with the use of combining diacritics. The script was revised again in 1988.
Unicode blocks | Tai Le |
Alternate names | Tai Nüa, Dehong Dai |
Timeframe | ca. 1954 to present |
Regions | South Asian |
Type | alphabet |
Alternate names | left to right |
Status | lviing |
Number of speakers | 647400 |
Languages | Tai Le |
Main sources | Coulmas, Florian. 1996. The Blackwell Encyclopedia of Writing Systems. Oxford, Cambridge: Blackwell, pp. 118-119. |
Secondary sources | — |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2672.pdf |
Tai Tham
Tai Tham script, sometimes called Lanna, Old Tai Lue, or Old Xishuangbanna Dai, is a descendant of the Brahmi and Old Mon script. It is used for the Kam Mu'ang (Northern Thai), Tai Lue, and Khün languages. It is also used for religious purposes to write Lao Tham (Old Lao), and can be found as the alphabet of old manuscripts in temples in Northern Thailand.
Unicode blocks | Tai Tham |
Alternate names | Lanna, Old Xishuangbanna Dai, Tham, Yuan |
Timeframe | 13C to present |
Regions | South Asian |
Type | abugida |
Alternate names | left to right |
Status | living |
Number of speakers | 100000 |
Languages | Kam Mu'ang, Tai Lue, Khün, Lao Tham |
Main sources | Peltier, Anatole-Roger. 1996. Lanna Reader. Chiang Mai: Wat Tha Kradas. |
Secondary sources | — |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3207.pdf |
Tai Viet
The Tai Viet script is used to write three Tai languages spoken primarily in northwestern Vietnam, northern Laos, and central Thailand—Tai Dam (also known as Black Tai or Tai Noir), Tai Dón (White Tai or Tai Blanc), and Thai Song (Lao Song or Lao Song Dam). The script reflects great diversity in the traditional form of the script, depending upon the community. There has been an attempt to establish a standard for the Tai script, which was called Unified Alphabet. The script is used today by the Tai people in Vietnam.
Unicode blocks | Tai Viet |
Alternate names | Viet Thai, Tay Viet |
Timeframe | 16C? to present |
Regions | South Asian |
Type | abugida |
Alternate names | left to right |
Status | living |
Number of speakers | 1.3 million |
Languages | Tai Dam, Tai Dón, and Thai Song |
Main sources | Cầm Trọng. 2005. “Thai Scripts in Vietnam” in Workshop on the Preservation and Digitization of Tai Scripts. Hanoi, Vietnam. |
Secondary sources | Baccam Don, Baccam Faluang, Baccam Hung, and Dorothy Fippinger. 1989. Tai Dam – English, English – Tai Dam Vocabulary Book. Summer Institute of Linguistics. |
Proposal | http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3220.pdf |
Tai Xuan Jing Symbols
The Tai Xuan Jing symbols include sets of monogram, digram and tetragram signs. These symbols appeared in China in a text called Tai Xuan Jing (literally, “the exceedingly arcane classic”), composed in 2 BCE by Yang Xiong (53 BCE-18 CE). The text is known in the West by several titles, including The Alternative I Ching and The Elemental Changes. The work is still published today.
Unicode blocks | Tai Xuan Jing Symbols |
Alternate names | — |
Timeframe | x-2C to present |
Regions | South Asian |
Type | symbols |
Alternate names | variable |
Status | historical |
Number of speakers | 0 |
Languages | Chinese |
Main sources | The Unicode Consortium. 2011. The Unicode Standard, Version 6.0, defined by: The Unicode Standard, Version 6.0. Mountain View, CA: The Unicode Consortium, pp. 506-507 (Section 15.8). |
Secondary sources | — |
Proposal | — |
Takri
The Takri script is used to write a variety of languages in the western regions of the Himalayas, present day Jammu and Kashmir, Himachal Pradesh, Panjab, and Uttarakand. It was used primarily during 17C to 20C. Takri is derived from the Sharada family of Brahmi scripts. There are reports of revival efforts of Takri to write languages as Dogri, Kishtwari, and Kulvi. A number of regional varieties of the script exist.
Unicode blocks | Takri |
Alternate names | Takari, Takkari, Tankri |
Timeframe | 17C to 20C |
Regions | South Asian |
Type | abugida |
Alternate names | left to right |
Status | historical |
Number of speakers | 0 |
Languages | Bhattiyali, Chambeali, Dogri, Gaddi, Gahri, Jaunsari, Kangri, Kinnauri, Kishtwari, Kulvi, Mahasu, Mandeali, Sirmauri |
Main sources | Kaul Deambi and Bushan Kumar. 2008. Śāradā and Ṭākarī Alphabets: Origin and Development. New Delhi: Indira Gandhi National Centre for the Arts. |
Secondary sources | — |
Proposal | http://std.dkuug.dk/JTC1/SC2/WG2/docs/n3758.pdf |
Tamil
The Tamil script descends from the South Indian branch of Brahmi. It is used to write the Tamil language of the Tamil Nadu state in south India and surrounding states, as well as for minority languages such as Badaga, Irula, Paniya, and Saurashtra. Tamil is also spoken in Sri Lanka, Singapore, and parts of Malaysia.
Unicode blocks | Tamil |
Alternate names | — |
Timeframe | 6C or 7C? to present |
Regions | South Asian |
Type | abugida |
Alternate names | left to right |
Status | living |
Number of speakers | 66.5 million |
Languages | Tamil, Badaga, Irula, Paniya and Saurashta |
Main sources | Steever S. 1996. Tamil Writing” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 426-430. |
Secondary sources | — |
Proposal | — |
Telugu
The Telugu script is used to write the Telugu language, spoken in the south central Indian state of Andhra Pradesh and nearby states. It is also used to write minority languages such as Gondi and Lambadi. It became a distinct script in 13C CE. Telugu is has a common descendent with the Kannada script.
Unicode blocks | Telugu |
Alternate names | — |
Timeframe | 13C to present |
Regions | South Asian |
Type | abugida |
Alternate names | left to right |
Status | living |
Number of speakers | 69.7 million |
Languages | Telugu, Gondi and Lambadi |
Main sources | Bright, W. 1996. "Kannada and Telugu Writing” in The World’s Writing Systems, ed. Peter T. Daniels & William Bright. New York; Oxford: Oxford University Press, pp. 413-419. |
Secondary sources | — |
Proposal | — |
Thaana
The Thaana (or Taana, Tāna) script is used to write the modern Dhivehi (Divehi) language of the Republic of Maldives. Although Thaana has borrowed many of its glyphs from Arabic and shares a number of features with Arabic writing, Thaana is a true alphabet because the writing of vowels is mandatory. Thaana also derives some of its letters from an earlier script that was used on the Maldives, Dhives Akuru. Thaana was developed in the 18C and largely replaced Dhives Akuru at that time.
Unicode blocks | Thaana |
Alternate names | Taana, Tāna |
Timeframe | 18C to present |
Regions | South Asian |
Type | alphabet |
Alternate names | right to left |
Status | living |
Number of speakers | 371000 |
Languages | Dhivehi (Maldivian) |
Main sources | Geiger, Wilhelm. 1996. Maldivian Linguistic Studies. New Delhi: Asian Educational Services. |
Secondary sources | Maniku, Hassan Ahmed. 1990. Say It in Maldivian (Dhivehi), [by] H. A. Maniku [and] J. B. Disanayaka. Colombo: Lake House Investments. |
Proposal | — |