Hindi is the most widely spoken language and primary tongue of 41 percent of the people of India. English is a subsidiary official language. Hindi and its sister language Urdu are Indo-European language. The two languages came into wide use in the 17th and 18th centuries. The two languages are similar same except that more words of Muslim, Farsi (Persian) and Arab origin are used in Urdu while more words of Sanskrit (the ancient Brahman language) are used in Hindi. Both Hindi and Urdu have elements in common with other Indo-European languages such as English and French.

By one count there are 325 classified languages— including 15 official languages, 18 major languages (many derived from Sanskrit)—over 500 minor languages and 1,653 dialects. Most of these languages are part of the Indo-European family of languages. The majority of Indians speak the language of their ethnic group as their first language and learn Hindi (or Urdu) and English in school. Government business is conducted in 15 language, using at least a half dozen different scripts.

India is home to some of the most widely spoken languages in the world: Hindi is spoken by 500 million people, Bengali by 250 million (100 million in India and 150 million in Bangladesh), Telugu by 100 million, Punjabi by 95 million, Tamil and Marathi 90 million each. The states are largely divided up the basis of linguist group: Kashmiri is spoken in Kashmir; Marathi is spoken in Maharashtra, Tamil in Tamil Nadu, and less obviously, Hindi in Uttar Pradesh, Malayalam in Kerala, Telugu in Andra Pradesh

Most widely spoken languages in India: Hindi: 41 percent; Bengali: 8.1 percent; Telugu: 7.2 percent; Marathi: 7 percent; Tamil: 5.9 percent; Urdu: 5 percent; Gujarati: 4.5 percent; Kannada: 3.7 percent; Malayalam: 3.2 percent; Oriya: 3.2 percent; Punjabi: 2.8 percent; Assamese: 1.3 percent; Maithili 1.2 percent; other: 5.9 percent.The 15 official languages are: Hindi, Bengali, Telugu, Marathi, Tamil, Urdu, Gujarati, Malayalam, Kannada, Oriya, Punjabi, Assamese, Kashmiri, Sindhi, and Sanskrit; Hindustani is a popular variant of Hindi/Urdu spoken widely throughout northern India but is not an official language. [Source: CIA World Factbook, Library of Congress]

About 45 percent of Indians speak Urdu or Hindi. Urdu, the national language of Pakistan. Only around three to five percent of the population is truly fluent in both English and an Indian language. But English-speakers include nearly all the educated elite and people who come in contact with tourists although knowledge of English varies widely from fluency to knowledge of just a few words. While English is relegated to the status of subsidiary official language it is the most important language for national, political, and commercial communication.

Diversity of Languages in India

The constitution of India recognizes 15 official languages (most countries have just one). There are 35 Indian languages spoken by more than a million people. Many of these languages have their own scripts and are as different from one another a English is from Chinese. Countries with the most languages: 1) Papua New Guinea (832); 2) Indonesia (731); 3) Nigeria (515); 4) India (400); 5) Mexico (300); 6) Cameroon (300); 7) Australia (300); 8) Brazil (234).

People speak between 300 and 3,000 languages and up to 22,000 dialects, depending on who is doing the counting, The total number of languages and dialects varies by source and counting method, and many Indians speak more than one language. The Indian census lists 114 languages (22 of which are spoken by one million or more persons) that are further categorized into 216 dialects or “mother tongues” spoken by 10,000 or more speakers. An estimated 850 languages are in daily use, and the Indian Government claims there are more than 1,600 dialects. Dialects that belong to a particular language are not always mutually comprehensible. [Source: Library of Congress, 2005 *]

India's ethnic, linguistic, and regional complexity sets it apart from other nations. To gain even a superficial understanding of the relationships governing the huge number of ethnic, linguistic, and regional groups, the country should be visualized not as a nation-state but as the seat of a major world civilization on the scale of Europe. The population is not only immense but also has been highly varied throughout recorded history; its systems of values have always encouraged diversity. The linguistic requirements of numerous former empires, an independent nation, and modern communication are superimposed on a heterogeneous sociocultural base. [Source: Library of Congress, 1995 *]

Almost 8 percent of the population belongs to social groups recognized by the government as Scheduled Tribes, with social structures somewhat different from the mainstream of society. Powerful trends of "regionalism" — both in the sense of an increasing attachment to the states as opposed to the central government, and in the sense of movements for separation from the present states or greater autonomy for regions within them — threaten the current distribution of power and delineation of political divisions of territory. *

Sir George Grierson's twelve-volume Linguistic Survey of India , published between 1903 and 1923, identified 179 languages and 544 dialects. The 1921 census listed 188 languages and forty-nine dialects. The 1961 census listed 184 "mother tongues," including those with fewer than 10,000 speakers. This census also gave a list of all the names of mother tongues provided by the respondents themselves; the list totals 1,652 names. The 1981 census — the last census to tabulate languages — reported 112 mother tongues with more than 10,000 speakers and almost 1 million people speaking other languages. The encyclopedic People of India series, published by the government's Anthropological Survey of India in the 1980s and early 1990s, identified seventy-five "major languages" within a total of 325 languages used in Indian households. In the early 1990s, there were thirty-two languages with 1 million or more speakers. *

Language Groups in India

The languages of South Asia fall into four language groups: 1) Indo-Aryan, or Indic: a branch of the Indo-European Family, dominant in Pakistan, northern India and Bangladesh, which includes Hindi and it many variants, Punjabi, Sinali, Urdu and Bengali; 2) Dravidian: found in mostly in southern India and northeastern Sri Lanka and pockets elsewhere in South Asia, including Tamil and Malayalam; 3) Tibeto-Burman, found in the Himalayan region and far eastern India; and 4) Austroasiatic (Austric or Munda), mostly tribal groups in Assam, northeast India and Bangladesh.

The overwhelming majority of Indians speak Indo-Aryan or Dravidian languages. The majority of the languages spoken in the north are Indo-European languages derived from Sanskrit. Tamil and other languages spoken in south are Dravidian languages not related to Sanskrit or members of the Indo-European family of languages. Some linguists identify other language groups such as Dardic, including Kashmiri. Another family, Andamanese, is spoken by at most a few hundred among the indigenous tribal peoples in the Andaman Islands, and has no agreed upon connections with families outside them.

Burushaski is an unusual languages spoken in a mountainous region of northwestern Kashmir by only around 40,000 people. It is of interest to linguists and archaeologists because it is unrelated to any language in the area or the world.

The four major families are as different in their form and construction as are, for example, the Indo-European and Semitic families. A variety of scripts are employed in writing the different languages. Furthermore, most of the more widely used Indian languages exist in a number of different forms or dialects influenced by complex geographic and social patterns. [Source: Library of Congress, 1995 *]

About 80 percent of all Indians speak one of the Indo-Aryan group of languages. Persian and the languages of Afghanistan are close relatives, belonging, like the Indo-Aryan languages, to the Indo-Iranian branch of the Indo-European family. Brought into India from the northwest during the second millennium B.C., the Indo-Aryan tongues spread throughout the north, gradually displacing the earlier languages of the area.*

Despite the extensive linguistic diversity in India, many scholars treat South Asia as a single linguistic area because the various language families share a number of features not found together outside South Asia. Languages entering South Asia were "Indianized." Scholars cite the presence of retroflex consonants, characteristic structures in verb formations, and a significant amount of vocabulary in Sanskrit with Dravidian or Austroasiatic origin as indications of mutual borrowing, influences, and counter-influences. Retroflex consonants, for example, which are formed with the tongue curled back to the hard palate, appear to have been incorporated into Sanskrit and other Indo-Aryan languages through the medium of borrowed Dravidian words.*

Sanskrit, Prakrits and the History of Indo-Aryan Languages of India

Modern linguistic knowledge of the process of assimilation of Indo-Aryan language comes through the Sanskrit language employed in the sacred literature known as the Vedas. Over a period of centuries, Indo-Aryan languages came to predominate in the northern and central portions of South Asia. [Source: Library of Congress *]

Sanskrit is the ancient language of India and the sacred language of Hinduism. The Asian cousin of Latin and Greek, it is ideal for chanting as it is full of sounds that resonate in a special way. Traditionally it was a taboo for any caste other than Brahmans (India’s highest caste) to learn Sanskrit—"the language of the gods." The Hindu epic “Ramayana” described a lower caste man who had molten metal poured in his ear after he listened to Sanskrit scriptures reserved for upper class Brahmans.

As Indo-Aryan speakers spread across northern and central India, their languages experienced constant change and development. By about 500 B.C., Prakrits, or "common" forms of speech, were widespread throughout the north. By about the same time, the "sacred," "polished," or "pure" tongue — Sanskrit — used in religious rites had also developed along independent lines, changing significantly from the form used in the Vedas. However, its use in ritual settings encouraged the retention of archaic forms lost in the Prakrits. Concerns for the purity and correctness of Sanskrit gave rise to an elaborate science of grammar and phonetics and an alphabetical system seen by some scholars as superior to the Roman system. By the fourth century B.C., these trends had culminated in the work of Panini, whose Sanskrit grammar, the Ashtadhyayi (Eight Chapters), set the basic form of Sanskrit for subsequent generations. Panini's work is often compared to Euclid's as an intellectual feat of systematization.*

The Prakrits continued to evolve through everyday use. One of these dialects was Pali, which was spoken in the western portion of peninsular India. Pali became the language of Theravada Buddhism; eventually it came to be identified exclusively with religious contexts. By around A.D. 500, the Prakrits had changed further into Apabhramshas, or the "decayed" speech; it is from these dialects that the contemporary Indo-Aryan languages of South Asia developed. The rudiments of modern Indo-Aryan vernaculars were in place by about A.D. 1000 to 1300.*

It would be misleading, however, to call Sanskrit a dead language because for many centuries huge numbers of works in all genres and on all subjects continued to be written in Sanskrit. Original works are still written in it, although in much smaller numbers than formerly. Many students still learn Sanskrit as a second or third language, classical music concerts regularly feature Sanskrit vocal compositions, and there are even television programs conducted entirely in Sanskrit.*

Dravidian and the History of Non- Indo-Aryan Languages of India

Around 18 percent of the Indian populace (about 200 million people) speak Dravidian languages. Most Dravidian speakers reside in South India, where Indo-Aryan influence was less extensive than in the north. Only a few isolated groups of Dravidian speakers, such as the Gonds in Madhya Pradesh and Orissa, and the Kurukhs in Madhya Pradesh and Bihar, remain in the north as representatives of the Dravidian speakers who presumably once dominated much more of South Asia. Other significant population of Dravidian speakers are the Brahuis in Pakistan and Tamils in Sri Lanka. [Source: Library of Congress *]

The oldest documented Dravidian language is Tamil, with a substantial body of literature, particularly the Cankam poetry, going back to the first century A.D. Kannada and Telugu developed extensive bodies of literature after the sixth century, while Malayalam split from Tamil as a literary language by the twelfth century. In spite of the profound influence of the Sanskrit language and Sanskritic culture on the Dravidian languages, a strong consciousness of the distinctness of Dravidian languages from Sanskrit remained. All four major Dravidian languages had consciously differentiated styles varying in the amount of Sanskrit they contained. In the twentieth century, as part of an anti-Brahman movement in Tamil Nadu, a strong movement arose to "purify" Tamil of its Sanskrit elements, with mixed success. The other three Dravidian languages were not much affected by this trend.*

Sino-Tibetan Language and Austroasiatic Languages in India

There are smaller groups, mostly tribal peoples, who speak Sino-Tibetan and Austroasiatic languages. Sino-Tibetan speakers live along the Himalayan fringe from Jammu and Kashmir to eastern Assam. They comprise about 1.3 percent, or 12 million, of India's 1995 population. The Austroasiatic languages, composed of the Munda tongues and others thought to be related to them, are spoken by groups of tribal peoples from West Bengal through Bihar and Orissa and nto Madhya Pradesh. These groups make up approximately 0.7 percent (about 6.5 million people) of the population. [Source: Library of Congress]

Sino-Tibetan languages predominate in China and mainland Southeast Asia. They are broken into three main subfamilies: 1) Tibeto-Burman, 2) Tai and 3) Sinitic, including many of the language spoken in China. One unique feature of all Sino-Tibetan languages is that most words consist of a single syllable. Multi-syllable words are as unthinkable to Tibetans and Chinese as words with only consonants are to English speakers. Sino-Tibetan languages are tonal, which means that the meaning of the word can change with the tone of pitch in which it is spoken.

Vietnamese and Cambodian are Austroasiatic languages. Enclaves of people that speak Austroasiatic languages also found in Malaysia, Laos, Thailand, Myanmar, and India. There are about 90 million speakers of Austroasiatic languages in the world today. They are also called Munda or Mon-Khmer languages. Although the language may have originated in China, very few people in China speak it today (a small enclave near the Myanmar border).

Austroasiatic languages are characterized by an abundance of vowels. In contrast to English, which only has around a dozen vowel sounds, Austroasiatic languages have around 40 or so, including ones that are nasal, non-nasal, long, extra-short, creaky, breathy, normal, high-tongue, low-tongue, medium-high tongue, medium-low tongue, front tongue, back tongue, middle tongue and various combinations of these sounds.

Names in India

Some Indians have only name. Traditionally, Indians had a given name (first name) and a honorific but no surname (last name). Both men and women put the initial of their father’s name before their own given name. Married women were known by their given name plus their husband’s given name. These days many Indians have European-style names with the given name first and the family name second. Some Indians use their caste or village or region as their last name. Indians rarely call each other by their names. Relatives are often referred to by the equivalent of father, mother, son, daughter even if they are not those things. Family friends are often called "auntie" or "uncle" as a sign of friendship and respect. Older people are sometimes called "father," "mother," "grandfather," or "grandmother," even if they are not blood relatives. Strangers are often greeted with the equivalent of “brother” or “sister.” Even a wife in a traditional family does not call her husbands by his name but calls him “the father of so and so.” Husbands and wives often address each other by their first names.

In formal situations or with people they don’t know very well, Indians generally use "Mr.," "Mrs.," "Miss," "Sir," "Madam," or use titles such as "Dr." Sri is the Indian equivalent of Mr. “Pandit” is an honorific term that means teacher. “Ustad” is the Muslim equivalent of Pandit. Indians sometimes greet male foreigners with the honorific term “Sahib” ("master,” and pronounced “saab,” like the car). With women Indians sometimes add “ji” to the end of the woman's name. Muslims refer to each other using the terms “bin” for a man and “binti” for a woman, followed by the father’ given name.

Family Names

Family names in India often reveal the language, religion, caste and home state of an individual. Male Muslims have names like Muhammed, Ali, Khan and Hussein. Female Muslims have Jan or Begum at the end of their names. Christians are often quickly recognizable by their Biblical names like Paul, Thomas, Andrew or Jacob. If a person has the last name of Singh more than likely he or she is a Sikh or at least from the Punjab.

Mukherjee, Chatterjee and Banerjee are common upper caste Bengali names. Bose and Ghose are common lower caste Bengali names. Krishnamarchai, Srinivasan, Padmanbhan are common Tamil names. Last names that end with “kar” or dey” like Ramaday and Gavaskar are typical of Maharashtra. Kumar, Mehta, Gupta, Sharma and Malhotra are names found throughout India. Fernandez is a common last name in the Goa area and a vestige of the Portugese era. [Source: Gitanjali Kolanad, “Culture Shock: India”]

Common Hindu first names include Vijay, Gopal, Rajendra and Prakash. Many people are named after Hindu gods such as Krishna or Lakshmi. English nicknames are common. Some people have used their English nicknames for so long they don’t respond to their given names.

Image Sources:

Text Sources: New York Times, Washington Post, Los Angeles Times, Times of London, Lonely Planet Guides, Library of Congress, Ministry of Tourism, Government of India, Compton’s Encyclopedia, The Guardian, National Geographic, Smithsonian magazine, The New Yorker, Time, Newsweek, Reuters, AP, AFP, Wall Street Journal, The Atlantic Monthly, The Economist, Foreign Policy, Wikipedia, BBC, CNN, and various books, websites and other publications.

Last updated June 2015

This site contains copyrighted material the use of which has not always been authorized by the copyright owner. Such material is made available in an effort to advance understanding of country or topic discussed in the article. This constitutes 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner. If you are the copyright owner and would like this content removed from, please contact me.