web analytics

Your browser (Internet Explorer 7 or lower) is out of date. It has known security flaws and may not display all features of this and other websites. Learn how to update your browser.

X

Number of Words in the English Language: 1,013,913

The number of words in the English language is:  1,013,913.   This is the estimate by the Global Language Monitor on January 1, 2012.

The English Language passed the Million Word threshold on June 10, 2009 at 10:22 a.m. (GMT).  The Millionth Word was the controversial ‘Web 2.0′. Currently there is a new word created every 98 minutes or about 14.7 words per day.

Google Validates GLM’s No. of Words in English Prediction

GLM/Google vs OED and Webster’s 3rd

Follow GLM on FacebookFollow GLM On Twitter

For more detail, go here.

Though GLM’s analysis was the subject of much controversy at the time, the recent Google/Harvard Study of the Current Number of Words in the English Language is 1,022,000.  The above graphic is from the AAAS /Science as reported on NPR.   At the time the  New York Times article on the historic threshold famously quoted several dissenting linguists as claiming  that “even Google could not come up with” such a methodology.  Unbeknownst to them Google was doing precisely that.

The number of words in the English language according to GLM now stands at:  1,013,913.   The difference between the two analyses is .0121%, which is widely considered statistically insignificant.

Google’s number, which is based on the counting of  the words in the 15,000,000 English language books it has scanned into the ‘Google Corpus,’ mirrors GLM’s Analysis.  GLM’s number is based upon its algorithmic methodologies, explication of which is available from its site.

Frequently Asked Questions About GLM

Frequently Asked Questions About the Global Language Monitor

Q.What is the Global Language Monitor?

A.The Global Language Monitor documents, analyzes, and tracks the latest trends in word usage and word choices and their impact on the various aspects of culture, with a particular emphasis upon Global English.GLM, an internet media analytics company, was founded six years ago in Silicon Valley.It is a direct descendent of yourDictionary.com, the premier multi-language dictionary site with some 230 languages.YDC had very deep academic roots with some two dozen of the world’s top linguists on its Academic Council of Experts.The Global Language Monitor is one of the first companies to exclusively focus on English as the first, true global language, and its impact on various aspects of culture, such as politics, the arts, entertainment, science, technology, and the like.The leading global media have come to rely upon GLM’s analysis and analytical techniques. The Global Language Monitor is based in Austin, Texas.Paul JJ Payack is the founding president of both companies.

Q.Who is Paul JJ Payack?

A.Paul JJ Payack is the president and Chief Word Analyst of the Global Language Monitor.   Payack was born in Morristown, New Jersey, and grew up in neighboring Boonton.  (His twin-brother, Peter,  is a poet, professor and the first ‘Poet Populist’ of Cambridge, Massachusetts.)  Payack earned a scholarship to Bucknell University where he studied psychology and philosophy, took a year off to write his first book, A Ripple in Entropy, and transferred to Harvard University where he was graduated with a bachelor of arts, concentrating in comparative literature; he subsequently earned a CAGS.  After an early stint in academia, Payack spent his career with a number of America’s most innovative technology companies, including such pioneers as Digital Equipment Corporation (DEC), Apollo Computer, Network Systems Corporation and Intelliguard Software, and Legato Systems.  He was subsequently a senior executive for three Fortune 500 companies (including Unisys, D&B, and companies that were absorbed by SUN, EMC and HP) as well as a number of Silicon Valley start-ups, spin-outs and spin-downs.

Payack has served as an adjunct lecturer for the University of Massachusetts for some three years, and has spoken at the Federal Reserve Bank (NY), Hughes Electronics, The University of Texas (Arlington), and many other organizations and educational institutions.Payack is a frequent media commentator on technology, words, and language to such organizations as CNN, NPR, the BBC, Reuters, the New York Times, the Sunday Times (London), and thePeoples’ Daily (Beijing).

Payack’s penultimate book, A Million Words and Counting, was published as a Citadel Imprint by Kensington, New York in 2008; the quality paperback edition has just been released.  (His latest book was an analysis of the Healthcare crisis in the US.)

For more extensive background information, check out Linkedin.

Q. So you are not a linguist?

A. I am most definitely not a linguist and have never claimed to be one.  Over the years my titles have included (in order):  Assistant Director of Admissions, Technical Writer, Engineer, Marketing Manager, Corporate Director,  v.p., C.M.O., SVP, C.E.O., founder, co-founder, principal and now ‘Chief Word Analyst’.

Q. What is a ‘Chief Word Analyst’?

A. The New York Times, in 2006, was the first to mention our PQI technology in an article about The Power of Words, which used our technology to see if the NY real estate market was heading toward a collapse.  In the article, Stephanie Rosenblum, described me as a ‘word analyst’.  I thought that was an apt description and have used the phrase as my title ever since.

GLM’s motto is ‘Where Technology Intersects With the Word’ and that is precisely what we do — applying statistical techniques, numerical analysis and the latest in computer technology to the analysis of the the Internet, blogosphere, print and electronic media, and now so-called social media.The Global Language Monitor’s expertise is in applying these techniques to global English in its various manifestations.

Q. Linguists frequently spar with you in the media.

A. Linguistics is classified as a subfield of Anthropology.  There are many subdivisions within the field and subdivisions within the various categories.  So expertise in one of these areas is quite narrow.  It’s analogous to being an engineer:  chemical, industrial, electrical, computer, audio, and the like.  So when you hear from a linguist, it helps to understand their particular field of expertise.

For the most part,  linguists are neither technologists, nor media analysts, and as such they are but one constituency.  Media analysts, technologists, and scholars in general not only encourage our work but also incorporate it into scores of peer-reviewed research, text books and so forth.  The Global Media seeks out our analysis in ever increasing numbers.

Q. We read that in an interview you once reversed Barack Obama’s name?

A.  True.  We’ve also been cited for typos, Word-clock malfunctions, mathematical errors, and so forth.  All true.

One of the many wonders of the Internet is that every mistake you make will be remembered indefinitely (and magnified, if at all possible).  And then there is the near-endless replication of hear-say, invective, or worse. I find it reassuring that anyone looking beyond the dozens of competing narratives swirling about one’s person, has good old-fashion ‘primary sources’ readily available at the click of key.

Q. Why was there such controversy about the Million Word March?

A. Linguists believe that there is no way to count words, since the nature of what a word is, itself, is in dispute.Hence you cannot count what you cannot define.More so, even attempting to take a measure of the language is to be condemned.

Q.  Don’t unabridged dictionaries have all or most of the words in the language, according to a rigid set of criteria.  Can’t you just count them?

A.  Apparently not without great difficulty.  We, too, are mystified by this.

Q.  Google and Harvard University recently launched the Google Books Ngram Viewer.  They also calculated the number of words in the English Language.  How does that compare to the number that your obtained from the Global language Monitor’s algorithmic-based analysis?

A.  Though GLM’s analysis was the subject of much controversy at the time, the recent Google/Harvard Study of the Current Number of Words in the English Language is 1,022,000.

Google Validates GLM’s No. of Words in English Prediction

GLM/Google vs OED and Webster’s 3rd

The above graphic is from the AAAS /Science as reported on NPR.   At the time the  New York Times article on the historic threshold famously quoted several dissenting linguists as claiming  that “even Google could not come up with” such a methodology.  Unbeknownst to them Google was doing precisely that.

The number of words in the English language according to GLM now stands at:  1,010,649.7.   The difference between the two analyses is .0121%, which is widely considered statistically insignificant.

Google’s number, which is based on the counting of  the words in the 15,000,000 English language books it has scanned into the ‘Google Corpus,’ mirrors GLM’s Analysis.  GLM’s number is based upon its algorithmic methodologies, explication of which is available from its site.

Q.  The 1,000,000 word was ‘web 2.0;’ a number of  lexicographers seemed to think this was not a word because it contains letter and a number and even a bit of punctuation.  Is it a word?

A. It’s a lexical unit. Think about this for a moment:  is O.K. a word?  Or 24/7, or w00t. or 3-D?  There is a long history of English words with numbers (or punctuation) intermixed.  And it is a burgeoning trend; it’s called L33t Speak.  Check the New York Times, where you will find and goodly amount of headlines featuring Government 2.0 or Healthcare 2.0, and the like.

Q.What is the methodology?

A.The Global Language Monitor first established a base number of words in the language using the number of words in the generally accepted unabridged dictionaries (the O.E.D., Merriam-Webster’s, Macquarie’s, etc.), that contain the historic ‘core’ of the English language, including every word found in the historical codex of the language beginning with Beowulf, Chaucer, the Venerable Bede, on to the works of Shakespeare, the King James Bible, and the like.

The Global Language Monitor’s proprietary algorithm, the Predictive Quantities Indicator tracks the frequency of words and phrases in the global print and electronic media, on the Internet, throughout the Blogosphere, in social media as well as accessing proprietary databases (Factiva, Lexis-Nexis, etc.).

GLM then assigned a number to the rate of creation of new words and the adoption and absorption of foreign vocabulary into the language. The result, though an estimate, has been found to be quite useful as a starting point of the discussion for lay persons, students, and scholars the world over.

Q.A million sounds like a lot of words?

A.The Global Language Monitor’s estimate of the Number of Words in the English Language, is taking a relatively conservative approach. For example, the Introduction to Merriam-Webster’s 3rd International claims it was limited to the 450,000 words listed in that dictionary, because “the number of words available is always far in excess of and for a single volume dictionary many times the number that can possibly be included”. Many times the 450,000 included words, results in a number far in excess of 1,000,000.  In fact, if you included all the scientific terms, all the jargon, and all the species of like, you could claim tens of millions of words.

Q. So it is rather difficult to estimate the number of English Words.

A. Nearly impossible.  But, of course, you can make the same argument for anything a human being can measure: the number of stars in the galaxy, the number of galaxies in the universe, the number of people on the planet, the depth of the oceans, fish in the sea, moves possible on a chessboard, throughput of the latest supercomputer, amount of CO2 in the atmosphere (and hence predict Global Warming), even the number of planets in the Solar System (Take that, Pluto!).

Answers to questions like these have been settled, from the beginning of the scientific revolution and the Enlightenment, through a number of methodologies, including statistical analysis, and rigidly defining the subjects of study.We see no reason to exclude language from such inquiry.

Q.Did you count variations of words such as run, runs and running as separate words?

A.GLM counts only headwords, so run, runs, and running are only counted once.We do not count the named numerals as separate words, e.g., two hundred twenty-four thousand one hundred ten … one hundred eleven … one hundred twelve.Doing so would result in an infinite number of words since the set of named numerals is infinite.

Q. OK, sowhat makes English special?

A.The English language is not anymore special than any of the other 6,919 languages spoken on the planet.All languages are of great cultural value and are worthy of study and preservation.What is special about English, however, is the fact that it is has acquired an immense number of words and is the first truly global language.  Of course, Greek was certainly spoken throughout that part of the world conquered by Alexander, as was Latin in the Roman Empire and later throughout Medieval Europe.And French was certainly the language of diplomacy in the late nineteenth and early twentieth centuries.However English is the first language to literally span the globe.

Q.How many people now speak English?

A.In 1960, there were 250 million English speakers in the world, mostly in former British colonies; the future of English as a major language was very much in question.Today, English is spoken by some 1.85 billion people as their first, second or business language.

Q.Have your years in high technology influenced your thinking?

A.When I began in technology what would come to be known as the world wide web consisted of some 138 ‘endpoints’; today there are more than 8,000,000,000, more than one for every person on the planet.

My first computer system, was approximately 80 feet long and weighed hundreds, if not thousands, of pounds.Today, you carry all that computational power – and more – in the 4G phone in your pocket, just as your coffee maker is undoubtedly more powerful than all the computer systems aboard Apollo XI.

It is in this type of environment that one rarely ponders why something cannot be done, but rather how to do something that has never been done before.

Q.  What about newly coined words of neologisms.  What give GLM the authority to add new words into the dictionary?

A.  In the English-speaking world there is no authority that judges the ‘worthiness’ of words to become an official part of the English Language, which is one reason why English has so many more words than many other languages.

Millionth Word Finalists Announced

English Language Millionth Word Finalists Announced, including:  alcopops, bangster, de-friend, n00b, quendy-trendy, slumdog, and wonderstar

English to Pass Millionth Word June 10 at 10:22 am GMT

Million Word March Now Stands at 999,824

Austin, Texas May 29, 2009 – The Global Language Monitor today announced the finalists for the Million Word March.  The English Language will cross the 1,000,000 word threshold on June 10, 2009 at 10:22 am Stratford-Upon-Avon time.

“The Million Word milestone brings to notice the coming of age of English as the first, truly global Language”, said Paul JJ Payack, president and chief word analyst of the Global Language Monitor.  “There are three major trends involving the English language today: 1) An explosion in word creation; English words are being added to the language at the rate of some 14.7 words a day; 2) a geographic explosion where some 1.53 billion people now speak English around the globe as a primary, auxiliary, or business language; and 3) English has become, in fact, the first truly global language.”

Due to the global extent of the English language, the Millionth Word is as likely to appear from India, China, or East L.A.as it is to emerge from Stratford-upon-Avon (Shakespeare’s home town). The final words and phrases under consideration are listed below.  These words represent each of the categories of Global English that GLM tracks, Since English appears to be adding a new word every 98 minutes or about 14.7 words a day, the Global Language Monitor is selecting a representative sampling.  You can follow the English Language WordClock counting down to the one millionth word at www.LanguageMonitor.com.

These words that are on the brink of entering the language as the finalists for the One Millionth English Word:

Australia:  Alchopops – Sugary-flavored mixed drinks very much en vogue.

Chinglish:  Chengguan –   Urban management officers, a cross between mayors, sheriff, and city managers.

Economics:  1) Financial Tsunami – The global financial restructuring that seemingly swept out of nowhere, wiping out trillions of dollars of assets, in a matter of months.  2) Zombie Banks – Banks that would be dead if not for government intervention and cash infusion.

Entertainment:  Jai Ho! — From the Hindi, “it is accomplished’ achieved English-language popularity through the multiple Academy Award Winner, “Slumdog Millionaire”.

Fashion: 1) Chiconomics – The ability to maintain one’s fashion sense (chicness) amidst the current financial crisis.  2) Recessionista – Fashion conscious who use the Global economic restructuring to their financial benefit; 3) Mobama – relating to the fashion-sense of the US First Lady, as in ‘that is quite mobamaish’.

Popular Culture:  Octomom (the media phenomenon of the mother of the octuplets).

Green Living:  1) Green washing – Re-branding an old product as environmentally friendly. 2) E-vampire – Appliances and machines on standby-mode, which continually use electrical energy they ‘sleep’. 3) Slow food: — Food other than the fast-food variety hopefully produced locally (locavores).

Hinglish:   Cuddies – Ladies’ underwear or panties.

Internet:  1) De-follow – No longer following the updates of someone on a social networking site.  2) De-friend – No longer following the updates of a friend on a social networking site; much harsher than de-following. 3) Web 2.0 – The next generation of web services.

Language: Toki Pona – The only language (constructed or natural) with a trademark.

Million Word March:   MillionWordWord — Default entry if no other word qualifies.

Music:  Wonderstar – as in Susan Boyle, an overnight sensation, exceeding all realsonable expectations.

Poland:  Bangsters – A description of those responsible for ‘predatory’ lending practices, from a combination of the words banker and gangster.

Politically incorrect:  1) Slumdog – a formerly disparaging comments upon those residing in the slums of India; 2) Seatmates of size – US airline euphemism for passengers who carry enough weight to require two seats.

Politics:  1) Carbon neutral — One of the many phrases relating to the effort to stem Climate Change.  2) Overseas Contingency Operations – The Obama re-branding of the Bush War on Terror.

Sports:  Phelpsian – The singular accomplishments of Michael Phelps at the Beijing Olympics.

Spirituality:  Renewalist – Movements that encompass renewal of the spirit; also call ‘Spirit-filled’ movements.

Technology:  1) Cloud Computing – The ‘cloud’ has been technical jargon for the Internet for many years.  It is now passing into more general usage. 2) N00b — From the Gamer Community; a neophyte in playing a particular game; used as a disparaging term.  3) Sexting – Sending email (or text messages) with sexual content.

YouthSpeak:  Quendy-Trendy — British youth speak for hip or up-to-date.

Extra Credit:

French word with least chance of entering English Language:  le courriel – E-Mail.

Most recognized English-language word on the planet:  O.K.

Each word is being analyzed to determine which is attaining the greatest depth (number of citations) and breadth (geographic extent of word usage), as well as number appearances in the global print and electronic media, the Internet, the blogosphere, and social media (such as Twitter and YouTube).  The Word with the highest PQI score will be deemed the 1,000,000th English language word.  The Predictive Quantities Indicator (PQI) is used to track and analyze word usage.

Global Language Monitor has been tracking English word creation since 2003.  Once it identifies new words (or neologisms) it measures their extent and depth of usage with its PQI technology.

In Shakespeare’s day, there were only 2,000,000 speakers of English and fewer than 100,000 words.  Shakespeare himself coined about 1,700 words.  Thomas Jefferson invented about 200 words, and George W. Bush created a handful, the most prominent of which is, misunderestimate.  US President Barack Obama’s surname passed into wordhood last year with the rise of obamamania.

About The Global Language Monitor

Austin-Texas-based Global Language Monitor analyzes and catalogues the latest trends in word usage and word choices, and their impact on the various aspects of culture, with a particular emphasis upon Global English.  For more information, email info@ GlobalLanguageMonitor.com, visit www.LanguageMonitor.com, or call +1.925.367.7557.

A Million Words and Counting

If you are interested in learning more about the Million Word March, you can read about it in “A Million Words and Counting” by Paul JJ Payack.  This book from Kensington’s Citadel imprint takes you on a whirlwind tour of the English language and it dramatic impact on the various aspects of culture, including politics, the economy, entertainment, commerce and technology.  Now available as a quality paperback.

 

<!– Start of StatCounter Code for Default Guide –>

<script type=”text/javascript”>

var sc_project=790624;

var sc_invisible=1;

var sc_security=”49686678″;

</script>

<script type=”text/javascript”

src=”http://www.statcounter.com/counter/counter.js“></script>

<noscript><divil” style=”background-color: rgb(255, 255, 204); “>statcounter”><a title=”visit tracker

on tumblr” href=”http://statcounter.com/tumblr/

target=”_blank”><imgil” style=”background-color: rgb(255, 255, 204); “>statcounter”

src=”http://c.statcounter.com/790624/0/49686678/1/

alt=”visit tracker on tumblr”></a></div></noscript>

<!– End of StatCounter Code for Default Guide –>