Did Watson Really Beat Humans on Jeopardy? We Think Not!

Analysis into the ‘natural language processing’ claim.

AUSTIN, TEXAS.  March 1, 2011 — An analysis by the Global Language Monitor has found that Watson, the IBM Computer specifically designed to compete on the Jeopardy television show was not the victory of a machine tackling ‘natural language processing’  that many had been led to believe but rather a “a massive marketing coup,” as described in the Boston Globe.

When Watson bested two live-wear, carbon-based lifeforms named Ken Jennings and Brad Rutter, on the Jeopardy Television show a few days ago, it was widely viewed as a great advance in ‘natural language processing’.  Natural Language Processing is concerned with the interactions between computers and human (natural) languages.

As Ben Zimmer in the New York Times put it, Watson “came through with flying colors.”  And he was certainly not alone in his judgment.  There were many comparisons to the John Henry man vs. machine tale where the legendary ‘steel-driving’ railroad man challenges a steam hammer, and wins, only to collapse and die shortly thereafter.  It appeared as if the entire media went a little bit gaga (no pun intended) with stories on this great milestone in cyber (and possibly human) history.

Is this analysis true?  As Steve Colbert might put it, there is some ‘truthiness’ in the statement.  Watson did, in fact, best his human competitors, but if we are to “speaking truthiness to power,” we should ensure that we fully understand the nature of the competition.

“Comments like the above missed the mark for a very simple reason,” said Paul JJ Payack, President and Chief Word Analyst at GLM.  ”Watson did not prove adept at processing language in a manner similar to humans.  In fact, computers have dramatically failed at this task for four decades now.  What Watson has accomplished is a far cry from ‘natural language processing’.

Rather what Watson achieved was a very close approximation of appearing as if it had acquired an acuity at understanding of the English language. This, in itself, is an accomplishment to be acknowledged.  (But as in the old joke goes about a dog talking, it’s not that it was done well but rather that it was done at all.)  After all, Watson was designed from the ground up as a ‘question-answering machine,’ as IBM readily admits.  However this, in itself, is not quite accurate because Watson was specifically built as a ‘Jeopardy game-show answering machine’ “.

One problem is that few commentators understand what it means to actually program a computer at all, let alone the ‘machine coding’ which might be construed as the most basic unit of computer ‘thought’.  Even those who are familiar with today’s coding techniques are familiar with HTML or a variation of C++ or Linux, etc.  All of these ‘languages’ are as distant from machine coding technology as they are from understanding the mathematics of the Higgs boson and why it has been described as the ‘God particle’ at CERN.  Unfortunately, there will be  no friendly, Watson-like, avatar that will announce from the CERN lab that the God Particle has been identified, when and if ever.  We might also find out about that discovery when (as has been estimated by the CERN staff) the acceptable risk the 1 out of 50,000,000 chance hits and the whole enterprise results in the destruction of the entire planet though the creation of an, admittedly small, black hole.

The field of artificial intelligence has for decades been handicapped with the idea of emulating humans; whether their thinking, their speaking, their chess-playing ability or their ability to perambulate.  To make the advances we have seen recently, computer scientists had to literally re-think (and in many cases reverse) their earlier positions.

The key, as found in recent research, is not to emulate humans; rather the key is to define ‘machine logic’ or how would a machine do it, given its capabilities and limitations.  In other words do not attempt to  see like the human eye sees but attempt to see as a machine would see.  Rather than teach a machine everything there is to know about how a human gets around, the task becomes to teach a machine the few basic rules it needs to move forward, back up and to work around obstacles.  This is much different than a baby learning how to crawl which involves cognition, motor skills, sight, volition, and the sense of feel.

In the same way most would construe natural language processing would be the ability to understand basic sentences, concepts or instructions in a straight-forward manner.  Is this what Watson accomplished.  Consider the following:

Here’s what Watson needed to handle the ‘natural language’ of Jeopardy.

  • 90 IBM Power 750 servers
  • Each of the 90 IBM Power 750 servers is equipped with eight processors
  • A total 2,880 Central Processing Units (CPUs)
  • 1 network-attached storage (NAS) cluster
  • 21.6TB of data
  • 15 full-time technical professionals, as well any number of advisors and consultants
  • 5 years of development time
  • ’1,000s’ of computer algorithms to run simultaneously
  • 1 overlying algorithm to review the results of all the others
  • 1 power robotic finger

Incidentally, the effort required a minimum of $100,000,000 funding for personnel, some $25,000,000 in equipment, as well as all the costs associated with cooling, administration, transportation, and the like.

All of this reminds us of Gary Kasparov losing the famous chess match to IBM’s Deep Blue back in 1997.  IBM was allowed to modify its program between games.  In effect, this let IBM programmers compensate for any Deep Blue weaknesses Kasparov exposed during the game.  How, in any way, could this be considered a level playing field?  Once this was discovered, Kasparov requested a rematch, but IBM had already dismantled Deep Blue.

As for those comparisons with the legendary ‘iron-driving man’, we have one piece of advice:  John Henry, call your lawyer.

Note:  Each year GLM releases the Top High Tech Words Everyone Uses But Nobody Quite Understands.  This year’s edition will be released in conjunction with SXSWi on March 13, 2011.

unexpected T_ENDIF in /nfs/c01/h12/mnt/44840/domains/languagemonitor.com/html/wp-content/themes/website/footer.php on line 23