Replacing the Turing Test
Written by Sue Gee   
Saturday, 07 February 2015

A plan is afoot to replace the Turing test as a measure of a computer's ability to think. The idea is for an annual or bi-annual Turing Championship consisting of three to five different challenging tasks.

A recent workshop  at the 2015 AAAI Conference of Artificial Intelligence was chaired by Gary Marcus, a professor of psychology at New York University. His opinion, and one that we share (see Passing The Turing Test Brings It Into Disrepute) is that the Turing Test had reached its expiry date and has become

"an exercise in deception and evasion.” 

Referring to the incident that can be regarded as the final straw, a chatbot using the persona of a 13-year old Ukrainian boy, called Eugene Goostman being hailed as the first to pass the Turing Test, Marcus wrote:

The considerable hype around the announcement—nearly every tech blog and newspaper reported on the story—ignored a more fundamental question: What, exactly, is Eugene Goostman, and what does “his” triumph over the Turing Test really mean for the future of A.I.?


eugenesmall

Marcus went on to say: 

What Goostman’s victory really reveals ... [is] the ease with which we can fool others. 

This a sentiment we also made at the time speculating:

However, before we accept that this is a real breakthrough for AI we perhaps need to ask more questions about whether this is evidence that computers can learn to think or just that computers can learn tricks.

In an article What Comes After the Turing Test? Marcus points out:

the real value of the Turing Test comes from the sense of competition it sparks amongst programmers and engineers  

which has motivated the new initiative for a multi-task competition.

 

beyondtt

 

After the recent workshop, two challenges are firm front runners for the Turing Championship. One is the language-based test proposed by Hector Levesque and built on the work of Terry Winograd that we first reported on last August, see A Better Turing Test - Winograd Schemas.

This requires participants to grasp the meaning of sentences that are easy for humans to understand through their knowledge of the world. One simple example is:

The trophy would not fit in the brown suitcase because it was too big. What was too big? 

This is an ambiguous question because "it" could refer either to the trophy or to the suitcase. The "right" answer is immediately obvious to a  human who will draw on knowledge about the relatives sizes of suitcases and trophies. In this case a computer could probably pass the test, but in other, more subtle, cases a computer might be stumped:

The town councillors refused to give the angry demonstrators a permit because they feared violence. 
Who feared violence?

The second test is a variation on one Marcus himself proposed: 

Build a computer program that can watch any arbitrary TV program or YouTube video and answer questions about its content—“Why did Russia invade Crimea?” or “Why did Walter White consider taking a hit out on Jessie?”

As Marcus points out,:

Chatterbots like Goostman can hold a short conversation about TV, but only by bluffing. (When asked what “Cheers” was about, it responded, “How should I know, I haven’t watched the show.”) But no existing program—not Watson, not Goostman, not Siri—can currently come close to doing what any bright, real teenager can do: watch an episode of “The Simpsons,” and tell us when to laugh.

It transpired that Fei-Fei Li, director of the Stanford AI Lab, was working on a similar idea using images They has therefore joined forces to create an event where a machine will face “journalist-type” questions about images, video, or audio.

Two others have emerged as likely candidates. One is an elaboration of the Watson question/answer format to produce machines that could answer elementary-school standardized-test questions, and perhaps eventually use that knowledge to tutor human students.

The last, dubbed the Ikea challenge, asks robots to co-operate with humans to build flatpack furniture. This involves interpreting written instructions, choosing the right piece, and holding it in just the right position for a human teammate to turn the screw. This at least is a useful skill that might encourage us to welcome machines into our homes.

 

 

 

Banner


DuckDB And Hydra Partner To Get DuckDB Into PostgreSQL
11/11/2024

The offspring of that partnership is pg_duckdb, an extension that embeds the DuckDB engine into the PostgreSQL database, allowing it to handle analytical workloads.



Gifts For Geeks 2024
22/11/2024

Are you ready for Thanksgiving, when overeating remorse and a surfeit of being thankful causes the unsettling thought that there are only four weeks till the Xmas break? So here is a mix of weird [ ... ]


More News

 

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Saturday, 07 February 2015 )