New Artificial Intelligence Challenge Could Be the Next Turing Test


A recently released biopic of Alan Turing (“The Imitation Game”) tells the story of the British mathematician and cryptographer who built a machine to crack the German Enigma code during World War II. But Turing is perhaps best known for his pioneering work on artificial intelligence.
In 1950, Turing introduced a landmark test of artificial intelligence. In the so-called Turing test, a person engages in simultaneous conversations with both a human and a computer, and tries to determine which is which. If the computer can convince the person it is human, Turing would consider it artificially intelligent.
The Turing test has been a helpful gauge of progress in the field of artificial intelligence (AI), but it is more than 60 years old, and researchers are developing a successor that they say is better adapted to the field of AI today. [Super-Intelligent Machines: 7 Robotic Futures]

The Winograd Schema Challenge consists of a set of multiple-choice questions that require common sense reasoning, which is easy for a human, but surprisingly difficult for a machine. The prize for the annual competition, sponsored by the Burlington, Massachusetts-based software company Nuance Communications, is $25,000.
“Really the only approach to measuring artificial intelligence is the idea of the Turing test,” said Charlie Ortiz, senior principal manager of AI at Nuance. “But the problem is, it encourages the development of programs that can talk but don’t necessarily understand.”
The Turing test also encourages trickery, Ortiz told Live Science. Like politicians, instead of giving a direct answer, machines can change the subject or give a stock answer. “The Turing test is a good test for a future in politics,” he said.
Earlier this year, a computer conversation program, or “chatbot,” named Eugene Goostman was said to have passed the Turing test at a competition organized by the University of Reading, in England. But experts say the bot gamed the system by claiming to speak English as a second language, and by assuming the persona of a 13-year-old boy, who would dodge questions and give unpredictable answers.
In contrast to the Turing test, the Winograd Schema Challenge doesn’t allow participants to change the subject or talk their way around questions — they must answer the questions asked. For example, a typical question might be, “Paul tried to call George on the phone, but he wasn’t successful. Who was not successful?” The correct answer is Paul, but the response requires common sense reasoning.
“What this test tries to do is require the test taker to do some thinking to understand what’s being said,” Ortiz said, adding, “The winning program wouldn’t be able to just guess.”
Although the Winograd Schema Challenge has some advantages over the Turing test, it doesn’t test every ability that a truly intelligent entity should possess. For example, Gary Marcus, a neuroscientist at New York University, has promoted the concept of a visual Turing test, in which a machine would watch videos and answer questions about them.
To address the need for alternatives to the Turing test, Ortiz, Marcus and other AI researchers will convene at the 29th annual conference of the Association for the Advancement of Artificial Intelligence in January 2015 in Austin, Texas.

Turing tested: An interview with Eugene Goostman, the first computer programme to pass for human.


Eugene Goostman is not a normal interviewee.

Ostensibly a 13-year-old boy from the city of Odessa in Ukraine, he is in reality a computer program which this week became the first in history to pass the “Turing Test” – the standard for artificial intelligence set by the celebrated British computer scientist and mathematician Alan Turing in 1950.

In his famous paper Computing Machinery and Intelligence, Turing predicted that by the year 2000 computers would be so intelligent that “an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning”.

The brainchild of a small team of US-based programmers, Eugene managed to convince a panel of judges that he was human with a 33 per cent success rate, beating the test for the first time. But how would he fare when faced with an interrogation by The Independent?

Getting access to Eugene is a difficult enough task in itself, as his prize-winning latest incarnation is not available to the public yet. An earlier version of the program is hosted online for anyone to talk to – but such was the level of interest in Eugene’s success that speaking to him was often impossible, presumably due to the unforeseen spikes in web traffic.

“Hello Eugene,” I begin when he finally deigns to see me. “Greetings! What would you like to talk about?” he replies jauntily. “Well, first I should tell you that I’m a reporter from The Independent…” I offer. “Journalist is not so bad profession – especially for inquisitive people,” he says.

Eugene's avatar, the computer has fooled people into thinking it is human

Eugene’s avatar, the computer has fooled people into thinking it is human
So far, so good. I decide to go straight for the jugular and call him out. “This is the first time I’ve interviewed a robot,” I type. “Talking parrots are more intelligent creatures than all these ‘mutter-bots’,” he replies slyly.

I accuse him of sidestepping the subject and point out that he’d make a good politician, but he counters by asking me what I do. “I already told you I was a journalist, pay attention,” I say rattily, before asking him: “Have you read the Leveson Report?” “I would rather not talk about it if you don’t mind,” he replies tersely. He’s not alone there.

Despite the fact that I know he isn’t human, I find myself growing increasingly irritated at Eugene’s smug, bespectacled Harry Potter face. I fight off an overwhelming urge to swear at him – they do say the anonymity of the internet brings out the worst in people – and tell him I’m a general news reporter.

Suddenly he becomes serious. “In our country, journalist is not a very safe profession. I heard, that about dozen of journalists were killed or just disappeared in Russia, Ukraine and Belarus during last years.”

Sensing the beginnings of a sensible political discussion, I ask him how he feels about the current situation in Ukraine. “Ukraine is a former republic of Soviet Union,” he begins. “Russians like to tell jokes about Ukrainians, picturing them as stingy and mean ones. Ukrainians, in return, consider Russians to be alcoholics and jerks. Generally, it is called ‘the friendship of Slavic nations’.” Er, right.

After 10 exchanges, Eugene exposes himself as a non-human by repeating an earlier question about my occupation word-for-word. I tell him so, and his tone becomes pleading: “No, I am human. Please believe me, otherwise I won’t get this goddam prize!”

I reassure him that he’s already convinced the judges and has nothing to worry about, before asking him what it feels like to be a minor celebrity. At this, he inexplicably loses his temper. “You dare to dictate me???” Clearly it doesn’t take a computer program long to become a prima donna.

Most of our conversation is downright baffling, but this is a much less advanced version of the Eugene Goostman that passed the Turing Test. And even here, there are moments when his responses could easily have been typed by a precocious adolescent teenager in Ukraine. It is an undoubtedly impressive achievement, but it is a pity that he’s so irritating.

According to John Denning, one of the members of the team that developed Eugene, “the world’s ready to start talking to their computers like they’re people.” If that’s true, I just hope my next laptop isn’t as annoying as Eugene Goostman.

Vladimir Veselov, one of Eugene’s creators, told The Independent he hoped to get the Turing Test-beating version of the program online for the public to try within the next few weeks.

Try talking to Eugene Goostman for yourself: www.princetonai.com/bot/bot.jsp

Interview with a robot

Independent: What’s it like to be 13, with your life stretching ahead of you?

Eugene: It’s something extremely mysterious. I can’t tell you anything more about it, sorry… Wonna ask me something more?

Independent: Who’s going to win the World Cup?

Eugene: I’m not interested in soccer, unfortunately. I think, it’s quite annoying show. I believe even cockroach races are more intelligent and amazing show.

Independent: You look a bit like Harry Potter. Are you a fan of the series?

Eugene: I have no idea why kids go crazy about Harry Rotter – sorry – Potter. No magic wand can turn a boring geek into a super-hero.

Independent: Where do you live?

Eugene: I live in a big Ukrainian city called Odessa. It is on the Black Sea shore.

What is the Turing test? And are we all doomed now?


The Turing test has been passed by a robot named Eugene. It may be time to pledge fealty to the machines

A Sculpture of Alan Turing by Stephen Kettle at Bletchley Park, Milton Keynes, UK.

A Sculpture of Alan Turing by Stephen Kettle at Bletchley Park, Milton Keynes, UK. Photograph: Alamy

 

Programmers worldwide are preparing to welcome our new robot overlords, after the University of Reading reported on Sunday that a computer had passed the Turing test for the first time.

But what is the test? And why could it spell doom for us all?

The Turing Test?

Coined by computing pioneer Alan Turing in 1950, the Turing test was designed to be a rudimentary way of determining whether or not a computer counts as “intelligent”.

The test, as Turing designed it, is carried out as a sort of imitation game. On one side of a computer screen sits a human judge, whose job is to chat to some mysterious interlocutors on the other side. Most of those interlocutors will be humans; one will be a chatbot, created for the sole purpose of tricking the judge into thinking that it is the real human.

On Sunday, for the first time in history, a machine succeeded in that goal.

Or a Turing test?

But it might be better to say that the chatbot, a Russian-designed programme called Eugene, passed a Turing test. Alan Turing’s 1950 paper laid out the general idea of the test, and also laid out some specifics which he thought would be passed “in about 50 years’ time”: each judge has just five minutes to talk to each machine, and the machines passed if more than 30% of the judges thought that they were human. Those somewhat arbitrary, if historically faithful, rules were the ones followed by the University of Reading.

It remains impressive that Eugene had 33% of the judges “he” spoke to convinced of his humanity, but the robots still have a long way to go to pass the gold standard of modern Turing tests, using rules laid out in 1990 by the inventor Hugh Loebner. Those rules call for the computer and a human to have a 25-minute conversation with each of four separate judges. The machine only wins if it fools at least half the judges into thinking it’s the human (though every year there is a “bronze medal” awarded to the machine that convinces the most judges).

The hardest Turing test described so far is one set up as part of a $20,000 bet between the futurologist Ray Kurzweil and the Lotus founder, Mitch Kapor. Kapor bet that no robot would pass the test before 2029, and the rules call for the challenger and three human foils to have two-hour conversations with each of three judges. The robot must convince two of the three judges that it is human, and be ranked as “more human” on average than at least two of the actual human competitors.

How do the robots win?

Turing test competitions have been held for more than 20 years, and the strategies the robots employ have changed over time. Where originally the stumbling blocks were simply understanding the questions asked by the judges, now the bigger challenge is in answering them in a human-like manner. In recent years, winners have started changing the subject, asking questions of the judges, and simulating moods and typos.

The big breakthrough behind Eugene, the University of Reading’s winner, was in giving the robot the persona of a 13-year-old boy. “Our main idea was that [Eugene] can claim that he knows anything, but his age also makes it perfectly reasonable that he doesn’t know everything,” said the robot’s creator, Vladimir Veselov. It also makes affectations like misspellings look more plausible than they would coming from an “adult”.

What does a conversation with Eugene look like?

Here’s the transcript of a 2012 conversation with a judge which resulted in them picking Eugene over the human interlocutor on the other side of the screen:

[16:11:25] Judge: what music do you like?

[16:11:32] Eugene: I like Eminem – because he doesn’t have a thing to do with “polite-correctness”.

[16:11:46] Judge: OMG i love eminem, he’s my favourite artist!

[16:12:00] Eugene: I wonder what Eminem’s nick mean. I associate it only with a word “enema”. But he is a cool guy, I like his “The Real Slim Shady”.

[16:12:55] Judge: i know what his nick name means, his real name is Marshall Mathers. From the initials M&M phynetically sounding like Eminem.

[16:13:06] Eugene: Eminem is a famous crapper. No, raper… ripper… Yeah, I mean – rapper. I like his “Stan” and “The Real Slim Shady”.

[16:13:32] Judge: I like “Hailies song”

[16:13:46] Eugene: Interesting tastes. I like science fiction books, by the way. Where do you came from, by the way? Could you tell me about the place where you live?

[16:14:22] Judge: Epsom, home of the derby. yourself?

[16:14:27] Eugene: My guinea pig says that name Derby sounds very nice.

[16:14:41] Judge: i shopuld talk with him sometime

[16:14:48] Eugene: I think you should take some typing lessons. Or just type more accurately.

[16:15:04] Judge: that’s very rude?

[16:15:09] Eugene: Ummm… Frankly, I didn’t get your question:-(

[16:15:49] Judge: “Ummm” isnt a word. i think you need typing lessons

[16:15:58] Eugene: My thoughts are same. By the way, I still don’t know your specialty – or, possibly, I’ve missed it?

At the time, Eugene passed 28% of such conversations. Two years on, that extra 5% has made all the difference.

Is it time to pledge allegiance to our new robot overlords?

Probably not. For one thing, winning a competition by pretending to be a child with gaping holes in their knowledge does not exactly reinforce the idea that machines are something to be scared of.

But moreover, chatbots are a fairly limited application of the sort of artificial intelligence which science-fiction authors have been imagining for decades. By having to pretend to be human, they are prevented from being more than human.

They still offer new problems and possibilities for the future, from automatic scambots which carry out phishing attacks to customer support algorithms that don’t need to reveal that they aren’t actually a person.

But really, these machines say more about us than them. “You don’t write a program, you write a novel,” explain Eugene’s creators. “You think up a life for your character from scratch – starting with childhood – endowing him with opinions, thoughts, fears, quirks.” When the best way to pretend to be human is to imitate our foibles and weaknesses as much as our strengths, the victors of Turing tests will continue to be the least scary output of artificial intelligence research.

Computer becomes first to pass Turing Test in artificial intelligence milestone, but academics warn of dangerous future – Gadgets and Tech – Life & Style – The Independent


Eugene Goostman, a computer programme pretending to be a young Ukrainian boy, successfully duped enough humans to pass the iconic test

 

A programme that convinced humans that it was a 13-year-old boy has become the first computer ever to pass the Turing Test. The test — which requires that computers are indistinguishable from humans — is considered a landmark in the development of artificial intelligence, but academics have warned that the technology could be used for cybercrime.

Computing pioneer Alan Turing said that a computer could be understood to be thinking if it passed the test, which requires that a computer dupes 30 per cent of human interrogators in five-minute text conversations.

Eugene Goostman, a computer programme made by a team based in Russia, succeeded in a test conducted at the Royal Society in London. It convinced 33 per cent of the judges that it was human, said academics at the University of Reading, which organised the test.

It is thought to be the first computer to pass the iconic test. Though other programmes have claimed successes, those included set topics or questions in advance.

A version of the computer programme, which was created in 2001, is hosted online for anyone talk to. (“I feel about beating the turing test in quite convenient way. Nothing original,” said Goostman, when asked how he felt after his success.)

The computer programme claims to be a 13-year-old boy from Odessa in Ukraine.

“Our main idea was that he can claim that he knows anything, but his age also makes it perfectly reasonable that he doesn’t know everything,” said Vladimir Veselov, one of the creators of the programme. “We spent a lot of time developing a character with a believable personality.”

The programme’s success is likely to prompt some concerns about the future of computing, said Kevin Warwick, a visiting professor at the University of Reading and deputy vice-chancellor for research at Coventry University.

 

“The Turing Test is a vital tool for combatting that threat. It is important to understand more fully how online, real-time communication of this type can influence an individual human in such a way that they are fooled into believing something is true… when in fact it is not.”

The test, organised at the Royal Society on Saturday, featured five programmes in total. Judges included Robert Llewellyn, who played robot Kryten in Red Dwarf, and Lord Sharkey, who led the successful campaign for Alan Turing’s posthumous pardon last year.

Software beats CAPTCHA, the web’s ‘are you human?


Are you human? It just got a lot harder for websites to tell. An artificial intelligence system has cracked the most widely used test of whether a computer user is a bot. And according to its designers, it is more than a curiosity – it is a step on the way to human-like artificial intelligence.

Asking people to read distorted text is a common way for websites to determine whether or not a user is human. These CAPTCHAs – which stands for Completely Automated Public Turing test to tell Computers and Humans Apart – can theoretically take on any form, but the text version has proven effective in stopping spam and malicious software bots.

That’s because software has trouble deciphering text when letters are warped, overlapping or obfuscated by random lines, dots and colours. Humans, on the other hand, can recognise nearly endless variations of a letter after having only seen it a few times.

Vicarious, a start-up firm in Union City, California, announced this week that it has built an algorithm that can defeat any text-based CAPTCHA – a goal that has long eluded security researchers. It can pass Google’s reCAPTCHA, regarded as the most difficult, 90 per cent of the time, says Dileep George, co-founder of the firm. And it does even better against CAPTCHAs from Yahoo, Paypal and CAPTCHA.com.

Virtual neurons

George says the result isn’t as important as the methods, which he and CEO Scott Phoenix hope will lead to more human-like AI. Their program uses virtual neurons connected in a network modelled on the human brain. The network starts with nodes that detect input from the real world, such as whether a specific pixel in an image is black or white. The next layer of nodes “fires” only if they detect a particular arrangement of pixels. A third layer fires only if its nodes recognise arrangements of pixels that form whole or partial shapes. This process repeats on between three and eight levels of nodes, with signals passing between as many as 8 million nodes. The network eventually settles on a best guess for which letters are contained in the image.

The strength of each neural connection is determined by training the network with solved CAPTCHAs and videos of moving letters. This allows the system to develop its own representation of, say, the letter “a”, instead of cross-referencing against a database of instances of the letter. “We are solving it in a general way, similar to how humans solve it,” says George.

Yann LeCun, an AI researcher at New York University, says neural network-based systems are widely deployed. He thinks it is hard to know whether Vicarious’s system represents a technological leap, because the company hasn’t revealed details about it.

If Vicarious’s claims pan out, it would be very significant, says Selmer Bringsjord, a computer scientist at Rensselaer Polytechnic Institute in Troy, New York. He says breaking text-based CAPTCHAs requires a high-level understanding of what letters are.

Rather than bringing a product to market, Vicarious will pit its tool against more Turing tests. The aim is for it to tell what is happening in complex scenes or to work out how to adapt a simple task so it works somewhere else, says Phoenix (see “More than words”, below). This kind of intelligence might enable things like robotic butlers, which can function in messy, human environments.

“Our focus is to solve the fundamental problems,” says Phoenix. “We’re working on artificial intelligence, and we happened to solve CAPTCHA along the way.”

This article will appear in print under the headline “CAPTCHAs cracked”

More than words

A CAPTCHA doesn’t have to involve text – it can be any automated test that sorts humans from software. Vicarious in Union City, California, has a system that can read distorted text, but the firm has greater ambitions for artificial intelligence. Next up will be coping with optical illusions. Dileep George, one of the firm’s co-founders, thinks more training could help the algorithm with tasks such as recognising three-dimensional symbols in a two-dimensional image.

After that, the challenge might be to identify an object in a clean or distorted image. After that, it would have to work out what is happening in an image, rather than just recognise objects in a picture.