The Truth About Sentient AI: Could Machines Ever Really Think or Feel?


“We’re talking about more than just code; we’re talking about the ability of a machine to think and to feel, along with having morality and spirituality,” a scientist tells us.

humanoid robots facing each other, illustration

Gear-obsessed editors choose every product we review. We may earn commission if you buy from a link. Why Trust Us?

Amid the surge of interest in large language model bots like ChatGPT, an Oxford philosopher recently claimed that artificial intelligence has shown traces of sentience. We currently see AI reaching the singularity as a moving goalpost, but Nick Bostrom, Ph.D. claims that we should look at it as more of a sliding scale. “If you admit that it’s not an all-or-nothing thing … some of these [AI] assistants might plausibly be candidates for having some degree of sentience,” he told The New York Times.

To make sense of Bostrom’s claim, we need to understand what sentience is and how it differs from consciousness within the confines of AI. Both of these phenomena are closely related and have been discussed in philosophy long before artificial intelligence entered the picture. It’s no small accident then that sentience and consciousness are often conflated.

Plain and simple, all sentient beings are conscious beings, but not all conscious beings are sentient. But what does that actually mean?

Consciousness

Consciousness is your own awareness that you exist. It’s what makes you a thinking, sentient being—separating you from bacteria, archaea, protists, fungi, plants, and certain animals. As an example, consciousness allows your brain to make sense of things in your environment—think of it as how we learn by doing. American psychologist William James explains consciousness as a continuously moving, shifting, and unbroken stream—hence the term “stream of consciousness.”

Sentience

Star Trek: The Next Generation looks at sentience as consciousness, self-awareness, and intelligence—and that was actually pretty spot on. Sentience is the innate human ability to experience feelings and sensations without association or interpretation. “We’re talking about more than just code; we’re talking about the ability of a machine to think and to feel, along with having morality and spirituality,” Ishaani Priyadarshini, a Cybersecurity Ph.D. candidate from the University of Delaware, tells Popular Mechanics.

💡AI is very clever and able to mimic sentience, but never actually become sentient itself.

Philosophical Difficulties

The very idea of consciousness has been heavily contested in philosophy for decades. The 17th-century philosopher René Descartes famously said, “I think therefore I am.” A simple statement on the surface, but it was the result of his search for a statement that couldn’t be doubted. Think about it: he couldn’t doubt his existence as he was the one doubting himself in the first place.

Multiple theories talk about the biological basis of consciousness, but there’s still little agreement on which should be taken as gospel. The two schools of thought look at whether consciousness is a result of neurons firing in our brain or if it exists completely independently from us. Meanwhile, quite a lot of the work that’s been done to identify consciousness in AI systems merely looks to see if they can think and perceive the same way we do—with the Turing Test being the unofficial industry standard

While we have reason to believe AI can exhibit conscious behaviors, it doesn’t perceive consciousness—or sentience, for that matter—in the same way that we do. Priyadarshini says AI involves a lot of mimicry and data-driven decision making, meaning it theoretically could be trained to have leadership skill. So, it would be able to feign the business acumen needed to process difficult business decisions through data-driven decision-making for example. AI’s current fake-it-till-you-make-it strategy makes it incredibly difficult to classify if it’s truly conscious or sentient.

Can We Test For Sentience or Consciousness?

View full post on Youtube

Many look at the Turing Test as the first standardized evaluation for discovering consciousness and sentience in computers. While it has been highly influential, it’s also been widely criticized.

In 1950, Alan Turing created the Turing Test—initially known as the Imitation Game—in an effort to discover if computing “machines” could exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. Human evaluators would engage in blind, text-based conversations with a human and a computer. The computer passes the test if its conversational virtuosity can dupe the evaluator into not being able to reliably identify it from the human participant; this would mean that the system is sentient.

The Turing Test has returned to the spotlight with AI models like ChatGPT that are tailor-made to replicate human speech. We’ve seen conflicting claims about whether ChatGPT has actually passed the Turing Test, but its abilities remain apparent; for perspective, the famed AI model has passed the Bar exam, the SAT, and even select Chartered Financial Analyst (CFA) exams. That’s all well and good, but the fact still remains that many experts believe we need an updated test to evaluate this latest AI tech—and that we are possibly looking at AI completely wrong.

Alternatives to the Turing Test

Many experts have stated that it’s time for us to create a new Turing Test that provides a more realistic measure of AI’s capabilities. For instance, Mustafa Suleyman recently published his book, The Coming Wave: Technology, Power, and The Twenty-First Century’s Greatest Dilema, which not only talks about a new benchmark, but also how our understanding of AI needs to change. The book talks about a misplaced narrative about AI’s ability to match or surpass the intelligence of a human being—sometimes referred to as artificial general intelligence.

Rather, Suleyman believes in what he calls artificial capable intelligence (ACI) which refers to programs that can complete tasks with little human interaction. This next-generation Turing Test asks AI to build a business game plan that would be able to turn $100,000 of seed money into $1 million. This would all have to center around e-commerce by putting together a blueprint for a product and how it would be sold—ie: Alibaba, Amazon, Walmart, etc. AI systems are currently unable to pass this theoretical test, but that hasn’t stopped wannabe entrepreneurs from asking ChatGPT to dream up the next great business idea. Regardless, sentience remains a moving target that we can aim for.

Suleyman writes in his book that he doesn’t really care about what AI can say. He cares about what it can do. And we think that really says it all.

What Happens If AI Becomes Sentient?

We often see a fair amount of doom and gloom associated with AI systems becoming sentient, and reaching the point of singularity; this is defined as the moment machine intelligence becomes equal to or surpasses that of humans. We’ve seen signs of sentience as early as 1997 when IBM’s Deep Blue supercomputer beat Garry Kasparov in a chess match.

In reality, our biggest challenge with AI reaching singularity is eliminating bias while programming these systems. I always revisit Priyadarshini’s 21st-century example of the Trolley Problem, a hypothetical scenario where the occupant of a driverless car approaches an intersection. Five pedestrians jump into the road and there’s little time to react. While swerving out of the way would save the pedestrians, the resulting crash would kill the driver. The Trolley Problem itself is a moral dilemma comparing what is good to what sacrifices are acceptable.

AI is currently nothing more than decision-making based on rules and parameters, so what happens when it has to make ethical decisions? We don’t know. Away from the confines of the Trolley Problem, Bostrom mentions that with room for AI to learn and grow, there’s a chance that these large-language models will be able to develop consciousness, but the resultant capabilities are still unknown. We don’t actually know what sentient AI would be capable of doing because we’re not superintelligent ourselves.

Never mind the Elon—the forecast isn’t that spooky for AI in business


Don’t fear the machines—AI tech isn’t nearly ready to think for itself.

 Space Imaging’s IKONOS satellite detected this jack-o-lantern corn maze in Bell County, Kentucky. Satellite images are being paired with other data to find a totally different sort of pattern—predicting crop yields and failures.

Despite Elon Musk’s warnings this summer, there’s not a whole lot of reason to lose any sleep worrying about Skynet and the Terminator. Artificial Intelligence (AI) is far from becoming a maleficent, all-knowing force. The only “Apocalypse” on the horizon right now is an over reliance by humans on machine learning and expert systems, as demonstrated by the deaths of Tesla owners who took their hands off the wheel.

Examples of what currently pass for “Artificial Intelligence”—technologies such as expert systems and machine learning—are excellent for creating software that can help in contexts that involve pattern recognition, automated decision-making, and human-to-machine conversations. Both types have been around for decades. And both are only as good as the source information they are based on. For that reason, it’s unlikely that AI will replace human beings’ judgment on important tasks requiring decisions more complex than “yes or no” any time soon.

Expert systems, also known as rule-based or knowledge-based systems, are when computers are programmed with explicit rules, written down by human experts. The computers can then run the same rules but much faster, 24×7, to come up with the same conclusions as the human experts. Imagine asking an oncologist how she diagnoses cancer and then programming medical software to follow those same steps. For a particular diagnosis, an oncologist can study which of those rules was activated to validate that the expert system is working correctly.

However, it takes a lot of time and specialized knowledge to create and maintain those rules, and extremely complex rule systems can be difficult to validate. Needless to say, expert systems can’t function beyond their rules.

One-trick pony

Machine learning allows computers to come to a decision—but without being explicitly programmed. Instead, they are shown hundreds or thousands of sample data sets and told how they should be categorized, such as “cancer | no cancer,” or “stage 1 | stage 2 | stage 3 cancer.”

Sophisticated algorithms “train” on those data sets and “learn” how to make correct diagnoses. Machine learning can train on data sets where even a human expert can’t verbalize how the decision was made. Thanks to the ever-increasing quantity and quality of data being collected by organizations of all types, machine learning in particular has advanced AI technologies into an ever-expanding set of applications that will transform industries—if used properly and wisely.

There are some inherent weaknesses to machine learning, however. For example, you can’t reverse-engineer the algorithm. You can’t ask it how a particular diagnosis was made. And you also can’t ask machine learning about something it didn’t train on.

For instance, a classic example of machine learning is to show it pictures of pets and have it indicate “cat | dog | both | neither.” Once you’ve done that, you can’t ask the resulting machine learning system to decide if an image contains a poodle or a cow—it can’t adapt to the new question without retraining or the addition of one more layer of machine learning.

Viewed as a type of automation, AI techniques can greatly add to business productivity. In some problem areas, AI is doing great, and that’s particularly true when the decision to be made is fairly straightforward and not heavily nuanced.

I’m beginning to see a pattern here

One of the most widely applied types of machine learning is pattern recognition, based on clustering and categorization of data. Amazon customers have already experienced how machine learning-based analytics can be used in sales: Amazon’s recommendation engine uses “clustering” based on customer purchases and other data to determine products someone might be interested in.

Those sorts of analytics have been used in brick-and-mortar stores for years—some groceries place “clustered” products on display near frequently purchased items. But machine learning can automate those sorts of tasks in something approaching real time.

Machine learning excels in all sorts of pattern recognition—in medical imaging, financial services (“is this a fraudulent credit-card transaction?”), and even IT management (“if the server workload is too high, try these things until the problem goes away”).

That sort of automation based on data is being used outside the retail world to drive other routine tasks. The startup Apstra, for example, has tools that use machine learning and real-time analytics to automatically fine tune and optimize data center performance, not only reducing the need for some IT administrative staff but also reducing the need to upgrade hardware.

Another startup, Respond Software, has expert systems that corporate Security Operations Centers (SOCs) can use to automatically diagnose and escalate security incidents. And Darktrace, another security vendor, uses machine learning to identify suspicious behavior on networks—the company’s Enterprise Immune System looks for activities that fall outside of previously observed behaviors, and it alerts SOC staffers to things that may be of interest. And a module called Antigena can automate response to detected problems, disrupting network connections that appear to be malicious.

Human intelligence

Machine learning has also been applied to analysis of more human communications. With a good bit of work by data scientists and developers up front, machine learning algorithms have been able to relatively reliably detect the “sentiment” of a piece of text—determining whether the contents are positive or negative. That has begun to be applied to “text mining” in social media and to image processing as well.

Microsoft’s Project Oxford created an application interface for checking the emotional expression of people in images and also created a text-processing API that detects sentiment. IBM’s Watson also performs this sort of analysis with its Tone Analyzer, which can rank the emotional weight of tweets, e-mails, and other texts.

These types of technologies are being integrated into customer service systems, which identify customer complaints about products or services and prompt a human to respond to them. IBM partnered with Genesys to build Watson into Genesys’ “Customer Experience Platform,” providing a way to respond to customer questions directly and connect people with complaints to employees armed with the best information to resolve them. The system has to learn from humans along the way but gradually improves in responses—though the effectiveness of the system has yet to be fully tested.

Even the ultimate people field—human resources—is benefitting from AI in terms of measuring worker productivity and efficiency, conducting performance reviews, and even deploying intelligent chatbots that can help employees schedule vacations or express concerns to management using plain language. AI startups are optimizing mundane HR tasks: Butterfly offers coaching and mentoring, Entelo helps recruiters scour social media to find employment candidates, and Textio helps with writing more effective job descriptions.

But AI doesn’t do well with uncertainty, and that includes biases in the training data or in the expert rules. Different doctors, after all, might honestly make different diagnoses or recommend different treatments. So, what’s the expert diagnosis system to do?

An often-discussed case of machine learning is screening college admission applications. The AI was trained on several years’ admissions files, such as school report cards, test scores, and even essays and was told whether the student had been admitted or rejected by human admission officers.

The goal was to mimic those admissions officers, and the system worked—but also mimicked their implicit flaws, such as biases toward certain racial groups, socio-economic classes, and even activities like team sports participation. The conclusion: technical success but epic fail otherwise.

Until there are breakthroughs in handling ambiguity or disagreements in rules and implicit or explicit biases in training data, AI will struggle.

Help wanted

To get better, machine learning systems need to be trained on better data. But in order to understand that data, in many cases, humans have to pre-process the information—applying the appropriate metadata and formatting, then directing machine learning algorithms at the right parts of data to get better results.

Many of the advances being made in machine learning and artificial intelligence applications today are happening because of work done by human experts across many fields to provide more and better data.

Cheap historical satellite imagery and improved weather data, for example, make it possible for machine learning engines to forecast crop failures in developing countries. Descartes Labs was able, using LANDSAT 8 satellite data, to build a 3.1 trillion pixel mosaic of the world’s arable land and track changes in plant growth. Combined with meteorological data, the company’s machine learning-based system was able to accurately predict corn and soybean yield in the US, county by county. With the increasingly large volume of low-cost satellite imagery and pervasive weather sensors, forecasting systems will continue to become more accurate—with the help of data scientists and other human experts.

Forecasting of other sorts may well change the shape of businesses. A recent paper by researchers at Nayang Technological University in Singapore demonstrated that machine learning forecasts using neural networks could more accurately forecast manufacturing demand, allowing companies to better plan their inventory than when using expert systems or other forecasting methodologies that rely just on time-series data, particularly in industries with “lumpy” demand—where demand is either high or low but seldom in between—because the systems can find patterns without being told how to model the data in advance.

These sorts of systems, as they grow more complex and apply more types of data, could provide businesses and organizations with the power to find patterns in even more vast datasets. But while we can use AI to help humans make decisions about things we already know how to do, we can’t send AI-based agents into the true unknown without human oversight to provide expert rules or create new training data from scratch.

While some AI systems, like IBM’s Watson or Amazon’s Alexa, can hoover in huge amounts of unstructured data from the Internet and use it for text-based searches and building up a knowledge base to help answer questions, that won’t help in creating new training databases for pattern recognition, at least not yet. The science-fiction trope of computers intelligently autonomously searching for its own data sources (and for some inexplicable reason, flashing black-and-white battlefield pictures on a screen) is beyond today’s AI—and beyond tomorrow’s as well. The decisions—and the questions—will continue to have to be made by humans.

Stephen Hawking Says We Should Really Be Scared Of Capitalism, Not Robots


“If machines produce everything we need, the outcome will depend on how things are distributed.”


Machines won’t bring about the economic robot apocalypse ― but greedy humans will, according to physicist Stephen Hawking.

In a Reddit Ask Me Anything session on Thursday, the scientist predicted that economic inequality will skyrocket as more jobs become automated and the rich owners of machines refuse to share their fast-proliferating wealth.

If machines produce everything we need, the outcome will depend on how things are distributed. Everyone can enjoy a life of luxurious leisure if the machine-produced wealth is shared, or most people can end up miserably poor if the machine-owners successfully lobby against wealth redistribution. So far, the trend seems to be toward the second option, with technology driving ever-increasing inequality.

Essentially, machine owners will become the bourgeoisie of a new era, in which the corporations they own won’t provide jobs to actual human workers.

As it is, the chasm between the super rich and the rest is growing. For starters, capital ― such as stocks or property ― accrues value at a much faster rate than the actual economy grows, according to the French economist Thomas Piketty. The wealth of the rich multiplies faster than wages increase, and the working class can never even catch up.

But if Hawking is right, the problem won’t be about catching up. It’ll be a struggle to even inch past the starting line.

Elon Musk Says the Future of Humanity Depends on Us Merging With Machines


As society becomes increasingly automated by robots and artificial intelligence systems, one of the ways forward for humanity will be to physically merge with machines, Elon Musk said this week.

In comments made at the World Government Summit in Dubai on Monday, the billionaire entrepreneur behind Tesla and SpaceX said the kind of daily dependence we already have on personal technology will only increase as time goes on – to the point where human intelligence and machine thinking effectively become one.

“To some degree we are already a cyborg – you think of all the digital tools that you have – your phone, your computer,” Musk told the crowd.

“The applications that you have. The fact that you can ask a question and instantly get an answer from Google and other things.”

Calling this existing dependence on personal technology a “digital tertiary layer”, Musk said it’s effectively already part of our neural biology, affecting the way we think, but also with the ability to outlive our physical selves.

“Think of the limbic system — the animal brain and the cortex as the thinking part of the brain, and your digital self as a third layer,” he said. “If you die, your digital ghost is still around. All of their emails, and social media – that still lives if they die.”

While it’s easy to see the logic in this, the more ‘out there’ part of Musk’s vision is how humanity will evolve in the future – especially in a world where robots can perform many of the jobs human currently do, and even outperform us.

In that kind of a world, where huge numbers of unemployed workers subsist on a universal basic income, how will people find satisfaction?

Musk’s answer is that humans will deepen our ties with technology even further, to the point where we become cyborgs, as a way of upgrading our inherent natural abilities.

“Over time we will see a closer merger of biological intelligence and digital intelligence. It is all about the bandwidth of the brain,” he said.

“Some high-bandwidth interface to the brain will be something which helps achieve symbiosis between human and machine intelligence, which solves a control and usefulness problem.”

It’s not the first time Musk has talked along these lines.

Last year, he spoke about the need for humans to consider getting brain implants in the future to keep up with the rapid evolution of AI – a technology Musk fears could become dangerous to the world if it isn’t tightly controlled.

He’s also been known to share some other, shall we say interesting observations about life as we know it.

This is the guy who says it’s not impossible – and maybe even likely – that the physical Universe we perceive is a giant simulation being run by an alien civilisation more advanced than our own, with Musk calculating that “the odds that we’re in base reality is one in billions”.

He’s also suggested that we nuke Mars to make it habitable – but we might want to hold off on that for now, since newer developments with SpaceX rockets could have us making return trips to the Red Planet in less than 10 years.

Of course, as much as we love the idea of augmenting our biological selves with cybernetic enhancements that give us all kinds of new abilities, Musk’s visions of tomorrow’s Earth have their problems too.

In a future world where it’s illegal to drive cars because self-driving vehicles are so much safer, and there aren’t any jobs or even household chores to occupy us, making a fresh start on Mars doesn’t actually sound like such a bad idea.

One ticket, please.

Becoming Human: Intel is Bringing the Power of Sight to Machines


IN BRIEF

Intel acquires eight-year old startup Movidius to position itself as the leader in computer vision and depth-sensing technologies. While the details of the acquisition remain undisclosed, Intel and Movidius both stand to gain from this deal.

COMPUTER VISIONARY

No, this is not the beginning of a Terminator-esque world.

But yes, it certainly is a start of major developments in computer vision and machine learning technology. Intel is intent to boost its RealSense platform by acquiring Dublin-based computer vision startup Movidius.

With Intel’s existing framework, coupled with Movidius’ power-efficient system on chip (SoC), the pairing is bound to lead to major developments in consumer and enterprise products.

“As part of Intel, we’ll remain focused on this mission, but with the technology and resources to innovate faster and execute at scale. We will continue to operate with the same eagerness to invent and the same customer-focus attitude that we’re known for,” Movidius CEO Remi El-Ouazzane writes in a statement posted in their site.

TO ADD SIGHT TO MACHINES

With the existing applications of Intel’s RealSense platform, Movidius is even better equipped to realize its dream of giving sight to machines. But Movidius is not the only one that will benefit from this deal.

Movidius-Intel-drone-HZ-855x500
Remi El-Ouazzane and Josh Walden. Credit: Intel

“We see massive potential for Movidius to accelerate our initiatives in new and emerging technologies. The ability to track, navigate, map and recognize both scenes and objects using Movidius’ low power and high performance SoCs opens up opportunities in areas where heat, battery life and form factors are key,’ explains Josh Walden, Senior Vice President and General Manager of Intel’s New Technology Group.

Movidius has existing deals with Lenovo, for its Myriad 2 processors, and with Google, to use its neural computation engine to improve machine learning capabilities of mobile devices.

Why it’s time to prepare for a world where machines can do your job


Radical changes in employment patterns are on the way as artificial intelligence takes on many routine, repetitive tasks currently performed by people.

For decades movies have warned of intelligent machines taking our lives while ignoring a more plausible near-future threat: that they will take our jobs.

A growing number of economists and artificial intelligence researchers are recommending that societies prepare for a world where large numbers of jobs are automated.

If they’re right, the disruption to labour markets would be significant: the jobs identified as vulnerable are held by swathes of the population including supermarket cashiers and shop assistants, waiters, truck drivers and office admins. All of these tasks have a high probability of being carried out by software within “a decade or two”, according to a study by the Oxford Martin School & Faculty of Philosophy in the UK.

Not everyone agrees, but these predictions have struck a chord with those of some of the best-known names in AI research.

Andrew Ng, is the chief scientist for Chinese search giant Baidu and specialises in the field of deep learning, previously having worked on the “Google Brain” project. Recently, Baidu demonstrated a deep learning system that is able to describe what’s in images and get it right almost 95 percent of the time.

“I do think there’s a significant risk of technological unemployment over the next few decades,” said Ng. “Many people are doing routine, repetitive jobs. Unfortunately, technology is especially good at automating routine, repetitive work.”

 Jobs may already be being destroyed at a faster rate than they are being created. MIT (Massachusetts Institute of Technology) economists Erik Brynjolfsson and Andrew McAfee drew attention to how technology might have broken the centuries old link between employment and productivity in their recent book The Second Machine Age.

The book outlines how for most of the second half of the twentieth century the economic value generated in the US — the country’s productivity — grew hand-in-hand with the number of workers. But in 2000 the two measures began to diverge. From the turn of the century a gap opened up between productivity and total employment. By 2011, that delta had widened significantly, reflecting continued economic growth with little associated increase in job creation.

“In the US it’s pretty clear that the labour force participation ratio has been falling for about a decade, the share of the population that’s working is lower and median income has also stagnated,” said Brynjolfsson.

“There’s clearly something going on there that needs to be better understood and our view is that technology is a big part of the story.”

Brynjolfsson isn’t a neo-luddite trying to hold back new advances. He’s pointing out that we are undergoing a technologically-driven shift in labour of the kind witnessed throughout history — a shift that societies should prepare for.

“Technology has always been creating jobs and always been destroying jobs. There’s this flow, but the jobs that are created and the jobs that are destroyed tend to be different kinds of jobs,” he said, stressing that those displaced may not be suited to carry out the jobs created by AI and automation.

Brynjolfsson gives the example of truck driving — a job he sees as ripe for automation and which he said is the “number one occupation for US males”, employing more than three million people.

“That particular role I can easily see becoming much less important in the next decade or two,” he said, referencing recent advances in the development of self-driving cars.

He questions how many displaced truck drivers would be well-placed to take on newly created roles, or those jobs resistant to automation because they rely on emotional understanding or complex physical tasks.

“Then the question is, which occupations become more important? Maybe data scientist or pre-school teacher or massage therapist. How many of those truck drivers are going to be comfortable being reskilled and moving into those other roles, and be able to do those other jobs effectively? You can see there may be a mismatch.”

REAPING THE BOUNTY

But what about the positives of technologically-driven change? Some commentators believe the negative effects of widespread automation could be offset by reduced costs of goods and services and by the wider population sharing in greater profits from lowering the cost of production.

Robert D. Atkinson, president of the ITIF (Information Technology & Innovation Foundation), believes that increasing the use of technology in workplaces “cuts costs, and these cost savings are passed on in the form of lower prices and/or higher wages”.

“If we were somehow able to triple productivity in a decade (something that has never ever happened in any nation ever in history), consumers would absolutely not have a lack of things to spend that money on (more vacations, bigger TV, more eating out, a motor boat, etcetera) and all that would create jobs.”

Brynjolfsson and McAfee talk about the role of technology in lowering costs and driving up wages in the The Second Machine Age — referring to the benefits it generates as “the bounty”. This effect can be seen in the many ways modern information technology has lowered costs, with the web making it affordable for anyone with an internet-connected computer to try their hand at being a writer or broadcaster, rent rooms in homes on the cheap or access crowdfunding.

MIT economist Erik Brynjolfsson

However, some observations suggest that technology-fuelled returns are often poorly distributed and insufficient to offset other rising costs. For example, one theory proposes that the internet enables everyone to access the very best there is — the best writing, the best software, the cheapest retailers. This creates a “winner takes all” economy, where the top performers have access to a huge audience who aren’t inclined to use anyone else. In this model the majority of the “bounty” isn’t shared but is captured by those sitting on top of the pile. Another point in The Second Machine Age is that modern software companies often employ far fewer people than the companies they disrupt, an example being Facebook and its photo-sharing service Instagram, which employ around 10,000 people — a fraction of the number working at the photography firm Kodak in its heyday.

And while the cost of broadcasting yourself may have plummeted, the same cannot be said of many of the essentials people need to survive, such as food, drink and fuel. The Second Machine Age cites research by Jared Bernstein, who compared increases in median family income in the US between 1990 and 2008 with changes in the cost of housing, healthcare, and college. He found that while family income grew by around 20 percent during that time, prices for housing and college grew by about 50 percent, and healthcare by more than 150 percent.

The recent spread of information technology has also not coincided with a growth in wages. For the first time since the Great Depression, over half the total income in the United States went to the top 10 percent of Americans in 2012. On top of that, between 1973 and 2011 the median hourly wage in the US barely changed, growing by just 0.1 percent per year.

IS THE TECHNOLOGY READY?

In general, the abilities of AI tend to be narrow: they can recognise what’s in an image orlearn how to screw a top on a bottle, but, unlike people, can’t switch from these specific tasks to do something entirely unrelated, such as make a sandwich.

Without a human’s ability to react to the multitude of unexpected circumstances the real world can throw up, software and robots still have many challenges to overcome if they are to take on jobs outside of tightly-controlled environments, such as factory production lines.

Google’s self-driving cars may have travelled more than one million miles, for example, but they still struggle with scenarios that human drivers could take in their stride.

“It’s pretty clear that AI at the moment, using driverless cars as an example, isn’t at a level where it can entirely be trusted to take over,” said Sean Holden, senior lecturer in Machine Learning in the Computer Laboratory at Cambridge University.

“No matter what you read by PR departments with deep pockets, an AI cannot at the moment, if someone is standing at the side of a road waving their arms about, work out whether it’s someone saying hello to their friend, and therefore nothing to do with them, or someone gesticulating at it to stop.”

There is also still a gulf between the abilities of robots and humans when it comes to certain complex physical tasks that we take for granted. These shortcomings were very apparent at this year’s Darpa Robotics Challenge where many bots failed to stay upright. Robots also struggle with manual tasks that we find simple, such as picking items from warehouse shelves.

Google’s self-driving car.

But Baidu’s Ng points out that automation doesn’t require that software be capable of replacing humans entirely, noting that it can and likely will be used to reduce human’s share of the work. He gives the example of hospital radiologists, a skilled job but one that involves considerable amounts of routine, repetitive work.

“It’s also not just about full automation. For example, if 50 percent of a radiologist’s job can be automated, this will put pricing pressure on their salaries.”

In the case of truck driving, automated vehicles might control the bulk of the trip along highways, with humans taking over the last leg of the journey through built-up areas. And for taxi drivers, self-driving chauffeurs could be restricted to city routes that have been well-mapped and understood, as will be the case in Milton Keynes in the UK.

Other machine intelligence researchers are more bullish about the prospects for AI-driven automation, contending that software will rapidly become more accomplished as it takes on new tasks.

“It all boils down to machine learning. Most of the automation will be driven by software that learns from its own experience,” said Hod Lipson, professor of Mechanical Engineering, Columbia University in NYC.

“As it learns, it gets better. Not just that specific instance of the software gets better, but allinstances learn from each other’s experiences. This compounding effect means that there is tremendous leverage.”

Lipson gives the example of a self-driving car that shares its “wisdom” with other instances of the same software inside other autonomous vehicles.

“In a relatively short while, the driverless car’s AI will have accumulated a billion hours of driving experience — more than a thousand human lifetimes. That’s difficult to beat. And it’s the same situation for medical diagnostics, strategic investment, farming, pharmacy. The AI doctor that sees patients will have quickly seen millions of patients and encounter almost all possible types of problems — more than even the most experienced doctor will see in her lifetime.”

NOT ALL DOOM AND GLOOM

There’s another, more optimistic outcome from all this automation. That companies will use it to augment what people can do, rather than replace or reduce their role. In this scenario people are freed from the more boring, rote aspects of their jobs and instead focus on tasks requiring creativity and other qualities that software struggles with.

Brynjolfsson talks about this possibility, describing it as “racing with machines”, rather than against them.

The power of human-machine collaboration was neatly illustrated in the Playchess.com tournament in 2005. Two amateur players teamed up with custom chess software running on a laptop to win the contest, beating human grandmasters and a supercomputer working individually.

That same complementary relationship is at the heart of the success of the hugely popular ride-sharing and taxi company Uber, says Teppo Felin, professor of strategy at Oxford Said Business School. Uber uses a system that directs drivers to the nearest passengers, who summon their ride using a smartphone app. The system relies on humans to drive the passengers while the drivers rely on the system to guide them. It’s a good illustration of how humans and machines can achieve more together than individually, says Felin.

Uber: An example of man and machine working in co-operation. 

In spite of the disruption Uber has caused to existing taxi drivers and the firm’s work to develop self-driving cars, which could take humans out of the equation in the long run, Brynjolfsson said it is an example of IT creating, rather than destroying, employment.

“For now, it is creating a lot of work opportunities. That’s not because people have learned new skills, it was rather because a group of entrepreneurs invented a new business model that found new ways of using existing skills.

“In some places, like in San Francisco, there are far more Uber drivers than there ever were taxi and limo drivers put together. So there’s a net increase in that category.”

That automation can lead to positive outcomes of this sort is by no means at odds with Brynjolfsson’s stance. He isn’t arguing that widespread joblessness and social unrest is inevitable or that automation will happen overnight, rather that such technologically-driven change is happening and societies should be prepared.

“It’s not that the overall demand for labour falls, so much that the demand for certain types of skills fall and demand for other skills increase and if we don’t have a good match in the economy, and if we don’t think about it and develop our institutions correctly, then you’re going to have losers as well as winners.”

A large part of that preparation, he argues, involves reforming education — looking beyond the Victorian obsessions with reading, writing and arithmetic to fostering skills that are tricky for computers, such as as ideation (the creation of new ideas), large-frame pattern recognition, and complex communication — as well as making it easier for people to continue to learn throughout their lives.

Baidu’s Ng, agrees, and believes more effort needs to be put into making the education available at the world’s top universities available online — something he’s engaged in as the co-founder of the open online course service Coursera.

“Our educational system just isn’t set up right now for getting huge numbers of people to do non-routine, creative work. The top universities in the world do this well, but for the most part we haven’t been able to give people this type of education at scale,” he said.

But reforming schooling systems and making Ivy-League education available to all are not overnight jobs, and Columbia University’s Lipson stresses the importance of getting to grips with these issues today.

“Often people ask me about the dangers of AI, thinking that AI robots will one day ‘take over the world’. The truth is more subtle. There will be no titanium robots marching down the street and shooting people. There will be AI that gradually learns to do everything we do. And when a machine can do almost everything better than almost everyone, our social structure will begin to unravel. And that’s something we need to prepare for.”

 

Minds and machines: The art of forecasting in the age of artificial intelligence


The human/artificial intelligence (AI) relationship is just heating up. So when is AI better at predicting outcomes, and when are humans? What happens when you combine forces? And more broadly, what role will human judgment play as machines continue to evolve?

ER_3324_interior-image

Human judgment in the age of smart machines

Two of today’s major business and intellectual trends offer complementary insights about the challenge of making forecasts in a complex and rapidly changing world. Forty years of behavioral science research into the psychology of probabilistic reasoning have revealed the surprising extent to which people routinely base judgments and forecasts on systematically biased mental heuristics rather than careful assessments of evidence. These findings have fundamental implications for decision making, ranging from the quotidian (scouting baseball players and underwriting insurance contracts) to the strategic (estimating the time, expense, and likely success of a project or business initiative) to the existential (estimating security and terrorism risks).

The bottom line: Unaided judgment is an unreliable guide to action. Consider psychologist Philip Tetlock’s celebrated multiyear study concluding that even top journalists, historians, and political experts do little better than random chance at forecasting such political events as revolutions and regime changes.1

The second trend is the increasing ubiquity of data-driven decision making and artificial intelligence applications. Once again, an important lesson comes from behavioral science: A body of research dating back to the 1950s has established that even simple predictive models outperform human experts’ ability to make predictions and forecasts. This implies that judiciously constructed predictive models can augment human intelligence by helping humans avoid common cognitive traps. Today, predictive models are routinely consulted to hire baseball players (and other types of employees), underwrite bank loans and insurance contracts, triage emergency-room patients, deploy public-sector case workers, identify safety violations, and evaluate movie scripts. The list of “Moneyball for X” case studies continues to grow.

More recently, the emergence of big data and the renaissance of artificial intelligence (AI) have made comparisons of human and computer capabilities considerably more fraught. The availability of web-scale datasets enables engineers and data scientists to train machine learning algorithms capable of translating texts, winning at games of skill, discerning faces in photographs, recognizing words in speech, piloting drones, and driving cars. The economic and societal implications of such developments are massive. A recent World Economic Forum report predicted that the next four years will see more than 5 million jobs lost to AI-fueled automation and robotics.2

Let’s dwell on that last statement for a moment: What about the art of forecasting itself? Could one imagine computer algorithms replacing the human experts who make such forecasts? Investigating this question will shed light on both the nature of forecasting—a domain involving an interplay of data science and human judgment—and the limits of machine intelligence. There is both bad news (depending on your perspective) and good news to report. The bad news is that algorithmic forecasting has limits that machine learning-based AI methods cannot surpass; human judgment will not be automated away anytime soon. The good news is that the fields of psychology and collective intelligence are offering new methods for improving and de-biasing human judgment. Algorithms can augment human judgment but not replace it altogether; at the same time, training people to be better forecasters and pooling the judgments and fragments of partial information of smartly assembled teams of experts can yield still-better accuracy.

We predict that you won’t stop reading here.

When algorithms outperform experts

While the topic has never been timelier, academic psychology has studied computer algorithms’ ability to outperform subjective human judgments since the 1950s. The field known as “clinical vs. statistical prediction” was ushered in by psychologist Paul Meehl, who published a “disturbing little book”3 (as he later called it) documenting 20 studies that compared the predictions of well-informed human experts with those of simple predictive algorithms. The studies ranged from predicting how well a schizophrenic patient would respond to electroshock therapy to how likely a student was to succeed at college. Meehl’s study found that in each of the 20 cases, human experts were outperformed by simple algorithms based on observed data such as past test scores and records of past treatment. Subsequent research has decisively confirmed Meehl’s findings: More than 200 studies have compared expert and algorithmic prediction, with statistical algorithms nearly always outperforming unaided human judgment. In the few cases in which algorithms didn’t outperform experts, the results were usually a tie.4 The cognitive scientists Richard Nisbett and Lee Ross are forthright in their assessment: “Human judges are not merely worse than optimal regression equations; they are worse than almost any regression equation.”5

Subsequent research summarized by Daniel Kahneman in Thinking, Fast and Slow helps explain these surprising findings.6 Kahneman’s title alludes to the “dual process” theory of human reasoning, in which distinct cognitive systems underpin human judgment. System 1 (“thinking fast”) is automatic and low-effort, tending to favor narratively coherent stories over careful assessments of evidence. System 2 (“thinking slow”) is deliberate, effortful, and focused on logically and statistically coherent analysis of evidence. Most of our mental operations are System 1 in nature, and this generally serves us well, since each of us makes hundreds of daily decisions. Relying purely on time- and energy-consuming System 2-style deliberation would produce decision paralysis. But—and this is the non-obvious finding resulting from the work of Kahneman, Amos Tversky, and their followers—System 1 thinking turns out to be terrible at statistics.

Given that Michael Lewis’s book was, in essence, about data-driven hiring decisions, it is perhaps ironic that hiring decisions at most organizations are still commonly influenced by subjective impressions formed in unstructured job interviews, despite well-documented evidence about the limitations of such interviews.

The major discovery is that many of the mental rules of thumb (“heuristics”) integral to System 1 thinking are systematically biased, and often in surprising ways. We overgeneralize from personal experience, act as if the evidence before us is the only information relevant to the decision at hand, base probability estimates on how easily the relevant scenarios leap to mind, downplay the risks of options to which we are emotionally predisposed, and generally overestimate our abilities and the accuracy of our judgments.7

It is difficult to overstate the practical business implications of these findings. Decision making is central to all business, medical, and public-sector operations. The dominance and biased nature of System 1-style decision making accounts for the persistence of inefficient markets (even when the stakes are high) and implies that even imperfect predictive models and other types of data products can lead to material improvements in profitability, safety, and efficiency. A very practical takeaway is that perfect or “big” data is not a prerequisite for highly profitable business analytics initiatives. This logic, famously dramatized in the book and subsequent movieMoneyball, applies to virtually any domain in which human experts repeatedly make decisions in stable environments by subjectively weighing evidence that can be quantified and statistically analyzed. Because System 1-style decision making is so poor at statistics, often economically substantial benefits can result from using even limited or imperfect data to de-bias our decisions.8

While this logic has half-century-old roots in academic psychology and has been commonplace in the business world since the appearance of Moneyball, it is still not universally embraced. For example, given that Michael Lewis’s book was, in essence, about data-driven hiring decisions, it is perhaps ironic that hiring decisions at most organizations are still commonly influenced by subjective impressions formed in unstructured job interviews, despite well-documented evidence about the limitations of such interviews.9

Though even simple algorithms commonly outperform unaided expert judgment, they do not “take humans out of the loop,” for several reasons. First, the domain experts for whom the models are designed (hiring managers, bank loan or insurance underwriters, physicians, fraud investigators, public-sector case workers, and so on) are the best source of information on what factors should be included in predictive models. These data features generally don’t spontaneously appear in databases that are used to train predictive algorithms. Rather, data scientists must hard-code them into the data being analyzed, typically at the suggestion of domain experts and end users. Second, expert judgment must be used to decide which historical cases in one’s data are suitably representative of the future to be included in one’s statistical analysis.10

The statistician Rob Hyndman expands on these points, offering four key predictability factors that the underlying phenomenon must satisfy to build a successful forecasting model:11

  1. We understand and can measure the causal factors.
  2. There is a lot of historical data available.
  3. The forecasts do not affect the thing we are trying to forecast.
  4. The future will somewhat resemble the past in a relevant way.

For example, standard electricity demand or weather forecasting problems satisfy all four criteria, whereas all but the second are violated in the problem of forecasting stock prices. Assessing these four principles in any particular setting requires human judgment and cannot be automated by any known techniques.

Finally, even after the model has been built and deployed, human judgment is typically required to assess the applicability of a model’s prediction in any particular case. After all, models are not omniscient—they can do no more than combine the pieces of information presented to them. Consider Meehl’s “broken leg” problem, which famously illustrates a crucial implication. Suppose a statistical model predicts that there is a 90 percent probability that Jim (a highly methodical person) will go to the movies tomorrow night. While such models are generally more accurate than human expert judgment, Nikhil knows that Jim broke his leg over the weekend. The model indication, therefore, does not apply, and the theater manager would be best advised to ignore—or at least down-weight—it when deciding whether or not to save Jim a seat. Such issues routinely arise in applied work and are a major reason why models can guide—but typically cannot replace—human experts. Figuratively speaking, the equation should be not “algorithms > experts” but instead, “experts + algorithms > experts.”

Of course, each of these principles predates the advent of big data and the ongoing renaissance of artificial intelligence. Will they soon become obsolete?

What computers still can’t do

Continually streaming data from Internet of Things sensors, cloud computing, and advances in machine learning techniques are giving rise to a renaissance in artificial intelligence that will likely reshape people’s relationship with computers.12 “Data is the new oil,” as the saying goes, and computer scientist Jon Kleinberg reasonably comments that, “The term itself is vague, but it is getting at something that is real. . . . Big Data is a tagline for a process that has the potential to transform everything.”13

Such issues routinely arise in applied work and are a major reason why models can guide—but typically cannot replace—human experts. Figuratively speaking, the equation should be not “algorithms > experts” but instead, “experts + algorithms > experts.”

A classic AI application based on big data and machine learning is Google Translate, a tool created not by laboriously encoding fundamental principles of language into computer algorithms but, rather, by extracting word associations in innumerable previously translated documents. The algorithm continually improves as the corpus of texts on which it is trained grows. In their influential essay “The unreasonable effectiveness of data,” Google researchers Alon Halevy, Peter Norvig, and Fernando Pereira comment:

[I]nvariably, simple models and a lot of data trump more elaborate models based on less data. . . . Currently, statistical translation models consist mostly of large memorized phrase tables that give candidate mappings between specific source- and target-language phrases.14

Their comment also pertains to the widely publicized AI breakthroughs in more recent years. Computer scientist Kris Hammond states:

[T]he core technologies of AI have not changed drastically and today’s AI engines are, in most ways, similar to years’ past. The techniques of yesteryear fell short, not due to inadequate design, but because the required foundation and environment weren’t built yet. In short, the biggest difference between AI then and now is that the necessary computational capacity, raw volumes of data, and processing speed
are readily available so the technology can really shine.15

A common theme is applying pattern recognition techniques to massive databases of user-generated content. Spell-checkers are trained on massive databases of user self-corrections, “deep learning” algorithms capable of identifying faces in photographs are trained on millions of digitally stored photos,16 and the computer system that beat the Jeopardy game show champions Ken Jennings and Brad Rutter incorporated a multitude of information retrieval algorithms applied to a massive body of digitally stored texts. The cognitive scientist Gary Marcus points out that the latter application was feasible because most of the knowledge needed to answer Jeopardyquestions is electronically stored on, say, Wikipedia pages: “It’s largely an exercise in data retrieval, to which Big Data is well-suited.”17

The variety and rapid pace of these developments have led some to speculate that we are entering an age in which the capabilities of machine intelligence will exceed those of human intelligence.18While too large a topic to broach here, it’s important to be clear about the nature of the “intelligence” that today’s big data/machine learning AI paradigm enables. A standard definition of AI is “machines capable of performing tasks normally performed by humans.”19 Note that this definition applies to more familiar data science applications (such as scoring models capable of automatically underwriting loans or simple insurance contracts) as well as to algorithms capable of translating speech, labeling photographs, and driving cars.

Also salient is the fact that all of the AI technologies invented thus far—or are likely to appear in the foreseeable future—are forms of narrow AI. For example, an algorithm designed to translate documents will be unable to label photographs and vice versa, and neither will be able to drive cars. This differs from the original goals of such AI pioneers as Marvin Minsky and Herbert Simon, who wished to create general AI: computer systems that reason as humans do. Impressive as they are, today’s AI technologies are closer in concept to credit-scoring algorithms than they are to 2001’s disembodied HAL 900020 or the self-aware android Ava in the movie Ex Machina.21All we currently see are forms of narrow AI.

The nature of human collaboration with computers is likely to evolve. Tetlock cites the example of “freestyle chess” as a paradigm example of the type of human-computer collaboration we are likely to see more of in the future.

Returning to the opening question of this essay: What about forecasting? Do big data and AI fundamentally change the rules or threaten to render human judgment obsolete? Unlikely. As it happens, forecasting is at the heart of a story that prompted a major reevaluation of big data in early 2014. Some analysts had extolled Google Flu Trends (GFT) as a prime example of big data’s ability to replace traditional forms of scientific methodology and data analysis. The idea was that Google could use digital exhaust from people’s flu-related searches to track flu outbreaks in real time; this seemed to support the arguments of pundits such as Chris Anderson, Kenneth Cukier, and Viktor Mayer-Schönberger, who had claimed that “correlation is enough” when the available data achieve sufficient volume, and that traditional forms of analysis could be replaced by computeralgorithms seeking correlations in massive databases.22 However, during the 2013 flu season, GFT’s predictions proved wildly inaccurate—roughly 140 percent off—and left analysts questioning their models. The computational social scientist David Lazer and his co-authors published a widely cited analysis of the episode, offering a twofold diagnosis23 of the algorithm’s ultimate failure:

Neglect of algorithm dynamics. Google continually tweaks its search engine to improve search results and user experience. GFT, however, assumed that the relation between search terms and external events was static; in other words, the GFT forecasting model was calibrated on data no longer representative of the model available to make forecasts. In Rob Hyndman’s terms, this was a violation of the assumption that the future sufficiently resembles the past.

Big data hubris. Built from correlations between Centers for Disease Control and Prevention (CDC) data and millions of search terms, GFT violated the first and most important of Hyndman’s four key predictability factors: understanding the causal factors underlying the data relationships. The result was a plethora of spurious correlations due to random chance (for instance, “seasonal search terms unrelated to the flu but strongly correlated to the CDC data, such as those regarding high school basketball”).24 As Lazer commented, “This should have been a warning that the big data were overfitting the small number of cases.”25 While this is a central concern in all branches of data science, the episode illustrates the seductive—and unreliable—nature of the tacit assumption that the sheer volume of “big” data obviates the need for traditional forms of data analysis.

“When Google quietly euthanized the program,” GFT quickly went from “the poster child of big data into the poster child of the foibles of big data.”26 The lesson of the Lazer team’s analysis is not that social media data is useless for predicting disease outbreaks. (It can be highly useful.) Rather, the lesson is that generally speaking, big data and machine learning algorithms should be regarded as supplements to—not replacements for—human judgment and traditional forms of analysis.

In Superforecasting: The Art and Science of Prediction, Philip Tetlock (writing with Dan Gardner) discusses the inability of big data-based AI technologies to replace human judgment. Tetlock reports a conversation he had with David Ferrucci, who led the engineering team that built the Jeopardy-winning Watson computer system. Tetlock contrasted two questions:

  1. Which two Russian leaders traded jobs in the last 10 years?
  2. Will two top Russian leaders trade jobs in the next 10 years?

Tetlock points out that the former question is a historical fact, electronically recorded in many online documents, which computer algorithms can identify using pattern-recognition techniques. The latter question requires an informed guess about the intentions of Vladimir Putin, the character of Dmitry Medvedev, and the causal dynamics of Russian politics. Ferrucci expressed doubt that computer algorithms could ever automate this form of judgment in uncertain conditions. As data volumes grow and machine learning methods continue to improve, pattern recognition applications will better mimic human reasoning, but Ferrucci comments that “there’s a difference between mimicking and reflecting meaning and originating meaning.” That space, Tetlock notes, is reserved for human judgment.27

The data is bigger and the statistical methods have evolved, but the overall conclusion would likely not surprise Paul Meehl: It is true that computers can automate certain tasks traditionally performed only by humans. (Credit scores largely eliminating the role of bank loan officer is a half-century-old example.) But more generally, they can only assist—not supplant—the characteristically human ability to make judgments under uncertainty.

That said, the nature of human collaboration with computers is likely to evolve. Tetlock cites the example of “freestyle chess” as a paradigm example of the type of human-computer collaboration we are likely to see more of in the future. A discussion of a 2005 “freestyle” chess tournament by grandmaster Garry Kasparov (whom IBM Deep Blue famously defeated in 1996) nicely illustrates the synergistic possibilities of such collaborations. Kasparov comments:

The surprise came at the conclusion of the event. The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time. Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.28

Many minds

Human-computer collaboration is therefore a major avenue for improving our abilities to make forecasts and judgments under uncertainty. Another approach is to refine the process of making judgments itself. This is the subject of the increasingly prominent field of collective intelligence. Though the field is only recently emerging as an integrated field of study, notions of collective intelligence date back millennia.29 For example, Aristotle wrote that when people “all come together . . . they may surpass—collectively and as a body, although not individually—the quality of the few best.”30 In short, groups are capable of pooling disparate bits of information from multiple individuals to arrive at a better judgment or forecast than any of the members of the group. Speaking figuratively, a “smart” group can be smarter than the smartest person in the group.31

A famous early example of collective intelligence involved the inventor of regression analysis, Francis Galton.32 At a Victorian-era English country fair, Galton encountered a contest involving hundreds of participants who were guessing the weight of an ox. He expected the guesses to be well off the mark, and indeed, they were—even the actual experts in the crowd failed to accurately estimate the weight of 1,198 lbs. But the average of the guesses, made by amateurs and professionals alike, was a near-perfect 1,197 lbs.33

Prediction markets are another device for combining forecasts. The logic of prediction markets mirrors economist Friedrich Hayek’s view that a market mechanism’s primary function is not simply to facilitate buying and selling but, rather, to collect and aggregate information from individuals.34 The Hollywood Stock Exchange, for example, is an online prediction market in which people use simulated money to buy and sell “shares” of actors, directors, films, and film-related options; it predicts each year’s Academy Award winners with a 92 percent reported accuracy rate. A more business-focused example is the Information Aggregation Mechanism (IAM), created by a joint Caltech/Hewlett-Packard research team. The goal was to forecast sales by aggregating “small bits and pieces of relevant information [existing] in the opinions and intuition of individuals.” After several HP business divisions implemented IAM, the team reported that “the IAM market predictions consistently beat the official HP forecasts.”35 Of course, like financial markets, prediction markets are not infallible. For example, economist Justin Wolfers and two co-authors document a number of biases in Google’s prediction market, finding that “optimistic biases are significantly more pronounced on days when Google stock is appreciating” and that predictions are highly correlated among employees “who sit within a few feet of one another.”36

The Delphi method is a collective intelligence method that attempts to refine the process of group deliberation; it is designed to yield the benefits of combining individually held information while also supporting the type of learning characteristic of smart group deliberation.37 Developed at the Cold War-era RAND Corp. to forecast military scenarios, the Delphi method is an iterative deliberation process that forces group members to converge on a single point estimate. The first round begins with each group member anonymously submitting her individual forecast. In each subsequent round, members must deliberate and then offer revised forecasts that fall within the interquartile range (25th to 75th percentile) of the previous round’s forecasts; this process continues until all the group members converge on a single forecast. Industrial, political, and medical applications have all found value in the method.

In short, tapping into the “wisdom” of well-structured teams can result in improved judgments and forecasts.38 What about improving the individual forecasts being combined? The Good Judgment Project (GJP), co-led by Philip Tetlock, suggests that this is a valuable and practical option. The project, launched in 2011, was sponsored by the US intelligence community’s Intelligence Advanced Research Projects Activity; the GJP’s goal was to improve the accuracy of intelligence forecasts for medium-term contingent events such as, “Will Greece leave the Euro zone in 2016?”39 Tetlock and his team found that: (a) Certain people
demonstrate persistently better-than-average forecasting abilities; (b) such people are characterized by identifiable psychological traits; and (c) education and practice can improve people’s forecasting ability. Regarding the last of these points, Tetlock reports that mastering the contents of the short GJP training booklet alone improved individuals’ forecasting accuracy by roughly 10 percent.40

Each year, the GJP selects the consistently best 2 percent of the forecasters. These individuals—colloquially referred to as “superforecasters”—reportedly perform 30 percent better than intelligence officers with access to actual classified information. Perhaps the most important characteristic of superforecasters is their tendency to approach problems from the “outside view” before proceeding to the “inside view,” whereas most novice forecasters tend to proceed in the opposite direction. For example, suppose we wish to forecast the duration of a particular consulting project. The inside view would approach this by reviewing the pending work streams and activities and summing up the total estimated time for each activity. By contrast, the outside view would begin by establishing a reference class of similar past projects and using their average duration as the base scenario; the forecast would then be further refined by comparing the specific features of this project to those of past projects.41

Beyond the tendency to form reference-class base rates based on hard data, Tetlock identifies several psychological traits that superforecasters share:

  1. They are less likely than most to believe in fate or destiny and more likely to believe in probabilistic and chance events.
  2. They are open-minded and willing to change their views in light of new evidence; they do not hold on to dogmatic or idealistic beliefs.
  3. They possess above-average (but not necessarily extremely high) general intelligence and fluid intelligence.
  4. They are humble about their forecasts and willing to revise them in light of new evidence.
  5. While not necessarily highly mathematical, they are comfortable with numbers and the idea of assigning probability estimates to uncertain scenarios.

Although the US intelligence community sponsors the Good Judgment Project, the principles of (1) systematically identifying and training people to make accurate forecasts and (2) bringing together groups of such people to improve collective forecasting accuracy could be applied to such fields as hiring, mergers and acquisitions, strategic forecasting, risk management, and insurance underwriting. Advances in forecasting and collective intelligence methods such as the GJP are a useful reminder that in many situations, valuable information exists not just in data warehouses but also in the partial fragments of knowledge contained in the minds of groups of experts—or even informed laypeople.42

Mind this

Although predictive models and other AI applications can automate certain routine tasks, it is highly unlikely that human judgment will be outsourced to algorithms any time soon. More realistic is to use both data science and psychological science to de-bias and improve upon human judgments. When data is plentiful and the relevant aspects of the world aren’t rapidly changing, it’s appropriate to lean on statistical methods. When little or no data is available, collective intelligence and other psychological methods can be used to get the most out of expert judgment. For example, Google—a company founded on big data and AI—uses “wisdom of the crowd” and other statistical methods to improve hiring decisions, wherein the philosophy is to “complement human decision makers, not replace them.”43

In an increasing number of cases involving web scale data, “smart” AI applications will automate the routine work, leaving human experts with more time to focus on aspects requiring expert judgment and/or such non-cognitive abilities as social perception and empathy. For example, deep learning models might automate certain aspects of medical imaging, which would offer teams of health care professionals more time and resources to focus on ambiguous medical issues, strategic issues surrounding treatment options, and providing empathetic counsel. Analogously, insurance companies might use deep learning models to automatically generate cost-of-repair estimates for damaged cars, providing claims adjusters with more time to focus on complex claims and insightful customer service.

Human judgment will continue to be realigned, augmented, and amplified by methods of psychology and the products of data science and artificial intelligence. But humans will remain “in the loop” for the foreseeable future. At least that’s our forecast.

Scientists teach machines to learn like humans


Scientists teach machines to learn like humans
This paper compares human and machine learning for a wide range of simple visual concepts, or handwritten characters selected from alphabets around the world. This is an artist’s interpretation of that theme. 

A team of scientists has developed an algorithm that captures our learning abilities, enabling computers to recognize and draw simple visual concepts that are mostly indistinguishable from those created by humans. The work, which appears in the latest issue of the journal Science, marks a significant advance in the field—one that dramatically shortens the time it takes computers to ‘learn’ new concepts and broadens their application to more creative tasks.

“Our results show that by reverse engineering how people think about a problem, we can develop better algorithms,” explains Brenden Lake, a Moore-Sloan Data Science Fellow at New York University and the paper’s lead author. “Moreover, this work points to promising methods to narrow the gap for other machine learning tasks.”

The paper’s other authors were Ruslan Salakhutdinov, an assistant professor of Computer Science at the University of Toronto, and Joshua Tenenbaum, a professor at MIT in the Department of Brain and Cognitive Sciences and the Center for Brains, Minds and Machines.

When humans are exposed to a new concept—such as new piece of kitchen equipment, a new dance move, or a new letter in an unfamiliar alphabet—they often need only a few examples to understand its make-up and recognize new instances. While machines can now replicate some pattern-recognition tasks previously done only by humans—ATMs reading the numbers written on a check, for instance—machines typically need to be given hundreds or thousands of examples to perform with similar accuracy.

“It has been very difficult to build machines that require as little data as humans when learning a new concept,” observes Salakhutdinov. “Replicating these abilities is an exciting area of research connecting machine learning, statistics, computer vision, and cognitive science.”

Salakhutdinov helped to launch recent interest in learning with ‘deep neural networks,’ in a paper published in Science almost 10 years ago with his doctoral advisor Geoffrey Hinton. Their algorithm learned the structure of 10 handwritten character concepts—the digits 0-9—from 6,000 examples each, or a total of 60,000 training examples.

In the work appearing in Science this week, the researchers sought to shorten the learning process and make it more akin to the way humans acquire and apply new knowledge—i.e., learning from a small number of examples and performing a range of tasks, such as generating new examples of a concept or generating whole new concepts.

To do so, they developed a ‘Bayesian Program Learning’ (BPL) framework, where concepts are represented as simple computer programs. For instance, the letter ‘A’ is represented by computer code—resembling the work of a computer programmer—that generates examples of that letter when the code is run. Yet no programmer is required during the : the algorithm programs itself by constructing code to produce the letter it sees. Also, unlike standard computer programs that produce the same output every time they run, these probabilistic programs produce different outputs at each execution. This allows them to capture the way instances of a concept vary, such as the differences between how two people draw the letter ‘A.’

Scientists teach machines to learn like humans
Can you tell the difference between humans and machines? Humans and machines were given an image of a novel character (top) and asked to produce new exemplars. The nine-character grids in each pair that were generated by a machine are (by row) B, A; A, B; A, B. Credit: Brenden Lake

While standard pattern recognition algorithms represent concepts as configurations of pixels or collections of features, the BPL approach learns “generative models” of processes in the world, making learning a matter of ‘model building’ or ‘explaining’ the data provided to the algorithm. In the case of writing and recognizing letters, BPL is designed to capture both the causal and compositional properties of real-world processes, allowing the algorithm to use data more efficiently. The model also “learns to learn” by using knowledge from previous concepts to speed learning on new concepts—e.g., using knowledge of the Latin alphabet to learn letters in the Greek alphabet. The authors applied their model to over 1,600 types of handwritten characters in 50 of the world’s writing systems, including Sanskrit, Tibetan, Gujarati, Glagolitic—and even invented characters such as those from the television series Futurama.

In addition to testing the algorithm’s ability to recognize new instances of a concept, the authors asked both humans and computers to reproduce a series of handwritten characters after being shown a single example of each character, or in some cases, to create new characters in the style of those it had been shown. The scientists then compared the outputs from both humans and machines through ‘visual Turing tests.’ Here, human judges were given paired examples of both the human and machine output, along with the original prompt, and asked to identify which of the symbols were produced by the computer.
Scientists teach machines to learn like humans
Can you tell the difference between humans and machines? Humans and machines were given an image of a novel character (top) and asked to produce new exemplars. The nine-character grids in each pair that were generated by a machine are (by row) 1, 2; 2, 1; 1, 1. Credit: Brenden Lake

While judges’ correct responses varied across characters, for each visual Turing test, fewer than 25 percent of judges performed significantly better than chance in assessing whether a machine or a human produced a given set of symbols.

“Before they get to kindergarten, children learn to recognize new concepts from just a single example, and can even imagine new examples they haven’t seen,” notes Tenenbaum. “I’ve wanted to build models of these remarkable abilities since my own doctoral work in the late nineties. We are still far from building machines as smart as a human child, but this is the first time we have had a machine able to learn and use a large class of real-world concepts—even simple visual concepts such as handwritten characters—in ways that are hard to tell apart from humans.”