The real-world
potential and limitations of artificial intelligence PART II
David Schwartz: Right. I’m hearing that we’re dealing with very
complicated problems, very complex issues. How would someone, outside in, ever
understand what may appear to be—may in fact be—almost a black box?
James Manyika: This is the question of
explainability, which is: How do we even know that? You think about where we
start applying these systems in the financial world—for example, to lending. If
we deny you for a mortgage application, you may want to know why. What is the
data point or feature set that led to that decision? If you apply the system
set to the criminal-justice system, if somebody’s been let out on bail and
somebody else wasn’t, you may want to understand why it is that we came to that
conclusion. It may also be an important question for purely research purposes,
where you’re trying to self-discover particular behaviors, and so you’re trying
to understand what particular part of the data leads to a particular set of
behaviors.
This is a very hard
problem structurally. The good news, though, is that we’re starting to make
progress on some of these things. One of the ways in which we’re making
progress is with so-called GANs. These are more generalized, additive models
where, as opposed to taking massive amounts of models at the same time, you
almost take one feature model set at a time, and you build on it.
For example, when you
apply the neural network, you’re exploring one particular feature, and then you
layer on another feature; so, you can see how the results are changing based on
this kind of layering, if you like, of different feature models. You can see,
when the results shift, which model feature set seemed to have made the biggest
difference. This is a way to start to get some insight into what exactly is
driving the behaviors and outcomes you’re getting.
Michael Chui: One of the other big drivers
for explainability is regulation and regulators. If a car decides to make a
left turn versus a right turn, and there’s some liability associated with that,
the legal system will want to ask the question, “Why did the car make the left
turn or the right turn?” In the European Union, there’s the General Data
Protection Regulation that will require explainability for certain types of
decisions that these machines might make. The machines are completely
deterministic. You could say, “Here are a million weights that are associated
with our simulated neurons. Here’s why.” But that’s not engaging to a human
being.
Another technique is an
acronym, LIME, which is locally interpretable model-agnostic explanations. The
idea there is from the outside in—rather than look at the structure of the
model, just be able to perturb certain parts of the model and the inputs and
see whether that makes a difference on the outputs. If you’re taking a look at
an image and trying to recognize whether an object is a pickup truck or an
ordinary sedan, you might say, “If I change the wind screen on the inputs, does
that cause me to have a different output? On the other hand, if I change the
back end of the vehicle, it looks like that makes a difference.” That says,
that what this model is paying attention to as it’s determining whether it’s a
sedan or a pickup truck is the back part of the vehicle. It’s basically doing
experiments on the model in order to figure out what makes a difference. Those
are some of the techniques that people are trying to use in order to explain
how these systems work.
David Schwartz: At some level, I’m hearing
from the questions and from what the rejoinder might be that there’s a very
human element. A question would be: Why is the answer such and such? And the
answer could be, it’s the algorithm. But somebody built that algorithm, or
somebody—or a team of somebodies—and machines built that algorithm. That brings
us to a limitation that is not quite like the others: bias—human predilections.
Could you speak a little bit more about what we’re up against, James?
It becomes very, very
important to think through what might be the inherent biases in the data, in
any direction.
James Manyika: The question of bias is a
very important one. And I’d put it into two parts.
Clearly, these
algorithms are, in some ways, a big improvement on human biases. This is the
positive side of the bias conversation. We know that, for example, sometimes,
when humans are interpreting data on CVs [curriculum vitae], they might
gravitate to one set of attributes and ignore some other attributes because of
whatever predilections that they bring. There’s a big part of this in which the
application of these algorithms is, in fact, a significant improvement compared
to human biases. In that sense, this is a good thing. We want those kinds of
benefits.
But I think it’s worth
having the second part of the conversation, which is, even when we are applying
these algorithms, we do know that they are creatures of the data and the inputs
you put in. If those inputs you put in have some inherent biases themselves,
you may be introducing different kinds of biases at much larger scale.
The work of people like
Julia Angwin and others has actually shown this if the data collected is
already biased. If you take policing as an example, we know that there are some
communities that are more heavily policed. There’s a much larger police
presence. Therefore, the data we’ve got and that’s collected about those
environments is much, much, much higher. If we then start to compare, say, two
neighborhoods, one where it’s oversampled—meaning there’s lots and lots of data
available for it because there’s a larger police presence—versus another one
where there isn’t much policing so, therefore, there isn’t much data available,
we may draw the wrong conclusions about the heavily policed observed
environment, just simply because there’s more data available for it versus the
other one.
The biases can go
another way. For example, in the case of lending, the implications might go the
other way. For populations or segments where we have lots and lots of financial
data about them, we may actually make good decisions because the data is
largely available, versus in another environment where we’re talking about a
segment of the population we don’t know much about, and the little bit that we
know sends the decision off in one way. And so, that’s another example where
the undersampling creates a bias.
The point about this
second part is that I think it becomes very, very important to make sure that
we think through what might be the inherent biases in the data, in any
direction, that might be in the data set itself—either in the actual way it’s
constructed, or even the way it’s collected, or the degree of sampling of the
data and the granularity of it. Can we debias that in some fundamental way?
This is why the
question of bias, for leaders, is particularly important, because it runs a
risk of opening companies up to all kinds of potential litigation and social
concern, particularly when you get to using these algorithms in ways that have
social implications. Again, lending is a good example. Criminal justice is
another example. Provision of healthcare is another example. These become very,
very important arenas to think about these questions of bias.
Michael Chui: Some of the difficult cases
where there’s bias in the data, at least in the first instance, isn’t around,
as a primary factor, people’s inherent biases about choosing either one or the
other. It is around, in many cases, these ideas about sampling—sampling bias,
data-collection bias, et cetera—which, again, is not necessarily about
unconscious human bias but an artifact of where the data came from.
There’s a very famous
case, less AI related, where an American city used an app in the early days of
smartphones that determined where potholes were based on the accelerometer
shaking when you drove over a pothole. Strangely, it discovered that if you
looked at the data, it seemed that there were more potholes in affluent parts
of the city. That had nothing to do with the fact there were actually more
potholes in that part of the city, but you had more signals from that part of
the city because more affluent people had more smartphones at the time. That’s
one of those cases where it wasn’t because of any intention to not pay
attention to certain parts of the city. Understanding the providence of
data—understanding what’s being sampled—is incredibly important.
There’s another
researcher who has a famous TED Talk, Joy Buolamwini at MIT Media Lab. She does
a lot of work on facial recognition, and she’s a black woman. And she says,
“Look, a lot of the other researchers are more male and more pale than I am.
And as a result, the accuracy for certain populations in facial recognition is
far higher than it is for me.” So again, it’s not necessarily because people
are trying to exclude populations, although sometimes that happens, it really
has to do with understanding the representativeness of the sample that you’re
using in order to train your systems.
So, as a business
leader, you need to understand, if you’re going to train machine-learning systems:
How representative are the training sets there that you’re using?
People forget that one
of the things in the AI machine-deep-learning world is that many researchers
are using largely the same data sets that are shared—that are public.
James Manyika: It actually creates an
interesting tension. That’s why I described the part one and the part two.
Because in the first instance, when you look at the part-one problem, which is
the inherent human biases in normal day-to-day hiring and similar decisions, you
get very excited about using AI techniques. You say, “Wow, for the first time,
we have a way to get past these human biases in everyday decisions.” But at the
same time, we should be thoughtful about where that takes us to when you get to
these part-two problems, where you now are using large data sets that have
inherent biases.
I think people forget
that one of the things in the AI machine-deep-learning world is that many
researchers are using largely the same data sets that are shared—that are
public. Unless you happen to be a company that has these large, proprietary
data sets, people are using this famous CIFAR data set, which is often used for
object recognition. It’s publicly available. Most people benchmark their
performance on image recognition based on these publicly available data sets.
So, if everybody’s using common data sets that may have these inherent biases
in them, we’re kind of replicating large-scale biases. This tension between
part one and part two and this bias question are very important ones to think
through. The good news, though, is that in the last couple years, there’s been
a growing recognition of the issues we just described. And I think there are
now many places that are putting real research effort into these questions about
how you think about bias.
David Schwartz: What are best practices for
AI, given what we’ve discussed today about the wide range of applications, the
wide range of limitations, and the wide range of challenges before us?
Michael Chui: It is early, so to talk about
best practices might be a little bit preliminary. I’ll steal a phrase that I
once heard from Gary Hamel: we might be talking about next practices, in a
certain sense. That said, there a few things that we’ve observed from leaders
who are pioneers and vanguards.
The first thing is one
we’ve described as “get calibrated,” but it’s really just to start to
understand the technology and what’s possible. For some of the things that
we’ve talked about today, business leaders over the past few years have had to
understand technology more. This is really on the tip of the spear, on the
cutting edge. So, really try to understand what’s possible in the technology.
Then, try to understand what the potential implications are across your entire
business. As we said, these technologies are widely applicable. So, understand
where in your business you’re deriving value and how these technologies can
help you derive value, whether it’s marketing and sales, whether it’s supply
chain, whether it’s manufacturing, whether it’s in human capital or risk.
And then, don’t be
afraid to be bold. At least experiment. This is a type of technology where it’s
a learning curve, and the earlier you to start to learn, the faster you’ll go
up the curve and the quicker you’ll learn where you can add value, where you
can find data, and how you can have a data strategy in order to unlock the data
you need to do machine learning. Getting started early—there’s really no
substitute for that.
James Manyika: The only other thing I would
add is something you’ve been working a lot on, Michael. One of the things that
leaders are going to have to understand, or make sure that their teams
understand, is this question of which techniques map to which kinds of
problems, and also which techniques lead to what kind of value.
We know that the vast
majority of the techniques, in the end, are largely classifiers. Knowing that
is helpful. Then knowing if the kind of problem sets in your business system
are ones that look like classification problems; if so, you have an enormous
opportunity. This leads to where you then think about where economic value is and
if you have the data available.
There’s a much more
granular understanding that leaders are going to have to have, unfortunately.
The reason why this matters, back to Michael’s next-practice point, is that we
are already seeing, if you like, a differentiation between those leaders and
company who are at the frontier of understanding this and applying these
techniques, versus others who are, quite frankly, dabbling—or, at least, paying
lip service.
It’s worth occasionally
as a leader, I would think, visiting or spending time with researchers at the
frontier, or at least talking to them, just to understand what’s going on and
what’s not possible. Because this field is moving so quickly. Things that may
have been seen as limitations two years ago may not be anymore. And if you’re
still relying on a conversation you had with an AI scientist two years ago, you
may be behind already.
David Schwartz: James and Michael, absolutely
fascinating. Thank you for joining us.
James Manyika: Thank you.
Michael Chui: Thank you.
https://www.mckinsey.com/featured-insights/artificial-intelligence/the-real-world-potential-and-limitations-of-artificial-intelligence?cid=podcast-eml-alt-mkq-mck-oth-1805&hlkid=5484d298eec3407fa2e0bb8aee2c561c&hctky=1627601&hdpid=8b6826bd-a3a2-403c-85bf-4bae5416f343
No comments:
Post a Comment