Ethics is
a very hot topic in the field of analytics even more than it is a topic in
civic society.
Why are
ethics and analytics related? Simply because, during the course of our work,
with the amazing amount of data being generated and captured in all sorts of
ways by all sorts of players (and traded), added to the advances in technology,
the increase in the amount of information, even personal information, we can
grasp has increased much, much faster than the awareness of it, and therefore
how much thinking has gone into setting ethical standards.
In sum,
the ability to grasp information has grown much faster than the awareness of
it. Since people are not aware of what you and me could know about them, they
don’t see the ethical issues. For most of us, our ethics are rooted in our
past.
Of course
there are pioneers who subscribe to “it’s easier to ask for forgiveness than
permission”, who may know ethics will change over time and that it is highly
profitable to be ahead of the ethical changes. This is especially true in the
technology and analytics spaces.
Let me
first illustrate some possible ethical issues in famous uses of technology in
the recent years (in no particular order), and I am not including hacking
exploits. In my next blog I will illustrate real life cases people in analytics
have faced.
1 Google car also captured wifi
information and google kept the data
Remember
the google car? It went around many countries worldwide, taking photos and
enabling the maps. However they also collected more "It is now clear that
we have been mistakenly collecting samples of payload data from open wifi
networks, even though we never used that data in any Google products"
google said in 2010 (1). However, if you, nowadays, want to 'benefit' from
high-accuracy on your location services (something Grab demands), you will
'benefit' from a service that "calls upon every service available: GPS,
Wi-Fi, Bluetooth, and/or cellular networks in whatever combination available,
and uses Google's location services to provide the most accurate
location." (2)
The fun
bit was that google claimed to have "mistakenly" collected the data,
I may be a technical ignoramus, but somehow I think that a camera and something
that snoops wifis are quite different. Plus, even the first time they obtained
the data, someone should have asked: "hey! what is this?" (which must
have happened if it was a mistake" and hopefully "Should we keep
this?".
At the
very least, this "should", a question that demands a value judgement,
should have been asked.
2 Facebook and emotion contagion
experiment of 2012 (3)
Facebook
simply wanted to understand whether people’s mood could spread to their friends
and contacts.
So they
simply arbitrarily, for some users, only show say post of their friends who
displayed negative sentiments, while suppressing those that showed positive
sentiments. Basically you only see that your friends are not in a good way, and
nothing from those who ae doing well.
Do you
think that would dampen your mood too?
Well,
facebook showed it did.
Interestingly,
this experiment affected 689,000 facebook users (like an old fashioned cheque,
I will specify six hundred and eighty-nine thousand users).
Interestingly,
someone (Clay Johnston from Blue state digital who helped Obama’s 2008 election
campaign (4)) had anticipated Facebook’s next move: “Could the CIA
incite revolution in Sudan by pressuring Facebook to promote discontent? Should
that be legal? Could Mark Zuckerberg swing an election by promoting Upworthy [a
website aggregating viral content] posts two weeks beforehand? Should that be
legal?”
3 Facebook’s electoral activism
(5) mid-term elections 2010
Here
facebook ran experiments to find out whether they could influence people to
vote.
The idea
again is very simple, put an I voted button and encourage people who voted to
click on it, and publish on their friend’s feed that they had voted. The idea
was to see whether people who were shown that their friends had voted were more
likely to also vote, and the contagion spread.
Do you
think it was ethical? Does knowing that 61 million people were affected make a
difference to your answer (sixty-one million).
Would it
make a difference if they targeted only a certain group of people, in a
specific area, say where elections are expected to be undecided?
I am
quite sure that this experiment, in one way or another, played a part in the
Cambridge Analytica campaign to get Mr Trump elected in 2016 (6)
4 Target and Pregnancy Prediction
(7)
Target
identified a combination of products purchased from their stores that indicated
that a person was likely to be pregnant.
Pregnancy
is expensive and target decided that it would be a good idea to give customers
discounts to alleviate the burden while forming habits that would continue well
beying the pregnancy period.
Since
this was 2012, it wasn’t straightforward to identify customers as they walk-in,
to make them offers. Therefore, target decided to mail the coupons and discount
vouchers to the customers’ homes.
In a
least one case, a father got to know that target was sending vouchers about
cribs and other baby related to his daughter who was still in high school and
thought it inappropriate.
Do you
think target had violated any ethics in building the model (pregnancy
prediction)? How about the way they chose to make use of the information
(discounts with a view to form a long term habit)? Or the way they chose to
implement the campaign (mailers to homes)?
I ask
these questions because to me, an analytics project should encompass all
aspects, from data collection, modelling, and execution/implementation. This is
probably my ethical view, but if there is no responsibility throughout the
chain, then ethics can easily disappers: “I just collect the data, for all I
know nobody ever looks at it” or “I just build the models as a challenge, it
can only potentially cause harm if employed wrongly” or “I have no idea how
this thing as built, I am only executing on a plan”…
5 Amazon’s sexist recruiting tool
(8)
Some of
you may be old enough to remember that say the calendar was a separate
programme you had to install on your desktop computers instead of it being
preloaded into windows. After a while Microsoft simply decide to move into this
space, drive the calendar companies out of business.
Today’s
cloud providers are doing the same.
Amazon
decided they wold get into the recruitment space. Using AI, they figured, they
could find the best people for various jobs. Afterall they had a treasure trove
of data , 10 years worth of applications.
Note that
I am not knocking Amazon specifically, Microsoft via linked-in and google via
google recruit (9) are also in the business.
What is
different with the amazon effort is that they realised that their predictions
very heavily biased against women. Their post-hos analysis found that the
mention of the word “women” in a cv, for example in “captain of the women’s
football team” tended to give applicants a lower score. So were graduates from
female only colleges.
Amazon
pulled the tool.
Do you
think Amazon was ethical (or others are) when using CVs and hiring outcomes to
score applicants? Or are they unethically perpetuating human biases?
6 Uber and one-night-stands
I have
argued before that companies like grab, uber are not in the transportation
business but in the data business. Their focus is on transforming the data that
people give them for free into information they can sell.
In 2012,
Uber showed what they can do with the data when they published a research piece
entitles “rides of glory”. Basically they simply identified people who had one
night stands. It is simple really and goes something like: if you are picked up
at night from say an area with bars (assuming you consumed alcohol and given
that alcohol lowers inhibitions), go to a residential area you rarely visit
(from your past behaviour), and leave after a short while or up to dawn (with
or without breakfast), then chances are you had a one-night-stand.
Note that
even if Uber doesn’t have the data on you going to an unfamiliar place
(afterall, your date could have paid for that ride), the fact that you leave
the unfamiliar place is also a very good indicator.
Uber
pulled the article, but you may still retrieve it from cached data or snapshots
or refer to articles that describe the experiment (10).
This
becomes more interesting when you consider that since 2016 (11), uber tracked
people who have downloaded the app even when you are not using the app; after
some furore they have apparently deactivated this feature (12).
Do you
think it is ethical for Uber to use the data from passengers to predict one-night-stands,
or would you feel this is an invasion of privacy? How about how often you go to
a doctor, I bet your insurer would love to know that…
7 Baidu and Wei Zexi (13)
Just in
case you thought I was being racist targeting only organisations based in the
US, or that ethical issues were a purely western concern, I have included the
case of Wei Zexi, a baidu user.
In simple
terms, Baidu is the google of china, the number one search engine used by
people in China.
Mr Wei
Zexi found that he had a very rare form of cancer. And he decided to do some
research using Baidu. The top recommendation (the post that appeared at the top
of the resuts) was from the XXX Military Academy. The website then proceeded to
detail their successes in treating that form of cancer. Mr Wei Zexi was
convinced and with his family’s help moved city to undertake treatment.
He did
not survive.
It is
only later that it was found that the hospital did not really have the success
rates it promoted, and arguably, Baidu placed at the top of their list, adding
some credibility to the claims.
The
Chinese authorities launched an investigation, and news of that investigation
caused Baidu’s value on the share market to drop by USD5bn.
Do you
think Baidu has an obligation to check on the veracity of the claims made by
its clients when being paid to promote certain links to the top of the list? Or
should people instead place little value on the rankings (thereby making the
ranking algorithm wars obsolete)?
I am not
sure how baidu ranks websites, but in the early years google differentiated
itself by placing websites that are linked to most other websites on top. The
underlying assumption was that if other websites refer to yours, then chances
are you have something important/relevant or something many people agree with
to say, hence higher credibility/relevance. Google has, of course, improved
from page-rank (14) but the idea is there.
Ask
yourself, how many times do you use a non-google search engine? How many times
do you go to the second page of results? How many times do you read past the
top 3 websites? How much trust are you putting in google? Now ask the question
again, do you think that ethically google (and baidu and other search engines)
should check on the veracity of claims, especially in cases where the price of
getting things wrong is high, like in the case of medical treatment?
8 People at google are actually
listening to you (15)
Recently,
google fired a contractor who leaked individuals “ok google” conversations.
While google argued that no personally identifiable information is captured, it
is possible that some people may have mentioned their names, addresses.
I assume
that users of “ok google” would never have imagined that it is possible for
someone to actually be listening to their recordings. For those of you who do
use “ok google”, is my assumption correct? Do you assume your chats with “ok
google” are man/woman to machine and thereby relatively private? Do you feel
this privacy is violated when a human is listening?
9 Alexa is always listening, and
recording, and sharing (16)
Alexa is
Amazon’s assistant, a device you bring into your home, and to whom you can make
requests, ranging from doing internet searches (just like ok google) or to
control things around the house (“Alexa, play romantic music”, “Alexa, dim the
lights”…)
Did you
know that Alexa is always listening?
Think
about it, Alexa is designed to “wake up” and know you are addressing her/it
when keywords are spoken such as “Alexa”, “Echo”, “Computer”. But in order to
“hear” these words when they are spoken, Alexa needs to always be listening.
Makes sense?
Where is
becomes a bit more interesting is that Alexa keeps the conversations,
transcribes them and stores them on the cloud and these are analysed, including
by humans.
Do you
think this is an invasion of your privacy? What if only machines ‘listened’ to
your conversations, would that be ok? Is a line crossed if there are people
listening?
10 Smart TV, who is watching who?
It’s not
only voice activated machines who are at it; how smart is your smart TV?
Very.
Google
makes millions by selling the idea of placing ads you may be interested when
you are surfing. wouldn’t it be great if someone could do the same for when you
watch tv?
The first
step is to simply know who is watching TV at a point in time. For example, I
spend most of my time watching cartoons/anime. If you only looked at my channel
behaviour, you may assume that I live in a household with kids. Nut at this
moment, since each tv gets the same advertisement, the ads are targeted to
people with kids.
Once the
TV is connected to the internet, then advertisements can be personalised, or at
least different segments of people would be shown different ads. While judging
based om viewing behaviour is better than nothing, wouldn’t it be better if you
knew for sure say how many people are watching tv, may be their gender,
approximate age group, on top of the viewing behaviour? That’s what smart TVs
do. (17)
Smart TV
manufacturers collect the data, and sell it to people who would like to
personalise the ads that will be shown on the smart tv, a nice little cycle.
Do you
think it is an invasion of privacy if your smart tv is watching you? Is it
ethical for the smart tv manufacturers to do this, and even make it real hard
to deactivate the fatures?
11 And it does not stop there…
vroom vroom
TVs are
just there in your living room, presumably voice activated devices such as
alexa just stay where you put them. There is a device that moved autonomously
around many people’s homes. I have a friend who proudly declares that he has
not swept or mopped his home for 2 years.
No he is
not a hoarder or loves filth; he uses an autonomous cleaner.
So what
is the big deal? Well, recently, the technology has moved in the direction of
mapping the internal layout of your home, and this is stored on the cloud, and
may be shared with partners (18).
How do
you feel about a map of what’s inside your home is potentially shareable? It
becomes more interesting when some of these devices are also hooked up to
Alexa. Are there any ethical issues here?
12 You can run but you cannot
hide, facial recognition everywhere
How do
you feel about facial recognition? Isn’t it great to be recognised and
greeted/treated properly. There are offices that allow access to employees
using just facial recognition. Do you feel that there is any downside to facial
recognition?
My guess
is that most people would be ok with the technology. They might not be with the
millions of cameras constantly trying to identify everyone, everywhere.
However, if facial recognition is used to send personalise warnings to children
going towards areas with high drowning risk, it’s a good thing right? But maybe
not when the system also automatically sends a message to the student’s parents
and school (19).
So
probably it’s a case of: “it’s not the technology, but the people who apply it”?
Not
really. Facial recognition technology isn’t as accurate as you may think.
When the
European cup was held in Cardiff a few years ago, the authorities thought it
would be a great opportunity to identify and catch football fans who had been
captured on camera engaging in various offences but were not caught then. They
wrongly identified 2000 people (20). Of 2,470 potential hooligans pointed out
by the system, 2,297 were wrong (false positive rate of 92.99%).
And it
gets more interesting, Amazon Rekog for example is actually bad at recognising
Asian faces (21), if I were in Washington, I’d be worried (22).
Do you
think there are ethical questions around the use of facial recognition? Or does
it depend on the use case? For example, in cases where the effect of getting it
wrong is high (arresting and questioning the wrong person, or preventing some
people from accessing some services), may be facial recognition should not be
used?
There are
cities now back-pedalling on the use of facial recognition in law enforcement
(23)(24)(25). And there is an interesting map on the usage of facial
recognition (26), would you be more or less likely to visit/live in cities that
use facial recognition at scale? And why? Anything to do with ethics?
And hot
off the press, google is apparently paying people cash or giving starbucks
vouchers for allowing them to capture and use faces. (27) and in case you think
it is fake news, I can offer you another piece of news on the topic (28).
And just
in case you think that your face can be captured by any cctv anywhere, do bear
in mind that cctvs were not designed to be used to collect information to allow
facial recognition, specialised devices are. Is your identity worth only USD5?
What do
you think?
Conclusion
Just as
an indicator for your own, self, just take this little test to see how you
range in terms of ethics.
Use
case
|
Score
of Ethical Issues
1 no
issues to 5 huge issues
|
|
1
|
Google
street view captured and stored wifi details
|
|
2
|
Facebook’s
emotion contagion experiment
|
|
3
|
Facebook’s
electoral activism (2010 mid-terms)
|
|
4
|
Target
and pregnancy prediction
|
|
5
|
Amazon’s
sexist recruiting tool
|
|
6
|
Uber
and one-night-stands
|
|
7
|
Baidu
and Wei Zexi
|
|
8
|
People
at google are listening to you
|
|
9
|
Alexa
is always listening, recording, sharing
|
|
10
|
Smart
TV is watching you
|
|
11
|
Autonomous
robot cleaner maps inside your home
|
|
12
|
Facial
Recognition everywhere
|
Do you
think there is a positive relationship between your score above and your age?
In my
next blog, I will follow up with real life cases faced at work. Chances are
that blog will be shorter J
My personal view is: "just because you can, doesn't mean you should"; yes, I often sit in corners - anyone reading my blogs probably must have guessed it. Anybody wants to guess my score in the table above? J
My personal view is: "just because you can, doesn't mean you should"; yes, I often sit in corners - anyone reading my blogs probably must have guessed it. Anybody wants to guess my score in the table above? J
20 https://www.walesonline.co.uk/news/wales-news/facial-recognition-wrongly-identified-2000-14619145