Tuesday, 17 April 2018

#metoo, the question of identification, understanding and biases we and AI may not be aware of


I am more than halfway through a “data science” blog when I read a piece of news where a reporter was kissed on both cheeks during a report on the Hong Kong rugby sevens (1). Another thought piece mentioned that the reporter looked humiliated afterwards. (2)

What I found most interesting is that the headline called the men “rude”. Rude? Rude is not saying hello to people, not saying “thank you” or “please”. Is kissing someone, in an obviously pre-planned way, just rude? Especially given that the person was apparently humiliated, then it ought to be more than that, no? Where is the #metoo movement?

Then I recalled another recent incident on American Idol where a judge, hearing a contestant has never kissed anyone before because the contestant believed that this would require being in a relationship, tricked the contestant into a kiss on the lips. (3). Again, there was minor backlash, but no #metoo against the judge. (4)

When you compare this to the groundswell of the #metoo movement, you cannot help but wonder... The crux of #metoo is identification, you identify with something, someone... May be it’s not just sexual harassment, or even sexual harassment of a woman.

Earlier this year I read this open letter that I highly recommend to everyone to read. (5). Emma Watson talks about feminism and what I like most is the idea that introspection is needed to understand your own position, especially things you may not be aware of.

When I gave my UN speech in 2015, so much of what I said was about the idea that “being a feminist is simple!” Easy! No problem! I have since learned that being a feminist is more than a single choice or decision. It’s an interrogation of self. Every time I think I’ve peeled all the layers, there’s another layer to peel. But, I also understand that the most difficult journeys are often the most worthwhile. And that this process cannot be done at anyone else’s pace or speed.

When I heard myself being called a “white feminist” I didn’t understand (I suppose I proved their case in point). What was the need to define me — or anyone else for that matter — as a feminist by race? What did this mean? Was I being called racist? Was the feminist movement more fractured than I had understood? I began...panicking.

It would have been more useful to spend the time asking myself questions like: What are the ways I have benefited from being white? In what ways do I support and uphold a system that is structurally racist? How do my race, class and gender affect my perspective? There seemed to be many types of feminists and feminism. But instead of seeing these differences as divisive, I could have asked whether defining them was actually empowering and bringing about better understanding. But I didn’t know to ask these questions.

Basically, what I am trying to say is that the stories above (the reporter at the rugby sevens and the American idol story) didn’t create that huge an outcry possible because of some bias which people may or may not be aware of.

So how does that relate to “data science”?

Well, it seems blogs get tagged by keywords, and I would rather this doesn’t get tagged as a political post, I have to add a “data science” bit; and for this purpose I will use the words data science without quotation marks.. So here comes the data science bit... 

As data science gets more and more automated, as ML and AI become more popular (especially to non data scientists), one of the hidden dangers is bias.

You are what you eat, even if you are a machine or are artificial.

It takes a lot of effort to even identify biases from a bunch of data. That’s something I mentioned before in the context of Human Resource Analytics where the impact could arguably be the worst (6).

There have been many articles such as on whether computers can be racist (7); the answer is yes if the data set which was used to train is happened to have a bias towards or against  a certain race even if it was purely unintentional. For example if your area of the world has virtually no orange people and one happens to apply for a role (say president) and gets accepted, a machine could pick up that orangeness makes one suited for presidency (The probability of becoming president given the race is orange is 1 haha). 

Serious thought is being given to the topic, Barocas and Selbst (8)  argue that “Addressing the sources of this unintentional discrimination and remedying the corresponding deficiencies in the law will be difficult technically, difficult legally, and difficult politically. There are a number of practical limits to what can be accomplished computationally.” Ransbotham (9) from Boston College also argued that having more data doesn’t necessarily remove sampling bias. 

In sum, when we, whether as data scientists or normal human beings want to analyse and issue, it is a good idea to understand our own biases and the biases in the data we have, so that we can do justice to interpreting the information we have and generating results.

(3) https://www.youtube.com/watch?v=ce3_D3IG96w I guess people who didn't know of this case would have assumed the genders were reversed. While some men would have loved to be kissed by Katy Perry, not everyone would, especially people who believe you need to be in a relationship before you kiss someone.
(4) Just FYI the contestant did not get past the round.

No comments:

Post a Comment