Thursday, 20 December 2018

Tale of the left handed samurai


A few days ago I was asked to prepare a piece that would help people who have not really been exposed to thinking using data on how to go about doing it.

Since the presentation did not take place, I decided that rather than let the slides go to waste, I would share it here. Please note though that my presentation style is very spoken with just visual cues on the slides rather than full explanations, so do forgive the short comings.


Why left handed? Because many people believe that left-handed people are more creative, and I believe that analysis is a creative endeavour, at least analysis that creates and impact.

And why a samurai? Well I hope that by the end of this post, you should have your own answer.

Analysis/Analytics is a way of thinking

Everyone analyses things before making a decision, whether consciously or not. So this is nothing new.

In a business context however, it is worth thinking whether there are business implications to the idea, whether it may have some benefit. There is no point, in a business context, to do things just for the sake of doing it, RoI and so on.

Once the potential business implications have been cleared up, the next step is to decide upon a metric or set of metrics we would use to decide whether the idea works or not. The metrics are usually very closely linked to the business objective; if the idea we are testing is to do with increasing sales, then it makes more sense to use sales as a metric, or sales growth rather than employee satisfaction for example.

Once the metric is chosen, or a list in order of preference and including some proxies if the actual preferred metric would not be available, then the next step is to look for the data needed for analysis; this includes both history and granularity.

Next is the number crunching, based on the algorithms chosen.

And finally we reach a certain conclusion.

There are a couple of things worth pointing out.
  1. The first step always has to be the business context; the assumption is that analytics is designed to help the business. This also means that there is need for some level of subject matter expertise, some creativity, at a minimum to translate the business issues into something solvable by analysis of data
  2. I deliberately chose metrics, data, and algorithm before any analysis starts and in this order, We need to have a clean and objective view of the situation, pick the best, second best (... or more) approaches and tackle them in that order. It is not difficult to find a combination that gives us the answer we'd like to have, but that would not be real analysis, that would be fishing, not analysing. We let numbers tell their story, we do not torture them until they say what we want to hear.
  3. While I have described the process, or at least one crank of the wheel as linear, it is not necessarily so. For example, if we had chosen to use year on year growth as our metric, but find out that we only have 6 months of history, then we should go back and evaluate whether we can have a good enough analysis leading to a conclusion if we use only 6 months of data, in which case we should amend the metric to monthly or quarterly sales growth for example.

But analysis cannot exist on its own.


Analysis is part of a process, cycle of having hypotheses, analysing and assessing their viability then reaching a conclusion.

The process can then be repeated so as to get better and better answers, either by testing different things or by refining ideas.

There are 2 points to bear in mind:
  1. The key is the approach of experimentation; we cannot expect to have the perfect answer the first time, every time. Another way of looking at it is that "no" is a good answer too. It may dent our ego if the hypothesis was "dear to us", but in the spirit of experimentation, a conclusive "no" is as important as a conclusive "yes". And this brings us to the second point.
  2. It is worth reiterating (I touched on it earlier), is to be as objective as possible, While a hypothesis may be "dear to us", they should all be analysed and evaluated objectively. I am not saying passion should be excluded, passion is great in hypothesis forming, but hypothesis testing ( even predicting) should be done with a cold heart and mind.
What about the process of discovery you could ask, just blindly letting the data tell its story? I think it is perfectly acceptable, but as we interpret the data, a story will be formed, whether externally (say from experience or the past) or as coming from the data, and this will lead to a hypothesis, and so the cycle begins.

It is also important to remember again, that analysis, at least in a business context, cannot be purely for the sake of analysis.



I keep saying it, but analysis should be done to help the business, and we should avoid analysis-paralysis.

If, as a result of the analysis, given our choice of metrics and methodology there is no conclusive answer, then we should review. For example, we may decide to rephrase the business issue, or decide to collect more data to allow for a more definitive answer.

If we have a conclusive answer then we can exit the process and go to a next step; remember there is nothing wrong with deciding that the analysis of data does not support the hypothesis, experimentation is about learning and moving on.

It may not be easy to switch to an experimentation driven methodology, but the rewards are worth it.

The ultimate aim of analysis is to take action; analysis, in a business context, only truly comes to life when there is a resulting action, the experiment is tested in real life, and to me that’s one of the most exciting times, when you really get to see whether your analysis has been accurate and how much it actually helps the business.

The other side of the coin is that if action is taken without proper analysis, then it can be likened to gambling. Of course people with experience can use this to help guide their actions, but what is experience if not an accumulation of data. The danger is that, given how more lasting/easier to recall memories in the human brain are usually associated with emotion, we may be remembering a distorted view which only an objective analysis may reveal. Also, situations change, and in fluid environments, data analysis is invaluable as a tool to guide decision making and a precursor to action.

But just taking action, is not enough; the result matters.


Finally, once action is taken and the experiment is run, it is critical to gather the results and compare them to what is expected and learn whether things went as expected.

Furthermore, analysis of results can lead to new hypotheses, refinement of hypotheses and kick starts a virtuous cycle of improvement.

Still, why the Samurai?

Because it is very important to have the right attitude when analysing the data and the may be romanticised image of a Samurai as someone who is zen-like but decisive helps. It is important, when analysing data, to be able to put everything aside and focus in what the data is saying.


So who wants to be a left-handed Samurai?






Monday, 10 December 2018

Lies, Damned lies, and football statistics (with a sprinkling of fake news)


“People who don’t understand football analyse with statistics”(1), so said  Jose Mourinho, 4 times world’s best coach, 2 times champions league winner, 3 times English champion, 1 time English FA cup winner, 4 time English league cup winner, 3 times Spanish champion, 3 times Spanish cup winner, 2 times Spanish super cup winner, 2 times Italian Champion, 1 time Italian Cup winner, 1 time Italian super cup winner, 2 times Portuguese champion, 2 times Portuguese cup winner, 1 time UEFA cup winner, 1 time Europa League Winner, 1 time UEFA Super Cup winner, 2 times Portuguese Super Cup Winner, 2 times English Super Cup winner (2)

On the other hand, Pep Guardiola kind of referred to statistics when arguing that his team is not dirty: “Normally when a team has 65 or 70 per cent of the ball we cannot kick the opponent. We can kick each other, okay, but we have the ball. Normally when for every 10 minutes you have the ball for seven of them there is less option to make fouls. I don't think we're a team that make a lot of fouls in games.”(3).

So Mr Guardiola, 2 times world’s best coach, 2 times champions league winner, 1 time English champion, 1 time English league cup winner, 3 times Spanish champion, 2 times Spanish Cup winner, 3 times Spanish Super cup winner, 3 times German champion, 2 times German cup winner, 3 times FIFA club world cup winner, 3 times UEFA super cup winner and 1 time English super cup winner (4) on the other hand, uses statistics (sort of) when assessing his team.

And we have the adage that my friend Ramesh reminded me of “Lies, Damn lies, and Statistics”

Given his past record with respect to lies (5), let us consider the argument of Mr Guardiola. Taking data from whoscored (6), I focused on the number of fouls committed per game, and made a distinction between home games and away games.

In the chart above, the teams who commit more fouls per game are on the exterior whereas those who commit less fouls are closer to the origin. It is easy to see that Mr Guardiola was right: Manchester City is one of the teams that make the least fouls. It is an undeniable fact.

So is Mr Mourinho wrong then?

This is where context and subject matter expertise are important. I am very fond of the Drew Conway data science diagram (), and it emphasises the need for subject matter expertise.

What subject matter expertise would you ask?

Well, enough to understand that Manchester City play a possession based strategy, basically they keep the ball for huge chunk of each game. This element provides context.

This is football (or as Americans call is soccer) and players are not allowed to tackle players who do not have the ball, basically your opponents are much more likely to try and attack you when you have the ball; when you do not have the ball, you are unlikely to be attacked. The more time the ball spends in your possession, the less likely you are to commit a foul.

Hence, what matters is not fouls per game, but fouls per number of minutes the opposition has the ball.

Now the situation looks totally different doesn’t it. Manchester City is not among the teams that commit fewer fouls per minute out of possession; they commit more than their fair share of fouls when the opposition has the ball; in fact, if you look only at home games, they commit the most fouls adjusted to possession than any other team in the premier league.

Also interestingly, Chelsea and Liverpool also foul consistently. It makes sense, if your tactics are around overloading opponents, it makes sense that if the opponent gets the ball (by passing your press), you would be very overloaded too, hence the tactical foul to allow you to regroup and balance the situation.

In fact, that was what Gary Neville was referring to when he said that Manchester City is a cynical team (and that he likes that).

Actually what I find interesting is the fact that Manchester City’s triangle is very asymmetric. They foul much more at home than away. They are much more aggressive at home, having scored twice as many goals as they have away; Chelsea and Liverpool are much more balanced.

Anyway I can happily disagree with Mr Guardiola, after all you wouldn’t expect a coach to agree that he asks his players to commit fouls, but I would have expected him to keep quiet rather than manipulate the data in his favour. I thought the temple of “fake news” is located at the white house., apparently it has a branch at the Etihad (and I am not commenting on the FFP and other allegations by Der Spiegel (7) such as “We do what we want”)

Was this a case of “Lies, Damn Lies, and Statistics”? “Fake news” yes, deliberate misdirection/white lying may be, but it’s not the fault of statistics, it’s the fault of the person who chose the metric (number of fouls per game rather than number of fouls adjusted for possession) rather than the metric itself.

So was it an illustration of what Mr Mourinho was saying, that “people who do not understand football analyse with statistics”?

Well, since I am currently spending a lot of my energy trying to make an organisation increase its adoption and usage of statistics in decision making (not a football club though, any takers?), I would neither agree or disagree, and hide, as usual behind “it depends”.

It depends on what Mr Mourinho actually meant. Saying the people who do not understand football analyse with statistics is not the same as saying that people who understand football do not analyse with statistics. Please note that since we are dealing with absolutes, the intersections will be shown as changes in colour of the affected regions (for example red overlap with yellow makes orange, and red with blue makes purple)


The world is made up of people who understand football and those who don’t.
Now let’s add people to analyse football with statistics, Jose Mourinho’s words can be seen as:
There is a perfect overlap between people who do not understand football, and those who use statistics to analyse football.
But his statement is equally valid if:
In this case there are people who understand football and do use statistics to analyse it; presumably Mr Mourinho has at least one such analyst in his team.
So what am I saying?
I deliberately started this blog with a seemingly controversial statement by Mr Mourinho, who is someone with many detractors. It is possible that a proportion of people would have interpreted his words as the orange and blue diagram above, just because of what they perceive Mr Mourinho to be, that is negatively. Hence they may not have seen his statement as representing the last diagram above, which is not very controversial.
On the other hand, if you just go on a search engine and look for Guardiola and Gary Neville, you will find many more articles on the response of Mr Guardiola, than on the statement by Mr Neville. Again, Mr Guardiola has a better image and people tend not to analyse his statements as critically, whereas Mr Neville can be polarising, hence the focus on the rebuttal of his statement.
But as you can see, at least in this case, Mr Guardiola was dealing in “fake news”.
Conclusion(s)
While the source of any data should be looked at, personal feelings towards the person delivering the message should not get into the picture. Data is data and should be analysed without prejudice
That being said, if the person who does the analysis has “mis-spoken” frequently in the past, then it makes sense to review their data a little bit closer, after all frequency of mis-speaking is a characteristic…
One of the simplest ways of analysing data is to put it in a proper context, and this takes some understanding of the data, the process of data creation, some subject matter expertise, and an open but critical mind.

P.S. While I wrote this blog last week, this weekend, Chelsea played and beat Manchester city. In total Chelsea committed 12 fouls, Manchester City 11, whereas possession was 39%-61%; hence Chelsea made more possession adjusted fouls than Manchester City. Chelsea won, by the way and disrupted Manchester City, restricting them to only 4 shots on target, their average being 6.1.

  1. Pep Guardiola’s changing defense against his convictions for taking performance enhancing drugs, and the power of unstable urine in all 4 tests as proposed by his still close collaborator, Mr Manuel Estiarte http://www.sportingintelligence.com/2017/04/25/sharapova-guardiola-doping-darkness-and-light-250401/

Tuesday, 9 October 2018

Analytical Maturity, why organisations should embed analytics across all depts, so all depts grow together.

Last year, a friend of mine asked me my thoughts on an HBR article “If Your Company Isn’t Good at Analytics, It’s Not Ready for AI” (1). It was an interesting one and caused me to pause. I had started writing a blog post about it but abandoned it half-way. Today this topic is more relevant than ever. So here goes.

If you read some of the thousands of articles, or listen to people’s reactions around AI, you would have thought everything is AI nowadays. But this isn’t the case; why?

Similarly, the same friend was surprised to learn that some organisations such as tencent has a huge team of data scientists; so does Zhong An(2) by the way (at least 54% of their employees are engineers or technicians). That does seem to indicate that the deeper you get into analytics/ml/ai, the more data scientists/machine learning engineers… you need, not less.

Why is this so? Why can’t an organisation “put all the data in one machine and all answers come out” like what one of my bosses wanted? Can’t everyone just adopt AI?

If you build it, they will come (3)

I love the movie “field of dreams”. It’s about a whacky farmer who decides he wants to put all his eggs in one innovation and goes full hog, uprooting his previous business model and establishing a new one driven by his passion (and a voice in his head).

To me, the most memorable line of the movie is “If you build it, they will come”.

This is often the idea behind transformation projects, where the idea is to transplant analytics (let alone AI) into an organisation. Whether it is from a bunch of external consultants, or via hiring a relatively experienced head and build a team internally, the results are most often the same. There are only ghosts on the baseball field.

This is why I consider most Insurance company big data labs a failure; the men and women in white coats/uniforms are playing amongst themselves, and life goes on as per normal for ordinary folks, like 2 distinct worlds. So what is the RoI of these labs? To me, I would call this a failure since I believe the main aim of analytics/”Data Science” is to generate RoI and benefit all parties – grow the pie and everyone gets more.

So why the do many attempts at embedding analytics in organisations end in failure? Why do 85% of data lake projects fail (4)? The technology is there. Sure, there messy implementations, broken pipelines, choked pipelines, clogged processing engines, extremely dirty data where ELT works only sporadically or knowledge disappears with staff attrition…
harder
Well, as the article (4) says “More than 85 percent of respondents report that their firms have started programs to create data-driven cultures, but only 37 percent report success thus far. Big Data technology is not the problem; management understanding, organizational alignment, and general organizational resistance are the culprits. If only people were as malleable as data.”

Even if you build it, they may not come (for longer than the rah-rah show).

Basically, production of analytical pieces is ‘easy’. You drop me or any decent analytical person in a datalake, throw an SMEs and a good data engineer (my claim of analytics as a team sport (5)) and we are bound to catch some fish for the client and clean it, and cook it for him/her; but he/she is unlikely to know how to continuously include fish in his/her diet unless the people from the client team are ready.

What is most often missing is the ability to consume the pieces of analytics in a consistent and on-going manner.

Consumption of analytical/”data science”/AI output s not as obvious as you may think. And this is the part that most failed implementations have in common, (and also why I have consistently refused to join any organisation trying to transform itself in terms of analytics if the role focuses on the IT production side).

There can only be one, can’t there?

You could argue that it is only necessary to have 1 good consumer in the organisation, 1 department adopt analytics/”data science”, show the benefits and drag all other departments along. Afterall, once a piece of analytics is successful, each head of department can choose to adopt analytics and enjoy the benefits at a much lower risk.

There are 2 flaws in this argument. Firstly, we are forgetting the ability to consume, wanting to consume is one thing, but being able to (analytically mature enough) is not a given, Secondly, departments rarely exist in isolation in an organisation. A simple example will illustrate this.

A while ago, I was demonstrating how quickly a selection of customers based on their behavioural similarities can be made and readied for an experiment. I gave up when the customer informed me it usually takes 6 months to run a campaign (even a mini-one) and that was the only way to run experiments. An organisation often moves at the pace of its slowest department.

This brings us to organisational analytical maturity.

I will admit that this is a topic that is very close to my heart and mind at the moment (hence the idea to revive the blog from last year). I fundamentally believe that in order for an organisation to fully benefit from the advantages provided by analytics or eventually becoming data-driven, it is critical for all parts of the organisation to be pulling in the same direction and preferably at the same speed.So how do I define analytical maturity?

To me, the easiest way to understand how mature an organisation is, is to understand the kind of questions that the people within the organisation are trying to answer using data. 



The range of questions where analytics can provide a good answer ranges from what has happened to how can we make this happen. For simplicity the analytical maturity can be broken into 4 stages.

Descriptive Stage

The descriptive stage is the first encounter many organisations have with data. It often takes the shaped of backward looking reports: what has happened? How many of item X did I sell last month? This is a stage most organisations will be familiar with.

Diagnostic Stage

After getting the hand of static reports, the next stage is the diagnostic stage, where hypotheses are formed. Questions are asked around “why” and often require further slicing and dicing to find potential answers.

Predictive Stage

The predictive stage is when the questions move from looking backwards, to looking forwards. While concerns about the future may have been implicit in the diagnostic stage, it is in the predictive stage where specific tools, methodologies and algorithms are employed to uncover what is likely to happen, and often how likely it is to happen, what are the drivers of the behaviour.

Pre-emptive/Pro-active stage

At this more advanced stage, instead of taking certain variables/inputs as given and trying to predict the outcome, the idea is to influence the variables and thereby cause a change in the behaviour/status… Nudging, Behavioural Economics, Game Theory are common strategies and approaches.

A simple example can illustrate the difference, the “drain the swamp”(6) example:

·         Descriptive Stage: How many people voted against me?
·         Diagnostic Stage: Why did these people vote against me?
·         Predictive Stage: Who is that person likely to vote for?
·         Prescriptive Stage: How do I get that person to vote for me?

It is too easy to underestimate how difficult it can be for people to climb through the stages of analytical maturity, some never get to the pre-emptive/pro-active stage.

I believe that usually people do not want to make their lives harder than it is, hence the best way to make people in various parts of an organisation more analytically mature is by showing them direct benefits to their own selves. It is about change management.

At eternity’s gate (7)

For organisations with people in departments who are only used to static reports or even who are so busy that they don’t look at the reports, making descriptive analytics visual is a natural step. To anyone who is interested in helping make reports relevant to people, creating meaningful dashboards and triggering people to think using numbers, I would recommend books by Stephen Few (8); I had the opportunity to attend a course by the author a few years ago, and would like to think I learnt a lot and I try to follow the guidelines as much as I can.

The great thing about this book is that the principles can be applied using most software, so you can start from today itself.

One of the more logical approaches to (re-)introduce the use of simple reports in an organisation is to take stock of existing reports, gather business requirements, and do a gap analysis. In parallel or even prior to that, it would be good to have special purpose pieces of work answering specific ad hoc business questions. When immediate needs are met, the focus can switch to future needs, the discussion can move easier to dashboard design and ability to drill, slice and dice.

Basically the idea is to use ad hoc analyses and visualisations to encourage people to think about data, and to use data to try solve their problems, moving from the descriptive stage to diagnostic stage.

One of the important aspects of the diagnostic stage is the culture of experimentation. Hypotheses can be formed, may be even theoretically tested, but true learning comes from actual experimentation, and this gets more important in the next phase.

Back to the future (9)

The move from backward looking to forward looking is a very important one. Creating hypotheses (as in diagnostic stage) can still be done without knowledge of statistics for example, but evaluating them and making inferences requires some statistical knowledge, so does the evaluation of the results of experiments. This is even more so when one moves into the realm of predictive analytics.


Why statistics? Well I believe that having working knowledge of maths and stats allows the understanding of many techniques used for predictive analytics. And I will, as usual, place my favourite data science diagram (10):



Advanced Analytics/”Data Science” is concerned about predictions, and as it can be seen above, knowledge of stats/maths is an important characteristic of “data science”.

Once an organisation is comfortable in the world of creating hypotheses and possibly testing them, the next step is to use predictions to guide the ‘best’ course of action. It is important to note that in order to maximise the impact of predictive analytics, the culture of the organisation must have evolved to one of experimentation.

Once the culture of experimentation is established, we have a learning organisation and can become data driven. Again, it is important that experimentation permeates the organisation, it is critical to understand some experiments will not get the expected results, and learning from them is the point, not learning is a failure.

Minority Report: A Beautiful Mind (11)(12)

Predictive analytics assumes that the behaviour variables are given; pre-emptive/pro-active analytics attempts to change the behaviour. This falls in the realm of behavioural economics, game theory, nudging, precogs(11)… Most organisations are not there yet, plus there may be some ethical implications (after all the swamp hasn’t been drained yet, has it?)

In sum, analytical maturity is critical to ensure the successful adoption of the more advanced tools of analytics/”Data Science” (to me AI is a tool); to paraphrase the article quoted earlier (4), people are not ‘malleable’, putty is. So as long as we are dealing with people, change management, bringing people across an organisation up the analytical maturity stages is important.

However, that is not to say that it is not possible for organisation to engage technological leapfrogging. One of the interesting aspects of technology is that you do not need to understand it fully to use it to make decisions. As someone said in the Global Analytics Summit in Bali last year (you can find a piece I presented there in a previous blog post (13)), “managers who know how to use data to make decisions will replace managers who don’t”.

Once a technology gets to the bottom of the through of despair in the hype cycle (14), what brings is back up via the slope of enlightenment is that it starts getting applied beyond the purely technical hype, real life applications are what make technologies reach the plateau of productivity.




In Sum
To me it’s our job as analytics/“data science” practitioners to help organisations go through the analytical maturity. What about new technologies to come you would ask? The answer is that if an organisation is mature enough, has become data-driven, it will naturally seek to adopt new technologies and be competing with data.

So to answer my friend, yes, if an organisation is not doing analytics, it can’t simply adopt AI. However, it is not necessarily take that long to learn and become analytically mature, as long as there is a framework and commitment through-out to do so. And I would like to add, I certainly believe in technology leap-frogging, I am betting on it.


  1.  https://hbr.org/2017/06/if-your-company-isnt-good-at-analytics-its-not-ready-for-ai
  2. https://asia.nikkei.com/Business/Chinese-online-insurer-leaves-traditional-rivals-in-the-dust
  3. https://en.wikipedia.org/wiki/Field_of_Dreams
  4. https://www.techrepublic.com/article/85-of-big-data-projects-fail-but-your-developers-can-help-yours-succeed/
  5. http://thegatesofbabylon.blogspot.com/2018/08/if-you-dont-have-phd-dont-call-yourself.html
  6. https://www.tampabay.com/florida-politics/buzz/2018/03/20/and-i-was-in-florida-with-25000-people-going-wild/
  7. https://www.imdb.com/title/tt6938828/
  8. https://www.goodreads.com/book/show/336258.Information_Dashboard_Design
  9. https://www.imdb.com/title/tt0088763/
  10. http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
  11. https://www.imdb.com/title/tt0181689/
  12. https://www.imdb.com/title/tt0268978/
  13. http://thegatesofbabylon.blogspot.com/2018/01/
  14. https://en.wikipedia.org/wiki/Hype_cycle

Sunday, 2 September 2018

What is Value? Or how much should a "data science" project cost / "data scientist" be paid?


Singapore has been rocked by the Sing Health hack (1) and the fact that the government has been downplaying it saying the data was not valuable - no state Secret (2) - and the description by the ex-PM that people who do not make a million dollars annually are very very mediocre (3). My question is “so what?”(4)

What is the thing called value? What is the value of the data? Is less than a million dollar salary very very mediocre?

In this blog, I’ll take a stab at “value”.

SingHealth

First let’s take the case of SingHealth. You must understand that Singapore takes defence and security very seriously, the concept of total defence (5) includes cybersecurity, and some people were amused that the CEO of SingHealth is the wife of the minister for defence (6). But that’s not the point.
So what is the value of the data? How do you measure the value of data? To me this is a very subjective area. However, I assume everyone will agree that accuracy of data is important, even (or especially for) AI/ML.

How accurate is SingHealth data?

There was another story recently (7), SingHealth actually tagged someone as HIV positive whereas she was not. So my question is how many such mistakes are there in the data? Saying someone is HIV positive when the person is not is a huge mistake. In fact, the article states that, since the ‘victim’ was pregnant, the husband talked about divorce, abortion...

I am not slagging SingHealth specifically, but value of data is tightly tied to its accuracy, and there are some doubts over the accuracy of SIngHealth data, after all how many of us have looked into what is in our files at the doctor/hospital/clinic (especially those of us who are not medically trained)?
However, the government stated that only basic data such as name, NRIC number, age, gender was compromised.  But the NRIC is the key to most (if not all) databases in Singapore. In fact some shops want to use the NRIC as their loyalty card, and I refuse to give that up. With this detail, someone can apply for a loan and make you responsible for it.

SingHealth is aware of the issue and has tried immediate mitigating factors; that kind of shows the depth of the problems (8)(9).

So what is the value of this data?

To me, there are a couple of other criteria that determines the value of the data: the use it will be put to and the skills of the person. So it depends who you are talking about. And this brings us to very very mediocre people.

The ex-PM basically called people who do not make SGD1m a year as being very very mediocre, and needless to say it created some noise in cyberspace, but was it warranted? Many Singapore ministers come from 3 backgrounds: Lawyers, Doctors, and Army officers.



The above charts from payscale.com (10) show the distribution of yearly salaries for these job roles in Singapore. Note that the army numbers are highly skewed since they include people undergoing national service included, and the data has some quality issues (salary of $72 for doctor), but the median is reasonable.

Basically, it is quite clear that people who make $1m or more are definitely above the top 10% in salary for their domain.

The ex-PM also argued (3) that ministers salaries need to compensate the ministers for the salaries they are giving up; and given that Singapore ministers salaries are quite high (11); hence it does make sense that if you aim to compensate people for the salaries they are giving up, to only look at people making S$1m yearly.

I am specifically avoiding questions/discussions around whether this is a good way to remunerate people serving the public. But if the aim is to be able to entice people at the top of their domains as measured by salary drawn, and compensate them similarly, plus given that the ministers’ salaries are close to 7 figures, saying below S$1m is not what they would be looking at, hence “very very mediocre” is acceptable in this context. (You wouldn’t consider an 18 year young man doing his national service as a ministerial candidate for example, hence the ‘segment’ you go for is not the whole spectrum but a small segment at the right of the distribution.)

Surprised that I think so?

I believe that people should be paid based on the value they generate.

If the salary that the minister-to-be is a reflection of his/her value in his/her domain, and if that value can be transferred to how much he/she contributes as a minister, then it is perfectly alright that the salary they receive is similar.

But the real question is how do we measure the value that someone generates?

That is precisely the great thing about being in the Analytics/”Data Science” space. The value you bring in a project can and should be easily measured.

When I took up my first contract more than a decade ago, the business sponsor for me had a choice between employing a new salesperson, or spending the money of an analytics guy, so my contract had targets just like sales people (but no variable income unfortunately). Hence the value I brought to the organisation – indeed that of my fellow analytics guys – was tracked and measured, methods of measurement, metrics all discussed, agreed.

These were extremely exciting times for me. In fact at the time I resigned less than 6 months into a new contract, I told my colleagues to tell my replacement that he/she could rest easy if he/she was paid around the same as I was since I had already justified my existence for the year and therefore his/hers.

This is why, whenever there is a “data science”/analytics project, I insist on having metrics that reflect the impact of the piece of work on the organisation it is being done for, whether it is in terms of savings (say for churn can be decline in number of churners based on past trends, or even monetised by spend – although that comprises of an extra dimension and gives more room to play) or revenue increase or market share increase, whatever is the KPI of the project sponsor; measurable and measured.

When I started in analytics more than a decade ago, we had to prove ourselves to sceptical business, hence we ‘manually’ tracked our impact to justify our existence and gain trust. I spent almost 4 years in that organisation, and we eventually set-up proper campaign tracking. Imagine my shock when I went back and found out that the organisation had stopped tracking campaigns. They did campaigns simply because they had the budget, and “use it or lose it”, without caring whether there we better ways of “using it”.

Some people may like this environment where you get to experiment without risk; but how would you know if the risk paid off, how good your ideas/hypotheses/skills are, if you do not measure the 
outcome? How do you know the value you bring to an organisation? How would you know your value?

Value is not a measurement of input, but of output.

Once you have an idea of how much you will be able to contribute to the organisation, then you can apply RoI/break-even rules and determine how much it would be acceptable for you to charge, thereby delivering a win win situation.

The value of a project is a proportion of the value of the benefits the project generates for the client and that proportion is usually based on the typical RoI or Break-even period for projects the client undertakes.

This means that the same effort may generate less value to an SME than an MNC (in $ terms), hence the value you’d bring to an SME is lower than that to an MNC.

I recently had this discussion with someone who works closely with and helps bring innovation to SMEs. I think SMEs have similar problems to MNCs, albeit at a lower scale. While it is true that SMEs are less likely to have a full set of data to start work on, the analytical methods of solving the problems are similar. Furthermore, analytics is not as expensive as many people think it is.

I think SMEs have an advantage over MNCs, they are more flexible. Hence arrangements where the client pays a low base fee and a proportion of the value generated by a project/analytical piece of work can be done with SMEs whereas MNCs may not have that flexibility (neither would large consultancies who would have to account for revenue recognition risk and so on).

Basically to me it is very simple, tie what you are paid to the value you deliver to your customer, this is a very simple way of having win-win situations. And it all starts by knowing the value you bring which is based on measuring the impact of your work.

Similarly, as someone considering paying for the services of a “data scientist” or “data science team”, you should base the payment on the expected returns from the services received, and that starts by looking at the impact delivered in the past.

P.S. As I wrote this blog, a new ruling from the PDPC Personal Data Protection Commission recognises the value of the NRIC, and therefore restricting unwarranted use of it. (12)



  1. https://www.businesstimes.com.sg/government-economy/singhealth-hacked-records-of-15m-patients-including-pm-lee-hsien-loong-stolen
  2. https://www.straitstimes.com/singapore/singhealth-cyber-attack-pm-lee-says-nothing-alarming-in-his-data-that-was-stolen-no-dark
  3. https://sg.news.yahoo.com/ministers-not-paid-enough-says-goh-chok-tong-reports-043024792.html
  4. https://www.youtube.com/watch?v=FJfFZqTlWrQ
  5. https://www.scdf.gov.sg/home/community-volunteers/community-preparedness/total-defence
  6. https://www.theonlinecitizen.com/2018/07/26/singhealth-hack-exposing-the-cracks-of-elitism-and-entitlement/
  7. https://www.straitstimes.com/singapore/health/singhealth-apologises-after-polyclinic-doctor-mistakenly-marks-woman-as-hiv
  8. https://twitter.com/mrbrown/status/1021332747204218880
  9. https://twitter.com/mrbrown/status/1020223366341394433
  10. https://www.payscale.com/research/SG/Job=Attorney_%2F_Lawyer/Salary , https://www.payscale.com/research/SG/Job=Physician_%2F_Doctor%2C_Cardiologist/Salary/67bbbe3e/Singapore , https://www.payscale.com/research/SG/Job=Army_Officer/Salary
  11. https://en.wikipedia.org/wiki/Cabinet_of_Singapore
  12. https://www.straitstimes.com/singapore/stricter-rules-to-protect-nric-data-from-next-sept


Sunday, 26 August 2018

Business Internet Banking, 5 working days to unlock an account? yticoleV. Nudge nudge branch transaction red flag...

I had a very interesting experience last week and I thought it was worth sharing. Basically it is about OCBC and the push towards decreasing an organisation’s costs by either replacing humans by machines within their ranks, or shifting costs onto the customer (one of my pet peeves with supermarket self-checkouts), and especially how that push if being made and the possible impacts on existing staff, customers, and customer experience.

I had managed to lock myself out of my business banking internet account with OCBC. (Hey, what can I say, I love the fact that I have a personalised debit card... have always been a sucker for looks). What is important is that I am/was using self-service internet banking, apparently exactly what the bank wants. When I was locked out, I called the hotline, and was told it only operates during office hours.



The next day, I tried the hotline again, and decided to try and get the problem fixed without having to go to the branch. The superb robot couldn’t understand what I was saying. (I am not complaining about the robot, although I spoke in my version of English throughout (no mid sentence language switch which is horrible to deal with in speech recognition), but the simple fact is I couldn’t get what I wanted after trying a couple of times.)

You can stop laughing at my engriss please. The OCBC robot’s standards are way higher than mine and accent quite posh J

So I decided to take a quick walk to the nearby branch.

Anyway, after a not so small wait, I went to the counter and met a perfectly polite and helpful lady (Ms Lim if I remember correctly), she understood my problem, got the appropriate paper forms, ticket the relevant boxes, got me to sign and all was done!

Just before I left the counter, I asked, “around how long do I have to wait to have my access re-instated please?” and she replied “up to 5 days” and after she realised I was shocked she helpfully explained that they couldn’t do this on-site, and this piece of paper had to be sent to the relevant department for action, so it would take up to 5 days.

Ridiculous, don’t you agree?

Here you have a bank pushing its customers to use internet, but when they are locked out, it can take up to 5 days to reinstate access to the account. Now if like stupid me you thought internet access would mean you can always take advantage of the really fast bank transfers to pay for business expenses, you’d end up like stupid me with an invoice you cannot pay because you have to wait 5 days for the piece of paper to make its way from the branch to the relevant department (are they in internet banking department?) and for someone to take action.

I lost it.

I asked the counter staff to put herself in my shoes, and whether she doesn’t find it ridiculous that a bank that is investing a huge amount of money to replace these counter staff by ‘robots’ (1) didn't instead spend the money on making internet services for customers more seamless. (Unless I am one of the few idiots, if not the only one, who got himself locked out of his internet banking access – in which case I will hang my head in shame).

Anyway, the counter staff said she would talk to her colleague to expedite my case and I left the branch.

A few hours later, I received an SMS from the bank telling me my access had been restored, and an email, followed by a call by a human. Wow, speak of service! OCBC managed to turn an unpleasant experience into a surprisingly pleasant one.

So am I really writing a congratulatory blog?


I started thinking...

If it is possible to resolve the issue so fast, why wait for the customer to complain? Either the bank’s process were short-circuited to expedite my case as an exceptional case (can you do me a favour, I have this customer who is very unhappy...?) or the process does indeed take minutes but it is a deliberate policy on the part of the bank to make branch transactions as painful as possible.

The counter staff did ask me if I had tried calling in, and I said I did, but it didn’t work, the machine couldn’t understand what I was asking for. Was that a factor? So I did try the way the bank preferred, and hence I wasn’t made to pay the full price (5 days locked out).

This reminded me of an interesting article I read on how banks are pushing us towards ‘the cashless society’ (1).

Basically the key to making customers do what you want is “nudging”.

To quote the Wikipedia definition (emphasis mine) “Nudge is a concept in behavioral science, political theory and economics which proposes positive reinforcement and indirect suggestions as ways to influence the behavior and decision making of groups or individuals. Nudging contrasts with other ways to achieve compliance, such as education, legislation or enforcement.”

A classic example of nudging is illustrated below:



For people not familiar, public urinals often now come with a fly attached; the idea is that the urinator takes aim and thus spills less. The urinator is nudged towards doing the right thing. I think most of us would agree than a toilet where there are fewer spills on the floor is a good thing. It’s good for the urinator who is less likely to get splash back on his pants, it is good for the other toilet users who enjoy a cleaner environment, and it is good for the people cleaning the toilet (and the company that employs them) in terms of a less mess to deal with. Wins all around! Some organisations have even cashed in (2)

Nudging is another area that interests me, but again, I’ll hold off for another blog; reserve this one for how it appears to me, as an outsider and based on my personal experiences with and what I have read about OCBC’s latest trends. But for now what I am saying is that nudging is in itself and by itself not a bad thing; it can create win-win situations all round.

Let’s assume OCBC is nudging me away from branches to using internet banking.

You could argue that I can’t complain about the fact that the hotline was not operating is a non-starter since I wouldn’t be getting 24 hour service if I was using a branch.

But that ignores the issue of expectations. We have been sold the idea that using internet banking is superior because it is 24/7. But if support is not 24/7 is internet banking for businesses really 24/7? I don’t think so.

Plus, personally, I much prefer dealing with a human than a robot, face to face rather than via phone. But that’s just me. So if I am moved away from this personal touch, there needs to be a compelling value proposition.

What else is OCBC doing at the branches?

OCBC is planning to replace half of the teller staff (3) by machines by 2020 (4). But it is not firing people thankfully, but retraining them as highlighted (5).

There are a couple of things I’d like to highlight from the articles.

I object to tellers being called relatively low-value added services: “The bank said the tellers would move into roles that allow them to take on "higher value-added" tasks that require decision-making or physical verification.” What is the measure of value? That’s a blog post in itself, but to me being able to get my issues resolved and queries answered is very valuable, and it’s a definite plus if it comes from a human. I am a human, and I relate better to humans; I ‘feel’ a brand if I am dealing with a human representative of that brand, I do not feel anything if I am dealing with a machine from that brand. So to me there is a tangible (issues that are not out of the box, specific preferences, services that require more stringent verification as pointed out by OCBC themselves) and intangible (the human factor).

Another fact that I find interesting is that, even as the new generation machines developed by OCBC at the cost of $14m and over which they have 5 year registered design license are ‘future-proof”, they are retraining tellers into advisory (which is great) and where verification is required. This latter piece puzzles me.

To me, anything that is used multiple times is something that can be automated. For example, if to release funds requires verifying the documentation and signatures, there are basic steps that are always followed, and even for signature verification, technology enables you to score how different a signature if from another. And if there is anything you want when deciding whether signatures are similar enough is some objectivity to decide the “enough”.

How about advisory then?

Well I am sure that, especially given the evolution of the role of counter staff from pure customer servicing and transactional role to having some sales component, many tellers will be able to make the switch. However, OCBC has also debuted ‘robo-advisors’ a few days ago (6). And this is not a one-off; it is a direction that OCBC has taken for at least a year (7).

They recently launched service targets younger investors; well heeled clients already have relationship managers, so what market will the newly trained advisors target?


I applaud the efforts by OCBC to re-train its staff, i think it is too easy to replace people without a thought, and putting resources into improving the skills of existing employees is great. Furthermore this is in-line with the efforts by the government to upgrade the skills of the workforce, for example skillsfuture (8).

However, I believe that an organisation should not cry victory at the beginning of a programme, but should only do so once there is a victory. OCBC claims that all tellers who are being replaced by the new machines will be re-employed in different roles. As the articles point out (4) and (5), OCBC did not disclose the current number of tellers, neither was the number of affected people. I believe that OCBC should follow up this PR exercise to put its money where its mouth is and disclose the number of tellers affected now, and the number from this group still employed in 2020. That would be a true win-win.

I hope OCBC can truly create a win-win-win situation where customers get better service, employees remain gainfully employed and enjoy their roles, and the organisation naturally makes more profits. Automation, Analytics have the power to create all around win situations, and organisations who create such situations would certainly emerge as winners in the medium to long run.

I believe that automation, finding better ways to serve customers, embedding analytics at all levels of processes of an organisation (effectively making organisations data-driven) can enhance the ability of people to do their jobs, make them more productive, decreasing the costs of the services and thus allowing more people to enjoy them. It’s about decreasing costs to decrease prices and make things more affordable to more people, not decreasing costs to increase profits.