Wednesday, 23 May 2018

Which gig/contract "data science" option is suitable for your organisation?


In my previous blog, I argued that few organisations should have full-time “data scientists” on their books. In this blog, inspired further by comments I received, I explore who the “data science” functions should be outsourced to.

To simplify things, I will start with 4 options.

Large Management Consultancies with “Data Science” arm
The first, most obvious, is to go with the big names; whether you look at the big management consultancies (Bain, BCG, McKinsey) to which you may add say Accenture or the big accountancy firms (Deloitte, EY and to a lesser degree PWC and KPMG) are all trying to establish a practice to gorge of the “data science” pie in the sky. So if you are from a huge organisation, then chances are, someone likely high up in your organisation has been approached by one or more of these.

I am definitely not against management consultancies getting into “data science”. In fact I believe that an organisation’s  “Analytical Maturity” is key to its ability to make use of data and become “data driven”. Therefore, the journey is much more than just applying some models/algorithms here and there, but the ability to consume and exploit them is critical. (In my previous blog (1), I kind of proxied that by “variety of ‘data science’ projects”).

The question is, can you afford these, is the RoI (Return on Investment) worth it...

Individual “data scientist”
On the other extreme, you can choose to pick an individual and either kick-start your analytical journey by showing quick-wins and good RoI, or start building an analytical culture from scratch – in which case the individual should be well rounded (it’s more akin to getting a sort of CDO).

Niche “data science” consultancies
A mid-way solution would be to engage niche consultancies specialised in “data science”. This is kind of best of both worlds (or worse); you get a group of people who may have complementary skills/specialisations, without the extreme overheads of layers of management and partners.

Technology Luminaries?
How about technology luminaries such as Cloudera, HortonWorks, MapR, Google, AliCloud... you may ask. Personally I think technology is a very important component of “data science” (afterall many algorithms have been available for years but compute capabilities required weren’t ready), however, technology is a tool. So, while I believe collaboration with these behemoths is a good way to go to equip your organisation, they should not be leading the efforts; carts and horses...

Positioning vis-a-vis Data Science Venn Diagram
If you look at the Drew Conway data science Venn Diagram (2), it becomes easier to see. The diagram below is inspired by the Drew Conway version, updated if you want.




Large Management or Accounting consultancies come from the area of Substantive Expertise, and in order to reach the data science have hired people with IT and Maths/Stats skills to form “Data Science” capable teams.

Individuals have specific skills, or a combination and can be coming from any area of the Venn Diagram (although, given the depth of knowledge and effort required to be in the centre of the diagram, they are less likely to be there)

Large Technology companies have an abundance of IT/Hacking Skills, and are picking up Maths/Stats, very often ignoring the “Substantive Expertise”, relying on “Machine Learning”/”AI”/”Deep Learning”. While it is debatable whether proponents/practitioners of machine learning today have enough maths/stats understanding, I think most would agree than there is a lack of business knowledge.

Some ML/AI/DL practitioners see this as a benefit, believing that all you need is data; I do not agree.

Small consultancies, like individuals can be anywhere in the diagram, but mostly are a combination of 2 aspects and most I am aware of are in the Machine Learning space, being staffed or started by people with strong Computer Science backgrounds. I am not saying that niche consultancies have no domain knowledge, they may have but either it comes from the IT side (as someone recently reminded me: working in the IT side of a bank doesn’t mean you understand how a bank works, but you understand how the IT in a bank works) or is quite specialised since the number of such experts is likely to be limited. They are after all niche.

What does this mean to your organisation?

If you are large enough and the kind of transformation you are willing to undertake is big enough, by all means engage a large management consultancy with a good “data science”/Analytics practice.

If your organisation is not that large, or prefers to spend more conservatively (could be the same amount but stretched over a longer period), then you could choose to find a niche consultancy and supplement them with your own substantive expertise (adding burden to your staff since the person should more or less be embedded with the niche vendor), the problem is that the talent pool is quite limited.

An alternative would be to constitute a team from individuals with various skill sets, you may even get an external expert with substantive expertise. The issue with this is that not everyone has the knowledge to hire specialists skills, and secondly screening, choosing people can be quite costly.
I am not sure why any organisation would look to technology companies to provide “data science”; it is just not their forte, and you would be better off pairing them with one of the above options to form a slightly more rounded team, more likely to reach the centre of the Venn Diagram.

Is there another option?

Well I expect Sesh (hi Sesh!) to say he knew it was coming, but frankly I didn’t . My blogs are usually written as I think about something, so they are quite raw (and the diagrams worse... The only blog that took me a lot of effort to write was the one about Hindu temple builders being data scientists). When I started thinking about this question, I genuinely expected to end up with conditions where each of the “data science” providers would have a place (except the large technology companies who really can only play an important supporting role, not a driving role). However ...

The long prescription

If your organisation is really large and serious about really transforming, then you should look at a large Management Consultancy with an analytics/”data science” arm.  They have the scale and capability to help you without adding much extra workload on your organisation, compared to cases when you would have to manage the process if you would engage niche consultancies or even individuals.

Organisation size is important – some of the large Management Consultancies would not even consider engaging with small organisations – but so is the analytical maturity of the organisation. Large Management Consultancies really come to the fore when there is a organisational transformation since they can leverage not only the analytics/”data science” arm, but also the traditional change management and associated skills where their traditional expertise lies.

If you are not large enough and/or you are not looking for organisational transformation, then you would be better off looking elsewhere. That is the whole point of analytics/”data science”; you run small experiments and keep what is good, chuck what is not. You do not need an army for that (as I mentioned in my earlier blog).

The advantage of working with individual “data scientists” is that you get dedicated people who you know or will get to know and who can fit well with your organisation (else you can always get someone else). The fit with the organisation can be experience in the specific subject area, ability to fit culturally with the organisation... and should not be under estimated.

However, no one individual will have the breadth of skills that you may need. Of course every data scientist can be adequate at a whole range of skills and subject areas, but no one can be a specialist at everything. Furthermore I believe analytics/”data science” is a team sport, you need more than just one person once you reach a certain level of maturity, especially when you are operationalising. 

Hence niche consultancies look good.

The main advantage of niche consultancies is that they have a decent breadth of skill-sets under one roof, and you may be able to access these skill-sets as and when you need them. Proper niche “data science” consultancies would at least a team of people covering the three circles of the Drew Conway diagram.

Looking at a “data science” project as a flow:


It takes a lot of different skill sets to have a successful “data science” project. For example solution architects to design data flows, database experts and data engineers to manage the data and especially for operationalisation, domain experts to interprete and craft the data, “data scientists” to build models/algorithms, visualisation experts to help take action among others... Not that these roles are a full list, nor that these are all specialised people, nor all are external to an organisation, but it just gives an idea of what is needed to operationalise “data science” and the advantage a niche consultancy would have over an individual, or a group of individuals brought together for a specific project.

However, when you get into consultancies, you enter the realm of overheads. While an individual has very little overheads, the larger the niche consultancy the more overheads: management, sales, adjustments for time on the bench, office space, administration expenses... Furthermore you may lose the fact that individual consultants, free from official partnerships/administrative work... are able to learn and grow in their areas of interest, as opposed to what the consultancy requires/prefers.

Furthermore, niche consultancies are niche because of some specialisation, be it by industry verticals or by function, or by a combination.  Therefore you need to find the right partner. One of the factors you may lose out is that one emerging trend is “data science” is cross-pollination where ideas and techniques from a different field/industry is modified and used.

So what is the solution?

The solution is to find networks of people who work together, of course preferably with an emphasis on analytics/”data science”. There are quite a few organisations that offer this “new” approach to analytics/”data science” and a few variations. My views are based on the ones I am familiar with.
Such a analytics/”data science” focused network basically combines the advantages of individual people with that of niche consultancies adding the potential for cross-pollination, without the disadvantages such as high overheads.

An organisation is free to choose the right person while that person benefits from being part of a network for learning and support, or a group of people with complementary skill sets to deliver an analytics/”data science” project without the huge overheads. Furthermore, all members have access to common resources and a support group of fellow experts just like larger formally organised consultancies.

What do the individual consultants get from joining such a network? Simple, the opportunity of working on projects they did not uncover themselves, of working on bigger projects than they could have by themselves, of learning from peers with similar mindset whether via discussions, training, or participating in projects in different roles.

Looks like I did end up with some simple rules of thumb...

In sum:
  1. If your organisation is large and serious about transformation, go for a large management consultancy with a “data science” practice, run a comprehensive transformation programme.
  2. If you have very specific needs and know an individual or a niche consultancy with reasonable overheads that exactly suits your these needs, then go for either of these options bearing in mind where you will need to manage/supplement.
  3.  For other cases, go for a network of analytics/”data science” experts that incorporates the advantages of the two options above without the disadvantages.
  4.  In general, use large technology vendors as providing technology rather than “data science” services, horses for courses.

So what does this mean for independent “data scientists” and niche consultancies?

Join an analytics/”data science” focused Network! A large portion of demand generation is via networks anyway, hence networking is not new to independent “data scientists” and niche consultancies. But it is an advantage to join more formal networks and get the benefits from there.
Individuals would benefit from joining networks by broadening their knowledge and gaining the ability of participating in larger projects. Note this does not have to mean loss of independence, in fact as long as there are no fees or other commitments; there is no downside for an individual to join a network.

Niche consultancies would still have high overheads, that’s because of their structure, but they would at least gain the ability to broaden the scope of projects they could take up by collaborating with other members of the network, allowing to continue specialisation while broadening the scope of projects that they can embark on.

Eventually, the choice of which network to join will be the critical one. The right network has to bring value to the individual or the niche consultancy. Value can be measured in many ways, and not all are purely monetary. As Doc argued (3), the values of the leader (and of the network) are critical.

P.S.
Personally I believe the labour market is changing so much that we will soon be back to the times where most of us would independently be selling our skills rather than being “full time employed” especially with benefits; back to middle ages/very early industrialisation.


Wednesday, 9 May 2018

Should you hire data scientists on gigs or as FTEs (Full Time Employees)?


Recently a conversation with a client caused me to re-think this situation. Yes, I fundamentally believe analytics/”Data science” is a comparative advantage an organisation can possess. However this need not mean an FTE (Full Time Employee) army, or does it?

As in most such questions, the answer is… "it depends".
To make things simple, I’ll consider 3 points of view:
A "Data Scientist"'s point of view
B Hiring Manager's point of view
C Organisation's point of view

Of course the ideal situation is where all 3 parties’ interests are in agreement.

I simplify the issue by measuring the status on 2 axes:
X.               Number of “data science” projects undertaken in a year
This is an indication of how often the skills of the “data scientist” are being used. This had to be projects that go beyond BI, need to have a predictive component. A “data scientist” is an expensive resource and it is a waste from all parties point of view to utilise a “data scientist” to build dashboards most of the time for example. There are other people much more skilled at this than “data scientists”.
Y.               Variety of projects undertaken in a year
The variety of projects is an indication of how far down the path of utilising “data science” or how broad the adoption or experimentation with “data science” is an organisation.

These 2 axes can be thought as components of analytical maturity – necessary but not sufficient conditions (I will discuss about analytical maturity in a subsequent blog). The more “data science’ projects you undertake in a year, the more likely you are to be using them. More importantly, the more varied they type of problems you are trying to solve using “data science”, the more likely it is that the adoption of “data science’ or the attempts at adoption permeate the organisation.

However, this looks at only the production side of things, not the consumption. There are many organisations out there who adopt a “if you build it…” and end up with white elephants.

At the end of this blog, I’ll describe an Occam’s Razor, but for now let’s look at things a bit more management school style.

A             “Data Scientist”'s Point of view


The top right is a sweet spot for the “Data Scientist”; he/she gets to continually learn things and use the skills in a variety of projects. This is a happily tired “Data Scientist” where it’s not work, just play.


The bottom right is where the “Data Scientist” is kept busy on projects that require his/her skills and knowledge, but these projects are repetitive. This can lead to boredom, and the palliative situation is to build strong feedback loops to keep improving and challenging the “Data Scientist”. Else it might make sense to rotate “data scientists’ and bring in new pairs of eyes to try and bring quantum improvements rather than continual polishing.

The top left is where the “Data scientist” gets a variety of projects but is under-utilised, a stop-start adventure. This is likely the case of an organisation who is starting with data science operationalisation, or is not mature enough to exploit “data science” fully. In this case, it might be better to hire specialist “data scientist” on a as-needed basis. This would be much better use of resources and make better use of “data scientists’” skills.

The bottom left is where no “data scientist” would want to tread; apparently he/she did not ask the right questions in the interview.

B             Hiring Manager's Point of View:

I was going to write an analysis of a hiring manager’s point of view, but I think it is enough to say that if they are not aligned to that of the organisation (principal agent problem) then it doesn’t matter what is really going on, all that matters is the impression you can give; it is an ego or resume padding trip and no reasoning can be applied.

C             Organisation’s point of view
The top-right is the sweet spot for the organisation; the “data scientist” is involved in many projects (well utilised) and a variety of projects (organisational analytical maturity), chances are organisations in this quadrant are able to generate sufficient RoI (Returns on Investment) from their “data scientist”. But does that mean that organisations in this quadrant should not hire data scientists on gigs? The answer depends what the “data scientist” can bring.

The bottom right is where the “data scientist” keeps doing the same things. Basically, at every iteration of a model, unless there have been structural changes, there will be incremental improvements. The question is whether this generates sufficient RoI for an FTE.

One of the questions clients (both when I was FTE and on gigs) often ask me is when they know a model needs to be relooked at. My usual answer is all to do with metrics. In the same way as I insist of proper performance metrics of models to be agreed at the start of a project (to ensure RoI), I also encourage clients to have acceptable variability in results. It’s a bit like having a point estimate and a band of acceptable intervals (CI). So only if the results degrade below an acceptable level would it be worth looking at (you cannot assume that the degradation is due to random fluctuations and the impact on returns is too negative).

So chances are, unless each application of the model generates huge returns, it might make more sense not to have a “data scientist” on the payroll but to use a “data scientist” on gigs (1).

The top left is where the “data scientist” is engaged on a variety of projects but not many projects. This is a clear case that a “data scientist” by gig is better. Not only do you utilise resources only when you need them, but you can ensure you get the best resource for the project.

The bottom left shows and organisation that has no use for a full time data scientist, but should explore “data science” via gigs.

Conclusion:

It is quite clear that the best case scenario for both the “data scientist” and the organisation  is when the “Data scientist” is fully engaged, building a variety of models/algorithms to solve different business use cases and generating RoI.

In the rest of the cases, the situation is unlikely to last. In cases where there is either variety or quantity but not both, the “data scientist” is likely to feel lack of growth and leave, causing the organisation to go through the expensive hiring process again and again, further impacting RoI.

In the case when there is neither quantity nor variety, there is no point engaging a full time “data scientist”. This is a very clear-cut case for having “data scientists” on a gig basis. The approach should be one of proof of concept, prove the RoI that can be obtained, using “data scientists” on gigs; not only is the cost overall lower, but the organisation can get specialists.

Simlarly, when there is variety but lack of quantity, “data scientists” on gigs offer the possibility of specialist help, and help only when needed. When you do not have projects requiring skills of a “data scientist” then why pay someone?

When there is no variety, the challenge is different. It is likely that this is the case when the analytical development of the organisation has stalled; for example, “data science” is used only in one area of the organisation, hence a lack of variety. Here again it might make more sense to only get “data scientists” on gig basis, to try and expand the variety and increase RoI by opening new avenues for returns. Also, another way of increasing RoI in this case if to hire “data scientists” for prototyping, but leaving maintenance to less expensive resources (2).

Finally, does that mean that if you have both variety and quantity you do not need “data scientists” on gigs? Well, it depends how well your RoI is doing. In the corporate world, where performance has to increase over time, using specialist help for prototyping, having a “different set of eyes” looking at business issues can be a solution. Of course you may choose to rotate your “data scientists” to ensure that freshness, but if could come at the cost of returns.

In a nutshell:

In a nutshell it all depends on the RoI. When I first took up a contract, it was very exciting as someone in the analytics field. My headcount was directly funded by the business and I had to justify my existence every year, I had specific RoI targets. That made me so alive. I guess that’s why I am believe strongly in RoI.

Apart of vanity and bragging rights on the part of management, why would an organisation spend on a resource that is generating low or no RoI?

Organisations have to measure the returns on their investments; that includes human capital. As long as the RoI is met, then hiring a FTE “data scientist” makes sense. Else, at the least, until the organisation matures enough to be able to allow the “data scientist” to generate that RoI, it should stick to “data scientists” on gigs.


1 There have been cases when clients want to put data scientists on a gig but also pay a retainer. This is sort of a compromise between the 2 models. However, retainers may not work. From the organisations perspective there will be an incentive to use the “data scientist” for non-“data science” work.
2 Another question I often get is “when do I need to review my models?”. I believe that the model metrics should not be decided just for a one time use, but also for on-going performance. For example, a targeted response rate of 15%, where review will take place if the response rate dips below say 12% for 2 consecutive runs.