I am now working with a team of people, and
conducted a fair number of interviews during which I tried to be as informative
as possible. That also gave me the opportunity to prepare my sales pitch: why
should you join or not join my team (I am perfectly alright if someone decided
‘this is not for me’; I’d rather that than dealing with disappointed people).
So this gave me a little list of things you should look out for if you are
interviewing for an analytics/data science role. (These obviously are on top of whether you feel comfortable with the manager/supervisor, the organisation culture... that apply for any role, also I am not talking about aptitude for a role in this blogpost)
First though, you need
to know yourself.
What excites you the most about doing
analytics/data science?
- The chance to be at the bleeding edge of algorithms and trying them out?
- The chance to use bleeding edge algos to squeeze the last few % points?
- The chance to solve business problems and receive immediate feedback
- Eventually heading a lab where cool algos come out (or running such a start-up)
- Develop sharper and sharper algos to squeeze further micro %
- Out on the speaking circuit promoting your knowledge and bagging consulting pieces
- Keep solving business problems, but more complex ones (or in a such a start-up)?
Actually this second question became important
to me when a few years ago, when the team I was part of was struggling under
bad management and tried to get our collective mid management act together.
That’s a question one of my fellows asked the staff and the variety of answers
opened my eyes.
Now I will be very
very crude and create some personas, like good old times of simple segments.
Please note that I am not saying this is an
exhaustive list of there are no people with combinations I have not shown; it’s
just that I think I have met these types of people mainly during my time in
this line.
So we have roughly 12 personas; but
before we go there, I’d like to place my favourite “data science” diagram, Drew
Conway’s(1):
A unicorn data scientist will be
proficient in all 3 of these skill sets; if you have been reading my previous
blogs, you will know I don’t believe many people have the time/skill-set to be
unicorns.
1 The Scientist – From Bleeding Edge Algos to
Algo Creation
The key ingredient missing from
professional PhDs is the experience with real life datasets. While there are
efforts to make datasets available to the public, most companies rightly guard
their data trove.
There are 2 main places where “the
scientist” would fit.
Firstly, is in start-ups whose remit
is to develop new algorithms, quite disjointed from the commercial world, like
a private lab. The main criteria to look out for would be:
- the line of work/interest compatibility
- the technology compatibility
- the general mission/vision of the organisation
The second organisation would be
“labs” at more established organisations, but to the above more criteria should
be added:
- track record of lab at delivering output (if career growth matters)
- degree of independence of lab and collaboration with parent organization (data)
- aim of the parent company (it’s never altruism for long)
2 The Practical Scientist – from bleeding edge algos to sharper algos in the real world
The practical scientist is similar
to the people making the transition from a strong theoretical background to the
commercial world.
It is very unlikely that the
“practical scientist” would find one organization that he/she will either
grow/change together with, hence the choice would be of a good stepping stone.
Here, compared to the pure scientist, an academic background environment can be
useful.
- Collaboration partners of the academic institution (industry problems to deal with)
- The mode of collaboration with the partners (this often is a breaking point, where the academics think they know everything and ignore the business view, but this is unlikely for someone who intends to truly help the business at a later stage of his/her career)
The other option would be to follow
the path of the scientist for a while, but the next step has to be borne in
mind.
- Commercialisation track record
- Degree of collaboration with partner organization
3 The Teaching Scientist – from bleeding edge algos to the speaking and consulting circuit
The speaking circuit is a great
place for skilled people, people with opinions to spread their word, and also
for learning from others. While the speaking circuit for analytics/data science
may have less controls, lower standards than areas where there is a longer
legacy and more established circuits (there is a guy in Asia who call himself
“Chief Data Scientist” of a famous bank in Asia, but he definitely is not,
while he participated in the “data science process”, he is more on the infrastructure
side and certainly is not “Chief Data Scientist” even though he is announced
and publicized as such at some conferences, tells you about the quality…).
The best way to get on the circuit –
and the honest way – is to have an achievement that buys you a place at the
table.
Hence the criteria for the first
role ae similar to the two types above, but there is a very special focus on
- Track record in delivering measurable and transferable output
4 The Scientist Practitioner – from
bleeding edge algos to impacting complex business problems
This is a rare beast, especially
since this is a planned change. From my point of view, analytics/”data science”
should be used to benefit people. Hence the scientist practitioner is someone
who decides to first hone his/her technical saw to scalpel like sharpness, then
use this to dissect complex business issues.
Again, the best place for such an
individual would be to join an academic institution with strong business ties,
so on top of the previous case, the type of partnership with the commercial
entities is even more important.
- Track record of embedding scientists into partner organisations for duration of project
- This will ensure the person is well equipped with real subject matter expertise before making the jump into trying to solve complex business problems.
5 The Kaggler – from squeezing the last few % to
building algorithms
The kaggler is someone who enjoys
squeezing the last few percentage points from algorithms, doing his/her utmost
to optimize the algorithms.
As such, if the person starts in the
commercial world then the organization he/she should target should:
- Be in a highly competitive market where 1% makes a huge difference
- Have a culture of experimentation with very rapid cycles
- Have a variety of offerings so that optimization can take various forms
Then once the person has proven his
or her worth, and before burn out (in some such organisations the tour of duty
rarely exceeds 2 years once you survive the first 6 months).
The next step would be to join an
organization where all of this experience will allow the development of
custom-built algorithms.
- Strong related/same industry focus
- Start-up, niche player, or specialized academic institution.
6 The Survivor – Kaggle style – From squeezing the last
% to squeezing the last .%
Another dedicated creature plays
survivor kaggle style. As mentioned above, fighting for the last few % of gains
in a highly competitive market takes quite a toll and while the learnings are
fabulous, the price to pay is a high burn out rate.
Frankly I am in awe of people who
can do this, amazing stamina; and for them to thrive, the organization has to
provide an optimal environment, so on top of the above criteria for the
Kaggler, there should be
- Competitive but sharing learning environment
- Flexibility in work conditions: time/location, purely results focused
The basic idea is to allow such
people to focus on their thing, remove silly stuff such as admin, HR which just
clog up bandwidth and create frustration.
7 The show-off Kaggler – from squeezing the last % to joining
the speaking/consulting circuit
The show-off kaggler is someone who
plans to take his/her hard-earned experiences and use them for a less hectic
life, join the speaking and consulting circuit. I personally think it is a good
plan, the only thing to watch for is having enough consulting work to ensure
the individual keeps up to date and the advice is practical.
The first step is similar to the
kaggler, and the second would be to associate themselves with organisations
that support this lifestyle and provide more options for keeping in touch with
‘the real world’ rather than depending purely on one’s own power on the
speaking circuit
- Loose network of analytics/data science experts who exchange views relatively freely
- Combination of people with skills across the spectrum required for analytics projects
- Able to generate high quality opportunities/projects (or deliver on them in case the show-off kaggler generates opportunities)
8 The misplaced kaggler – from squeezing the last % to solving complex business problems
This transition is one you see quite
often across organisations, but personally I think the success rate is not
likely to be that high. On paper the idea, from the point of view of an
organization trying to build/increment a “data science” team, you take someone
successful from a highly competitive environment with a track record, and ask
him/her to transfer skills at making your organization as successful.
However, I would say that to be
successful in dealing with complex business problems requires more than great
kaggling skills, domain knowledge, a good deal of EQ, and the ability to switch
to chasing the last decimal points to “good enough” are required.
Hence the first organization has to
prepare the kaggler for such a role, so on top of the criteria for the kaggler:
- Allow a variety of projects on top of just tactical optimisations
- Have a lower degree of segregation of roles (Data engineer/data scientist…)
Then the second organization must
also be appropriate for the kaggler who is switching tack, it should have
- an existing team the kaggler could learn from, leadership does not mean dictatorship
- good clearly defined roles and responsibilities
- a management that understands the transition the kaggler will have to undergo
- a strong data engineering counterpart (not just IT) who can collaborate with the kaggler.
It is also possible for the kaggler
to join a network of analytics professionals who have practical experience to
spare, can generate interesting projects, and are willing to collaborate with
the kaggler to achieve better results. Of course there needs to be mutual respect and appreciation of each others' strengths.
9 The visionary geek – from solving business problems to
creating algos
This is a very interesting path to
take. The first stint allows the person to understand the business much better,
see the deficiencies in the way things are done, and then go fix these
deficiencies. Basically the first step is to gain substantive expertise while
developing analytical techniques. In order to achieve this, the person needs to
be proficient at coding to start with, but the organization needs to provide
the right environment.
- High level of analytical maturity (across the organization)
- Implementation of results of analytics with proper feedback
- See analytics/”data science” as a collaborative effort rather than a scientific project/lab
- Be in an industry where the person has an initial interest.
After the learning ground, I
actually believe the person should have his or her own start-up, ideally. A
second choice would be to join an existing analytics systems vendor in that
space, or worse case a niche business consultancy in that space. The
organisations should offer:
- Financing to do the required research and development
- Space for experimentation and clients willing to participate in experiments/co-develop
- Management that can provide all non-analytics support, advice, else know to stay away
10 The Specialist – from solving business problems to squeezing the last .%
The specialist is someone who starts
in a general helping the business role, then morphs into getting much more into
algorithms, more technical, while still helping the business. This is a
strategy of depth/specialization rather than breadth.
For this to happen, the first
organisation the person selects must have some key attributes, similar to those
suitable for the visionary geek:
- High level of analytical maturity (across the organization)
- Implementation of results of analytics with proper feedback
- See analytics/”data science” as a collaborative effort rather than a scientific project/lab
- Have management buy-in to compete using analytics, especially in areas such as pricing, especially if these are dynamic, requiring up to the second optimisation
- Be in an industry where competition is truly fierce
If the first organisation does not
offer the challenge of optimisting to the nth degree, then it is likely that
the person should look for an organisation that does:
- Same industry so the substantive experience gathered is not wasted
- Highly competitive
- Analytics/”Data Science” team driving at least one aspect of business (for example pricing)
11 Doer to talker (D2T) – from solving business problems to joining the speaking/consulting circuit
There are quite a fair number of people
in this category, again this has to do with the speaking circuit in the
Analytics/”Data Science” space being quite ‘unregulated’ with more demand
(especially from people willing to learn but who may only have 1 of the
components in Drew Conway’s diagram and are thus bound to be easily impressed),
and willing supply of people; after all, it looks good on your cv. But in order
to stay in the speaking circuit and do some consulting by the side, the first
stop has to offer a few things:
- A brand name
- Recognition or at least time off for taking part in conference during work hours or at least lax control (see the Chief Data Scientist in 3 above)
- Real use cases that clients are willing to allow you to speak about (brand here helps too)
And the second step, especially at
the beginning, it helps to be associated with a loose organization that offers
- Quality projects of varying lengths to keep the saw sharp
- Flexible schedules to the speaking engagements can be combined
- Good network of skills in analytics/”data science” so as to fulfill opportunities the Doer to Talker may create
12 The Business Partner – from solving business problems to solving more complex business problems
There are also a fair number of
people in this category; the passion of these people lies in solving business
problems, and they see their career path as getting closer and closer to the
business. The dream is to create a truly data driven business organization.
They are also the people who end up being the most frustrated.
Hence the choice of set-up is
critical
- Analytics/Data Science team needs to report to business, not IT
- KPIs need to be set in terms of business objectives, dollars, incremental market share…
- Senior management buy-in into analytics and see project as long term
- The manager must have clout in the organization run experiments
- A manager who is knowledgeable (technical and business) or wise enough to focus on his/her strengths (usually relationships with business units)
There is no reason to change
organization is it allows growth and is on the path to being data driven, but
the reality is very often they don’t despite the above. So the business partner
should go in search of a different organization where
- The organization structure is such that each member of senior has a very narrow well defined role – for example if you have sales (short term optimisation), marketing (medium to long term optimisation) reporting to the same member of senior management, then you know strategies will be muddled, or even contradictory, disaster in the making)
- The head of analytics/”data science” has a seat at senior management meetings
- The head of analytics/”data science” is also aligned to making his/her team business partners
- There is a clear roadmap of how the organization will become more and more data driven
Alternatively, there always is the
option of joining a loose group of analytics/”data science” professionals where
you can share experiences and get to work on projects that would appeal to you.
The choice of network is critical.
- Network of reputable analytics/”data science” people
- Specialises in analytics/”data science” – does not do everything, but does enough to get measurable results
- Has the depth of people required (here to address complex business problems)
- Provides quality assignments
- Allows the freedom to choose assignments
Some extra questions
What if you are someone who just
wants to stay on the left hand side/vertical axis and not want to move to the
horizontal?
Well I think that if you are
starting your career, the least you should do is that you pick the
characteristics that suit the role you have chosen. Where you transition to,
then becomes a second or third step. You might have to adjust organization in
view of your longer term goal, but there is nothing wrong; it is wrong though
to stay in an organization that definitely does not suit.
What If your aim is neither of the 3
starting aims I listed?
Then may be your participation in
the “data science” process is at a different stage, not at the “data scientist”
piece, and there is nothing wrong with that. Personally I don’t believe that there
are many people proficient in all 3 skills of Drew Conway’s diagram to be the unicorns
“data scientist” – in fact in real life, I have a team with complementary
skills who I am trying to build into a data-science team.
Or maybe you prefer to be the on the
business side pushing the output of “analytics”/”data science”… or maybe you
need some experience before you choose a path for yourself.
What is you just want to become a
“data scientist” because you heard they make tons of cash?
Well, it helps if you have good
connections in the business world, look good, and/or can talk then you can join
one of the large management consultancies; they do pay well.
Well any other questions, please pop
them in the comments on my linkedin page, putting my flame-proof suit on.
Conclusion
What I tried to do here is to give
some advice to people starting in the analytics/”data science” field, a field I
have been in for a very very long time, when it was called “data-mining”. I
basically stumbled through my career, made a few interesting decisions and
learnt quite a bit along the way. So I have tried to help people to learn at a
lower price that I did, after all my bogs is not that torturous, I hope.
Oh, any ideas which of the 12
personas I may be? Haha…
- http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram