The “Data Scientist” is dead; long live the programmer! The
world will soon not need “data scientists”, but instead programmers will rule
the world.
This brutal assessment comes from the World Economic Forum
(WEF), and I must say that their arguments make a lot of sense.
In this age of fake news, credentials are important; so what
is the WEF? Founded in 1971, it is committed
to improving the state of the world, is the International Organization for
Public-Private Cooperation (1). Its board of trustees includes world leaders
such as Jim Yong Kim – president of the world bank, Christine Lagarde –
Managing Director of the IMF, but also Jack Ma – founder of Alibaba Group,
Mukesh Ambani – Chairman of Reliance Group among others. (2)
Furthermore, the “centre
for the fourth industrial revolution” is one of their major undertakings (3),
and one of the focus areas is artificial intelligence and machine learning (4).
So this is not fake
news, neither is it flimsy covfefe gossamer; sad!
In an article
entitled “You’ve heard about it, but do you understand? Everything you need to know about Machine
Learning”(5), the WEF gives a very easy to understand convincing description of
what machine learning is, how it can be used, and what is the way forward.
The article covers
the history of Machine Learning, broad classes of machine learning (supervised –
the machine is told what the different groups are, unsupervised – the machine
is left alone to find out the different groups, and reinforcement machine
learning – after finding out the groups it is told whether it is right or wrong
this feeds back into the learning cycle.
In illustrating the
concept of reinforcement ML, the WEF shows a clip where a machine learns to
play Atari Breakout – where a horizontally mobile paddle at the bottom of the
screen is controlled and the aim is to reflect a ball back to destroy an
obstacle made of bricks at the top of the screen. The results are amazing (6).
But even more astounding
is the commentary:
“The most important
thing to know is that all the agent is given is sensory input (what you see on
the screen) and it was ordered to maximise the score on the screen”
“No domain
knowledge is involved! This means that the algorithm doesn’t know the concept
of a ball or what the controls exactly do”
After 10 minutes
the machine is clumsy. After 120 it is playing like and expert. After 240, it
has developed its own strategy!
Isn’t this amazing?
So you will ask,
what does that have to do with the death of the “data scientist”?
Well, if you, like
me, think that Drew Conway was on to something when he came up with the basic “Data
Scientist” skill diagram (8):
In this diagram, data science requires substantive expertise, or “domain
knowledge”.
What the WEF agreeing with is that Machine Learning does not require “domain
knowledge”. Furthermore, there is no mention of the “data scientist” in the WEF
view point, but rather the article highlights “the programmer”: ML “needs a programmer to tell it what
to do when it is fed with data.” Not a “data scientist”, a “programmer”.
But does that mean the death of the “data scientist”?
Yes it does.
Think about it, if an problem can be solved without “domain knowledge”,
then ML is all you need.
And what does ML need? Not pompous “data scientist”,
but the humble “programmer’ who had been cast aside and finally makes his/her
triumphant return.
There is a wave washing away the veneer off the “data scientist” and
restoring that of the programmer. An example argues that coding is not fun, but
it is technically and ethically complex (9).
What I find most interesting is
the acknowledgement of the complexity of coding (I am sure most people would
agree that it takes serious skill to develop industrial grade code), but also
the introduction of ethics into the mix.
To quote from the article “As well as being highly analytical
and creative, software developers need almost superhuman focus to manage the
complexity of their tasks. Manic attention to detail is a must; slovenliness is
verboten. Attaining this level of
concentration requires a state of mind called being ‘in the flow,’ a quasi-symbiotic
relationship between human and machine that improves performance and
motivation.”
To push the idea further, the
latest triumph of machine over man was Alpha-go. Alpha-go was built by the same
organisation that developed the machine playing Atari Breakout above. And the
same principle was used: “no domain knowledge”, but lots of Monte Carlo
Simulation. (basically the machine learns which move is best by, from the
current position, randomly trying different plays and evaluating the end
result, then assigning a value to that play and picking the one that gives
higher change of winning/lower chance of losing, and repeats the process at
every turn)
No domain knowledge
No data scientist
One of the greatest triumphs of machine over man was enabled by programmers,
not data scientists.(9)
The “data scientist”
is dead, long live the programmer!
Further readings/references:
- https://www.weforum.org/about/world-economic-forum
- https://www.weforum.org/about/leadership-and-governance
- https://www.weforum.org/center-for-the-fourth-industrial-revolution/
- https://www.weforum.org/center-for-the-fourth-industrial-revolution/areas-of-focus
- https://www.weforum.org/agenda/2017/05/what-is-machine-learning
- https://www.youtube.com/watch?v=V1eYniJ0Rnk
- http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
- https://qz.com/987170/coding-is-not-fun-its-technically-and-ethically-complex/
- https://www.tastehit.com/blog/google-deepmind-alphago-how-it-works/