A while ago, I wrote about a new
insurance product launched in Singapore that required you to submit your DNA as
part of the deal – you got ‘personalised’ advice in exchange. The ad
ridiculously showed two identical-looking twins receiving different advice
(since identical twins share the same DNA...). (1) In that blog-post, I mentioned
that the insurance company was at pains to stress that they had no access to
the DNA, but I raised the prospect of someone buying that company collecting
the DNA and not being bound by the same rules. And unfortunately this prospect
is very real.
Let’s take a step back, am I
talking about DNA or facebook?
These few weeks have been
exciting for people interested in data and “Big Data”, since the extent of the
data collected by Cambridge Analytica via facebook, very often without the
subjects being aware (2). I have been going on about the need for us to own our
data but this really takes the cake; you were not only giving away your data
but that of your connections too (53 Australians took the test – and possibly
gained something – but the data of 311,127 was harvested. Similarly 10 New
Zealanders did so, and data from 63,724 as harvested. I am not saying there
were national boundaries, but these numbers give an idea of the pandemic).
Ok, so people’s surfing habits,
likes comments, photos they posted in public were accessed and used, but what
use can be made of this data? As the time magazine article (1) mentioned, one
use was for Mr Trump’s presidential campaign. And as this article shows, the
efforts started in 2014 (4), and were very effective as confirmed by Mr Trump
himself (5):
“But they had this expression ‘drain the swamp.’ And I hated it, I
thought it was so hokey. I said, ‘that is the hokiest, give me a break, I am
embarrassed to say it.’ And I was in Florida where 25,000 people were going
wild, and I said, ‘and we will drain the swamp’ — the place went crazy. I
couldn’t believe it. And then the next speech I said it again and they went
even crazier. ‘We will drain the swamp… we will drain the swamp,’ and every
time I said it I got the biggest applause”
So we can at least say that the
data facebook ‘allowed’ Cambridge Analytica to harvest from the subjects was,
at least, ‘useful’.
So what does that have to do with
DNA?
Basically if you think that
someone getting their hands on your surfing history and using it for their own
purposes without your consent is bad, what if they get their hands on your DNA?
The organisation that holds the
DNA for myDNA from Prudential is Prenetics Limited (7). Recently I read that
Alibaba and Ping An insurance are the major investors in Prenetics (8). On one
hand, I find it amusing that Ping An possibly have access to data that
Prudential help collect. On the other I find it scary that the data of these
people (of course I did not purchase myDNA) is now in the hands of another
insurer.
Anyway, Prenetics claims that the
DNA of over 200,000 people across South East Asia, China and Hong Kong were in
their hands as early as October 2017 (9).
But, I am sure some nice people
will say, there is a legitimate reason to do research into DNA; hospitals and
universities have been doing so to the benefit of mankind for years. Yes, but I
would argue that the CEOs of hospitals and universities have different
experiences as compared to the CEO of Prenetics (Mr Danny Yeung) and that may
affect how the data is being used:
“Prenetics started out as ‘Multigene’ in 2009 when it span out from
Hong Kong’s City University. Yeung joined the firm as CEO in 2014, after
leaving Groupon following its acquisition of his Hong Kong
startup uBuyiBuy, and it has been in startup mode since
then. Prenetics has raised over $52 million from investors which, aside
from Alibaba, include 500 Startups, Venturra Capital and Chinese insurance
giant Ping An.”
This, I will admit, is pure
speculation on my part. For all I know, Prenetics really wants to help mankind
and bless everyone whose DNA they hold with better health and lower health care
costs (prevention rather than cure). But I have other reasons to be sceptical.
Basically, even if humans
‘decoded’ the whole DNA sequence (which hasn’t been achieved yet (10)), even if
you have inherited a predisposition to a condition, nobody can tell where you
will actually get affected by it:
“Genetic testing can provide only limited information about an inherited
condition. The test often can't determine if a person will show symptoms of a
disorder, how severe the symptoms will be, or whether the disorder will
progress over time.” (11)
And to make things more
interesting, the pieces of the genetic code that have not been sequences were
considered useless or too hard to analyse given technological limitations, but
are now being re-evaluated. Does that sound familiar? For people in the “Big
Data” space (especially proponents of the “Data Lake”), it should.
One of the arguments of the “Data
Lake” is that we do not know what data can be useful; even if we cannot extract
is and use it now, we might as well keep it since it might be useful.
When I first started in this line
of work, the kind of conversations I would have would be along these lines:
Q: “What data
do you need?”
A: “Just give
me what you have and I’ll analyse”
Q: “That is
impossible, tell me what data do you need?”
A: “Ok, can I
have the list of pieces of data that you have?”
Q: “That is
impossible, tell me what you want and I will see if I have it...” ad nauseam
Now technology and acceptance of
the usefulness of data have advanced and it is possible to “keep all the data”
in a “Data Lake” or “Data Swamp” as some friends call it (Drain it! Drain it!
Sorry I got caught for a moment).
Pieces of data that we would have
had trouble analysing a few years ago such as weblogs, or pictures, or voice
recordings can now be analysed relatively easily. But these pieces of data were
routinely considered to be useless.
It is the same thing with DNA
data. And to make it worse, there is the link between being at risk of some
condition as per your DNA profile and actually getting that condition.
Basically, there is way too much
data that would be needed to transform this ‘risk’ into something that can be
measured with ‘enough accuracy’. That is what insurance companies try to do
when they ask questions about your lifestyle, smoking, drinking... but these
are very crude.
So is it fair that you could be
penalised because of a feature of your DNA make-up? Are we slaves of our DNA?
What I am getting at is not the
importance of DNA data, but rather at the care that must be taken when
conclusions are made, and people penalised for things they may not be aware of.
To make things more fun, not only
is Prenetics in China, Hong Kong and South East Asia, but it has recently
acquired DNAFit (12). This impacts Prenetics in 2 ways. Firstly geographically,
DNAFit’s market presence is mainly in Europe and is expanding to the USA.
Secondly DNAFit goes direct to the consumer whereas Prenetics tended to reach
the consumer via Insurance or Medical companies. (In fact even Linkedin is one
of DNAFit’s customers).
The impact of direct-to-consumer
DNAkits is debatable (13), but “a little learning is a dangerous thing” (14),
add to this the emotional weight of ‘learning’ not necessary pleasant things
about your own self...
So what I am saying is:
- As individuals we should have control over the data we produce by living (web/call/messaging behaviour, surveillance footage...
- But we should also have control over data we produce by existing (DNA).
I think there are many gaps
between the general public (who have no issues with being facebook’s product in
exchange for a quiz (15)) and those who have some idea of what can be done with
such data; the same for DNA. And it is critical for people to be educated or
educate themselves on this. As long as there is such an asymmetry of
information, together with major issues with how people/machines use the data
(people/machines, not technology or data itself), the cost of exploitation can
be very high.
I would like to end this post
with the poem by Alexander Pope (14):
A little learning is a dangerous thing ;
|
Drink deep, or taste not the Pierian spring :
|
There shallow draughts intoxicate the brain,
|
And drinking largely sobers us again.
|
Fired at first sight with what the Muse imparts,
|
In fearless youth we tempt the heights of Arts ;
|
While from the bounded level of our mind
|
Short views we take, nor see the lengths behind,
|
But, more advanced, behold with strange surprise
|
New distant scenes of endless science rise !
|
So pleased at first the towering Alps we try,
|
Mount o’er the vales, and seem to tread the sky ;
|
The eternal snows appear already past,
|
And the first clouds and mountains seem the last
;
|
But those attained, we tremble to survey
|
The growing labours of the lengthened way ;
|
The increasing prospect tires our wandering eyes,
|
Hills peep o’er hills, and Alps on Alps arise !
|
5 https://thinkprogress.org/trump-drain-the-swamp-cambridge-analytica-steve-bannon-nrcc-09152b7a90b9/
7 https://www.prudential.com.sg/en/prumydna/mydnapromotnc/
see point g: ““myDNA report”
means the personalised report that Eligible Customers receive from Prenetics
Limited”