Microsoft shut down Tay and apologised for the way it was
interacting with people after 16 hours of being online. (http://www.theguardian.com/technology/2016/mar/26/microsoft-deeply-sorry-for-offensive-tweets-by-ai-chatbot).
When I read about this experiment, I was immediately
reminded of Migi, character of ‘Parasyte the Maxim’. Migi was a creature who
basically started from a blank slate and had to learn about the world on its
own, and decided to access the internet and all its resources. Migi had been
called a “demon” by the human Shinichi, and since this concept was alien to
Migi, Migi decide to spend a night understanding what it meant. The conclusion
as illustrated above is that “humans are the closest to” “demon”.
It was a very interesting thought, and one I felt was not
necessarily wrong.
Migi was like an AI engaged in unsupervised learning.
Shinichi was asleep and Migi had the whole internet to play with and learn
from. It was not said whether Migi entered into chats, but even assuming Migi
just browsed, I do not think we can easily reject the conclusion.
Unsupervised learning can show us unexpected things, which
we may or may not like.
Also we should bear in mind that the internet is not a
random medium, people who ‘have something to say’ are more likely to go online
and say it, and these people are not usually proponents of the status quo. http://www.huffingtonpost.com/2011/03/29/internet-polarizing-politics_n_842263.html.
Hence Microsoft decided to make changes to Tay. It decided
to go for supervised learning: “it would revive Tay only if its engineers could
find a way to prevent Web users from influencing the chatbot in ways that
undermine the company’s principles and values” basically Microsoft decided that
the AI could not be trusted to learn from interactions with just about anyone.
It sounds like the AI needs to go to a controlled environment (school) before
growing up and establishing principles and being released into the world again.
In a business environment, I think it makes sense. First of
all, we must recognize that despite “Big Data”, the amount of information
unsupervised learning models are based on is limited. Secondly, there also is
the fact that “correlation does not mean causation” and that humans know to take
things with a pinch of salt, that is include some statistical probability on
facts.
For example, imagine a business rule that offers a
supplementary card immediately to spouses of approved high end credit card
customers. If the business rule is not part of the mass of data available for
unsupervised machine learning, all the machine would see is high card issue
rates for this segment; and you could end up with a rule such as: ‘offer cards
to spouses of high net worth customers’ whereas, in Singapore for example, you
need salary information to get cards. It gets worse if the rule is hidden by
being refined: ‘offer cards to ladies living in this district and who do not
have existing cards’ or ‘offer cards to men of 20 to 40 who use their ATM in
the vicinity of these schools between 8 to 10 am or 3 to 5 pm…’.
Hence, there are reasons for supervised learning.
But what is equally interesting is that the full statement
from Microsoft makes reference to earlier experiments. (http://blogs.microsoft.com/blog/2016/03/25/learning-tays-introduction/).
Tay’s predecessor was “XiaoIce” (https://en.wikipedia.org/wiki/Xiaoice)
and was launched in China without such negative impact. Microsoft wanted to
replicate the experience in the US, and blames “a coordinated attack by a
subset of people exploited a vulnerability in Tay”. Does this make Tay’s experiences
and therefore growth via learning any less real? This is what happens in real
life, and most of us make it through.
Microsoft is working hard at improving: “To do AI right, one
needs to iterate with many people and often in public forums. We must enter
each one with great caution and ultimately learn and improve, step by step, and
to do this without offending people in the process” sounds like supervised
learning to me, like most of us went through.
As a conclusion, I’d urge caution at this stage with regards
to unsupervised learning; we first need to make sure the results make sense. Secondly,
we need to recognize the ingredients that went into learning, and understand
that changes in environment for example will limit applicability of any model.