A friend recently asked me to help him
understand the “gen ai thing” at a level that will allow him to have
discussions (and since he knows me well, he knows this comes w opinion). I
decided to go a level simpler, and try to explain Gen AI in a way my mum would
understand (she’s in her 80s and my recent victory was getting her to carry her
mobile phone when she is out of the house). I figured out it would take me a
while, so I broke the explanations into smaller more digestible pieces. Here is Part 1.
What is Gen AI?
Before we go there….
First what is AI.(with apologies to my brother, Dr. AI)
Humans are a very arrogant species, so we
decided that the way we think is something worth replicating. Hence, if we
could make machines think like humans, then we would have something fantastic. Basically,
machines don’t get tired easily, and you can expand the capacity of a machine
much faster than a human (hopefully (1)).
AI is basically that, how do we get
machines to think like humans.
So, what does it mean to think like a
human?
How do you think?
Let’s take a simple example (a simple
application of thinking like a human), you see a piece of furniture in a shop,
how do you decide that it is a chair, or a table (assuming someone hasn’t
written: this chair/table for $xxx)?
Enter Plato!
This is not a new question. Plato (~428-342BC
that’s close to 2500 years ago) came up with a theory of forms, and that made
me fall in love with PH102. The basic idea is that there is this world where
the perfect form of every item in our world exists. So, I thought that makes
sense! I know if something is a chair or a table by comparing it to the ideal
form: it is closer to the ideal chair, or the ideal table?
What does closer mean?
If you have read other articles by me, you
will remember I love talking about distance, closer means smaller distance. An
object a is more likely to be an A than a B if it is closer to form A than
form/ideal B. This is easy; how you define closer is where the fun begins 😊
Plato’s Theory of Forms
So, when I started playing with data, Plato’s
theory of forms helped me a lot. The main difference is that, since I can’t
access the world of ideals/forms, I have to base my version of form on what I
had seen before.
The tables I had seen were 4 legged, came
up to waist high (since my teens), had at the top a large flat surface so you
can put stuff on top. Usually they were made of wood, although the legs could
be made of metal. Chairs, were shorter, below waist high, but also usually had
4 legs, and made of similar material. However, chairs also had a back, the flat
surface was not the highest point of the chair, but the back, so the person can
sit on the flat seat, and rest his/her back on the back.
So, when I see a new object, I decide
whether it looks more like a chair or a table, based on whether it is closer to
the typical form I had in mind. Note that, I am not comparing just these words
as I described table and chair, but the more complicated concept I have in mind
(like an ideal form)
While humans learn from experience,
machines can be made to learn. Instead of telling the machine the short
ungainly description of a chair and a table above based on what I have seen,
the trick is simply to give thousands of examples of things we know are chairs
and tell the machine, these are chairs, and same thing for tables. So, you
train the machine so that it comes up with its own view of what a chair is and
what a table is. This is the training part of a model.
In this case, we train the model by feeding
it images of chairs with the label that these are chairs, and the same for
tables. This is called supervised learning, since someone supervised the
process by providing these presumably accurate labels.
For now, we skip on how the machine breaks
down the images, and let’s just assume that the machine now knows what chairs
look like, and what tables look like. We then feed it a new image with a
picture of a piece of furniture without label, and it will tell us: this is
likely a chair (or a table) depending on what it has learnt. The machine has
solved the classification problem, by deciding the new unlabelled furniture is
classified as a chair/table accordingly.
Now, nobody stops you from training the
machine with other pieces of furniture, and animals, and all sorts of other
things… Afterall, that’s how we learnt, no?
Thought experiment:
Imagine you are walking about, and from far
you see something. How do you decide whether this thing with 4 black legs, and
black and white splotchy pattern on the top and sides is a table or a cow or
may be a dalmatian?
How would your thinking process go?
Would it be faster if you remembered you
were in a field in the middle of a farm, or close to a nature inspired
furniture shop?
For me, yes; based on the context (where
the object is), I can make the process simpler by focusing on a smaller list of
likely choices, than the whole list.
This is why you get faster, likely better
results, on a specialised machine (a farm animal identifier in the first case
or a furniture classifier in the second) rather than a generic machine: a
machine trained only on furniture would identify the table much faster and more
accurately than one that has also learnt about cows and dalmatians. However,
the furniture classifier would fail if someone asked it to identify a
dalmatian… Hence, machines/algos trained on a specific set of data are usually
better at working on that theme/context, but will not do so well at things in
different contexts.
It should not be surprising, if someone from
the tropics had never even heard of snow, he/she would be flabbergasted the
first time, may be even think it was volcanic ash… But someone who has lived in
the snow would even be able to tell you the type of snow (4), it all depends on
what you need. Similarly, I know of many Mandarin/Cantonese/French speakers who
claim that there are many nuances in their languages that are not present in
English. Again, depends on what the people who use the language use it for.
If I had not seen a chair and table before,
maybe I could check out in a dictionary:
- Chair: a piece of furniture for one person to sit on, with a back, a seat and four legs (2)
- Table: a piece of furniture that consists of a flat top supported by legs (3)
Then based on these
definitions try and decide…
But you will tell me, wait, the human
has a lot of work to do, he/she has to label the pictures.
Well, yes, for supervised learning,
as a child asks adults: “what is this? And this? And this? How about this?”.
But you will recognise the work the child put in: the child takes in the image
he/she sees, commits it to memory in one shape or form, then later, when he/she
sees a new object decides whether it is a chair, table or something else.
It is also possible to feed the machine unlabelled
pictures, and it will decide by itself how many categories of objects there are
(you can tell it that if you want) and it will create its own view of things
and when presented with a new picture, after having been trained, decide
whether that object is a chair of a table. This is called unsupervised
learning.
There also is reinforcement learning,
whereby the machine is given feedback on what it has predicted, therefore can
continue learning by analysis what went right and what went wrong.
Now whether you choose to use supervised or
unsupervised learning is up to you, there are reasons for and against using
either form. Not only that, but how you choose to learn or group things also
makes a difference to the output you will get and the ability of the
model/algorithm to properly classify things. This is something I am geeky
about, but is not for this blog post
You will agree this is a very useful thing
to have in your back pocket and the practical applications are very very vast.
For example, a few years ago, I found it was not too hard to build something
that, once you feed it a photo of a piece of meat from a supermarket, it can
identify the meat with reasonable accuracy, and you can slap on features such as
estimating price (after estimating volume), freshness… You can easily do the
same for fruit: auntie, no need to press-press anymore!)
Ok, but this is only classification of
objects, doesn’t AI do many many more things? Is this really AI, or is it ML?
AI vs ML
AI is, as mentioned above, focused on
making machines think like humans. ML is how we apply specific pieces if this
to solve problems. The classification piece I used is a piece of ML, but ML is
part of AI. But there is more than that.
Classification is just a small piece of
what ML can do. ‘Traditionally’, ML has been used to do 3 things: classifying
things as I illustrated above (think the photo app in your phone tagging the
pictures by recognizing what is inside), finding out what affects what
(regression) for example understanding how the weather affects price of
tomatoes, and predicting things such as predicting the price of tomatoes next
week.
A little diagram will illustrate what I am
talking about:
So basically, while trying to explain Gen
AI as it is today, I used ML, basically applied AI, and took 1 aspect
(classification). I skipped over neural networks, that can be used to classify
the images by say automatically varying the importance of different aspects –
is height of horizontal piece mor important than number of legs – or even deep
learning that basically is a more complex neural network.
But, simply by looking at the name: Neural
Network, you can get a hint that the original idea was to mimic the layers of the
human brain, deep learning is adding layers and other complexities. So, fret
not, I am not misleading, I am simplifying. Remember, my aim is that even
someone like my mum can understand.
In my next blog, I will explain the most
common understanding of GenAI, ‘ChatGPT’, or basically LLM (Large Language
Model) because people using it are not coding (speaking machine language) but
speaking their own Natural Language (oops I slipped in NLP)
- Elon Musk’s Neuralink has been approved to have human trials https://www.cnbc.com/2023/09/20/elon-musks-neuralink-is-recruiting-patients-for-its-first-human-trial.html
- https://www.oxfordlearnersdictionaries.com/definition/english/chair_1
- https://www.oxfordlearnersdictionaries.com/definition/english/table_1
- https://en.wikipedia.org/wiki/Eskimo_words_for_snow