Spend a vision with me: Gen AI thing Part I: Plato’s spark

A friend recently asked me to help him understand the “gen ai thing” at a level that will allow him to have discussions (and since he knows me well, he knows this comes w opinion). I decided to go a level simpler, and try to explain Gen AI in a way my mum would understand (she’s in her 80s and my recent victory was getting her to carry her mobile phone when she is out of the house). I figured out it would take me a while, so I broke the explanations into smaller more digestible pieces. Here is Part 1.

What is Gen AI?

Before we go there….

First what is AI.(with apologies to my brother, Dr. AI)

Humans are a very arrogant species, so we decided that the way we think is something worth replicating. Hence, if we could make machines think like humans, then we would have something fantastic. Basically, machines don’t get tired easily, and you can expand the capacity of a machine much faster than a human (hopefully (1)).

AI is basically that, how do we get machines to think like humans.

So, what does it mean to think like a human?

How do you think?

Let’s take a simple example (a simple application of thinking like a human), you see a piece of furniture in a shop, how do you decide that it is a chair, or a table (assuming someone hasn’t written: this chair/table for $xxx)?

Enter Plato!

This is not a new question. Plato (~428-342BC that’s close to 2500 years ago) came up with a theory of forms, and that made me fall in love with PH102. The basic idea is that there is this world where the perfect form of every item in our world exists. So, I thought that makes sense! I know if something is a chair or a table by comparing it to the ideal form: it is closer to the ideal chair, or the ideal table?

What does closer mean?

If you have read other articles by me, you will remember I love talking about distance, closer means smaller distance. An object a is more likely to be an A than a B if it is closer to form A than form/ideal B. This is easy; how you define closer is where the fun begins 😊

Plato’s Theory of Forms

So, when I started playing with data, Plato’s theory of forms helped me a lot. The main difference is that, since I can’t access the world of ideals/forms, I have to base my version of form on what I had seen before.

The tables I had seen were 4 legged, came up to waist high (since my teens), had at the top a large flat surface so you can put stuff on top. Usually they were made of wood, although the legs could be made of metal. Chairs, were shorter, below waist high, but also usually had 4 legs, and made of similar material. However, chairs also had a back, the flat surface was not the highest point of the chair, but the back, so the person can sit on the flat seat, and rest his/her back on the back.

So, when I see a new object, I decide whether it looks more like a chair or a table, based on whether it is closer to the typical form I had in mind. Note that, I am not comparing just these words as I described table and chair, but the more complicated concept I have in mind (like an ideal form)

While humans learn from experience, machines can be made to learn. Instead of telling the machine the short ungainly description of a chair and a table above based on what I have seen, the trick is simply to give thousands of examples of things we know are chairs and tell the machine, these are chairs, and same thing for tables. So, you train the machine so that it comes up with its own view of what a chair is and what a table is. This is the training part of a model.

In this case, we train the model by feeding it images of chairs with the label that these are chairs, and the same for tables. This is called supervised learning, since someone supervised the process by providing these presumably accurate labels.

For now, we skip on how the machine breaks down the images, and let’s just assume that the machine now knows what chairs look like, and what tables look like. We then feed it a new image with a picture of a piece of furniture without label, and it will tell us: this is likely a chair (or a table) depending on what it has learnt. The machine has solved the classification problem, by deciding the new unlabelled furniture is classified as a chair/table accordingly.

Now, nobody stops you from training the machine with other pieces of furniture, and animals, and all sorts of other things… Afterall, that’s how we learnt, no?

Thought experiment:

Imagine you are walking about, and from far you see something. How do you decide whether this thing with 4 black legs, and black and white splotchy pattern on the top and sides is a table or a cow or may be a dalmatian?

How would your thinking process go?

Would it be faster if you remembered you were in a field in the middle of a farm, or close to a nature inspired furniture shop?

For me, yes; based on the context (where the object is), I can make the process simpler by focusing on a smaller list of likely choices, than the whole list.

This is why you get faster, likely better results, on a specialised machine (a farm animal identifier in the first case or a furniture classifier in the second) rather than a generic machine: a machine trained only on furniture would identify the table much faster and more accurately than one that has also learnt about cows and dalmatians. However, the furniture classifier would fail if someone asked it to identify a dalmatian… Hence, machines/algos trained on a specific set of data are usually better at working on that theme/context, but will not do so well at things in different contexts.

It should not be surprising, if someone from the tropics had never even heard of snow, he/she would be flabbergasted the first time, may be even think it was volcanic ash… But someone who has lived in the snow would even be able to tell you the type of snow (4), it all depends on what you need. Similarly, I know of many Mandarin/Cantonese/French speakers who claim that there are many nuances in their languages that are not present in English. Again, depends on what the people who use the language use it for.

If I had not seen a chair and table before, maybe I could check out in a dictionary:

Chair: a piece of furniture for one person to sit on, with a back, a seat and four legs (2)
Table: a piece of furniture that consists of a flat top supported by legs (3)

Then based on these definitions try and decide…

But you will tell me, wait, the human has a lot of work to do, he/she has to label the pictures.

Well, yes, for supervised learning, as a child asks adults: “what is this? And this? And this? How about this?”. But you will recognise the work the child put in: the child takes in the image he/she sees, commits it to memory in one shape or form, then later, when he/she sees a new object decides whether it is a chair, table or something else.

It is also possible to feed the machine unlabelled pictures, and it will decide by itself how many categories of objects there are (you can tell it that if you want) and it will create its own view of things and when presented with a new picture, after having been trained, decide whether that object is a chair of a table. This is called unsupervised learning.

There also is reinforcement learning, whereby the machine is given feedback on what it has predicted, therefore can continue learning by analysis what went right and what went wrong.

Now whether you choose to use supervised or unsupervised learning is up to you, there are reasons for and against using either form. Not only that, but how you choose to learn or group things also makes a difference to the output you will get and the ability of the model/algorithm to properly classify things. This is something I am geeky about, but is not for this blog post

You will agree this is a very useful thing to have in your back pocket and the practical applications are very very vast. For example, a few years ago, I found it was not too hard to build something that, once you feed it a photo of a piece of meat from a supermarket, it can identify the meat with reasonable accuracy, and you can slap on features such as estimating price (after estimating volume), freshness… You can easily do the same for fruit: auntie, no need to press-press anymore!)

Ok, but this is only classification of objects, doesn’t AI do many many more things? Is this really AI, or is it ML?

AI vs ML

AI is, as mentioned above, focused on making machines think like humans. ML is how we apply specific pieces if this to solve problems. The classification piece I used is a piece of ML, but ML is part of AI. But there is more than that.

Classification is just a small piece of what ML can do. ‘Traditionally’, ML has been used to do 3 things: classifying things as I illustrated above (think the photo app in your phone tagging the pictures by recognizing what is inside), finding out what affects what (regression) for example understanding how the weather affects price of tomatoes, and predicting things such as predicting the price of tomatoes next week.

A little diagram will illustrate what I am talking about:

So basically, while trying to explain Gen AI as it is today, I used ML, basically applied AI, and took 1 aspect (classification). I skipped over neural networks, that can be used to classify the images by say automatically varying the importance of different aspects – is height of horizontal piece mor important than number of legs – or even deep learning that basically is a more complex neural network.

But, simply by looking at the name: Neural Network, you can get a hint that the original idea was to mimic the layers of the human brain, deep learning is adding layers and other complexities. So, fret not, I am not misleading, I am simplifying. Remember, my aim is that even someone like my mum can understand.

In my next blog, I will explain the most common understanding of GenAI, ‘ChatGPT’, or basically LLM (Large Language Model) because people using it are not coding (speaking machine language) but speaking their own Natural Language (oops I slipped in NLP)

Elon Musk’s Neuralink has been approved to have human trials https://www.cnbc.com/2023/09/20/elon-musks-neuralink-is-recruiting-patients-for-its-first-human-trial.html
https://www.oxfordlearnersdictionaries.com/definition/english/chair_1
https://www.oxfordlearnersdictionaries.com/definition/english/table_1
https://en.wikipedia.org/wiki/Eskimo_words_for_snow

Spend a vision with me

Sunday, 15 October 2023

Gen AI thing Part I: Plato’s spark

No comments:

Post a Comment