What is Machine Learning?

Artificial Intelligence (AI) and Machine Learning (ML) are the rage today! What exactly is machine learning? Can a computer really learn? What do we mean by learning? All these questions deserve some answers.

It is important to realise that in Computer Science, we do not really worry about the philosophical, metaphysical and semantic underpinnings of the word learning. We take the functional approach. Imagine this situation: a mother is showing an animal to her 1-year old baby. She tells it that it is a cow. It has two horns, is large, gives milk, has tail, etc. She shows the baby a few more animals and tells their names: cow, dog, etc. and their features such as size, colours, and so on. Later, after a few days, the mother points to a cow and asks the baby to name the animal. The baby, previously trained on many animals, now correctly says "cow" to the proud mommy! We say that the baby learned how to recognise animals.

So, now what did the mother and baby do so that there is some learning? The mother gave her experience in naming animals to the baby through a series of examples. In each example, she told the baby certain features of the animal and then its name. The baby used the experience to relate the features to the names and improve its ability to recognise and name animals. How did we measure this ability? By seeing how many animals are named correctly by the baby after getting mother's inputs. We say the baby learnt how to recognise animals if the number of animals named correctly - after the baby listened to the mother's inputs - is higher than before.

The same has been beautifully captured and applied to computers (machines) by Tom Mitchell, one of the biggies in AI/ML, when he defined machine learning as follows:

"A machine is said to learn a task $T$ from experience $E$ if its performance, measured by $P$, on $T$ improves after being exposed to $E$"

It is very easy now to realise what these $T, E$ and $P$ are! In Machine Learning, it is common to see the experience $E$ called as training. The example above where there is a mother to tell the learner (baby) the correct answer is called Supervised Learning. As we put and relax conditions on $T, E$ and $P$, we get many different types of learning.

Further, human experts used to invent and compute features mathematically and then give training to a system using such features. Some of the features used are edges, corners, textures, SIFT, SURF, wavelets and many more. The systems were programmed to calculate these features from the given inputs in and associate them with names or labels in the training phase. Later, for every new image, the system would compute the same features and predict a name/label using the associations learned during training.

Deep learning gave us a way to calculate important and essential features automatically from the data without requiring human experts - this represented a significant advance because the process of learning became more automatic... but that is another story! Read any article available on the Internet about convolutional neural networks.

We move on to how the above ideas are converted into programs in Part - II titled, How do machines learn?

END OF PART - I