AI and Curve Fitting

This article from last year popped up in my newsfeed recently. It contains a discussion on whether AI systems today display true intelligence. I would like to focus on the following quote from computer scientist Prof Judea Pearl mentioned in the article.

As much as I look into what’s being done with deep learning, I see they’re all stuck there on the level of associations. Curve fitting. That sounds like sacrilege, to say that all the impressive achievements of deep learning amount to just fitting a curve to data. From the point of view of the mathematical hierarchy, no matter how skillfully you manipulate the data and what you read into the data when you manipulate it, it’s still a curve-fitting exercise, albeit complex and nontrivial.

To provide more context, Prof Pearl said the above during an interview in response to a comment on the public excitement over the possibilities in AI. He was actually very impressed that curve fitting could solve so many practical problems. So, what exactly is the “curve fitting” he was talking about?

It turns out that many problems in the real world can be represented as spaces. Spaces which are populated by data points which represent objects of interest in the real world. Take the rather simple examples illustrated below.

The first diagram is a classification model. It predicts healthy (blue) and diseased (red) specimens based on two quantitative characteristics – gene 1 and gene 2.

The second diagram is a regression model. That is just a jargon-y way of saying that instead of predicting a category, the model predicts a numerical value – in this case the salary of an individual – based on work experience in years.

In each case, the model is represented by a line (the eponymous “curve”). How good the model is will be determined by how well the curve is fitted. For the classification model, any specimen falling above the line is deemed diseased. For the regression model, the predicted salary is given by drawing a vertical line from the number of years in experience to the model line and reading the value off the Salary axis.

That, in very simple terms, is what Prof Pearl was talking about when he said “curve fitting”. It lies at the heart of many problems that have been solved by AI, like spam detection and object classification.

Represent, Transform, Optimise

Of course, there are a lot more things going on in developing an AI model. For starters, it is not always immediately clear what characteristics of the real world (which can number in the hundreds and thousands) should be considered. That is what feature selection is concerned with. The characteristics must also be represented in numerical form (quantified). Often, there is even a need to perform mathematical transformations on the original data before a good curve can be fitted. This is called feature engineering.

All machine learning algorithms consist of automatically finding such transformations that turn data into more useful representations for a given task.

– Deep Learning with Python by François Chollet

Finally, there must be a way to measure the “goodness-of-fit” of the curve because AI model development is basically an iterative optimisation problem where the curve becomes increasingly better at performing the particular task, be it classification or regression.

In conclusion, for all the apparent magic that present-day AI does, like facial recognition etc, it is basically a number-crunching exercise. It probably is not the same thing as the intelligence displayed by human beings, but it is here to stay and finding increasing number of use cases in our lives.

(To get a formal introduction to AI, sign up for the AI for Everyone and AI for Industry programmes!)