Can AI be used to match and classify images? Of course! They do this all the time, looking at everything from paint chips to x-rays. In today’s post, I use an established model called ResNet-50 to match and classify post-impressionist artists. For example, Braque and Picasso have a 70% similarity score.
The “cosine similarity” between Braque and Picasso is 0.70.
ResNet-50 is a convolutional neural network (CNN) introduced in 2015. Normally, we would use it as the base for image interpretation, and then add layers to learn the specific application. In this case, we are only interested in the coding system it uses, called an “embedding.”
ResNet-50 encodes each image as a list of 2,048 numbers, known as a “vector” in machine learning. This vector is not simply a way to store the image – the JPEG file already does that – but to encode whatever features the model deems useful.
For this demonstration, I collected examples from fourteen artists. To avoid complications over the choice of subject, I used self-portraits by each artist.
Experiments with CNNs show that they recognize shapes, colors, styles, and textures – everything you would expect from “machine vision.” Our model is not going to know anything about the painters, though – not who cut off an ear, or who moved to Tahiti. It’s just the pixels.
With the fourteen paintings vectorized, we can do things like compute similarity scores. For instance, Braque, Chagall, and Picasso seem to hang together. I also ran a hierarchical clustering analysis.
It’s hard to imagine what the clustering algorithm “sees” in high-dimensional space so, wherever possible, I try to reduce down to three dimensions – using principal component analysis (PCA) or UMAP. In this case, because of the small sample, a three-D chart captures 40% of the variance.
The human eye naturally finds clusters – there are Picasso, Braque, and Chagall down at the bottom, and here is Kandinsky off by himself. Also note that Cezanne, Gauguin, and Schiele are spread out along the Y axis, but together on the X axis.
Unfortunately, these axes are completely arbitrary. ResNet-50 can’t tell us if Z is the “axis of cubism,” or whatever. That’s the knock against neural net reasoning being a “black box.” We can see, though, that the PCA plot roughly agrees with the cluster analysis.
So, that was about two hundred lines of code as a proof of concept, plus some fun charts. If you were really doing this for your MFA, you would want to use many more paintings, and stash them in a vector database. For more on vector databases, see Literary Analysis with RAG.
Today’s featured image nods to a common gaffe in generative AI. Yes, Marc Chagall really did paint a “Self-Portrait with Seven Fingers.”