📁 ⏏️ 🤴🏿 We analyze the masterpieces of painting with the help of classic ML 👩🏼‍⚕️ 🥐 🏺

Hello everyone! One friend of mine studies as an artist and regularly talks about this or that masterpiece, about unique compositional techniques, about color perception, about the evolution of painting and brilliant artists. Against the background of this constant impact, I decided to check whether my engineering knowledge and skills are suitable for analyzing the world cultural heritage.

Armed with a makeshift parser under cover of night, I burst into the online gallery and brought out almost 50 thousand paintings from there. Let's see what is interesting to do with this, using only classic ML tools (caution, traffic).

Naive transformation

As many of us remember from computer science lessons, an image is represented as an array of bytes that are responsible for the color of each individual pixel. As a rule, an RGB scheme is used, in which the color is divided into three components (red / green / blue), which, when summed with a black background, give the original color that is perceived by a person.

Since now for us all the masterpieces have temporarily become only arrays of numbers on the disk, we will try to characterize these arrays by constructing histograms of the distribution of intensity frequencies for each channel.

We will use numpy for calculations, and visualize using matplotlib.

Source

#      
def load_image_by_index(i):
    image_path = paintings_links.iloc[i].img_path
    img = cv2.imdecode(np.fromfile(str(Path.cwd()/image_path), np.uint8), cv2.IMREAD_UNCHANGED)
    return img    
#    
def get_hist_data_by_index(img_index):
    bin_div = 5 
    img = load_image_by_index(img_index)
    b, bins=  np.histogram(img[:,:,0], bins=255//bin_div, range=(0,255), density=True)
    g = np.histogram(img[:,:,1], bins=255//bin_div, range=(0,255), density=True)[0]
    r = np.histogram(img[:,:,2], bins=255//bin_div, range=(0,255), density=True)[0]
    return bins, r, g, b
#       
def plot_image_with_hist_by_index(img_index, height=6):
    bins, r, g, b = get_hist_data_by_index(img_index)
    img = load_image_by_index(img_index)
    fig = plt.figure(constrained_layout=True)

    if img.shape[0] < img.shape[1]:
        width_ratios = [3,1]
    else:
        width_ratios = [1,1]
        
    gs = GridSpec(3, 2, figure=fig, 
                  width_ratios = [3,1]
                 )
    ax_img = fig.add_subplot(gs[:,0])

    ax_r = fig.add_subplot(gs[0, 1])
    ax_g = fig.add_subplot(gs[1, 1], sharey=ax_r)
    ax_b = fig.add_subplot(gs[2, 1], sharey=ax_r)

    ax_img.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB),aspect = 'equal')
    ax_img.axis('off')
    
    ax_r.bar(bins[:-1], r, width = 5, color='red',alpha=0.7)
    ax_g.bar(bins[:-1], g, width = 5, color='green',alpha=0.7)
    ax_b.bar(bins[:-1], b, width = 5, color='blue',alpha=0.7)

    ax_r.axes.get_xaxis().set_ticks([])
    ax_r.axes.get_yaxis().set_ticks([])
    ax_g.axes.get_xaxis().set_ticks([])
    ax_g.axes.get_yaxis().set_ticks([])
    ax_b.axes.get_xaxis().set_ticks([])
    ax_b.axes.get_yaxis().set_ticks([])
    fig.suptitle("{} - {}".format(paintings_links.iloc[img_index].artist_name, 
                                 paintings_links.iloc[img_index].picture_name),ha= "left")
    
    fig.set_figheight(height)
    plt.axis('tight')
    if img.shape[0] < img.shape[1]:
        fig.set_figwidth(img.shape[1] *height / img.shape[0] *1.25)
    else:
        fig.set_figwidth(img.shape[1] *height / img.shape[0] *1.5)
    plt.show()

Examples of works:

Having carefully looked at the histograms of different pictures, we can notice that their form is very specific and varies greatly from work to work.

In this regard, we make the assumption that the histogram is a kind of cast of the picture, which allows it to be characterized to some extent.

First model

We collect all the histograms into one large dataset and try to look for some “anomalies” in it. Fast, convenient and generally my favorite algorithm for such purposes is one class svm. We will use its implementation from the sklearn library

Source

#       ,       
res = []
error = []
for img_index in tqdm(range(paintings_links.shape[0])):
    try:
        bins, r, g, b = get_hist_data_by_index(img_index)
        res.append(np.hstack([r,g,b]))
    except:
        res.append(np.zeros(153,))
        error.append(img_index)
        
np_res = np.vstack(res)
#    
pd.DataFrame(np_res).to_pickle("histograms.pkl")
histograms = pd.read_pickle("histograms.pkl")
#  .         .   10    
one_class_svm = OneClassSVM(nu=10 / histograms.shape[0], gamma='auto')
one_class_svm.fit(histograms[~histograms.index.isin(bad_images)])
#  
svm_outliers = one_class_svm.predict(histograms)
svm_outliers = np.array([1 if label == -1 else 0 for label in svm_outliers])
#   
uncommon_images = paintings_links[(svm_outliers ==1) & (~histograms.index.isin(bad_images))].index.values
for i in uncommon_images:
    plot_image_with_hist_by_index(i,4)

Let's see what anomalous we find in the bins of our gallery.

Work done in pencil:

Work in very dark colors:

Lady in red:

Something sketchy:

Very dark portrait:

Search for similar jobs

Well, our model finds something unusual, far from everything else.

But can we make a tool that will help find work similar in color?

Now each picture is characterized by a vector of 153 values (because when building the histogram, it hit 5 intensity units per bin, total 255/5 = 51 frequencies for each channel).

We can determine the “degree of similarity” by calculating the distances between the vectors of interest to us. The Euclidean distance familiar from school here will pay a lot of attention to the length of the vector components, and we would like to pay more attention to the sets of shades that make up the picture. Here we will find the cosine measure of distance widely used, for example, in text analysis problems. Let's try to apply it for this task. We take the implementation from the scipy library.

Source

#  ,        
from scipy import spatial
def find_closest(target_id,n=5):
    distance_vector = np.apply_along_axis(spatial.distance.cosine,
                        arr=histograms,
                       axis=1,
                       v=histograms.values[target_id])
    return np.argsort(distance_vector)[:n]

Let's see what looks like Aivazovsky’s “Ninth Wave”.

Original:

Similar works:

What looks like Van Gogh's “Flowering Almonds”.

Original:

Similar works:

And what looks like an anomalous lady found earlier in red?

Original:

Similar works:

Color spaces

Until that moment, we worked in the RGB color space. It is very convenient for understanding, but far from ideal for our tasks.

Look, for example, on the brink of the RGB color cube

With the naked eye you can see that there are large areas on the faces where our eyes do not see changes, and relatively small areas where our color perception changes very sharply. This non-linear perception prevents the machine from evaluating colors the way a person would.

Fortunately, there are many color spaces, probably some will fit our tasks.

We will choose our favorite color space by comparing its usefulness in solving a human problem. Let's, for example, calculate the artist by the contents of the canvas!

Take all the available color spaces from the opencv library, train xgboost on each, and see the metrics on the deferred selection.

Source

# ,        
def get_hist_data_by_index_and_colorspace(bgr_img, colorspace):
    bin_div = 5
    img_cvt = cv2.cvtColor(bgr_img, getattr(cv2, colorspace))
    c1, bins =  np.histogram(img_cvt[:,:,0], bins=255//bin_div, range=(0,255), density=True)
    c2 = np.histogram(img_cvt[:,:,1], bins=255//bin_div, range=(0,255), density=True)[0]
    c3 = np.histogram(img_cvt[:,:,2], bins=255//bin_div, range=(0,255), density=True)[0]
    return bins, c1, c2, c3
#        
all_res = {}
all_errors = {}
for colorspace in list_of_color_spaces:
    all_res[colorspace] =[]
    all_errors[colorspace] =[]
for img_index in tqdm(range(paintings_links.shape[0]) ):
    
    for colorspace in list_of_color_spaces:
        try:
            bgr_img = load_image_by_index(img_index)
            bins, c1, c2, c3 = get_hist_data_by_index_and_colorspace(bgr_img, colorspace)
            all_res[colorspace].append(np.hstack([c1, c2, c3]))
        except:
            all_res[colorspace].append(np.zeros(153,))
            all_errors[colorspace].append(img_index)
all_res_np = {}
for colorspace in list_of_color_spaces:  
    all_res_np[colorspace] = np.vstack(all_res[colorspace])
res = []
#        
for colorspace in tqdm(list_of_color_spaces):
    temp_df = pd.DataFrame(all_res_np.get(colorspace))
    temp_x_train =   temp_df[temp_df.index.isin(X_train.index.values)]
    temp_x_test =   temp_df[temp_df.index.isin(X_test.index.values)]

    xgb=XGBClassifier()
    xgb.fit(temp_x_train, y_train)
    current_res = classification_report(y_test, xgb.predict(temp_x_test), labels=None, target_names=None, output_dict=True).get("macro avg")
    current_res["colorspace"] = colorspace
    res.append(current_res)
pd.DataFrame(res).sort_values(by="f1-score")

precision	recall	f1-score	colorspace
0.001329	0.003663	0.001059	COLOR_BGR2YUV
0.003229	0.004689	0.001849	COLOR_BGR2RGB
0.003026	0.004131	0.001868	COLOR_BGR2HSV
0.002909	0.004578	0.001934	COLOR_BGR2XYZ
0.003545	0.004434	0.001941	COLOR_BGR2HLS
0.003922	0.004784	0.002098	COLOR_BGR2LAB
0.005118	0.004836	0.002434	COLOR_BGR2LUV

A tangible increase in quality was given by the use of the LUV color space.

The creators of this scale tried to make the perception of color changes along the axis of the scale as uniform as possible. Thanks to this, the perceived color change and its mathematical evaluation will be as close as possible.

This is how a slice of a given color space looks like when fixing one of the axes:

Let's look at the model

After the previous step, we still have a model that can predict something.
Let's see whose work we learn most accurately.

precision	recall	f1-score	Artist
0.042553	0.019417	0.026667	Ilya Efimovich Repin
0.055556	0.020000	0.029412	William Merrit Chase
0.071429	0.022222	0.033898	Bonnard pierre
0.035461	0.035211	0.035336	Jill elvgren
0.100000	0.021739	0.035714	Jean Auguste Dominic Ingres
0.022814	0.224066	0.041411	Pierre Auguste Renoir
0.100000	0.028571	0.044444	Albert Bierstadt
0.250000	0.032258	0.057143	Hans Zatska
0.030396	0.518797	0.057428	Claude Oscar Monet
0.250000	0.037037	0.064516	Girotto walter

Metrics themselves are far from ideal, but you need to remember that the color scheme is a small fraction of the information about the work. The artist uses many expressive means. The fact that we found in this data a certain “handwriting” of the artist is already a victory.

We will choose one of the artists for a deeper analysis. Let it be Claude Oscar Monet (I will make my wife nice, she likes the impressionists).

Let's take his work, ask the model to tell us the author and calculate the frequencies

Predicted author	Number of predictions
Claude Oscar Monet	186
Pierre Auguste Renoir	171
Vincent Van Gogh	25
Peter Paul Rubens	nineteen
Gustave Dore	17

Many people tend to confuse Monet and Manet, and our model prefers to confuse him with Renoir and Van Gogh. Let's see what, according to the model, is similar to Van Gogh.

And now we’ll use our search function for similar works and find Van Gogh’s paintings similar to the above-mentioned works (this time we will measure distances in the LUV space).

Original:

Similar work:

Original:

Similar works:

Original:

Similar works:

Satisfied with myself, I showed the results to a friend and found out that the histogram approach is actually quite rude, since it does not analyze the distribution of the color itself, but of its components separately. In addition, it is not so much the frequency of the colors that are important as their composition. It turned out that contemporary artists have proven approaches to the choice of color schemes. So I found out about Johannes Itten and his color wheel.

Itten's color wheel

Johannes Itten is an artist, art theorist and teacher, author of famous books on form and color. The color wheel is one of the most well-known tools that helps to combine colors so as to please the eye.

We illustrate the most popular color selection methods:

Complementary colors - located on opposite parts of the circle
Adjacent colors - adjacent to the circle
Classical Triad - Colors at the Tops of an Equilateral Triangle
Contrast Triad - Colors at the Tops of an Isosceles Triangle
Rectangle Rule - Colors at the vertices of the rectangle
Rule of the square - colors on the tops of the square

We analyze as artists

Let's try to put into practice the acquired knowledge. To get started, we get an array of colors lying on the color wheel, recognizing them from the picture.

Now we can compare in pairs each pixel of our paintings with an array of Itten circle colors. We replace the original pixel with the nearest one in the color wheel and calculate the frequency of the colors in the resulting image

Source

#          
def plot_composition_analysis(image_index):
    img = load_image_by_index(image_index)
    luv_img = cv2.cvtColor(load_image_by_index(image_index), cv2.COLOR_BGR2LUV)
    closest_colors = np.argmin(euclidean_distances(luv_img.reshape(-1,3),wheel_colors_luv),axis=1)
    wheel_colors2[closest_colors].reshape(luv_img.shape)
    color_areas_img = wheel_colors2[closest_colors].reshape(img.shape)

    v, c = get_image_colors(image_index)
    #         
    c_int = (c*img.shape[1]).astype(int)
    c_int_delta = img.shape[1] - sum(c_int)
    c_int[np.argmax(c_int)] = c_int[np.argmax(c_int)] + c_int_delta

    _ = []

    for i, vi in enumerate(v):
        bar_width = c_int[i]
        _.append(np.tile(wheel_colors2[vi], (150,bar_width,1)))
    color_bar_img = np.hstack(_)
    final_image = np.hstack([
                                np.vstack([img,
                                           np.tile(np.array([254,254,254]),(160,img.shape[1],1))]),

                                np.tile(np.array([254,254,254]),(img.shape[0]+160,10,1)),
                                np.vstack([color_areas_img,
                                           np.tile(np.array([254,254,254]),(10,img.shape[1],1)),
                                           color_bar_img])
                            ])
    h = 12
    w = h / final_image.shape[1] * final_image.shape[0]
    fig = plt.figure(figsize=(h,w))
    plt.imshow(cv2.cvtColor(final_image.astype(np.uint8), cv2.COLOR_BGR2RGB),interpolation='nearest', aspect='auto')
    plt.title("{} - {}".format(paintings_links.iloc[image_index].artist_name, 
                             paintings_links.iloc[image_index].picture_name),ha= "center")
    plt.axis('off');

Have you noticed how “Young Arlesian” has changed little after our transformation? Perhaps it’s worth measuring not only the frequencies of new colors, but also statistics on conversion errors - this can help us in the analysis.

But this is not enough for the present analysis. Let's look for harmonious combinations in a circle?

All complementary pairs:

All classical triads:

And all squares:

We will look for these combinations in our paintings, find the most significant (in frequency) and
see what happens.

To start the pair:

Then the triads:

And, finally, the squares:

Not bad by eye, but will the new metrics help determine the author of the work?

We will train the model using only Itten's color frequencies, error characteristics and harmonious combinations found.

This time, the list of the most “predictable” artists has changed a bit, which means that a different approach to analysis allowed us to extract some more information from the contents of the picture.

precision	recall	f1-score	Artist
0.043478	0.012195	0.019048	Martin, Henri-Jean-Guillaume
0.032680	0.029070	0.030769	Camille Pissarro
0.166667	0.019608	0.035088	Jean-Leon Jerome
0.076923	0.027778	0.040816	Turner, Joseph Mallord William
0.133333	0.024390	0.041237	Poortvliet, Rien
0.100000	0.026316	0.041667	Max Klinger
0.026725	0.228216	0.047847	Pierre Auguste Renoir
0.200000	0.028571	0.050000	Brasilier, Andre
0.028745	0.639098	0.055016	Claude Oscar Monet

Conclusion

Works of art are unique. Artists use many compositional techniques that make us admire their work again and again.
The color scheme is an important, but far from the only component in the analysis of artists' works.

Many real art critics will laugh at the naivety of the analysis, but still I was satisfied with the work done. There are several more ideas that have not yet reached the hands of:

Apply clustering algorithms to the analysis of artists' color schemes. Surely we could highlight interesting groups there, distinguish between various trends in painting
Apply clustering algorithms to individual paintings. Search for “plots” that can be identified by color scheme. For example, in different clusters get landscapes, portraits and still lifes
Search not only pairs, triples and squares, but also other combinations from Itten's circle
Move from frequency analysis to color spot analysis by grouping pixels by location
Find works in which authorship is in doubt and see who the model will vote for

PS

This article was originally a graduation project for a machine learning course , but several people recommended turning it into material for Habr.

I hope you were interested.

All the code used in the work is available on github .

We analyze the masterpieces of painting with the help of classic ML

Naive transformation

First model

Search for similar jobs

Color spaces

Let's look at the model

Itten's color wheel

We analyze as artists

Conclusion

PS

More articles: