Picasso “styles” visualized with t-SNE

In an earlier blog post I explained how I came to discover that “style,” as described in the Gatys, et al style transfer paper, has little correlation with what humans think of when they think of a particular artist’s style, and instead really has more to do with color and texture. In this sense, a particular artist can have wildly different “styles” over the course of his/her career.

This led me to wonder whether this notion of style could be used to identify different periods within a particular artist’s career, even if it can’t be used to identify the artist. Coincidentally, I recently learned about t-SNE, a machine learning algorithm useful for visualizing high dimensional data.

Without going into too much detail, t-SNE is useful for projecting high dimensional data into lower dimensional space that humans can visualize in such a way that “similar” points in the original space appear close together in the projection. The user can define this notion of similarity to be anything she likes by defining a metric between points in the original space. As opposed to PCA, t-SNE can help preserve nonlinear structures within your data space.

For this exploration, I focused on 359 works by Picasso (namely, those available from this Kaggle dataset that had year data available). For each image in the dataset, I computed its Gram matrices from the 5 convolutional layers of the 19-layer VGG network. (As explained in the style transfer paper and my previous post, Gram matrices represent the correlation between the different activations of a given layer of the neural net and thus represent the style of the original image.) I took all 5 Gram matrices, flattened them, and concatenated them to get an embedding of the image in a huge-dimensional space (610304, to be exact):

NUM_CHANNELS = [64, 128, 256, 512, 512]
LAYER_IM_SIZE = [224, 112, 56, 28, 14]
EMBED_SIZE = sum(map(lambda x:x*x, NUM_CHANNELS))

def gram_matrix(F, N, M):
    # F is the output of the given convolutional layer on a particular input image
    # N is number of feature maps in the layer
    # M is the total number of entries in each filter
    Ft = np.reshape(F, (M, N))
    return np.dot(np.transpose(Ft), Ft)

def flattened_gram(imarray, session):
    grams = np.empty([EMBED_SIZE])
    index = 0
    for i in range(5):
        grams[index:(NUM_CHANNELS[i]**2 + index)] = gram_matrix(session.run(model['conv' + str(i+1) + '_1'], feed_dict={tf_image: imarray}), NUM_CHANNELS[i], LAYER_IM_SIZE[i]**2).flatten()
        index += NUM_CHANNELS[i]**2
    return grams

filenames = []
for filename in os.listdir(im_dir):
    if os.path.splitext(filename)[1] in ('.jpg', '.png'):

embeddings = np.empty([len(filenames), EMBED_SIZE])

with tf.Session(graph=graph) as sess:
    count = 0
    for filename in filenames:
        embeddings[count, :] = flattened_gram(get_imarray(os.path.join(im_dir ,filename)), sess)
        count += 1
        if count % 10 == 0:
            print("Embedded " + str(count) + " images")

print("Large embeddings generated. Shape: " + str(embeddings.shape))

def distance(fg1, fg2):
    dist = 0
    index = 0
    for i in range(5):
        square_1 = np.reshape(fg1[index:NUM_CHANNELS[i]**2 + index], (NUM_CHANNELS[i], NUM_CHANNELS[i]))
        square_2 = np.reshape(fg2[index:NUM_CHANNELS[i]**2 + index], (NUM_CHANNELS[i], NUM_CHANNELS[i]))
        index += NUM_CHANNELS[i]**2
        dist += (1.0 / (4 * NUM_CHANNELS[i] * LAYER_IM_SIZE[i]**2)) * (np.linalg.matrix_power(square_1 - square_2, 2)).sum()
    return dist

tsne = TSNE(perplexity=30, n_components=2, init='pca', n_iter=5000, metric=distance)
print("Projecting onto two dimensions... this might take a while")
two_d_embeddings = tsne.fit_transform(embeddings)
print("2D embeddings generated")

Once I had the 2D embeddings, I plotted them using matplotlib with the color corresponding to the year in which the artwork was made. In the visualization, you can see how the styles spread out as Picasso’s career progresses:


Later, I teamed up with Lyn N. to make this awesome version in Javascript using Chart.js, where you can explore styles decade by decade and click on points to see the image they represent. I’m now looking to generalize the scripts I used to embed the images and produce the plots for more versatile uses, like exploring stylistic influences or comparing different artists within similar artistic schools.

[UPDATE] The interactive Javascript version is now live on the web for your enjoyment! Clicking on a data point shows its corresponding artwork image, and clicking on the keys at the bottom of the page toggles on/off the data points coming from the corresponding decades.




2 thoughts on “Picasso “styles” visualized with t-SNE

  1. Super fun blog posts! It would be great if you could start including import statements in the code so we can follow along better. Some are obvious like tf is probably TensorFlow, but if your readers aren’t used to scientific python, they might have no idea that np is numpy, etc.

    Keep the posts coming! 🙂


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s