Lab 5: Neural Style Transfer

Introduction

In this lab, you'll take an image and then apply the style of an artist to it (original paper). You'll take a picture of a bird:

robin JPEG

and then apply the style of Van Gogh's Starry night:

yielding something like:

Approach

You'll be doing gradient descent to update values, but instead of updating the weights in a Neural network, you'll update the pixels in the picture that is fed into the NN. Here's a sample notebook that shows how you can use gradient descent to update inputs in PyTorch.

Parts

Part 1

In the first part, you'll use the content picture and attempt to recreate a similar-content picture starting from random pixels.

Steps

  1. Resize the content picture to 224x224.
  2. Create a random tensor which'll represent an image also of size 224x224. You'll be feeding this tensor into the neural network. What shape should it be? What is the batch size? What are the number of channels?
  3. Create your VGG19 model:
      m_vgg = vgg19_bn(pretrained=True).cuda() 
    Note that this model has fully-connected layers at the end which you don't need. You can remove them, if you wish, so that your training runs faster.
  4. Make sure your model is frozen (you do not want to update the weights of the Neural Network). You can obtain the parameters (learnable Tensors) of a model with m_vgg.parameters(). Given a parameter, p, you can freeze it with: p.requires_grad = False.
  5. You'll need a way to get at the activations at a particular layer of your model. Model hooks are a good way to do that. For example, hook_outputs will save the activations of any passed-in modules. In particular, you'll want to save the output of the second Conv2D layer between the next-to-last MaxPool and the previous MaxPool.
  6. Convert your resized content image to a tensor (image2tensor will do such a thing).
  7. Run your model with your content image tensor (just call the model as a function, passing in the content image tensor). Save the output of the appropriate Conv2d layer. This represents the content of the content image.
  8. Now, you will run your optimizer and optimize the input. You should use a MSE loss between the output of the appropriate Conv2d layer of the input, and the output of the appropriate Conv2d layer of the content image.
  9. When you're done, your tensor that was originally random should now look like a bird. Here is an example of what the output might look like:

Hints

Part 2

In this part, you'll change your loss function so that you're no longer trying to match the content of the content image, but instead match the style of the style image.

The style is represented by activations of many different layers. You'll want to repeat much of what you did in part 1, except that you'll save the activations of more layers: the first Conv2d layer and then the first one after each maxpool.

In addition, the loss will be different. For each layer of activations, you'll create a Gram matrix. The loss should be the average MSE loss across all layers between the Gram matrix of the image-being-optimized and the style image. Here is an example of what the output might look like:

Hints

Part 3

For this part, you'll use a combined loss that takes into account both the content loss and the style loss.

Hints

Challenges

Challenge 1 Utilize the fact that the GPU can work on batches simultaneously. Use 10 different content images and 10 different style images and apply Neural Style Transfer to all 10 at once by putting them in a single batch. Take advantage of any tensor operations that you can do in parallel (for example, calculating the Gram matrices).

Challenge 2 Use full-size images rather than 224x224 images (for content, style, and the new image).

Challenge 3 The paper says that better results are obtained by replacing MaxPool layers with AvgPool layers. Do that replacement and evaluate whether you get better-looking results.

This completes the lab. Submit instructions

  1. Make sure that the output of all cells is up-to-date.
  2. Rename your notebook:
    1. Click on notebook name at the top of the window.
    2. Rename to "CS152Sp21Lab5 FirstName1/FirstName2" (using the correct lab number, along with your two first names). I need this naming so I can easily navigate through the large number of shared docs I will have by the end of the semester.
  3. Choose File/Save
  4. Share your notebook with me:
    1. Click on the Share button at the top-right of your notebook.
    2. Enter rhodes@g.hmc.edu as the email address.
    3. Click the pencil icon and select Can comment.
    4. Click on Done.
  5. Enter the URL of your colab notebook in this submittal form. Do not copy the URL from the address bar (which may contain an authuser parameter and which I will not be able to open). Instead, click Share and Copy link to obtain the correct link. Enter your names in alphabetical order.
  6. At this point, you and I will go back and forth until the lab is approved.
    1. I will provide inline comments as I evaluate the submission (Google should notify you of these comments via email).
    2. You will then need to address those comments. Please do not resolve or delete the comments. I will use them as a record of our conversation. You can respond to them ("Fixed" perhaps).
    3. Once you have addressed all the comments in this round, fill out the submittal form again.
    4. Once I am completely satisifed with your lab, I will add a LGTM (Looks Good to Me) comment
    5. At that point, setup an office hour appointment with me. Ill meet with you and your partner and we'll have a short discussiona about the lab. Both of you should be able to answer questions about any part of the lab.

'