Skip to content

Commit

Permalink
Merge branch 'developer'
Browse files Browse the repository at this point in the history
  • Loading branch information
raulbonet committed Aug 8, 2020
2 parents 7d070ef + eea1f61 commit d1ee648
Show file tree
Hide file tree
Showing 9 changed files with 212 additions and 2 deletions.
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,9 @@ dmypy.json

# Pyre type checker
.pyre/

# Pycharm
.idea/

# Windows
desktop.ini
46 changes: 44 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,44 @@
# gatys-style-transfer
TensorFlow 2 impelemntation of Gatys et al. style transfer algorithm
# Neural Style Transfer
### Implementation of Gatys et al. algorithm using Tensorflow 2

The following repository contains an implementation of the Neural Style Transfer algorithm described by Gatys et al. in their paper [A Neural Algorithm of Artistic Style](https://arxiv.org/pdf/1508.06576).
The algorithm has been implemented using Tensorflow 2, and unlike many existing implementations, it has been written using its new *eager execution* feature. It also deploys Tensorflow's built-in VGG model, included in the Keras module.

### Background of the algorithm
The original algorithm, the one implemented in this repository, is of an iterative nature: a randomly generated initial image successively converges to the desired image, one mixing both the **content** of an image and the **style** of another.

<p align="center">
<img src="data/input-example.jpg" alt="Content Image" height=350px width=350px/>
<img src="data/style-example.jpg" alt="Style Image" height=350px width=350px/>
</p>

Output:

<p align="center">
<img src="data/out_example3000.jpg" alt="Content Image" height=350px/>
</p>



At each iteration, a loss function is evaluated, its gradient computed, and the image being iterated, updated. The loss function is made up of 2 parts:

* Content loss <img src="https://render.githubusercontent.com/render/math?math=J_c">: loss aimed to evaluate how close the content of an image is to the original content of the reference image. Let <img src="https://render.githubusercontent.com/render/math?math=l"> be a given layer, <img src="https://render.githubusercontent.com/render/math?math=f_{l}(x)"> the function that computes the flattened feature maps of that layer <img src="https://render.githubusercontent.com/render/math?math=l"> for a given image <img src="https://render.githubusercontent.com/render/math?math=x">. Then, the content loss at each iteration is defined as <img src="https://render.githubusercontent.com/render/math?math=f(a)-f(b)">, where <img src="https://render.githubusercontent.com/render/math?math=a"> is the original reference image, and <img src="https://render.githubusercontent.com/render/math?math=b"> the image being iterated.
* Style loss <img src="https://render.githubusercontent.com/render/math?math=J_s">: similarly, loss aimed to evaluate how close the style of two images are from one another. Let <img src="https://render.githubusercontent.com/render/math?math=\{l_i\}"> be a sequence of layers. Let <img src="https://render.githubusercontent.com/render/math?math=G_i(x)"> be the Gram Matrix of the flattened feature maps of layer <img src="https://render.githubusercontent.com/render/math?math=l_i"> for an input image <img src="https://render.githubusercontent.com/render/math?math=x">. Then, the style loss is defined as the sum over all <img src="https://render.githubusercontent.com/render/math?math=l_i"> of <img src="https://render.githubusercontent.com/render/math?math=G_i(a)-G_i(b)">.

The total loss <img src="https://render.githubusercontent.com/render/math?math=J"> is defined as <img src="https://render.githubusercontent.com/render/math?math=J = \alpha J_c %2B \beta J_s">, where <img src="https://render.githubusercontent.com/render/math?math=\alpha"> and <img src="https://render.githubusercontent.com/render/math?math=\beta"> are fixed weights.


### Details

The source code includes:
* A Python script named `main-transfer.py`, which transfers the style of an image to another image. The script is designed either to be called using command-line arguments, or be imported as a Python module to be used at other projects.
* A Jupyter notebook, with the same code as the `main-transfer.py` file, but explaining thoroughly each step. This file is aimed both to do further research with the code and give some insight on how Tensorflow works.


When executing the script through the command-line, several arguments are required, although all of them come with a default value. Please, execute `python src/main-transfer.py --help` to get help for the arguments. All default values are contained inside the `src/constants.py` file.

When executing `python src/main-transfer.py`, the script will look for the image `data/input-example.jpg`, which will be the content image, and the image `data/style-example.jpg`, and will transfer the style of the latter to the former. The result, after 3000 iterations (epochs), can be found at `out_example3000.jpg`. These images can be changed using command-line arguments as described in the previous section.

The code has been developed (or tested) in the following platform; see the `requirements.txt` file for further information on Python modules' versions:
* Ubuntu 18.04.03 / Windows 10 Pro version 1909 18363.959
* Tensorflow 2.2.0
Binary file added data/input-example.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/out_example3000.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/style-example.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file added output/.gitkeep
Empty file.
3 changes: 3 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
numpy==1.18.4
tensorflow==2.2.0
pillow==7.1.2
23 changes: 23 additions & 0 deletions src/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
import os

BASE_PATH = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
SRC_PATH = os.path.join(BASE_PATH, 'src')

EX_IM_PATH = os.path.join(os.path.join(BASE_PATH, 'data'), 'input-example.jpg')
STYLE_IM_PATH = os.path.join(os.path.join(BASE_PATH, 'data'), 'style-example.jpg')
# EX_OUT_PATH = os.path.join(os.path.join(BASE_PATH, 'output'), 'output-example.jpg')
EX_OUT_PATH = os.path.join(BASE_PATH, 'output')


CONTENT_LAYER_NAME = 'block4_conv2'
STYLE_LAYER_NAMES = [
'block1_conv1',
'block2_conv1',
'block3_conv1',
'block4_conv1',
'block5_conv1',
]
STYLE_WEIGHTS = [0.2, 0.2, 0.2, 0.2, 0.2]
ALPHA = 1
BETA = 10000
EPOCHS = 3000
136 changes: 136 additions & 0 deletions src/main-transfer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
import argparse
import os
import tensorflow as tf
import numpy as np
from PIL import Image

import constants as c


def parse_f():
parser = argparse.ArgumentParser(description='Neural Style Transfer')
parser.add_argument('--content_im_path', default=c.EX_IM_PATH, help='path of input image, including filename')
parser.add_argument('--output_im_path', default=c.EX_OUT_PATH, help='path of the folder in which the output image will be stored')
parser.add_argument('--content_layer', default=c.CONTENT_LAYER_NAME, help='content layer name whose feature maps will be used for loss computation. Read Readme for further information')
parser.add_argument('--style_layer', default=c.STYLE_LAYER_NAMES, help='style layer name(s) whose feature maps will be used for loss computation. Read Readme for further information')
parser.add_argument('--style_layer_weights', default=c.STYLE_WEIGHTS, help='weights for each style layer')
parser.add_argument('--content_loss_weight', default=c.ALPHA, help='weight of content loss on loss function')
parser.add_argument('--style_loss_weight', default=c.BETA, help='weight of style loss on loss function')
parser.add_argument('--epochs', default=c.EPOCHS, help='number of iterations when training')
return parser


def layer_dims_f(model, layer_names):
num_kernels = np.zeros(len(layer_names), dtype='int32')
dim_kernels = np.zeros(len(layer_names), dtype='int32')
for i, l_name in enumerate(layer_names):
num_kernels[i] = model.get_layer(l_name).output_shape[3]
dim_kernels[i] = model.get_weights()[0].shape[0] ** 2
return num_kernels, dim_kernels


def loss_f(content_ext_model, style_ext_models, content_ref_fmap, style_ref_grams, num_kernels, dim_kernels, gen_im,
style_layer_weights, content_loss_weight, style_loss_weight):

def _loss_f():
gen_im_preproc = tf.keras.applications.vgg19.preprocess_input(gen_im)

content_gen_fmap = content_ext_model(gen_im_preproc)
content_loss = 0.5 * tf.math.reduce_sum(tf.math.square(content_ref_fmap - content_gen_fmap))

style_gen_grams = style_grams_f(style_ext_models, gen_im_preproc)
diff_styles = [tf.math.square(a - b) for a, b in zip(style_ref_grams, style_gen_grams)]
diff_styles_reduced = tf.TensorArray(dtype='float32', size=len(diff_styles))
for i, diff_style in enumerate(diff_styles):
diff_styles_reduced = diff_styles_reduced.write(i, tf.math.reduce_sum(diff_style))
diff_styles_reduced = diff_styles_reduced.stack()

style_loss = 1. / ((2 * num_kernels * dim_kernels) ** 2) * diff_styles_reduced
style_loss = tf.tensordot(style_layer_weights, style_loss, axes=1)

total_loss = content_loss_weight * content_loss + style_loss_weight * style_loss
return total_loss

return _loss_f


def style_grams_f(style_ext_models, im_preproc):
style_ref_fmaps = [style_model(im_preproc) for style_model in style_ext_models]

def _gram_matrix_f(kernels):
kernels = kernels[0, :, :, :]
num_kernels = kernels.shape[2]
flattened_kernels = tf.reshape(kernels, (num_kernels, -1))
gram_matrix = tf.tensordot(flattened_kernels, tf.transpose(flattened_kernels), axes=1)
return gram_matrix

style_ref_grams = [_gram_matrix_f(f_maps) for f_maps in style_ref_fmaps]
return style_ref_grams


def load_image_tf_f(path):
im = Image.open(path)
im = np.array(im)
im = np.expand_dims(im, 0)
im = tf.Variable(im, dtype='float32')
return im


def main(**kwargs):

# Load content image
content_im = load_image_tf_f(kwargs['content_im_path'])
content_im_preproc = tf.keras.applications.vgg19.preprocess_input(content_im)

# VGG model
_, width, height, channels = content_im.shape
model_vgg = tf.keras.applications.VGG19(include_top=False, weights='imagenet', input_tensor=None,
input_shape=(width, height, 3), pooling='max')
model_vgg.trainable = False
num_kernels, dim_kernels = layer_dims_f(model_vgg, kwargs['style_layer'])

# Content loss: model to compute feature maps
content_ext_model = tf.keras.Model(inputs=model_vgg.inputs,
outputs=model_vgg.get_layer(kwargs['content_layer']).output)

# Content loss: feature maps of input (reference) image
content_ref_fmap = content_ext_model(content_im_preproc)

# Style loss: load style image
style_im = load_image_tf_f(c.STYLE_IM_PATH)
style_im_preproc = tf.keras.applications.vgg19.preprocess_input(style_im)

# Style loss: gram matrix of input (reference) image
style_ext_models = [tf.keras.Model(inputs=model_vgg.inputs, outputs=model_vgg.get_layer(layer).output) \
for layer in kwargs['style_layer']]
style_ref_grams = style_grams_f(style_ext_models, style_im_preproc)

# Create noise image
# gen_im = np.random.randint(0, 256, (1, width, height, channels))
# gen_im = np.zeros((1, width, height, channels))
# gen_im = tf.Variable(np.array(gen_im), dtype='float32')

# Start iterating from original image
gen_im = tf.Variable(content_im)

# Optimizer
opt = tf.keras.optimizers.Adam(learning_rate=1)

for epoch in range(kwargs['epochs']):
opt.minimize(
loss=loss_f(content_ext_model, style_ext_models, content_ref_fmap, style_ref_grams, num_kernels, dim_kernels, gen_im,
kwargs['style_layer_weights'], kwargs['content_loss_weight'], kwargs['style_loss_weight']),
var_list=[gen_im])
if epoch % 5 == 0:
save_im = gen_im.numpy()[0, :, :, :]
save_im = np.where(save_im <= 255, save_im, 255)
save_im = np.where(save_im >= 0, save_im, 0)
save_im = save_im.astype(np.uint8)
save_im = Image.fromarray(save_im)
save_im.save(os.path.join(kwargs['output_im_path'], 'output-example' + str(epoch) + '.jpg'))


if __name__ == '__main__':
args = parse_f().parse_args()
kwargs = args.__dict__
main(**kwargs)

0 comments on commit d1ee648

Please # to comment.