It is the beginning of a new month, so it’s time again for me to write a summary about what we have been doing at Helsinki Rephotography in September. Last month was the Hack4OpenGLAM event, so most of our time went to preparing for that. We were participating with a project about upscaling images with machine learning and artificial neural networks. For this we used the OpenCV library which stands for Open Computer Vision.
Artificial neural networks are computing systems that are inspired by the real biological neural networks found in our brains. They can be used for example in pattern recognition, 3D reconstruction and machine translation. For testing, we used a neural network model trained for upscaling images called EDSR.
Setting everything up
I was the one responsible for doing the initial legwork and getting OpenCV up and running. This required compiling it from the source code, since the super resolution module necessary for upscaling images was not included in the pre-compiled version of OpenCV. I first did a test version without GPU acceleration just to see how fast the upscaling was with just the CPU power. This ended up working with simpler upscaling models such as FSRCNN, but that one obviously wasn’t as good as EDSR, which took well over a minute to upscale a small image with just the CPU.
I then figured out how to get CUDA working with OpenCV. This allowed the use of Nvidia GPUs in the upscaling process. It required installing the CUDA toolkit and recompiling OpenCV with support for it. After setting it up, I ran into another problem: my GPU had too little integrated memory to do the upscaling. This prompted Kimmo to set up an analytics server with a more powerful GPU that we could try, and this ended up working. It now only took 5 seconds to upscale the same picture that took over a minute just with the CPU.
After this we could try it on different images and see how it worked. We ended up using 3 different image upscaling models; GFPGAN, Real-ESRGAN and ESRGAN. These models were based on a different neural network technique called generative adversarial network. In this technique, two neural networks compete against each other. The other creates data points, while the other tries to distinguish between the real ones and the ones created by the neural network.
The end result was that the images themselves looked good, but the AI sometimes tended to create weird things, such as badly shaven mustaches, faces inside faces and walls that were made from fabric. So, it was not perfect, but could work if you weren’t looking for the perfect result. Here is a good example of the upscaling that we did on an image provided to us by Fortepan, who we did this project in collaboration with. This image was upscaled with ESRGAN. In it you can clearly see the difference in sharpness and details between the original and the upscaled image.
For me it was interesting to see and learn what machine learning and neural networks can be used for. I hope that in the future we can also use this for something useful.