Super-resolution with OpenCV and Pythonđź”—

Summaryđź”—

This recipe describes how you can make a Python tool which uses OpenCV functions and deep neural networks for image analysis. The tool scales a small image up while trying to preserve details.

Detailed descriptionđź”—

Setting up the environmentđź”—

As explained in Python tool API, the Builder will execute all Python modules whose name ends with *toolplugin.py in $HOME/VisionAppster/plugins/tool/. If these modules register tools, they will appear in Builder’s tool box. In this example, the Python module will be rescale_toolplugin.py and the name of the tool in the plugin is Rescale.

The Rescale tool uses a deep neural network (DNN) to zoom up an image with enhanced resolution. The DNN is first used for zooming up in steps of powers of 2 and finally adjusting the size with OpenCV’s resize function to match the required zoom factor. The DNN data can be downloaded from github. The Python code searches for the DNN data files (LapSRN_x*.pb) in a directory called superresolution under the directory the Python module itself is stored in, so you must copy the data files to $HOME/VisionAppster/plugins/tool/superresolution as shown below.

Python code and DNN data files

Python code and DNN data filesđź”—

Should you want to experiment with superresolution models other than LapSRN, the DNN files and further information can be found here.

Use the va-pkg command-line tool to enable Python support and to install the required NumPy and OpenCV packages:

va-pkg install com.visionappster.extensions.python
va-pkg install python:numpy python:opencv-contrib-python

NOTE: Make sure not to install python:opencv-python as opencv-contrib-python already contains OpenCV and the conflict isn’t handled well by Python’s package system.

Python code walk-throughđź”—

Let us walk through the Python code.

import visionappster as va
import numpy as np
import cv2
from cv2 import dnn_superres
import os

Import the necessary modules. visionappster is needed in every Python tool, numpy almost as often. cv2 is the name of the OpenCV module. os is needed for path manipulation.

class Rescale:
    # Shared by all instances of this class
    _sr_dict = {}

    @staticmethod
    def _load_superresolution_models():
        data_dir = os.path.dirname(__file__) + '/superresolution/'
        for scale in (2, 4, 8):
            Rescale._sr_dict[scale] = dnn_superres.DnnSuperResImpl_create()
            Rescale._sr_dict[scale].readModel(data_dir + 'LapSRN_x' + str(scale) + '.pb')
            Rescale._sr_dict[scale].setModel('lapsrn', scale)

    def __init__(self):
        if not Rescale._sr_dict:
            Rescale._load_superresolution_models()

Initialization: a super-resolution object is created and the neural net models are loaded. The name of the class, Rescale, will show up as the name of the tool in the Builder. Since the DNN models are not modified at run time, they are only loaded once and shared between all instances of the tool class.

def process(self,
            inputs: [('image', va.Image, va.Image()),
                     ('zoomFactor', float, 1.0, {'min': 0.0})],
            outputs: [('image', va.Image)]):

This declares the tool’s public interface. It has two inputs:

  • image is the input image. Its type is va.Image, which is the standard data type for images passed between tools. As you’ll see later, conversions to other data types are straightforward. The default value of the input parameter is an empty image and must be changed by the user of the tool.

  • zoomFactor determines how much the image is zoomed in X- and Y-directions. For example, if zoomFactor = 3, the size of the output image will be 3*3=9 times the size of the input. The default value is 1 (i.e. no change). The value must be positive.

The output is also called image, and its type is va.Image.

if inputs.image.is_empty():
    raise ValueError('Input image must be non-empty.')
if inputs.zoomFactor <= 0:
    raise ValueError('Zoom factor must be greater than zero.')

# VA image to numpy array
np_img = np.array(inputs.image.to_rgb(), copy=False)

The first step is to ensure valid input. An empty image can not be scaled and a non-positive zoom factor is equally invalid.

The va.Image (and other array-like types like va.Matrix and va.Tensor) implement the Python buffer protocol and can therefore be used as NumPy arrays without copying the data. Passing copy=False to np.array creates a shallow copy.

The input for the superresolution DNN must be a color image. If the input is a gray-scale image, to_rgb copies the intensity to the red, green and blue color channels. Now the shape of the image array np_img is (rows, columns, 3).

rows, cols, _ = np_img.shape
zoom_factor = inputs.zoomFactor

# Maximum number of pixels in the output image
max_pixels = 4.0 * 2**20
# Actual number of pixels before limiter
pixels = rows * cols * (zoom_factor * zoom_factor)
overshoot = pixels / max_pixels

# Limit the zoom factor so that the output image has
# no more than max_pixels pixels.
if overshoot > 1:
    zoom_factor = (max_pixels / (rows * cols)) ** 0.5

This copies the zoom factor from input arguments to a local variable and makes a sanity check to it. If the factor is so big that the output image would be larger than 16 megapixels, adjust the factor so that output image size will be 16 Mpix.

# List of zoom factors to be applied by DNN
power2list = []
while zoom_factor > 1:
    if zoom_factor > 4:
        power2list.append(8)
        zoom_factor = zoom_factor / 8.0
    elif zoom_factor > 2:
        power2list.append(4)
        zoom_factor = zoom_factor / 4.0
    else:
        power2list.append(2)
        zoom_factor = zoom_factor / 2.0

Break the zoom factor into a product sequence of powers of two. The result of this while loop is list power2list where each element is 2, 4 or 8. The list will be empty if zoom_factor > 1. After the loop, zoom_factor will be in the half-open range (0…1].

for z in power2list:
    img = self.sr_obj[z].upsample(img)

This is where the magic happens. For each element z in power2list, run one of the preloaded models (“LapSRN_X.bp”) to zoom the image up by the corresponding factor.

# Scale down by the zoom_factor (which is <= 1)
rows, cols, _ = np_img.shape
if zoom_factor != 1:
    np_img = cv2.resize(np_img, (int(cols * zoom_factor), int(rows * zoom_factor)))

If the scaling factor was not a power of two, the size of the upscaled image needs to be adjusted down. This happens by calling OpenCV’s resize() function.

# Convert NumPy array to an image.
outputs.image = va.Image(va.Image.RGB32, np_img)
# The scaled image represents the same piece of world as the input
# image, just with a different resolution. This aligns the output image
# according to the input so that subsequent analysis steps know its
# exact position.
outputs.image.align_with(inputs.image)

The final step is to copy the NumPy array to a va.Image and place it to output arguments. The align_with call is needed to adjust the coordinate system of the output image so that subsequent analysis steps can correctly transform coordinates between its pixel coordinates and the world coordinate system. The Yolo cookbook illustrates why this is important.

va.Tool.publish('com.visionappster.opencvpython/1', Rescale)

This publishes the tool to the VisionAppster runtime. The unique component ID of tools in this Python plugin is com.visionappster.opencvpython/1, where “1” denotes a major version number.

Using the tool in the Builderđź”—

Once you save the Python file in $HOME/VisionAppster/plugins/tool/rescale_toolplugin.py and start the Builder (or click the “Refresh user plugins” button if the Builder is already running), you’ll see the new tool in the tool box:

The new tool in the tool box

The new tool in the tool boxđź”—

Let’s make a simple application which uses the tool. The image coming from the Image Source is first deliberately scaled down by a factor of two with Scale Image, then zoomed up by a factor of ten.

The complete app

The complete appđź”—

This is the original test image

Pena, our standard test image

Pena, our standard test imageđź”—

The picture zoomed with super-resolution (lower image) is slightly better compared to traditional bilinear zooming (upper image).

The difference becomes more obvious by zooming into details.

After bilinear upscaling

After bilinear upscalingđź”—

After super-resolution upscaling

After super-resolution upscalingđź”—

Bilinear upscaling details

Bilinear upscaling detailsđź”—

Super-resolution upscaling details

Super-resolution upscaling detailsđź”—