Developers Club geek daily blog

1 year, 1 month ago
Disposal of the image of noise – one of fundamental operations of computer sight. Smoothing algorithms are applied almost everywhere: they can be both independent procedure for improvement of the photo, and the first step for more difficult procedure, for example, for object recognition on the image. Therefore there is a huge set of methods of smoothing, and I would like to tell about one of them differing from the others in good applicability on textures and images with a large number of identical parts.

Under a cat there are a lot of pictures, is accurater with traffic.

Preliminary information

In brief about an objective. It is supposed that in some ideal world we can receive the ideal imageI (a matrix or a vectorN \times M of elements) which ideally transfers information on the outside world. Unfortunately, we live not in so beautiful place and therefore for various reasons we should operate with images I + nwheren – a matrix or a vector of noise. There are many models of noise, but in this article we will assume thatn – white noise is Gaussian. According to the available information we try to recover the image I_dwhich will be closest to the ideal imageI. Unfortunately, it is almost always impossible as at a zashumleniye the part of data is inevitably lost. Besides, the problem definition assumes infinite set of solutions.

As a rule, normal rectifier filters set new value for each pixel, using information on his neighbors. The cheap and angry Gaussian filter, for example, displaces an image matrix with a matrix Gaussian distributions. As a result, in the new image each pixel is the weighed pixel sum with the same number of the initial picture and his neighbors. A few other approach uses the median filter: it replaces each pixel with a median among all nearby pixels. Development of ideas of these methods of smoothing is the bilateral filter. This procedure takes the weighed sum around, relying both on distance to the selected pixel, and on proximity of pixels on color.

Idea of the filter

Nevertheless, all above-mentioned filters use only information on nearby pixels. The main idea of the nonlocal rectifier filter consists in using all information on the image, and not just pixels next to processed. How it works, often values in the remote points do not depend from each other in any way?

It is obvious that if we have several images of the same object with different noise level, then we can group them in one picture without noise. If these several objects are placed in different places of one image, then we all the same can use this additional information (things are easy – at first to find these objects). And even if these objects look a little differently or are partially blocked, then we all the same have redundant data which can use. The nonlocal rectifier filter uses this idea: finds similar areas of the image and applies information from them to mutual averaging. Certainly that parts of the image are similar, does not mean that they belong to the same objects, but, as a rule, approach is rather good. Look at drawing with lines. It is obvious that windows 2 and 3 are similar to a window 1, and windows 4, 5 and 6 are not. In the photo of the Hungarian parliament, it is also possible to notice a large number of identical elements.


Let's assume that if the neighborhood of pixel in one window is similar to the neighborhood of the corresponding pixel in other window, then values of these pixels can be used for mutual averaging (blue and green points in drawing). But how to find these similar windows? How to deliver to them in compliance of weight with what force they influence at each other? We formalize the concept "similarities" of pieces of the image. First, we need function of distinction of two pixels. Here everything is simple: for black-and-white images the module of a difference of values of pixels is usually used just. If there is any additional information on the image, options are possible. For example, if it is known that the blank image consists of objects of absolutely black color on completely white background, good idea will use a metrics which compares zero distance to the pixels not very different on color. In case of color images options are besides possible. It is possible to use an Euclidean metrics in the RGB format or something is more cunning in the HSV format. "Dissimilarity" of windows of the image is only the sum of distinctions of their pixels.

Now it is good to transform the received value to more usual format of weightw \in [0, 1]. Let's give the distinction of windows calculated on the previous step in some decreasing (or at least nonincreasing) function and we normalize the received result the sum of values of this function at all possible windows. Usually as such function the acquaintance as everything fiercely e^{-\frac{x^2}{h^2}wherex – distance, andh – the parameter of dispersion of scales acts. The moreh, the distinction between windows influences smoothing less. Ath \rightarrow \infty, all windows regardless of distinction between them equally make a contribution to each pixel, and the image will turn out ideally gray, ath \rightarrow 0, only the window corresponding to itself will have a significant weight and smoothing will not turn out.

Thus, the naive mathematical model of the nonlocal averaging filter looks so:

I_d(j) = \sum_{j \in I}w(i, j)I(j)

i- the ty pixel of the resulting image is equal to the sum of all pixels of the source image taken with scales wwhere weight it

w(i, j) = \frac{1}{Z(i)}e^{-\frac{\|I(N_i) - I(N_j)\|^2_2}{h^2}}

and the normalizing divider

Z(i) = \sum_{j \in I}e^{-\frac{\|I(N_i) - I(N_j)\|^2_2}{h^2}}

The alternative metrics offered above:

\rho(x, y, t)=\begin{cases}
    0, & \text{if $|x-y|<t$}.\\
    |x-y|-t , & \text{otherwise}.

In all cases\|I(N_i) - I(N_j)\| designates a step-by-step difference between windows as it is described above.

If attentively to look narrowly, then the final formula turns out almost same, as well as at the bilateral filter, only in an exponent instead of geometrical distance between pixels and a color difference there is a difference between picture rags, and also summing is carried out on all pixels of the image.

Reasonings on efficiency and implementation

It is obvious that complexity of the offered algorithm O(n^2(2r+1))wherer – the radius of a window on which similarity of parts of the image is calculated, andn – the complete number of pixels we for each pixel compare its neighborhood of the size2r+1 to the neighborhood of each other pixel. It is not too good because naive implementation of algorithm rather slowly works even at images 300x300. The following improvements are usually offered:

  • It is obvious that it is optional to recalculate all of weight all the time, pixel weighti for pixelj and vice versa – are equal. If to save calculated weight, runtime is cut by half, but it is requiredO(n^2) to memory for storage of scales.
  • For the majority of real images of weight between the most part of pixels will be equal to zero. So why not to nullify them at all? It is possible to use heuristics on the basis of head-banding which will prompt where to consider weight pixel-by-pixel senselessly.
  • The exponent at calculation of scales can replace with function, cheaper for calculation. At least piecewise interpolation of the same exponent.

Normal question which arises by consideration of this algorithm: why we use only values of the central pixel in the area? Generally speaking, prevents to put nothing in a basic formula summing on all area as in bilateral smoothing, and it even slightly will affect complexity, but the result of the changed algorithm, most likely, will turn out too indistinct. Do not forget that in real images even identical objects seldom happen absolutely identical, and such modification of algorithm will average them. To use information from the next image elements, it is possible to execute normal bilateral smoothing previously. Though if it is known that source objects were or have to turn out as a result absolutely identical (smoothing of the text), then similar change will just do good.

Naive implementation of algorithm (With + OpenCV 2.4.11):

#include "targetver.h"

#include <stdio.h>
#include <math.h>

#include <opencv2/opencv.hpp>
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/imgproc/imgproc_c.h"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/core/core.hpp"
#include "opencv2/features2d/features2d.hpp"
#include "opencv2/calib3d/calib3d.hpp"
#include "opencv2/nonfree/nonfree.hpp"
#include "opencv2/contrib/contrib.hpp"
#include <opencv2/legacy/legacy.hpp>

#include <stdlib.h>
#include <stdio.h>

using namespace cv;

	Returns measure of diversity between two pixels. The more pixels are different the bigger output number.
	Usually it's a good idea to use Euclidean distance but there are variations.
double distance_squared(unsigned char x, unsigned char y)
	unsigned char zmax = max(x, y);
	unsigned char zmin = min(x, y);
    return (zmax - zmin)*(zmax - zmin);

    Returns weight of interaction between regions: the more dissimilar are regions the less resulting influence.
    Usually decaying exponent is used here. If distance is Euclidean, being combined, they form a typical Gaussian function.
double decay_function(double x, double dispersion)
	return exp(-x/dispersion);

int conform_8bit(double x)
	if (x < 0)
		return 0;
	else if (x > 255)
		return 255;
		return static_cast<int>(x);

double distance_over_neighbourhood(unsigned char* data, int x00, int y00, int x01, int y01, int radius, int step)
    double dispersion = 15000.0; //should be manually adjusted
    double accumulator = 0;
	for(int x = -radius; x < radius + 1; ++x)
		for(int y = -radius; y < radius + 1; ++y)
            accumulator += distance_squared(static_cast<unsigned char>(data[(y00 + y)*step + x00 + x]), static_cast<unsigned char>(data[(y01 + y)*step + x01 + x]));
    return decay_function(accumulator, dispersion);

int main(int argc, char* argv[])
	int similarity_window_radius = 3; //may vary; 2 is enough for text filtering
	char* imageName = "text_noised_30.png";
    IplImage* image = 0;
	image = cvLoadImage(imageName, CV_LOAD_IMAGE_GRAYSCALE);
	if (image == NULL) {
		printf("Can not load image\n");
	CvSize currentSize = cvGetSize(image);
	int width = currentSize.width;
	int height = currentSize.height;
	int step = image->widthStep;
	unsigned char *data = reinterpret_cast<unsigned char *>(image->imageData);
	vector<double> processed_data(width*height, 0);
	//External cycle
	for(int x = similarity_window_radius + 1; x < width - similarity_window_radius - 1; ++x) 
		printf("x: %d/%d\n", x, width - similarity_window_radius - 1);
		for(int y = similarity_window_radius + 1; y < height - similarity_window_radius - 1; ++y) 
			//Inner cycle: computing weight map
			vector<double> weight_map(width*height, 0);
			double* weight_data = &weight;_map[0];
			double norm = 0;
			for(int xx = similarity_window_radius + 1; xx < width - similarity_window_radius - 1; ++xx)
				for(int yy = similarity_window_radius + 1; yy < height - similarity_window_radius - 1; ++yy)
                    double weight = distance_over_neighbourhood(data, x, y, xx, yy, similarity_window_radius, step);
                    norm += weight;
                    weight_map[yy*step + xx] = weight;
			//After all weights are known, one can compute new value in pixel
			for(int xx = similarity_window_radius + 1; xx < width - similarity_window_radius - 1; ++xx)
				for(int yy = similarity_window_radius + 1; yy < height - similarity_window_radius - 1; ++yy)
                    processed_data[y*step + x] += data[yy*step + xx]*weight_map[yy*step + xx]/norm;
	//Transferring data from buffer to original image
	for(int x = similarity_window_radius + 1; x < width - similarity_window_radius - 1; ++x)
		for(int y = similarity_window_radius + 1; y < height - similarity_window_radius - 1; ++y)
			data[y*step + x] = conform_8bit(processed_data[y*step + x]);
	cvSaveImage("gray_denoised.png", image);
	return 0;


Let's look at some results:


Pay attention to almost ideal smoothing and accurate regions of lines. The more identical square areas on the image, the algorithm works better. Here the program had enough examples for each type of the neighborhood, all square areas along lines approximately identical.

Certainly, everything is not so iridescent for those cases when "similarity" of areas cannot be found a simple square window. For example:


Though the background and an interior of a circle smoothed out well, along a circle there were artifacts. Certainly, the algorithm cannot understand that the right side of a circle same, as well as left, is only turned by 180 degrees and to use it when smoothing. Such obvious miss can be corrected consideration of the turned windows, but then runtime will increase in so many time how many turns of a window are considered.

That there were no doubts that the algorithm is able to smooth not only direct lines, here one more example which illustrates importance of the choice of the smoothing parameter at the same time:


In the second drawing the same artifacts, as well as in the previous case are visible approximately. Here it is not a lack of similar windows, and in too small parameterh: even similar areas receive too small weight. On the fourth picture, on the contrary, it was selected too bigh, and all areas received approximately identical weight. The third represents golden mean.

Still an example of work on the text. Even small and strongly noisy image is processed rather well for the subsequent use by machine (after an ekvalization of the histogram and targeting of sharpness, certainly):


Still a little bit examples from the website on which it is possible to apply already optimized nonlocal smoothing algorithm:

Nonlocal algorithm for smoothing of imagesNonlocal algorithm for smoothing of images
Nonlocal algorithm for smoothing of imagesNonlocal algorithm for smoothing of images
Nonlocal algorithm for smoothing of imagesNonlocal algorithm for smoothing of images

Order: initial, noisy, recovered. Notice how Asher's drawings in which there are a lot of similar elements were well cleared. Only black lines became more transparent and the texture smoothed out.

Calculation of scales by means of descriptors of special points

Unfortunately, high computing complexity of algorithm limits its practical application, nevertheless, it shows good results on problems of smoothing of images with the repeating elements. The main problem of this algorithm is covered in his main idea of calculation of scales. It is necessary to pass for each pixel across neighborhoods of all other pixels. Even if to assume that double viewing of all pixels is an inevitable evil, then viewing of his neighbors seems superfluous, even for window 3 radius it increases time of calculation of scales by 49 times. Let's get back a little back to initial idea. We want to find similar places on the image to use them for smoothing of each other. But to us optional pixel-by-pixel to compare all possible windows on the picture to find similar places. We already have a good method for designation and comparison of interesting image elements! Certainly, I speak about various descriptors of features. Usually it is meant as SIFT, but in our case it is better to use something less exact, the purpose at this stage is to find a lot of enough similar areas, but not several just the same. In case of application of descriptors the step of calculation of scales looks so:

  • Previously we consider descriptors in each point of the image
  • We pass on images and for each pixel
  • o Is passable on each other pixel
  • o we Compare descriptors
  • o If descriptors similar, then we compare windows pixel-by-pixel and we consider weight
  • o If different, we nullify weight
  • Not to pass according to the image the second time, counted weight and the coordinate of pixels are entered in the separate list. When in the list the sufficient number of elements collected, it is possible to stop search.
  • If it did not turn out to find the set quantity of similar points (the pixel is not remarkable), simply we pass according to the image and we consider weight as usual

Advantages of such approach:

  • Smoothing in special points which usually most of all and are interesting to us improves
  • If on the image there are a lot of similar special points, fall forward is noticeable


  • Not the fact that there will be many similar points to pay back calculation of descriptors
  • The result and acceleration of algorithm strongly depend on the choice of a descriptor



  • The algorithm well works at periodic textures
  • The algorithm well smoothes homogeneous areas
  • The more the image, the is more than similar areas and the better the algorithm works. But godlessly long.
  • Areas for which it was not succeeded to find similar smooth out badly (decides preliminary use of the bilateral filter or consideration of turns of windows)
  • As well as for many other algorithms, it is necessary to adjust value of the smoothing parameter

Thus, we considered the smoothing algorithm which uses information on similar objects on the image for the best approximation of the blank image. From such "pseudo-recognition" transition to this smoothing after selection of similar objects is near. And it is valid, modern approaches to recovery of images use neural networks for this purpose and they, as a rule, really cope better and quicker. However neural networks demand bigger skill in programming and thin setup. Even such algorithm only imitating recognition can give comparable, and sometimes the best result.


Antoni Buades, Bartomeu Coll A Non-local Algorithm For Image Denoising
Yifei Lou, Paolo Favaro, Stefano Soatto, and Andrea Bertozzi Nonlocal Similarity Image Filtering

This article is a translation of the original post at
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here:

We believe that the knowledge, which is available at the most popular Russian IT blog, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.
Best wishes.

comments powered by Disqus