Affine Transformation in Image Processing: Explained with C++

Transformations are used to change the geometry of the contents within the image. Geometric operations performed on an image, changes the spatial relationships of the image pixels. Mathematically, it is the process of transforming a pixel in a specific coordinate (x,y)  in the input image to a new coordinate (x’,y’) in the output image. This modification to the spatial relationship between pixels can be linear (like Affine transform) or non-linear (Projective transform).

G(x,y) → H(x’,y’)

Afffine transformation is a linear transformation which yields a mapping function that provides a new coordinate for each pixel in the input image, which has a linear relationship between them.

The mapping function can be specified as 2 separate functions like,

(x’,y’) = M(x,y)

x’ = Mx(x,y)

y’ = My(x,y)

In polynomial form, it is expressed as,

x’ = a0x + a1y + a2

y’ = b0 + b1y + b2

In matrix form,

Picture8

In image processing, we often use the matrix form.

Affine Transform preserves the parallelity in image and the distance ratio between the points. That is, a rectangle or square might become a parallelogram after the transformation, but won’t become a trapezoid.

The three types of Affine transformation that is often used are:

  • Translation
  • Rotation
  • Scaling

Translation:

The translation in affine transformation yields linear shift of pixels in X and Y coordinates.  Translation matrix is given as,

Picture9

The effect of translation is image processing can be seen from the following images,

output_zL08Xc                    output_7AN9cL                               output_jlRdgh

 

Rotation:

The rotation matrix applied to an image rotates the image to the specified angle. Rotation matrix is given as,

Picture10

The effect of rotation in image processing can be seen in the following image,

output_SzfR4o

Scaling:

Scaling operation resizes the image. Scaling matrix is given as,

Picture11

The effect of scaling in image processing can be seen from the following images,

output_eRfpe1

The following snippet (OpenCV + C++) initializes an Affine transformation kernel based on the given values of translation, rotation and scale,


// Stores affine transformation matrix
cv::Mat affineTransform;
// Initialise translational, rotational and scaling values for transformation
affineTransform.at<double>(0,0) = cos(theta * PI/180) * scaleX;
affineTransform.at<double>(0,1) = -sin(theta * PI/180) * scaleY;
affineTransform.at<double>(0,2) = translate.x;

affineTransform.at<double>(1,0) = sin(theta * PI/180) * scaleX;
affineTransform.at<double>(1,1) = cos(theta * PI/180) * scaleY;
affineTransform.at<double>(1,2) = translate.y;

affineTransform.at<double>(2,0) = 0;
affineTransform.at<double>(2,1) = 0;
affineTransform.at<double>(2,2) = 1;

The following snippet uses the above affine kernel to transform the given input image,

bool AffineTransform(const cv::Mat& src, cv::Mat& dst)
{
// Check entered translation points are within image bounds
if( (abs(translate.x)) > src.rows || (abs(translate.y > src.cols)))
{
std::cout<<"Initialised Translate points exceeds image bounds"<<std::endl;
return 0;
}

// Create an empty 3x1 matrix for storing original frame coordinates
cv::Mat xOrg = cv::Mat(3, 1, CV_64FC1);

// Create an empty 3x1 matrix for storing transformed frame coordinates
cv::Mat xTrans = cv::Mat(3, 1, CV_64FC1);

// Default initialisation of output matrix
dst = cv::Mat::zeros(src.size(), src.type());

// Go through entire image
for(int i = 0; i < src.size().height; i++){
for(int j = 0; j < src.size().width; j++){

// Get current coorndinates
xOrg.at<double>(0,0) = i;
xOrg.at<double>(1,0) = j;
xOrg.at<double>(2,0) = 1;

// Get transformed coodinates
xTrans = affineTransform * xOrg;

// Depth
const int w = (xTrans.at<double>(2,0));

// Homogeneous to cartesian transformation
const int newX = (xTrans.at<double>(0,0)) / w;
const int newY = (xTrans.at<double>(1,0)) / w;

// Make sure boundary is not exceeded
if(newX >= src.size().height || newY >= src.size().width || newX < 0 || newY < 0)
{
continue;
}

// Put the values of original coordinates to transformed coordinates
dst.at<uchar>(newX, newY) = src.at<uchar>(i,j);

}
}
return 1;
}

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s