Transformations are used to change the geometry of the contents within the image. Geometric operations performed on an image, changes the spatial relationships of the image pixels. Mathematically, it is the process of transforming a pixel in a specific coordinate (x,y) in the input image to a new coordinate (x’,y’) in the output image. This modification to the spatial relationship between pixels can be linear (like Affine transform) or non-linear (Projective transform).
G(x,y) → H(x’,y’)
Afffine transformation is a linear transformation which yields a mapping function that provides a new coordinate for each pixel in the input image, which has a linear relationship between them.
The mapping function can be specified as 2 separate functions like,
(x’,y’) = M(x,y)
x’ = Mx(x,y)
y’ = My(x,y)
In polynomial form, it is expressed as,
x’ = a0x + a1y + a2
y’ = b0 + b1y + b2
In matrix form,

In image processing, we often use the matrix form.
Affine Transform preserves the parallelity in image and the distance ratio between the points. That is, a rectangle or square might become a parallelogram after the transformation, but won’t become a trapezoid.
The three types of Affine transformation that is often used are:
- Translation
- Rotation
- Scaling
Translation:
The translation in affine transformation yields linear shift of pixels in X and Y coordinates. Translation matrix is given as,

The effect of translation is image processing can be seen from the following images,

Rotation:
The rotation matrix applied to an image rotates the image to the specified angle. Rotation matrix is given as,

The effect of rotation in image processing can be seen in the following image,

Scaling:
Scaling operation resizes the image. Scaling matrix is given as,

The effect of scaling in image processing can be seen from the following images,

The following snippet (OpenCV + C++) initializes an Affine transformation kernel based on the given values of translation, rotation and scale,
// Stores affine transformation matrix cv::Mat affineTransform; // Initialise translational, rotational and scaling values for transformation affineTransform.at<double>(0,0) = cos(theta * PI/180) * scaleX; affineTransform.at<double>(0,1) = -sin(theta * PI/180) * scaleY; affineTransform.at<double>(0,2) = translate.x; affineTransform.at<double>(1,0) = sin(theta * PI/180) * scaleX; affineTransform.at<double>(1,1) = cos(theta * PI/180) * scaleY; affineTransform.at<double>(1,2) = translate.y; affineTransform.at<double>(2,0) = 0; affineTransform.at<double>(2,1) = 0; affineTransform.at<double>(2,2) = 1;
The following snippet uses the above affine kernel to transform the given input image,
bool AffineTransform(const cv::Mat& src, cv::Mat& dst)
{
// Check entered translation points are within image bounds
if( (abs(translate.x)) > src.rows || (abs(translate.y > src.cols)))
{
std::cout<<"Initialised Translate points exceeds image bounds"<<std::endl;
return 0;
}
// Create an empty 3x1 matrix for storing original frame coordinates
cv::Mat xOrg = cv::Mat(3, 1, CV_64FC1);
// Create an empty 3x1 matrix for storing transformed frame coordinates
cv::Mat xTrans = cv::Mat(3, 1, CV_64FC1);
// Default initialisation of output matrix
dst = cv::Mat::zeros(src.size(), src.type());
// Go through entire image
for(int i = 0; i < src.size().height; i++){
for(int j = 0; j < src.size().width; j++){
// Get current coorndinates
xOrg.at<double>(0,0) = i;
xOrg.at<double>(1,0) = j;
xOrg.at<double>(2,0) = 1;
// Get transformed coodinates
xTrans = affineTransform * xOrg;
// Depth
const int w = (xTrans.at<double>(2,0));
// Homogeneous to cartesian transformation
const int newX = (xTrans.at<double>(0,0)) / w;
const int newY = (xTrans.at<double>(1,0)) / w;
// Make sure boundary is not exceeded
if(newX >= src.size().height || newY >= src.size().width || newX < 0 || newY < 0)
{
continue;
}
// Put the values of original coordinates to transformed coordinates
dst.at<uchar>(newX, newY) = src.at<uchar>(i,j);
}
}
return 1;
}
