For the predecessor paper on SIFT (Lowe, 2004), check . This is a review of Automatic Panoramic Image Stitching using Invariant Features(Brown & Lowe, 2007). The Assignment 1 of EE5731 Visual Computing is based on these two papers.

Homography
Input: pairs of anchors in both images
Output: the homography matrix
This paper usesĀ homographyĀ as the fundamental transformation to warp and stitch images together to create a panorama. Think of homography as a mathematical function that maps points from one image plane to corresponding points on another image plane, as if you were viewing a flat scene from different angles. Itās more powerful than simple translation, rotation, and scaling because it accounts for perspective distortions.
The paper emphasizes that homography works best when the scene is mostly flat or when the camera is rotating around its center of projection (no parallax). Because real scenes often arenāt perfect, the paper prioritizes stitching based on homographies between neighboring images in the panorama sequence. This is of great necessity, especially when we have multiple images and some of them are not neighbors. Weāll discuss this in Automatic Image Stitching.
In this example, if we correctly select at least 4 pairs of keypoints, we can recover the angle of the camera.

RANSAC
Input: all keypoints in both images
Output: the best pairs of keypoints to be the anchors
RANSAC stands for Robustly Finding the Best Match. Hereās how itās used:
RANSAC algorithm
Feature matching
First, SIFT features are detected in each image and matched between pairs of images.
Random sampling
RANSAC randomly selects a minimal set of matches (enough to compute a potential homography).
Homography calculation
Based on this small set of matches, a homography transformation is calculated.
Consensus set
The homography is then used to transform all other points in one image to the other. The algorithm then checks how many of the other matches also āagreeā with this homography (i.e., the transformed point is close to its corresponding point in the other image). These agreeing matches form the āconsensus set.ā
Iteration
This process (random sampling, homography calculation, consensus set building) is repeated many times. The homography that yields the largest consensus set is selected as the best transformation between the images.
In essence, RANSAC is a robust method for filtering out bad matches (āoutliersā) that would otherwise throw off the homography estimation. By relying on a majority consensus of good matches, it provides a much more accurate and reliable result.


We are now able to handle basic image stitching tasks with anchors automatically extracted by RANSAC.