This page is for the report of the program assignment #2 in VFX 2008 Spring.
Introduction:
This programming assignment aims to make panorama using computational methods. We mainly referred to [1] and [2] as guidelines for our implementation. The components we implemented include: 1) SIFT feature extractor, 2) ANN-based feature matching, 3) RANSAC-based image matching, 4) connected component extraction. 5) bundle adjustment, 6) up-vector estimation and 7) multi-band blending. As the whole thing contains many details that were already covered in the original papers, we would only discuss the problems we met in the following.
Algorithm implementation & details:
We implemented our program using VS .NET C++ with OpenCV to facilitate common image processing routines.
SIFT and ANN:
The implementation of SIFT simply follows the paper of Lowe [3]. We have modified some parameters to match our implementation. For nearest neighbor finding, we adopted the ANN library, which is commonly used by the public. We cutoff the ANN search at 250 points max.
RANSAC image matching / Connected component:
As suggested in the paper, we first find the images with the most number of matching features for each input image and then perform pairwise feature matching between those pairs to verify if they do match each other. We test at most 6 images for each image. This is quite enough to extract connected panorama in all our experiments. We also accelerate the time-consuming pairwise matching by randomly probing a few feature points and checking the matching ratio.
Bundle Adjustment for camera direction and focal length:
This is the most tricky part in our attempt to rebuild the original work. As the complete Levenberg-Marquardt algorithm is too complex to implement, we resorted to a free LM library for non-linear optimization. The author of this library also provides another library for purely bundle adjustment purpose, but we found it difficult to use in panorama stitching (no exact 3D point information available). Thus, we use the basic LM library in our BA. Nevertheless we found that if we simply follows what described in the paper, the BA would not converge successfully. To alleviate this problem, we first compute pairwise homographies between all the connected pairs and invoke optimizations to match the entries of homographies incrementally. This alone would still get some convergence problems in several datasets. So finally we initialize BA with different focal lengths and add a regularization term to discourage negative focal lengths. The result is further refined by two passes of BA identical to the formulation in the paper with different cutoff values of the robust error function ( firstly infinity, then 2). The program thus work well in most situations.
Some tips to work with the LM lib:
1. The adata pointer is annoying to use. Better method?
Yes, you can allocate the necessary data at the global scope, so your estimation function may access them directly.
2. How to use an robust error function (not purely L2 norm)?
Simple. Just let the measurement vector be zero and handcraft the error function in the estimation function.
Up-Vector Estimation / Multi-Band Blending:
This two parts are just like what described in the paper. We use three bands for multi-band blending. The only thing to take care of is "division by zero".
Results:
Here we present several stitched panorama from the test datasets and the images we grabbed in WoW (World of Warcraft). With the help of bundle adjustment and multi-band blending, there is visually no artifacts (seams and mis-registration) in all the results. The last one is produced by the Windows Live Image Center for comparison. Note that our program even choose the same reference view for centering panorama. The typical execution time for one panorama is about 5-10 minutes. Most time is spent on bundle adjustment and image stitching.
(Click for original size)
Test dataset - parrington:
Test dataset - Abbey:
Test dataset - Mountain:

Test dataset - Matier:
Test dataset - Denny:
Test dataset - GRAIL:
WoW - Sha'tar (estimated f = 904):
WoW - Silver Moon (estimated f = 958):
Silver Moon by Windows Live Image Center:
Acknowledgment:
We have to thank Chia-Kai Liang for many thoughtful discussions and experiment sharing.Download:
<>Reference:
1. M. Brown and D. G. Lowe, "Recognising Panorama", ICCV 2003.
2. M. Brown and D. G. Lowe, "Automatic Panoramic Image Stitching using Invariant Features", IJCV 2007.
3. David G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints", IJCV 2004.








