Visual Computing is a module I took this semester at NUS ECE, hosted by Assoc. Prof. Robby Tan. It covers some of the best-known classic CV algorithms, for example, face detection, image stitching and depth estimation.

A taste of classics

Deep Learning methods, typically based on CNN architectures, can offer ease of implementation, often achieve superior accuracy, and can result in lightweight models. Given these advantages, one might question the continued need to study traditional CV techniques.

The key lies in the distinct characteristics of each approach. Unlike the “black box” nature of many machine learning models, traditional CV algorithms are more explainable and provide greater control through manual tuning. Furthermore, they exhibit a degree of generality, as their feature extraction processes aren’t inherently tied to specific image datasets. Consequently, algorithms like SIFT remain prevalent in applications like image stitching and 3D mesh reconstruction, where class-specific knowledge isn’t required. Finally, DL can sometimes be an unnecessarily complex solution, while traditional CV techniques can often be simplified and implemented on resource-constrained devices like microcontrollers.

Course structure

The module is structured into two main sections and can be further divided into multiple sub-sections. Each section offers comprehensive explanations and extensive experimentation. The assignments can be found in repo Assignment 1 and Assignment 2.

  1. Viola-Jones Face Detection Algorithm (Viola & Jones, 2004)
    1. Feature extraction Haar-like features
    2. Fasten convolution process integral image
    3. Feature selection AdaBoost
    4. Fasten classification cascade classifier
  2. Histogram of Oriented Gradients (HOG) Features and Human Detection (Dalal & Triggs, 2005)
  3. Scale Invariant Feature Transform (SIFT) Algorithm (Lowe, 2004)
    1. Scale-space extrema Difference of Gaussian
    2. Keypoint localization Taylor expansion
    3. Orientation assignment
    4. Keypoint descriptor
  4. Image Stitching (Brown & Lowe, 2007)
    1. Homography
    2. RANSAC
  5. (Zhang et al., 2009)
    1. Camera Parameters (Hartley & Zisserman, 2003)
    2. Depth from Stereo
    3. Markov Random Field
    4. Depth from Video
    5. Optical Flow
    6. Structure Decomposition

References

Brown, M., & Lowe, D. G. (2007). Automatic panoramic image stitching using invariant features. International Journal of Computer Vision, 74, 59–73.
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 1, 886–893.
Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge university press.
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110.
Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57, 137–154.
Zhang, G., Jia, J., Wong, T.-T., & Bao, H. (2009). Consistent depth maps recovery from a video sequence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(6), 974–988.