Introduction

This post summarizes my work for the course 16820 Advanced Computer Vision taught by Prof. Matthew O’Toole in Fall 2024 at CMU. The course covers advanced topics in computer vision, including multi-view geometry, 3D reconstruction, optical flow, deep learning for vision, etc. The course consists of 6 assignments, each focusing on a different aspect of computer vision. Below are the summaries and some results from each assignment. GitHub link: CMU16820-CV

Assignment 1: Homography

Augmented Reality with Planar Homographies.

In this assignment, I compared different feature detection and description methods, including SIFT, FAST, BRIEF, ORB, etc. I also implemented RANSAC to robustly estimate homographies between images, and fused two images together using the estimated homography. I also implemented a real-time AR application with openCV API to warp kung fu panda onto a book cover.

Assignment 2: Lucas-Kanade Tracking

Motion detection and Object tracking

In this assignment, I implemented the Lucas-Kanade algorithm for optical flow estimation and object tracking. I run the algorithm to track a moving person, cars, cluster ants, etc. Different hyperparameters were tested and compared to see how they affect the tracking performance.

Assignment 3: 3D Reconstruction

Reconstruct a 3D point cloud from images

I implemented eight-point / seven-point algorithm to estimate the fundamental matrix between two images. I also implemented triangulation, essential matrix estimation, and camera pose estimation, epipolar correspondence, bundle adjustment, etc. Finally, I reconstructed a 3D point cloud of a scene from multiple images.

Assignment 4: Deep Learning

Train a network for character recognition

I implemented CNN from scratch with numpy and trained it with NIST dataset for handwritten character recognition, and applied it to real handwritten images. I also implemented deep learning model with PyTorch to train a classifier for CIFAR10 dataset.

Assignment 5: Photometric Stereo

Stereo Reconstruction

I implemented photometric stereo algorithm to reconstruct the surface normal and albedo of an object from multiple images taken under different lighting conditions for a face reconstruction.

Assignment 6: Image Segmentation

Image Segmentation

Applied segmentation for images with different views, and reconstructed a 3D point cloud of the object.

Note

Note that this post was written after 1 year of completing the course, so some details may be missing. TODO: add more details and descriptions.