UCF Lecture 01 - Introduction To Computer Vision
Lecture 01 Introduction to Computer Vision
This lecture covers the very basics of a what a digital image or digital video is. It also provides a brief overview of the main topics and applications in computer vision, without going into much detail about algorithms or implementations. I found it valuable to hear about what are currently the hot topics and research areas in computer vision.
My notes from the lecture:
-
He discusses what constitutes an image (2D array of pixels with values 0 - 255)
-
He discusses how an image is formed: projection of the a 3D object onto a 2D image plane.
-
Discusses approaches for reconstructing 3D information from 2D images.
-
Stereo - Depth information from two cameras.
-
Shading - makeup fools the human brain into giving your face a different shape.
-
Texture - Texture is a repeated pattern. You can look at distortions in the pattern to recover 3D information.
-
Shape from Motion - Looking at just a small collection of moving dots, we can make out that it’s a person based on the motion.
-
-
He recommends a book on computer vision by Rick Szeliski’s, a principal researcher at Microsoft research.
-
He shows a demo video of Microsoft’s Photosynth, which attempts 3D constructions of scenes from 2D images gathered on the web.
-
Shows some example applications of computer vision:
-
Mosaic - Stitching together images from a video sequence to construct a complete view of the scene.
- One example is video from UAV tracking a car down a road, mosaic stitches together all of the images of the road to create more of a map of the area.
-
“Human Detection” - Does the frame contain a person?
-
Airplane detection
-
Face Recognition
-
Facial Expressions
-
Detecting Driver Alertness
-
Lip Reading - Our brain supplements audio with lip reading.
-
Video Surveillance and Monitoring
- Automated Surveillance System - Detection & Tracking
-
They are working on a project for airport surveillance. Multiple high resolution cameras providing 360 degree view. Called wide-area surveillance (WAS), lots of people in the airport.
-
Homeland Security Advanced Research Project Agency - HSARPA
-
Called NONA system, couldn’t find links online though
-
-
UAV Surveillance
-
Currently the surveillance footage is reviewed by humans, because we don’t have the techniques to analyze these with a lot accuracy.
-
Part of the challenge is that you need to remove camera motion from the equation.
-
-
Unmanned Ground Vehicle (UGV) - Self driving cars.
-
Human Action Recognition - Recognizing the actions, activities that people are doing.
- Weizmann Action Dataset - A collection of videos constituting 9 actors and 10 actions. Try to figure out which action the person is performing.
-
Accurate Image Localization - “Where Am I?”
-
Layer Based Video Composition - Remove a foreground object from a video, filling it in with background information acquired over the sequence of frames. This is used by the film industry? Also background replacement.
-