COMP9517_24T2W1_Introduction.pdf
Document Details
Uploaded by FastGrowingJackalope
UNSW Sydney
2024
Tags
Full Transcript
COMP9517 Computer Vision 2024 Term 2 Week 1 Professor Erik Meijering Introduction What is computer vision? Computer science perspective Computer vision is the interdisciplinary field that develops theories and methods to allow computers extract relevant information from...
COMP9517 Computer Vision 2024 Term 2 Week 1 Professor Erik Meijering Introduction What is computer vision? Computer science perspective Computer vision is the interdisciplinary field that develops theories and methods to allow computers extract relevant information from digital images or videos Computer engineering perspective Computer vision is the interdisciplinary field that develops algorithms and tools to automate perceptual tasks normally performed by the human visual system Copyright (C) UNSW COMP9517 24T2W1 Introduction 2 Every picture tells a story “A picture is worth a thousand words” Computer vision automates and integrates many information processing and representation approaches useful for visual perception https://en.wikipedia.org/wiki/montparnasse_derailment Copyright (C) UNSW COMP9517 24T2W1 Introduction 3 Can computers match (or beat) humans? Yes and No Humans are still better at “hard” tasks Ambiguous data, leveraging prior knowledge, continual learning, working across applications Computers can be better at “easy” tasks High-quality data, using mathematical models, consistent training set, single well-defined application Copyright (C) UNSW COMP9517 24T2W1 Introduction 4 Human vision has its limitations… Which objects are brighter? Copyright (C) UNSW COMP9517 24T2W1 Introduction 5 Human vision has its limitations… Which objects are brighter? Copyright (C) UNSW COMP9517 24T2W1 Introduction 6 Human vision has its limitations… Which objects are brighter? Copyright (C) UNSW COMP9517 24T2W1 Introduction 7 Human vision has its limitations… Which side of this object is brighter? Copyright (C) UNSW COMP9517 24T2W1 Introduction 8 Human vision has its limitations… Which side of this object is brighter? Copyright (C) UNSW COMP9517 24T2W1 Introduction 9 Human vision has its limitations… Which side of this object is brighter? Copyright (C) UNSW COMP9517 24T2W1 Introduction 10 Human vision has its limitations… Are the cells “popping out” or “popping in”? Copyright (C) UNSW COMP9517 24T2W1 Introduction 11 Human vision has its limitations… 180o Copyright (C) UNSW COMP9517 24T2W1 Introduction 12 Human vision has its limitations… What pattern do the squares form? Copyright (C) UNSW COMP9517 24T2W1 Introduction 13 Human vision has its limitations… What pattern do the squares form? Copyright (C) UNSW COMP9517 24T2W1 Introduction 14 Human vision has its limitations… What object do you see in this image? Copyright (C) UNSW COMP9517 24T2W1 Introduction 15 Human vision has its limitations… What object do you see in this image? Copyright (C) UNSW COMP9517 24T2W1 Introduction 16 Human vision has its limitations… How do the main lines run with respect to each other? Copyright (C) UNSW COMP9517 24T2W1 Introduction 17 Human vision has its limitations… How do the main lines run with respect to each other? Copyright (C) UNSW COMP9517 24T2W1 Introduction 18 Human vision has its limitations… In which direction are these particles moving ? Copyright (C) UNSW COMP9517 24T2W1 Introduction 19 Human vision has its limitations… https://www.youtube.com/watch?v=a7efEqgpIrE Copyright (C) UNSW COMP9517 24T2W1 Introduction 20 Course rationale Human vision has its limitations Intensities, shapes, patterns, motions can be misinterpreted Is labor intensive, time-consuming, subjective, error-prone Computer vision can potentially improve this Computers can work day and night without getting tired Analyze information quantitatively and objectively Potentially more accurate, precise, reproducible If the methods and tools are well designed! Copyright (C) UNSW COMP9517 24T2W1 Introduction 21 Application: 3D shape reconstruction Project VarCity recreates 3D city models using social media photos Copyright (C) UNSW COMP9517 24T2W1 Introduction 22 Application: image captioning Google’s Show and Tell open-source image captioning model in TensorFlow Copyright (C) UNSW COMP9517 24T2W1 Introduction 23 Application: intelligent collision avoidance Iris Automation provides safer drone operation with intelligent collision avoidance Copyright (C) UNSW COMP9517 24T2W1 Introduction 24 Application: face detection and recognition Facebook’s DeepFace project nears human accuracy in identifying faces Copyright (C) UNSW COMP9517 24T2W1 Introduction 25 Application: face detection and recognition For improving image capture on digital cameras Copyright (C) UNSW COMP9517 24T2W1 Introduction 26 Application: vision-based biometrics How the Afghan girl was identified by her iris patterns The remarkable story of Sharbat Gula, first photographed in 1984 aged 12 in a refugee camp in Pakistan by National Geographic Who is she? photographer Steve McCurry, and traced 18 years later to a remote part of Afghanistan where she was again photographed by McCurry… Copyright (C) UNSW COMP9517 24T2W1 Introduction 27 Application: logging in without a password Fingerprint scanners Windows Hello makes on modern laptops logging in as easy as and other devices looking at your PC Copyright (C) UNSW COMP9517 24T2W1 Introduction 28 Application: optical character recognition (OCR) Converting scanned documents or number plates to processable text Copyright (C) UNSW COMP9517 24T2W1 Introduction 29 Application: landmark recognition Copyright (C) UNSW COMP9517 24T2W1 Introduction 30 Application: autonomous vehicles Intel’s Mobileye makes cars safer and more autonomous Copyright (C) UNSW COMP9517 24T2W1 Introduction 31 Application: space exploration NASA’s Mars Exploration Rover Spirit autonomously captured this picture in 2007 Vision systems used for panorama stitching, 3D terrain modeling, obstacle detection, position tracking See Computer Vision on Mars for more information Copyright (C) UNSW COMP9517 24T2W1 Introduction 32 Application: machine vision in robotics NASA’s Mars Spirit Rover RoboCup Copyright (C) UNSW COMP9517 24T2W1 Introduction 33 Application: video surveillance Software from TrafficVision turns traffic cameras into an intelligent sensors Traffic monitoring Action recognition Incident detection Speed estimation Vehicle counting... Copyright (C) UNSW COMP9517 24T2W1 Introduction 34 Application: medical imaging Computer Aided Diagnosis Image Guided Surgery Copyright (C) UNSW COMP9517 24T2W1 Introduction 35 Goals and challenges of computer vision Extract useful information from images, both metric and semantic Data ambiguity, heterogeneity, and complexity are a big challenge Significant progress in recent years due to various improvements: – Processing power – Storage capacity – Memory capacity – Data availability Careful design of every step in the computer vision workflow: images > measurements > representation > algorithms for learning and inference Copyright (C) UNSW COMP9517 24T2W1 Introduction 36 Computer vision tasks Obtain simple inferences from individual pixel values Group pixels to separate object regions or infer shape information Recognise objects using geometric or statistical pixel information Combine information from multiple images into a coherent whole Requires understanding of the physics of imaging and the use of mathematical and statistical models for information extraction Copyright (C) UNSW COMP9517 24T2W1 Introduction 37 Low-level computer vision This concerns mostly image processing (image in > image out) Sensing: image capture and digitization Preprocessing: suppress noise and enhance object features Segmentation: separate objects from background and partition them Description: compute feature maps which differentiate objects Labeling: assign labels to image segments (regions of interest) Copyright (C) UNSW COMP9517 24T2W1 Introduction 38 High-level computer vision This concerns deeper image analysis (image in > knowledge out) Detection: detect, localize, count objects of interest Recognition: identify object types based on low-level information Classification: assign unique labels to recognized objects Interpretation: assign meaning to groups of recognized objects Scene analysis: complete understanding of the captured scene Copyright (C) UNSW COMP9517 24T2W1 Introduction 39 Assumed knowledge To do this course successfully you should: Be able to program well in Python or willing to learn it independently Be familiar with data structures and algorithms and basic statistics Be able/learn to use software packages (OpenCV, Scikit-Learn, Keras) Be familiar with vector calculus and linear algebra or willing to learn it Please self-assess before deciding to stay/enroll in the course Copyright (C) UNSW COMP9517 24T2W1 Introduction 40 Student learning outcomes After completing this course, you will be able to: Explain basic scientific and engineering approaches to computer vision Implement and test computer vision algorithms using existing software Build larger computer vision applications by integrating software modules Interpret and comment on articles in the computer vision literature Copyright (C) UNSW COMP9517 24T2W1 Introduction 41 Course topics and lecturers Week Topic Lecturer 1 Introduction & Image Formation Professor Erik Meijering 2 Image Processing Senior Lecturer Dong Gong 3 Feature Representation Professor Erik Meijering 4 Pattern Recognition Senior Lecturer Dong Gong 5 Image Segmentation Professor Erik Meijering 6 Flexible Week (No Lectures) 7 Deep Learning I Senior Lecturer Dong Gong 8 Deep Learning II Senior Lecturer Dong Gong 9 Motion and Tracking Professor Erik Meijering 10 Applications Guest Lecturers Copyright (C) UNSW COMP9517 24T2W1 Introduction 42 Weekly class structure Lectures: Wednesdays 4-6pm & Thursdays 4-6pm (via BB Collab) Wednesday lectures will be on campus and online (links in Moodle) Lab consultations: Wednesdays 6-7pm in weeks 2-5 (via BB Collab) Software demos and consultations with your assigned tutor (links in Moodle) Project consultations: Wednesdays 6-7pm in weeks 6-10 (via BB Collab) All project consultations will be online with your assigned tutor (links in Moodle) Copyright (C) UNSW COMP9517 24T2W1 Introduction 43 Assessments Assessment Marks Release Due Where Lab Work (4x) 10% Weeks 2, 3, 4, 5 Weeks 3, 4, 5, 7 Online Group Project 40% Week 5 Week 10 Online Exam 50% Exam Day Exam Day On Campus (CSE Labs) Late submission penalty: Unless you have been granted Special Consideration, work submitted after the deadline during term will incur a penalty of 5% per day, capped at 5 days, after which submissions are no longer accepted. For the final examination, university exam rules apply. Copyright (C) UNSW COMP9517 24T2W1 Introduction 44 Communication modes and etiquette Online forum (Ed) is your first port of call for queries of wider interest on lectures, labs, project, exam, and general administrative things Contact the LIC for late submission, absence, assessment deadlines, and specific questions about the assignment, labs, project, and assessment contents Contact the course admin for issues with enrolment, file submission, group enrolment, or other administration related matters Team is committed to respond quickly to queries with a maximum turnaround of 24 hours Do observe standards of equity and respect in dealing with all students and staff, in person, emails, forum posts, and all other communication Language of communication is English Copyright (C) UNSW COMP9517 24T2W1 Introduction 45 Special Consideration If your work in this course is affected by unforeseen adverse circumstances, you should apply for Special Consideration via the UNSW website UNSW handles Special Consideration requests centrally, so use the website and do not email the Lecturer in Charge about Special Consideration requests Special Consideration requests must be accompanied by documentation Marks are calculated the same way as other students who sat the original assessment If you are awarded a Supplementary Exam and do not attend, your exam mark will be zero See the course webpage on WebCMS3 for more detailed information and links Copyright (C) UNSW COMP9517 24T2W1 Introduction 46 Plagiarism Policy READ the UNSW Policy and Procedure on this (links in the course outline on WebCMS3) For the purposes of COMP9517, plagiarism includes copying or obtaining all, or a substantial part, of the material for your assignment, whether written or graphical report material, or software code, without written acknowledgement in your assignment from: A location on the internet (including ChatGPT, GitHub Copilot, Google Bard etc.) A book, article or other written document (published or unpublished) in any form Another student, whether in your class or another class, at UNSW or elsewhere Someone else (for example someone who writes assignments for money) Copyright (C) UNSW COMP9517 24T2W1 Introduction 47 Plagiarism Policy If you copy material from another student or non-student with acknowledgement, you will not be penalized for plagiarism, but the marks you get for this will be at the marker’s discretion and will reflect the marker’s perception of the amount of work you put into finding and/or adapting the code/text If you use text found in a publication (on the internet or elsewhere), the marks you get for this will be at the marker’s discretion and will reflect the marker’s perception of the amount of work you put into finding and/or adapting the text Assessments provide opportunities for you to develop important skills Use these opportunities Copyright (C) UNSW COMP9517 24T2W1 Introduction 48 Copyright Notice All course materials made available to you are copyrighted by UNSW Reproducing, publishing, posting, distributing, or translating is a copyright infringement Infringements will be reported to UNSW Student Conduct and Integrity for action Copyright (C) UNSW COMP9517 24T2W1 Introduction 49 Further information on WebCMS3 Please be sure you are familiar with: Communication Etiquette Special Consideration Student Conduct Plagiarism Policy Academic Integrity Copyright (C) UNSW COMP9517 24T2W1 Introduction 50 Further reading on discussed topics In the lectures we will be referring to various online resources for further reading: Richard Szeliski, Computer Vision: Algorithms and Applications, 2nd Edition, Springer, 2021 Dana H. Ballard and Christopher M. Brown, Computer Vision, Prentice Hall, 1982 Ian Goodfellow, Yoshua Bengio, Aaron Courville, Deep Learning, MIT Press, 2016 David A. Forsyth and Jean Ponce, Computer Vision: A Modern Approach, Prentice Hall, 2011 Simon J. D. Prince, Computer Vision: Models, Learning and Inference, CUP, 2012 And other books, articles, and resources online or via the UNSW Library Copyright (C) UNSW COMP9517 24T2W1 Introduction 51 Further reading on discussed topics Chapter 1 of Szeliski for a general introduction to computer vision Appendix A of Szeliski for a recap of linear algebra and numerical techniques Copyright (C) UNSW COMP9517 24T2W1 Introduction 52