这是对ML4360课程的作业实现, ML4360课程总共包含了6份作业(ex01->ex06),作业包含书面问题以及编程问题,本仓库仅包含对编程问题的实现。
This is an exercise implementation of the ML4360 course, the ML4360 course contains a total of 6 exercises (ex01->ex06), the exercise contains pen and paper questions and programming problems, and this repository only contains the implementation of programming problems.
课程介绍:计算机视觉的目标是从数字图像中计算三维世界的几何和语义属性。该领域的问题包括重建物体的 3D 形状、确定物体的移动方式以及识别物体或场景。本课程将介绍计算机视觉,主题包括图像形成、相机模型、相机校准、特征检测和匹配、运动估计、几何重建、对象检测和跟踪以及场景理解。应用包括构建 3D 地图、创建虚拟形象、图像搜索、组织照片集、人机交互、视频监控、自动驾驶汽车、机器人技术、虚拟现实和增强现实、模拟、医学成像和移动计算机视觉。现代计算机视觉在很大程度上依赖于机器学习,尤其是深度学习和图形模型。因此,本课程假定您具备深度学习的先验知识(例如,深度学习讲座),并在需要时介绍图形模型和结构化预测的基本概念。这些教程将通过在 Python 和 PyTorch 中实现和应用深度神经网络来加深对深度神经网络的理解。本课程的重点是 3D 视觉。
Course Introduction:The goal of computer vision is to compute geometric and semantic properties of the three-dimensional world from digital images. Problems in this field include reconstructing the 3D shape of an object, determining how things are moving and recognizing objects or scenes. This course will provide an introduction to computer vision, with topics including image formation, camera models, camera calibration, feature detection and matching, motion estimation, geometry reconstruction, object detection and tracking, and scene understanding. Applications include building 3D maps, creating virtual avatars, image search, organizing photo collections, human computer interaction, video surveillance, self-driving cars, robotics, virtual and augmented reality, simulation, medical imaging, and mobile computer vision. Modern computer vision relies heavily on machine learning in particular deep learning and graphical models. This course therefore assumes prior knowledge of deep learning (e.g., deep learning lecture) and introduces the basic concepts of graphical models and structured prediction where needed. The tutorials will deepen the understanding of deep neural networks by implementing and applying them in Python and PyTorch. A strong emphasis of this course is on 3D vision.
课程目标:了解计算机视觉的理论和实践概念,包括图像形成、相机模型、特征检测、多视图几何、3D 重建、运动估计、对象识别、场景理解以及使用深度神经网络和图形模型进行结构化预测。本课程的重点是 3D 视觉,完成本课程后,应能够理解并在实践中应用计算机视觉的基本概念,开发和训练计算机视觉模型,复制研究成果并在该领域进行原创性研究。
Course Goals:Students gain an understanding of the theoretical and practical concepts of computer vision including image formation, camera models, feature detection, multiple view geometry, 3D reconstruction, motion estimation, object recognition, scene understanding and structured prediction using deep neural networks and graphical models. A strong emphasis of this course is on 3D vision. After this course, students should be able to understand and apply the basic concepts of computer vision in practice, develop and train computer vision models, reproduce research results and conduct original research in this area.