Computer vision is the science that aims at giving computers a visual sense to understand the world similar to how we humans do. It's a field of Artificial Intelligence and Computer Science. It's concerned with the automatic extraction, analysis, and understanding of useful information from a single image or a sequence of images. It involves the development of a theoretical and algorithmic basis to achieve automatic visual understanding. A significant part of artificial intelligence deals with planning or deliberation for the system which can perform mechanical actions such as moving a robot through some environment. This type of processing typically needs input data provided by a computer vision system, acting as a vision sensor and providing high-level information about the environment.
So, how does it work?
The Computer vision using digital images through three main processing components, these are -
Image acquisition is the process of translating the world around us into binary data that composed of zeros and ones, interpreted as digital images. A digital image is produced by one or several image sensors, which, besides various types of light-sensitive cameras, include range sensors, tomography devices, radar, ultrasonic cameras, etc. Depending on the type of sensor, the resulting image data is an ordinary 2D image, a 3D volume, or an image sequence. The pixel values typically correspond to light intensity in one or several spectral bands (gray images or colour images), but can also be related to various physical measures, such as depth, absorption or reflectance of sonic or electromagnetic waves, or nuclear magnetic resonance.
Gadgets that have been created -
Webcams & embedded cameras
Digital compact cameras & DSLR
Consumer 3D cameras & laser range finders
In this process, algorithms are applied to the binary data acquired in the first step to infer low-level information on parts of the image. This type of information is characterized by image edges, point features or segments, for example. They are all the basic geometric elements that build objects in images. This second step usually involves advanced applied mathematics algorithms and techniques. These are - Edge detection, Segmentation, Classification, Feature detection, and matching.
At this step, the input is typically a small set of data, for example, a set of points or an image region which is assumed to contain a specific object. The remaining processing deals with, for example -
Verification that the data satisfy model-based and application-specific assumptions.
Estimation of application specific parameters, such as object pose or object size.
Image recognition - classifying a detected object into different categories.
Image registration - comparing and combining two different views of the same object.
Image analysis and understanding
The last step of the Computer Vision pipeline if the actual analysis of the data, which will allow the decision making. High-level algorithms are applied, using both the image data and the low-level information computed in previous steps.
3D scene mapping
Making the final decision required for the application.
The most popular datasets -
PASCAL VOC Dataset - This dataset provides standardised image data sets for object class recognition
Open Images - This dataset contains 9 million images, 5,000 object categories
ImageNet - An extremely useful resource, this database consists of 15 million images, 22,000 object categories.
CALTECH-101 - 9,000 images with 101 object categories.
Kaggle CIFAR-10 - One of the most popular datasets consists of 60,000 color images and 10 object categories.
Common Objects in Context (COCO) - Better known as COCO, this dataset consists of 330K images, 80 object categories.
Implementation in Real Life
Techniques developed for Computer Vision have many applications in real fields. These are - Motion recognition, Augmented reality, Autonomous cars, Domestic/service robots, Medical computer vision or medical image processing, Image restoration such as denoising.
One of the most prominent application fields is medical computer vision or medical image processing. This area is characterized by the extraction of information from image data for the purpose of making a medical diagnosis of a patient. Generally, image data is in the form of microscopy images, X-ray images, angiography images, ultrasonic images, and tomography images.
The next application area in computer vision is in industry, sometimes called machine vision, where information is extracted for the purpose of supporting a manufacturing process. Machine vision is also heavily used in the agricultural process to remove undesirable foodstuff from bulk material, a process called optical sorting.
Military applications are probably one of the largest areas for computer vision. More advanced systems for missile guidance send the missile to an area rather than a specific target, and target selection is made when the missile reaches the area based on locally acquired image data. Modern military concepts, such as "battlefield awareness", imply that various sensors, including image sensors, provide a rich set of information about a combat scene which can be used to support strategic decisions.
One of the newer application areas is autonomous vehicles, which include submersibles, land-based vehicles such as small robots with wheels, cars or trucks, aerial vehicles, and unmanned aerial vehicles. The level of autonomy ranges from fully autonomous vehicles to vehicles where computer vision based systems support a driver or a pilot in various situations. Fully autonomous vehicles typically use computer vision for navigation, that is for knowing where it is, or for producing a map of its environment and for detecting obstacles.
It can also be used for detecting certain task-specific events, like an unmanned aerial vehicles looking for forest fires. Examples of supporting systems are obstacle warning systems in cars and systems for autonomous landing of aircraft.
Space exploration is already being made with autonomous vehicles using computer vision. Example - NASA's Mars Exploration Rover and ESA's ExoMars Rover.
Support of visual effects creation for cinema and broadcast. Example - camera tracking (matchmoving).
Used for the surveillance system.
Tracking and counting organisms in the biological sciences.
Career opportunities in vision
A career in Computer Vision can lead to a position within academe or industry. Many people now working in the vision industry initially worked in vision research in an academic institution, and a number of people with industrial experience have become academics. Opportunities for a career in Computer Vision in an industrial environment include working within software and hardware development, application-oriented companies, general purpose product sales, and research-based development. Thousands of companies around the world use and develop computer vision techniques.
Companies that engage in computer vision -
2d3 (Oxford) - Tracking; Mosaicing; Terrain generation; Change Detection; Camera Tacking; Metadata Extraction; Image Segmentation
Alpha Vision Design (London) - Automatic number plate recognition; People counting
Aralia Systems (West Sussex) - Surveillance; Tracking
Waterfall Solutions (Guildford) - Defence systems; Security and surveillance; Video analytics
Yotta DCL Ltd (Leamington Spa) - Pavement management; Assets management; Web-based inventory service
Image Metrics (Manchester) - Markerless Avatar Controlling via AAMs
Imorphics (Manchester) - Medical image analysis
Industrial Vision Systems Ltd (Kingston Bagpuize) - Industrial inspection
Ipsotek (London) - Video Analytics
Kinese (London) - Forensic video retrieval, search, and analysis
Microsoft Research (Cambridge) - Computer Vision; Machine Learning; Online services
AstraZeneca (Manchester) - Biomarkers
Aurora (Northampton) - Face Recognition; Biometrics
BBC (London, Salford) - Sports viewing enhancement; Person identification
Toshiba Research Europe (Cambridge) - 3D shape modeling; Gestural interfaces; Motion analysis
Vicon (Oxford) - Video analysis for motion tracking
Visio Ingenii (Luton) - Object recognition, tracking, robotics, detection & monitoring
Creative Dimension Software Ltd (Guildford) - 3D object shape modeling from photos
Disney Research (Edinburgh) - Motion Tracking & Image Analysis for Digital Acting, AR, VR
Five AI - Autonomous vehicles
Mobile Acuity (Edinburgh) - Mobile Visual Search Technology
NavTech Radar (London) - Threat detection and situation awareness via radar and CCTV
Framestore-CFC (London) - Visual Effects
FS Systems (Ringstead) - Smart cameras and sensors
In the end, we can say all technologies comes with its own concerns, with Computer Vision where any visual is more than an image. It's just a beginning of a new technological journey that has the potential to make our science fictions into reality - whether it’s for business, health, or pleasure. There’s still a lot of work to be done, which is a good thing for anyone who interested in this next big visual technology.