Sun Feb 27 2022

Computer Vision and It's Possibility

Computer Vision and It's Possibility

Computer vision is the science that aims at giving computers a visual sense to understand the world similar to how we humans do. It's a field of Artificial Intelligence and Computer Science. It's concerned with the automatic extraction, analysis, and understanding of useful information from a single image or a sequence of images. It involves the development of a theoretical and algorithmic basis to achieve automatic visual understanding. A significant part of artificial intelligence deals with planning or deliberation for the system which can perform mechanical actions such as moving a robot through some environment. This type of processing typically needs input data provided by a computer vision system, acting as a vision sensor and providing high-level information about the environment.

So, How Does It Work?

The Computer vision using digital images through three main processing components, these are -

Image acquisition

Image acquisition is the process of translating the world around us into binary data that composed of zeros and ones, interpreted as digital images. A digital image is produced by one or several image sensors, which, besides various types of light-sensitive cameras, include range sensors, tomography devices, radar, ultrasonic cameras, etc. Depending on the type of sensor, the resulting image data is an ordinary 2D image, a 3D volume, or an image sequence. The pixel values typically correspond to light intensity in one or several spectral bands (gray images or colour images), but can also be related to various physical measures, such as depth, absorption or reflectance of sonic or electromagnetic waves, or nuclear magnetic resonance.

Gadgets that have been created -

  • Webcams & embedded cameras

  • Digital compact cameras & DSLR

  • Consumer 3D cameras & laser range finders

Image processing

In this process, algorithms are applied to the binary data acquired in the first step to infer low-level information on parts of the image. This type of information is characterized by image edges, point features or segments, for example. They are all the basic geometric elements that build objects in images. This second step usually involves advanced applied mathematics algorithms and techniques. These are - Edge detection, Segmentation, Classification, Feature detection, and matching.

High-level processing

At this step, the input is typically a small set of data, for example, a set of points or an image region which is assumed to contain a specific object. The remaining processing deals with, for example -

  • Verification that the data satisfy model-based and application-specific assumptions.

  • Estimation of application specific parameters, such as object pose or object size.

  • Image recognition - classifying a detected object into different categories.

  • Image registration - comparing and combining two different views of the same object.

Image analysis and understanding

The last step of the Computer Vision pipeline if the actual analysis of the data, which will allow the decision making. High-level algorithms are applied, using both the image data and the low-level information computed in previous steps.

Technics -

  • 3D scene mapping

  • Object recognition

  • Object tracking

Decision making

Making the final decision required for the application.

Example -

  • Pass/fail on automatic inspection applications

  • Match / no-match in recognition applications

The most popular datasets -

  • PASCAL VOC Dataset - This dataset provides standardised image data sets for object class recognition

  • Open Images - This dataset contains 9 million images, 5,000 object categories

  • ImageNet - An extremely useful resource, this database consists of 15 million images, 22,000 object categories.

  • CALTECH-101 - 9,000 images with 101 object categories.

  • Kaggle CIFAR-10 - One of the most popular datasets consists of 60,000 color images and 10 object categories.

  • Common Objects in Context (COCO) - Better known as COCO, this dataset consists of 330K images, 80 object categories.

Implementation in Real Life

Techniques developed for Computer Vision have many applications in real fields. These are - Motion recognition, Augmented reality, Autonomous cars, Domestic/service robots, Medical computer vision or medical image processing, Image restoration such as denoising.

  • One of the most prominent application fields is medical computer vision or medical image processing. This area is characterized by the extraction of information from image data for the purpose of making a medical diagnosis of a patient. Generally, image data is in the form of microscopy images, X-ray images, angiography images, ultrasonic images, and tomography images.

  • The next application area in computer vision is in industry, sometimes called machine vision, where information is extracted for the purpose of supporting a manufacturing process. Machine vision is also heavily used in the agricultural process to remove undesirable foodstuff from bulk material, a process called optical sorting.

  • Military applications are probably one of the largest areas for computer vision. More advanced systems for missile guidance send the missile to an area rather than a specific target, and target selection is made when the missile reaches the area based on locally acquired image data. Modern military concepts, such as "battlefield awareness", imply that various sensors, including image sensors, provide a rich set of information about a combat scene which can be used to support strategic decisions.

  • One of the newer application areas is autonomous vehicles, which include submersibles, land-based vehicles such as small robots with wheels, cars or trucks, aerial vehicles, and unmanned aerial vehicles. The level of autonomy ranges from fully autonomous vehicles to vehicles where computer vision based systems support a driver or a pilot in various situations. Fully autonomous vehicles typically use computer vision for navigation, that is for knowing where it is, or for producing a map of its environment and for detecting obstacles.

  • It can also be used for detecting certain task-specific events, like an unmanned aerial vehicles looking for forest fires. Examples of supporting systems are obstacle warning systems in cars and systems for autonomous landing of aircraft.

  • Space exploration is already being made with autonomous vehicles using computer vision. Example - NASA's Mars Exploration Rover and ESA's ExoMars Rover.

  • Support of visual effects creation for cinema and broadcast. Example - camera tracking (matchmoving).

  • Used for the surveillance system.

  • Tracking and counting organisms in the biological sciences.

Career opportunities in vision

A career in Computer Vision can lead to a position within academe or industry. Many people now working in the vision industry initially worked in vision research in an academic institution, and a number of people with industrial experience have become academics. Opportunities for a career in Computer Vision in an industrial environment include working within software and hardware development, application-oriented companies, general purpose product sales, and research-based development. Thousands of companies around the world use and develop computer vision techniques.

Companies that engage in computer vision -

2d3 (Oxford) - Tracking; Mosaicing; Terrain generation; Change Detection; Camera Tacking; Metadata Extraction; Image Segmentation

Alpha Vision Design (London) - Automatic number plate recognition; People counting

Aralia Systems (West Sussex) - Surveillance; Tracking

Waterfall Solutions (Guildford) - Defence systems; Security and surveillance; Video analytics

Yotta DCL Ltd (Leamington Spa) - Pavement management; Assets management; Web-based inventory service

Image Metrics (Manchester) - Markerless Avatar Controlling via AAMs

Imorphics (Manchester) - Medical image analysis

Industrial Vision Systems Ltd (Kingston Bagpuize) - Industrial inspection

Ipsotek (London) - Video Analytics

Kinese (London) - Forensic video retrieval, search, and analysis

Microsoft Research (Cambridge) - Computer Vision; Machine Learning; Online services

AstraZeneca (Manchester) - Biomarkers

Aurora (Northampton) - Face Recognition; Biometrics

BBC (London, Salford) - Sports viewing enhancement; Person identification

Toshiba Research Europe (Cambridge) - 3D shape modeling; Gestural interfaces; Motion analysis

Vicon (Oxford) - Video analysis for motion tracking

Visio Ingenii (Luton) - Object recognition, tracking, robotics, detection & monitoring

Creative Dimension Software Ltd (Guildford) - 3D object shape modeling from photos

Disney Research (Edinburgh) - Motion Tracking & Image Analysis for Digital Acting, AR, VR

Five AI - Autonomous vehicles

Mobile Acuity (Edinburgh) - Mobile Visual Search Technology

NavTech Radar (London) - Threat detection and situation awareness via radar and CCTV

Framestore-CFC (London) - Visual Effects

FS Systems (Ringstead) - Smart cameras and sensors


In the end, we can say all technologies comes with its own concerns, with Computer Vision where any visual is more than an image. It's just a beginning of a new technological journey that has the potential to make our science fictions into reality - whether it’s for business, health, or pleasure. There’s still a lot of work to be done, which is a good thing for anyone who interested in this next big visual technology.

We use cookies to improve your experience on our site and to show you personalised advertising. Please read our cookie policy and privacy policy.