3D Machine Vision

3D machine vision is inspired by the most natural and at the same time most complex of all machine vision components: the eye. People see their environment with two eyes, which are slightly away from each other. This spatial difference of the eyes makes it possible to also perceive the depth, in addition to the horizontal and vertical information. Each eye perceives the world from a different angle. Only through the processing in the brain are the different images combined into a whole and the depth information is added.

Knowledge about the third dimension also makes a significant difference for various applications in industry. In robotics it allows the machine production and manufacturing assistants to be able to interpret their environment spatially and move in the room. In logistics processes such as pick & place, package control, loading & unloading of trucks, and depalletizing can be automated using 3D machine vision.

When choosing the appropriate 3D machine vision component, there are different types to decide on depending on the respective application.

What is the difference between a 3D camera and a 3D vision sensor?

In seeking the suitable component one comes across both 3D cameras and 3D vision sensors. But what is the difference between these types of components?

Generally speaking, a vision sensor is understood as a machine vision component that is optimized for a certain task. The vision sensor takes images and evaluates them using machine vision algorithms. The reaction to the result is also triggered by the sensor. For a vision sensor the camera represents the basis. We can only talk of a vision sensor with integrated algorithms for the evaluation of the images. The advantage of a vision sensor is that the installation as well as the operation of the device are very simple. However, the adaptability is limited as the vision sensor can only be used for a certain task.

In contrast to the vision sensor, no processor is integrated in an industrial camera or even a pure 3D camera and therefore the images cannot be processed independently. It must be connected to a computer which processes the images using machine vision software. This makes the application more complex, but also more flexible and adaptable than with a vision sensor.

Whether a camera or a vision sensor is more suitable depends on the respective application and also the vision expertise of the user.

With the rc_visard the decision between a 3D vision sensor and a 3D camera is spared. Thanks to an on-board processor the rc_visard is a smart 3D camera, which can also be used as a 3D stereo sensor with integrated software modules and algorithms. The connection to an external computer is generally not necessary thanks to a GigE Vision interface but is still possible. Camera data can be processed flexibly and independent 3D applications can be created.

This makes the rc_visard suitable for numerous applications and for all levels of vision expertise.

How does stereo vision work?

The rc_visard is based on stereo vision. But what does this term actually mean? 3D machine vision using stereo vision is a common technology that offers fast image recording and a large field of view. With this method two area scan cameras are typically used. Like with the human eyes, the two cameras are offset from each other in order to record the scene from two perspectives. The 3D information is acquired from the processing of the images from the two perspectives. The determining of the same points in the camera images from the different perspectives is key for the accuracy of the data. This process is also called a matching process.

Taking into consideration the relative positions of the cameras, the software compares corresponding points in the two images, identifies disparities, and generates a complete 3D point cloud.

Distinctive points, such as the corners of a cube, can be easily identified as related points. However, points that are on a smooth surface may be a challenge. This challenge can be tackled using a pattern projector.

This method is called Active Stereo Vision. With the pattern a texture is created on the smooth surface which makes the matching process easier.

Passive Stereo Vision Active Stereo Vision


In industrial machine vision there are other options for capturing a three dimensional image apart from stereo vision. Some of the most well-known methods include Time of Flight (ToF), laser triangulation, and structured light.

The advantage of stereo vision mainly lies in the quick capture of the 3D image with a single shot. There is no need to scan as is typical with laser triangulation systems. As a result, there are faster processing times with stereo vision. The coverage and the range are also generally much greater than with laser triangulation systems. For example, large objects that are common in depalletization/palletization can be easily displayed in the camera's field of view.
Stereo vision impresses with good quality of the point cloud, especially in close range this is generally superior to the ToF technology. However, it does not achieve the degree of detail of laser triangulation systems.

Also for heavily stuctured scenes stereo vision is less susceptible to pseudo errors and also offers the opportunity to use color sensors.

These properties make stereo vision particularly suitable for use in logistics and robotics, e.g., for bin picking, depalletization/palletization, navigation of driverless vehicles, or robot guidance.