Computer Vision in ROS

Video encoding and decoding

Video encoding is the process of compressing and potentially changing the format of video content, sometimes even changing an analog source to a digital one. In regards to compression, the goal is so that it consumes less space. This is because it’s a lossy process that throws away information related to the video

Decoding is essentially the reverse of encoding. A decoder takes an encoded, compressed stream/file and decompresses it back into its raw form. Raw video data is required for editing processes and for viewing of the raw video.

Video Encoding and Decoding

Video Encoding and Decoding. Source

ROS 2 image transport for FFmpeg encoding

This ROS 2 image transport supports encoding/decoding with the FFMpeg library. With this transport you can encode h264 and h265, using Nvidia hardware acceleration when available.

GitHub

image_common (contains image_transport)

GitHub

cv_bridge

cv_bridge is a ROS package that provides an interface between ROS and OpenCV. It is used to convert between ROS images and OpenCV images.

GitHub

OpenCV

OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. It is used as a library for real-time computer vision. It is written in C++ and its primary interface is in C++, but it still has a full C API. There are bindings in Python, Java and MATLAB/Octave.

OpenCV is used for a wide range of applications, including medical image analysis, stitching street view images, surveillance video, detecting and recognizing faces, tracking moving objects, extracting 3D models, and much more.

Official Website

Cameras

Cameras

Useful Resources and ROS 2 packages

  • ROS 2 package depth_anything_v2_ros2 - is a ROS 2 wrapper for the depth_anything_v2 library (Monocular Depth Estimation). It provides a ROS2 node that subscribes to a camera topic and publishes the depth map of the scene. GitHub

  • v4l2_camera - a ROS 2 camera driver using Video4Linux2 (V4L2. GitHub