Let’s look at the transportation industry-based case of extensive image processing.
Two video cameras were looking at the boxes moving fast on the conveyor belt. To provide high enough image resolution the cameras were placed close to the belt but they could not cover all the belt cross-section. They were placed on the sides of the belt, and could see parts of the boxes. Customer wanted good images of the texture on the top of the boxes, so the images from the two cameras needed to be stitched.
Two cameras see the same object at different angles and distances. Before merging the images from the different cameras the images must be transformed from the coordinate systems of the cameras to one common coordinate system, and placed in one common plane in XYZ space. Our developed software performed transformation automatically, based on the known geometry of the camera positions relative to the conveyor belt.
Still, after such transformation, multi-megapixel grayscale images from the left and the right cameras are shifted in common plane relative to each other: