Adaptive Multi-Resolution Gaze Compression for video streaming
Real-time acquisition, transmission, and storage of high frame rate /high-resolution video data are often constrained by bandwidth limitations of the system being used. For example, when trying to transmit video over a wireless network, network bandwidth limitation may require the system to reduce the video frame rate or the video resolution in order to transmit the video in real-time. This will result in a video with degraded visual quality.
The human eye has high visual acuity in a very small region at the center of our field of vision referred to as the fovea. Outside of the fovea, our vision has much lower visual acuity and is not able to decern visual details clearly. This invention makes use of this fact to reallocated the pixels in a video frame so that high visual resolution and detail is maintained at the focal region of the image while the remaining pixels in the periphery are gradually degraded based on the available system bandwidth. This allows for a significant reduction in the number of pixels that need to be acquired, transmitted, or stored and enables real-time high frame rate / high visual quality video on bandwidth constrained system.
he invention is a process/algorithm for generating compound video frames that can be reconstructed into multi-resolution video frames for efficient video acquisition/transmission/processing/storage. A region-of-interest is first defined on the video frame (this can be done manually by the user, or automatically using eye gaze tracking or image processing techniques). Multiple image resolutions are then extracted from the current video frame centered around this region-of-interest. The pixels within the region-of-interest are extracted at a high-resolution level while regions further away from the region-of-interest are extracted at progressively lower resolutions. The x and y resolution of each progressive level is reduced by half while the field of view is doubled so that the size of each extracted image is equal. A compound video frame is constructed from the extracted images by compositing the images next to each other. The number of resolution levels extracted is defined by the user or automatically by a control system based on the available bandwidth of the system.
The compound video frame can be compressed using existing video codecs to remove spatial and temporal redundancies further. When the video is displayed to the user, the various regions within the compound video frame are extracted, resized back to their original size, and embedded inside of each other in their corresponding locations to generate a multi-resolution image. This process is repeated for each video frame.
a. Benefits
The main benefit of the invention is that it is able to significantly reduce the number of video pixels that need to be acquired, transmitted, stored, or processed while maintaining the visual quality of the video frame. It does this by replacing a standard video frame by a compound video frame composed of various regions of the image that are composited into a single image frame at multiple resolutions. Regions that are closer to the user's focal point are maintained at a high resolution and quality while the resolution and quality of regions further away from the focal point are adaptively degraded based on the available system bandwidth. The reduction in the number of pixels achieved by the compound video frame enables for faster execution of all other downstream processes. Image compression codes can be applied to the compound video frame to reduce the bit rate of the video further and allow for real-time transmission on bandwidth-constrained systems. The compound video frame is reconstructed into a seamless multi-resolution video frame before it is displayed to the user. The resolution level of the compound video frame can also be adjusted in real-time in response to changes in the available system bandwidth.
Applications
The invention can be used to reduce the number of pixels that need to be acquired, transmitted, processed, or stored. It can be used in a number of applications to reduce video data rate or to increase the video frame rate as described in the following examples.
Example 1: High frame rate video acquisition for slow-motion video
Example 2: Low bandwidth real-time video transmission
Example 3: High speed / low bandwidth video storage
Relevant Key Words
Video compression, Real-time video, Low latency, Multi-resolution, Low Bandwidth, Adaptive.