HiDyVE: Hierarchical Dynamic Video Exploitation

Period of Performance: 01/01/2015 - 12/31/2015


Phase 1 SBIR

Recipient Firm

28 Corporate Drive Array
Clifton Park, NY 12065
Principal Investigator


ABSTRACT:Current full motion video (FMV) toolchains fall short of supporting semantically meaningful archive search for specific objects or object types. We propose the Hierarchical Dynamic Vision Exploitation system (HiDyVE), an end-to-end system for content-based object location and retrieval in operational FMV. HiDyVE combines convolutional neural networks (CNNs) specifically adapted to FMV's quality and data volume characteristics with state-of-the-art object proposal mechanisms for efficient processing. By exploiting CNN layering and sophisticated domain transfer techniques, we minimize the amount of domain-specific labeled data required for training, instead leveraging ?existing large labeled datasets (e.g. ImageNet) to train the lower layers of the CNN. High-level scene categorizations are also generated. We also store and index intermediate descriptors, allowing the system to dynamically adapt to query concepts not present during training. A sophisticated interactive query refinement (IQR) system further incorporates user feedback to refine the search space, enabling the analyst to more quickly converge on relevant results. Real-world issues of video quality and metadata burn-in are addressed by our proven FMV front-end, which automatically detects on-screen static elements. User interaction is facilitated by our open-source FMV GUI toolkit.BENEFIT:This project will advance the state-of-the-art in computer vision, ?particularly content-based image recognition and scene understanding ?in video, but also more broadly by transitioning state-of-the-art ?techniques for processing high-resolution static images to ?lower-resolution video data. ?These improvements would positively ?impact many application areas, including aerial and ground-based video ?intelligence, surveillance, and reconnaissance (ISR); autonomous ?navigation; and social media understanding. Contributions from this ?effort will drive significant interest in the computer vision research ?community to leverage convolutional neural networks and harness ?contextual information for dramatically improved object recognition ?and matching in full motion video (FMV). ???Further, there is significant commercial potential in addition to ?military and defense applications. FMV and other video data are ?growing at unprecedented rates, and companies are looking at unique ?ways to capitalize on commercial opportunities. Commercial industries ?such as automotive, semiconductor, consumer electronics, food & ?packaging, healthcare, and logistics are using vision tracking systems ?for applications including quality assurance & inspection, tracking, ?measurement, and identification. ?The proposed technology has the ?potential to significantly increase the performance of Kitwares ?existing technologies, such as video tracking and activity ?recognition, where scene understanding will help algorithms adapt well ?to new content.