): The model runs a full forward pass through the feature network ( Nfeatcap N sub f e a t end-sub ) to get feature maps A lightweight FlowNet ( Nflowcap N sub f l o w end-sub ) calculates the displacement field ( Mi→kcap M sub i right arrow k end-sub ) between the current frame and the last keyframe.
Does this video belong to a specific like ImageNet VID, or are you looking to implement this on a custom real-time stream ?
For further customization of the network architecture or training on specific datasets, refer to the official GitHub documentation.
:To extract and visualize deep features for your specific MP4 file, run the inference script pointing to your video: