It is my absolute pleasure to say that video is now a first-class citizen in PyTorch!
No more re-encoding, no more convoluted FFMPEG scripts and custom dataset hacking – it’s all here and it will only get better. A huge thanks to Francisco for getting this out there this fast, and for everyone that helped out with debugging.
Next up, we are planning to release a 3D channel-separated CUDA kernel, along with the state of the art ip- and ir-CSN models pre-trained Sports1M, Kinetics, and (hopefully) IG data. By ICCV, we hope to support most models from Video Model Zoo to have truly first-class support for video classification.