You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wanted to be able to input larger image resolutions. However when I do input image size of 480*480 it takes almost 10 minutes to process a tiny 10 second clip.
It seems when I increase image size, the model inference run-time become exponentially greater.
There is crucial motion information being lost when I downscale my images to 112*112 and it is effecting the precision of the model on my test sets.
Is there any alternative model or method that will allow me to proceed with larger image resolutions using the 3D-ResNet model?
Is it practical to use 3D-CNN with input sizes of 480*480 images for video classification tasks?
The text was updated successfully, but these errors were encountered:
I wanted to be able to input larger image resolutions. However when I do input image size of 480*480 it takes almost 10 minutes to process a tiny 10 second clip.
It seems when I increase image size, the model inference run-time become exponentially greater.
There is crucial motion information being lost when I downscale my images to 112*112 and it is effecting the precision of the model on my test sets.
Is there any alternative model or method that will allow me to proceed with larger image resolutions using the 3D-ResNet model?
Is it practical to use 3D-CNN with input sizes of 480*480 images for video classification tasks?
The text was updated successfully, but these errors were encountered: