After playing for a while with FFMPEG C API, it turns out that hardware video decoding using h264_cuvid is not faster than decoding on the CPU. The only advantage is that it can free CPU resources for image processing.
Additionally, processing an image generally takes much more time than reading it, which means that a significant speedup can be obtained by parallelizing the processing part (See figure below).
As long as the multithreaded processing part takes longer on each thread than reading the image, there is no advantage in trying to read images faster.
As a side note, many image processing routines in MATLAB rely on Intel's MKL library (blas/lapack) and are already multithreaded. There is less improvements to obtain by trying to process images in parallel if matlab is already using all the cores available