I have this project in which I need to analyse infants' movements from RGB and/or depth videos (they are taken by the same camera so as soon as I can with one video, I can track the pixels in the other one). The thing is that I can't find any dataset of images of babies that young (around 6 months old) to train a Neural Network or a Tree to recognize the infant*, so I thought of selecting the desired object (the baby, lol) from several frames of the video itself and then using that information to track him or her along the entire video.
I have no idea how to do that, though: I have thought of using Motion Analysis algorithms, or features extractions... but I am quite lost, to be honest.
Also, the child lies on a hospital cradle, which means that it doesn't move from that fixed area (good thing) but also that I can't use background removal techniques to segment the silhouette (I think! but I'm not so sure about that).
*infant silhouette is very different from an adult's, moreover they are lying and most datasets contain images of standing adult humans.
Thank you for any help you may provide!