Abstract
Contemporary pose estimation methods enable precise measurements of behavior via supervised deep learning with hand-labeled video frames. Although effective in many cases, the supervised approach requires extensive labeling and often produces outputs that are unreliable for downstream analyses. Here, we introduce “Lightning Pose,” an efficient pose estimation package with three algorithmic contributions. First, in addition to training on a few labeled video frames, we use many unlabeled videos and penalize the network whenever its predictions violate motion continuity, multiple-view geometry, and posture plausibility (semi-supervised learning). Second, we introduce a network architecture that resolves occlusions by predicting pose on any given frame using surrounding unlabeled frames. Third, we refine the pose predictions post-hoc by combining ensembling and Kalman smoothing. Together, these components render pose trajectories more accurate and scientifically usable. We release a cloud application that allows users to label data, train networks, and predict new videos directly from the browser.
Competing Interest Statement
Robert S. Lee assisted in the initial development of the cloud application as a solution architect at Lightning AI in Spring-Summer 2022. He left the company in August 2022 and continues to hold shares. The remaining authors declare no competing interests.
Footnotes
We include a new dataset with two freely moving mice; New analyses now show performance as a function of keypoint difficulty; Released all data on FigShare; Added training and inference time benchmarking for the various models (Supp. Info); Improved the results on the "mirror-fish" dataset after correcting labeling errors; New "ablation" experiments showing which of our unsupervised losses are contributing to the improvements we report; New analyses comparing EKS to other common post-processors.