ABSTRACT
Hockey rink registration is a useful tool for aiding and automating sports analysis. When combined with player tracking, it can provide location information of players on the rink by estimating a homography matrix that can warp broadcast video frames onto an overhead template of the rink, or vice versa. However, most existing techniques require accurate ground truth information, which can take many hours to annotate, and only work on the trained rink types. In this paper, we propose a generalized rink registration pipeline that, once trained, can be applied to both seen and unseen rink types with only an overhead rink template and the video frame as inputs. Our pipeline uses domain adaptation techniques, semi-supervised learning, and synthetic data during training to achieve this ability and overcome the lack of non-NHL training data. The proposed method is evaluated on both NHL (source) and non-NHL (target) rink data and the results demonstrate that our approach can generalize to non-NHL rinks, while maintaining competitive performance on NHL rinks.
- [n. d.]. International Ice Hockey Federation Ice Rink Guide. https://www.iihf. com/en/static/5890/iihf-ice-rink-guideGoogle Scholar
- Jianhui Chen and James J Little. 2019. Sports camera calibration via synthetic data. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 0--0.Google ScholarCross Ref
- Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. 2017. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence 40, 4 (2017), 834--848.Google Scholar
- Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017).Google Scholar
- Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV). 801--818.Google ScholarDigital Library
- Yen-Jui Chu, Jheng-Wei Su, Kai-Wen Hsiao, Chi-Yu Lien, Shu-Ho Fan, Min-Chun Hu, Ruen-Rone Lee, Chih-Yuan Yao, and Hung-Kuo Chu. 2022. Sports Field Registration via Keypoints-Aware Label Condition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3523--3530.Google ScholarCross Ref
- Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. 2016. Deep image homography estimation. arXiv preprint arXiv:1606.03798 (2016).Google Scholar
- Martin A Fischler and Robert C Bolles. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24, 6 (1981), 381--395.Google ScholarDigital Library
- Martin Formánek. [n. d.]. Nokia Arena, Tampere. https://www.eurohockey.com/ arena/2154-nokia-arena-tampere.htmlGoogle Scholar
- Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. The journal of machine learning research 17, 1 (2016), 2096--2030.Google Scholar
- Golnaz Ghiasi, Yin Cui, Aravind Srinivas, Rui Qian, Tsung-Yi Lin, Ekin D Cubuk, Quoc V Le, and Barret Zoph. 2021. Simple copy-paste is a strong data augmentation method for instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2918--2928.Google ScholarCross Ref
- Richard Hartley and Andrew Zisserman. 2003. Multiple view geometry in computer vision. Cambridge university press.Google Scholar
- Namdar Homayounfar, Sanja Fidler, and Raquel Urtasun. 2017. Sports field localization via deep structured models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5212--5220.Google ScholarCross Ref
- Lukas Hoyer, Dengxin Dai, and Luc Van Gool. 2022. Daformer: Improving network architectures and training strategies for domain-adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9924--9935.Google ScholarCross Ref
- Lukas Hoyer, Dengxin Dai, and Luc Van Gool. 2022. HRDA: Context-aware highresolution domain-adaptive semantic segmentation. In Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XXX. Springer, 372--391.Google Scholar
- Lukas Hoyer, Dengxin Dai, Haoran Wang, and Luc Van Gool. 2023. MIC: Masked image consistency for context-enhanced domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11721--11732.Google ScholarCross Ref
- Pavel Iakubovskii. 2019. Segmentation Models Pytorch. https://github.com/ qubvel/segmentation_models.pytorch.Google Scholar
- Max Jaderberg, Karen Simonyan, Andrew Zisserman, et al. 2015. Spatial transformer networks. Advances in neural information processing systems 28 (2015).Google Scholar
- Wei Jiang, Juan Camilo Gamboa Higuera, Baptiste Angles, Weiwei Sun, Mehrsan Javan, and Kwang Moo Yi. 2020. Optimizing through learned errors for accurate sports field registration. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 201--210.Google ScholarCross Ref
- Matthew CH Lee, Ozan Oktay, Andreas Schuh, Michiel Schaap, and Ben Glocker. 2019. Image-and-spatial transformer networks for structure-guided image registration. In Medical Image Computing and Computer Assisted Intervention--MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13--17, 2019, Proceedings, Part II 22. Springer, 337--345.Google ScholarDigital Library
- Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980--2988.Google ScholarCross Ref
- Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3431--3440.Google ScholarCross Ref
- Mingsheng Long, Yue Cao, Jianmin Wang, and Michael Jordan. 2015. Learning transferable features with deep adaptation networks. In International conference on machine learning. PMLR, 97--105.Google Scholar
- David G Lowe. 2004. Distinctive image features from scale-invariant keypoints. International journal of computer vision 60, 2 (2004), 91--110.Google ScholarDigital Library
- Xiaohan Nie, Shixing Chen, and Raffay Hamid. 2021. A robust and efficient framework for sports-field registration. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1936--1944.Google ScholarCross Ref
- Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention--MICCAI 2015: 18th International Conference, Munich, Germany, October 5--9, 2015, Proceedings, Part III 18. Springer, 234--241.Google Scholar
- Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. 2011. ORB: An efficient alternative to SIFT or SURF. In 2011 International conference on computer vision. Ieee, 2564--2571.Google ScholarDigital Library
- Long Sha, Jennifer Hobbs, Panna Felsen, Xinyu Wei, Patrick Lucey, and Sujoy Ganguly. 2020. End-to-end camera calibration for broadcast videos. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 13627-- 13636.Google ScholarCross Ref
- Feng Shi, Paul Marchwica, Juan Camilo Gamboa Higuera, Michael Jamieson, Mehrsan Javan, and Parthipan Siva. 2022. Self-Supervised Shape Alignment for Sports Field Registration. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 287--296.Google ScholarCross Ref
- Matthew Sinclair, Andreas Schuh, Karl Hahn, Kersten Petersen, Ying Bai, James Batten, Michiel Schaap, and Ben Glocker. 2022. Atlas-ISTN: joint segmentation, registration and atlas construction with image-and-spatial transformer networks. Medical Image Analysis 78 (2022), 102383.Google ScholarCross Ref
- Antti Tarvainen and Harri Valpola. 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in neural information processing systems 30 (2017).Google Scholar
- Wilhelm Tranheden, Viktor Olsson, Juliano Pinto, and Lennart Svensson. 2021. Dacs: Domain adaptation via cross-domain mixed sampling. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1379--1389.Google ScholarCross Ref
- Evan Weiner. 2009. Not every 200 foot by 85 foot NHL rink is the same. https://www.nhl.com/news/not-every-200-foot-by-85-foot-nhl-rink-isthe-same/c-501626Google Scholar
- Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, and Ping Luo. 2021. SegFormer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems 34 (2021), 12077--12090.Google Scholar
- Neng Zhang and Ebroul Izquierdo. 2021. A high accuracy camera calibration method for sport videos. In 2021 International Conference on Visual Communications and Image Processing (VCIP). IEEE, 1--5.Google ScholarCross Ref
- Qiang Zhou and Xin Li. 2019. STN-homography: Direct estimation of homography parameters for image pairs. Applied Sciences 9, 23 (2019), 5187.Google ScholarCross Ref
Index Terms
- Rink-Agnostic Hockey Rink Registration
Recommendations
Automatic player labeling, tracking and field registration and trajectory mapping in broadcast soccer video
In this article, we present a method to perform automatic player trajectories mapping based on player detection, unsupervised labeling, efficient multi-object tracking, and playfield registration in broadcast soccer videos. Player detector determines ...
Amateur ice hockey coaching and the role of video feedback
GI '15: Proceedings of the 41st Graphics Interface ConferenceAmateur minor hockey coaches have recently begun to capture and play back video recordings to provide their teams with visual feedback of their play as a learning tool. Yet what is not clear is whether such video feedback is useful and how video ...
Pass Evaluation in Women's Olympic Ice Hockey
MMSports '22: Proceedings of the 5th International ACM Workshop on Multimedia Content Analysis in SportsMuch of modern sports analytics is based on player and ball tracking data. Such data are mostly collected using wearable devices or an array of carefully located cameras and detectors. Many teams do not have such a luxury, especially in undervalued ...
Comments