Class-discriminative focal loss for extreme imbalanced multiclass object detection towards autonomous driving

Chen, Guancheng; Qin, Huabiao

doi:10.1007/s00371-021-02067-9

Class-discriminative focal loss for extreme imbalanced multiclass object detection towards autonomous driving

Original article
Published: 20 January 2021

Volume 38, pages 1051–1063, (2022)
Cite this article

The Visual Computer Aims and scope Submit manuscript

1989 Accesses
21 Citations
1 Altmetric
Explore all metrics

Abstract

Currently, modern object detection algorithms still suffer the imbalance problems especially the foreground–background and foreground–foreground class imbalance. Existing methods generally adopt re-sampling based on the class frequency or re-weighting based on the category prediction probability, such as focal loss, proposed to rebalance the loss assigned to easy negative examples and hard positive examples for single-stage detectors. However, there are still two critical issues unresolved. In practical applications, such as autonomous driving, the class imbalance will become more extreme due to the increased detection field and target distribution characteristics, needing a more effective way to balance the foreground–background class imbalance. Besides, existing methods typically employ the sigmoid or softmax entropy loss for classification task, which we believe is not capable to realize the foreground–foreground class balance. In this paper, we propose a new form of focal loss by re-designing the re-weighting scheme that can calculate the weight according to the probability as well as widen the weight difference of the examples. Besides, we introduce the extended focal loss to multi-class classification task by reformulating the standard softmax cross-entropy loss for better utilizing the discriminant difference of foreground categories, thereby yielding a class-discriminative focal loss. Comprehensive experiments are conducted on the KITTI and BDD dataset, respectively. The results show that our approach can easily surpass focal loss with no more training and inference time cost. Besides, when trained with the proposed loss function, current state-of-the-art object detectors no matter in one-stage or two-stage paradigms can achieve significant performance gains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Application of an Improved Focal Loss in Vehicle Detection

Balanced Loss for Accurate Object Detection

Imbalanced driving scene recognition with class focal loss and data augmentation

Article 07 June 2022

References

Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Bulo, S.R., Neuhold, G., Kontschieder, P.: Loss max-pooling for semantic image segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 7082–7091. IEEE (2017)
Chen, C., Song, X., Jiang, S.: Focal loss for region proposal network. In: Pattern Recognition and Computer Vision—First Chinese Conference, pp. 368–380. Springer, Berlin (2018)
Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., Xu, J., et al.: Mmdetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)
Chen, W., Huang, H., Peng, S., Zhou, C., Zhang, C.: Yolo-face: a real-time face detector. Visual Comput., 1–9 (2020)
Dai, J., Li, Y., He, K., Sun, J.: R-fcn: Object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems, pp. 379–387 (2016)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886–893 (2005)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning: Springer Series in statistics. Springer, Berlin (2001)
MATH Google Scholar
Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361. IEEE (2012)
Girshick, R.: Fast R-CNN. In: 2015 IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1106–1114 (2012)
Li, B., Liu, Y., Wang, X.: Gradient harmonized single-stage detector. In: The Thirty-Third AAAI Conference on Artificial Intelligence, vol. 33, pp. 8577–8584 (2019)
Li, T., Ye, M., Ding, J.: Discriminative hough context model for object detection. Visual Comput. 30(1), 59–69 (2014)
Article Google Scholar
Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., Sun, J.: Light-head R-CNN: In defense of two-stage object detector. arXiv preprint arXiv:1711.07264 (2017)
Lienhart, R., Maydt, J.: An extended set of haar-like features for rapid object detection. In: Proceedings of the 2002 International Conference on Image Processing, pp. 900–903. IEEE (2002)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 936–944 (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision, pp. 2999–3007 (2017)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: 13th European Conference on Computer Vision, pp. 740–755. Springer, Berlin (2014)
Liu, B., Wu, H., Su, W., Zhang, W., Sun, J.: Rotation-invariant object detection using sector-ring hog and boosted random ferns. Visual Comput. 34(5), 707–719 (2018)
Article Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: Single shot multibox detector. In: 14th European Conference on Computer Vision, pp. 21–37. Springer, Berlin (2016)
Ojala, T., Pietikäinen, M., Mäenpää, T.: Gray scale and rotation invariant texture classification with local binary patterns. In: 6th European Conference on Computer Vision, pp. 404–420. Springer, Berlin (2000)
Oksuz, K., Cam, B.C., Kalkan, S., Akbas, E.: Imbalance problems in object detection: a review. IEEE Trans. Pattern Anal. Mach. Intell. (2020)
Papageorgiou, C.P., Oren, M., Poggio, T.: A general framework for object detection. In: Proceedings of the Sixth International Conference on Computer Vision, pp. 555–562. IEEE (1998)
Pinheiro, P.O., Collobert, R., Dollár, P.: Learning to segment object candidates. In: Advances in Neural Information Processing Systems, pp. 1990–1998 (2015)
Pinheiro, P.O., Lin, T.Y., Collobert, R., Dollár, P.: Learning to refine object segments. In: 14th European Conference on Computer Vision, pp. 75–91. Springer, Berlin (2016)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 6517–6525 (2017)
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)
Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 761–769 (2016)
Tian, Z., Shen, C., Chen, H., He, T.: Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE international conference on computer vision, pp. 9627–9636 (2019)
Uijlings, J.R., Van De Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)
Article Google Scholar
Viola, P.A., Jones, M.J.: Rapid object detection using a boosted cascade of simple features. In: 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 511–518 (2001)
Weber, M., Fürst, M., Zöllner, J.M.: Automated focal loss for image based object detection. arXiv preprint arXiv:1904.09048 (2019)
Wei, L., Cui, W., Hu, Z., Sun, H., Hou, S.: A single-shot multi-level feature reused neural network for object detection. Visual Comput. 1–10 (2020)
Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., Madhavan, V., Darrell, T.: Bdd100k: A diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2636–2645 (2020)
Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: 13th European Conference on Computer Vision, pp. 391–405. Springer, Berlin (2014)

Download references

Author information

Authors and Affiliations

School of Electronic and Information Engineering, South China University of Technology, Guangzhou, 510641, China
Guancheng Chen & Huabiao Qin

Authors

Guancheng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Huabiao Qin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huabiao Qin.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, G., Qin, H. Class-discriminative focal loss for extreme imbalanced multiclass object detection towards autonomous driving. Vis Comput 38, 1051–1063 (2022). https://doi.org/10.1007/s00371-021-02067-9

Download citation

Accepted: 07 January 2021
Published: 20 January 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s00371-021-02067-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Class-discriminative focal loss for extreme imbalanced multiclass object detection towards autonomous driving

Abstract

Access this article

Similar content being viewed by others

Application of an Improved Focal Loss in Vehicle Detection

Balanced Loss for Accurate Object Detection

Imbalanced driving scene recognition with class focal loss and data augmentation

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Class-discriminative focal loss for extreme imbalanced multiclass object detection towards autonomous driving

Abstract

Access this article

Similar content being viewed by others

Application of an Improved Focal Loss in Vehicle Detection

Balanced Loss for Accurate Object Detection

Imbalanced driving scene recognition with class focal loss and data augmentation

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation