research-article

BS-MCVR: Binary-sensing based Mobile-cloud Visual Recognition

Authors:
Hongyi Zheng

The Hong Kong Polytechnic University & DAMO Academy, Alibaba Group, Hong Kong, Hong Kong

The Hong Kong Polytechnic University & DAMO Academy, Alibaba Group, Hong Kong, Hong Kong
View Profile

,
Wangmeng Zuo

Harbin Institute of Technology, Harbin, China

Harbin Institute of Technology, Harbin, China
View Profile

,
Lei Zhang

The Hong Kong Polytechnic University & DAMO Academy, Alibaba Group, Hong Kong, Hong Kong

The Hong Kong Polytechnic University & DAMO Academy, Alibaba Group, Hong Kong, Hong Kong
View Profile

MM '20: Proceedings of the 28th ACM International Conference on MultimediaOctober 2020Pages 1339–1347https://doi.org/10.1145/3394171.3413500

Published:12 October 2020Publication History

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Pages 1339–1347

ABSTRACT

The mobile-cloud based visual recognition (MCVR) system, in which the low-end mobile sensors are deployed to persistently collect and transmit visual data to the cloud for analysis and recognition, is important for visual monitoring applications such as wildfire detection, wildlife monitoring, etc. However, the current MCVR systems are mostly human-perception-oriented, which consume many computational resources and much energy for data sensing as well as much bandwidth for data transmission, limiting their large-scale deployment. In this work, we present a machine-perception-oriented MCVR system, called BS-MCVR, where the mobile end is designed to efficiently sense highly compact and discriminative features directly from the scene, and the sensed features are analyzed on the cloud for recognition. Particularly, the mobile end is designed to operate with completely binary operations and generate fixed-point feature maps. Experiments on benchmark datasets show that our system only needs to transmit 1/200 the amount of original image data without degrading much the recognition accuracy, while it consumes minimal computational cost in the data sensing process. BS-MCVR provides a highly cost-effective solution for deploying MCVR systems at a large-scale.

Supplemental Material

Available for Download

zip

mmfp2014aux.zip (122.8 KB)

The .zip only contains a .pdf of supplemental material. (There is no auxiliary material.)

References

Bryce E Bayer. 1976. Color Imaging Array. (July 1976).Google Scholar
Yoshua Bengio, Nicholas Léonard, and Aaron Courville. 2013. Estimating or Propagating Gradients through Stochastic Neurons for Conditional Computation. arXiv preprint arXiv:1308.3432 (2013).Google Scholar
Yifan Bo and Haiyan Wang. 2011. The Application of Cloud Computing and the Internet of Things in Agriculture and Forestry. In 2011 International Joint Conference on Service Sciences. IEEE, 168--172.Google Scholar
Gary Bradski. 2000. The Opencv Library. Dr Dobb's J. Software Tools, Vol. 25 (2000), 120--125.Google Scholar
Mark Buckler, Suren Jayasuriya, and Adrian Sampson. 2017. Reconfiguring the Imaging Pipeline for Computer Vision. In Proceedings of the IEEE International Conference on Computer Vision. 975--984.Google ScholarCross Ref
Lahiru D Chamain, Sen-ching Samson Cheung, and Zhi Ding. 2019. Quannet: Joint Image Compression and Classification over Channels with Limited Bandwidth. In 2019 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 338--343.Google ScholarCross Ref
Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, and Raquel Urtasun. 2016. Monocular 3d Object Detection for Autonomous Driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2147--2156.Google ScholarCross Ref
Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. Binaryconnect: Training Deep Neural Networks with Binary Weights during Propagations. In Advances in Neural Information Processing Systems. 3123--3131.Google ScholarDigital Library
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A Large-Scale Hierarchical Image Database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. Ieee, 248--255.Google Scholar
Zidong Du, Robert Fasthuber, Tianshi Chen, Paolo Ienne, Ling Li, Tao Luo, Xiaobing Feng, Yunji Chen, and Olivier Temam. 2015. ShiDianNao: Shifting Vision Processing Closer to the Sensor. In ACM SIGARCH Computer Architecture News, Vol. 43. ACM, 92--104.Google ScholarDigital Library
Andy Rosales Elias, Nevena Golubovic, Chandra Krintz, and Rich Wolski. 2017. Where's the Bear?-Automating Wildlife Image Processing Using Iot and Edge Cloud Systems. In 2017 IEEE /ACM Second International Conference on Internet-of-Things Design and Implementation (IoTDI). IEEE, 247--258.Google Scholar
Giaime Ginesu, Maurizio Pintus, and Daniele D Giusto. 2012. Objective Assessment of the WebP Image Coding Algorithm. Signal Processing: Image Communication, Vol. 27, 8 (2012), 867--874.Google ScholarDigital Library
Mehdi Habibzadeh, Mahboobeh Jannesari, Zahra Rezaei, Hossein Baharvand, and Mehdi Totonchi. 2018. Automatic White Blood Cell Classification Using Pre-Trained Deep Learning Models: ResNet and Inception. In Tenth International Conference on Machine Vision (ICMV 2017), Vol. 10696. International Society for Optics and Photonics, 1069612.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google ScholarCross Ref
Mitsuyoshi Hori, Eiji Kawashima, and Tomihiro Yamazaki. 2010. Application of Cloud Computing to Agriculture and Prospects in Other Fields. Fujitsu Sci. Tech. J, Vol. 46, 4 (2010), 446--454.Google Scholar
David A Huffman. 1952. A Method for the Construction of Minimum-Redundancy Codes. Proceedings of the IRE, Vol. 40, 9 (1952), 1098--1101.Google ScholarCross Ref
Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, and Ondrej Chum. 2019. Label Propagation for Deep Semi-Supervised Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5070--5079.Google ScholarCross Ref
Raghuraman Krishnamoorthi. 2018. Quantizing Deep Convolutional Networks for Efficient Inference: A Whitepaper. arXiv preprint arXiv:1806.08342 (2018).Google Scholar
Alex Krizhevsky, Geoffrey Hinton, et almbox. 2009. Learning Multiple Layers of Features from Tiny Images. Technical Report. Citeseer.Google Scholar
Ki Bum Lee, Sejune Cheon, and Chang Ouk Kim. 2017. A Convolutional Neural Network for Fault Classification and Diagnosis in Semiconductor Manufacturing Processes. IEEE Transactions on Semiconductor Manufacturing, Vol. 30, 2 (2017), 135--142.Google ScholarCross Ref
F Li, Y Ma, X Zhang, XW Yu, PF Feng, and MB Zhang. 2015. Research and Design of a Forest Management Mobile Service Cloud Platform for the Natural Forest Protection Project. In Future Communication Technology and Engineering: Proceedings of the 2014 International Conference on Future Communication Technology and Engineering (FCTE 2014), Shenzhen, China, 16-17 November 2014. CRC Press, 139.Google ScholarCross Ref
Zihao Liu, Tao Liu, Wujie Wen, Lei Jiang, Jie Xu, Yanzhi Wang, and Gang Quan. 2018. DeepN -JPEG: A Deep Neural Network Favorable JPEG -Based Image Compression Framework. In Proceedings of the 55th Annual Design Automation Conference. ACM, 18.Google ScholarDigital Library
Ekdeep Singh Lubana, Robert P Dick, Vinayak Aggarwal, and Pyari Mohan Pradhan. 2019. Minimalistic Image Signal Processing for Deep Learning Applications. In 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 4165--4169.Google Scholar
Bradley McDanel, Surat Teerapittayanon, and HT Kung. 2017. Embedded binarized neural networks. arXiv preprint arXiv:1709.02260 (2017).Google Scholar
Fabian Mentzer, Eirikur Agustsson, Michael Tschannen, Radu Timofte, and Luc Van Gool. 2018. Conditional Probability Models for Deep Image Compression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4394--4402.Google ScholarCross Ref
Jeffrey C Mogul, Fred Douglis, Anja Feldmann, and Balachander Krishnamurthy. 1997. Potential Benefits of Delta Encoding and Data Compression for HTTP. In ACM SIGCOMM Computer Communication Review, Vol. 27. ACM, 181--194.Google ScholarDigital Library
Khan Muhammad, Jamil Ahmad, Zhihan Lv, Paolo Bellavista, Po Yang, and Sung Wook Baik. 2018. Efficient Deep CNN -Based Fire Detection and Localization in Video Surveillance Applications. IEEE Transactions on Systems, Man, and Cybernetics: Systems, Vol. 49, 7 (2018), 1419--1434.Google ScholarCross Ref
Guillaume Obozinski, Ben Taskar, and Michael Jordan. 2006. Multi-Task Feature Selection. Statistics Department, UC Berkeley, Tech. Rep, Vol. 2, 2.2 (2006).Google Scholar
Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. Xnor-Net: Imagenet Classification Using Binary Convolutional Neural Networks. In European Conference on Computer Vision. Springer, 525--542.Google Scholar
Crefeda Faviola Rodrigues, Graham Riley, and Mikel Luján. 2017. Fine-Grained Energy Profiling for Deep Convolutional Neural Networks on the Jetson TX1. In 2017 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 114--115.Google ScholarCross Ref
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 234--241.Google Scholar
Anush Sankaran, Mayank Vatsa, Richa Singh, and Angshul Majumdar. 2017. Group Sparse Autoencoder. Image and Vision Computing, Vol. 60 (2017), 64--74.Google ScholarDigital Library
Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung. 2017. Distributed deep neural networks over the cloud, the edge and end devices. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). IEEE, 328--339.Google ScholarCross Ref
Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, et almbox. 2016. Matching Networks for One Shot Learning. In Advances in Neural Information Processing Systems. 3630--3638.Google Scholar
Gregory K Wallace. 1992. The JPEG Still Picture Compression Standard. IEEE transactions on consumer electronics, Vol. 38, 1 (1992), xviii--xxxiv.Google Scholar
Shengke Wang, Qinghong Dong, Lianghua Duan, Yujuan Sun, Muwei Jian, Jianzhong Li, and Junyu Dong. 2019. A Fast Internal Wave Detection Method Based on PCANet for Ocean Monitoring. Journal of Intelligent Systems, Vol. 28, 1 (2019), 103--113.Google ScholarCross Ref
Felix Weber and Reinhard Schütte. 2019. A Domain-Oriented Analysis of the Impact of Machine Learning--the Case of Retailing. Big Data and Cognitive Computing, Vol. 3, 1 (2019), 11.Google ScholarCross Ref
Chyuan-Tyng Wu, Leo F Isikdogan, Sushma Rao, Bhavin Nayak, Timo Gerasimow, Aleksandar Sutic, Liron Ain-kedem, and Gilad Michael. 2019 a. VisionISP: Repurposing the Image Signal Processor for Computer Vision Applications. In 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 4624--4628.Google Scholar
Zifeng Wu, Chunhua Shen, and Anton Van Den Hengel. 2019 b. Wider or Deeper: Revisiting the Resnet Model for Visual Recognition. Pattern Recognition, Vol. 90 (2019), 119--133.Google ScholarCross Ref
Xiaoyuan Yu, Jiangping Wang, Roland Kays, Patrick A Jansen, Tianjiang Wang, and Thomas Huang. 2013. Automated Identification of Animal Species in Camera Trap Images. EURASIP Journal on Image and Video Processing, Vol. 2013, 1 (2013), 52.Google ScholarCross Ref

Index Terms

BS-MCVR: Binary-sensing based Mobile-cloud Visual Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
2. Information systems
  1. Information systems applications
    1. Mobile information processing systems

Recommendations

A Dynamic Network Model of the Color Visual Pathways for Attentive Recognition

A neural network architecture for the segmentation and recognition of colored and textured visual stimuli is presented. The architecture is based on the Boundary Contour System and Feature Contour System (BCS/FCS) of S. Grossberg and E. Mingolla. The ...
Read More
Mesh motion scale invariant feature and collaborative learning for visual recognition

Visual recognition has been gradually played important roles in many fields. An effective feature descriptor, with higher discrimination and higher descriptiveness for the different visual recognition tasks, is a challenging issue. In this paper, we ...
Read More
Minimum Bayes error features for visual recognition

The design of optimal feature sets for visual classification problems is still one of the most challenging topics in the area of computer vision. In this work, we propose a new algorithm that computes optimal features, in the minimum Bayes error sense, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '20: Proceedings of the 28th ACM International Conference on Multimedia
October 2020
4889 pages
ISBN:9781450379885
DOI:10.1145/3394171
General Chairs:
Chang Wen Chen
Chinese University of Hong Kong, Shenzhen, China
,
Rita Cucchiara
UNIMORE, Italy
,
Xian-Sheng Hua
Alibaba Group, China
,
Program Chairs:
Guo-Jun Qi
Futurewei Technologies, USA
,
Elisa Ricci
UNITN & Fondazione Bruno Kessler, Italy
,
Zhengyou Zhang
Tencent, China
,
Roger Zimmermann
National University of Singapore, Singapore
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 October 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
binary-sensing
mobile-cloud system
visual recognition
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 75
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

BS-MCVR: Binary-sensing based Mobile-cloud Visual Recognition

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

A Dynamic Network Model of the Color Visual Pathways for Attentive Recognition

Mesh motion scale invariant feature and collaborative learning for visual recognition

Minimum Bayes error features for visual recognition