research-article

The Case for Hierarchical Deep Learning Inference at the Network Edge

Authors:
Ghina Al-Atat

IMDEA Networks Institute, Madrid, Spain

IMDEA Networks Institute, Madrid, Spain

https://orcid.org/0009-0007-2451-6780
View Profile

,
Andrea Fresa

IMDEA Networks Institute, Madrid, Spain

IMDEA Networks Institute, Madrid, Spain

https://orcid.org/0000-0002-2849-5151
View Profile

,
Adarsh Prasad Behera

IMDEA Networks Institute, Madrid, Spain

IMDEA Networks Institute, Madrid, Spain

https://orcid.org/0000-0001-7220-5353
View Profile

,
Vishnu Narayanan Moothedath

KTH Royal Institute of Technology, Stockholm, Sweden

KTH Royal Institute of Technology, Stockholm, Sweden

https://orcid.org/0000-0002-2739-5060
View Profile

,
James Gross

KTH Royal Institute of Technology, Stockholm, Sweden

KTH Royal Institute of Technology, Stockholm, Sweden

https://orcid.org/0000-0001-6682-6559
View Profile

,
Jaya Prakash Champati

IMDEA Networks Institute, Madrid, Spain

IMDEA Networks Institute, Madrid, Spain

https://orcid.org/0000-0002-5127-8497
View Profile

NetAISys '23: Proceedings of the 1st International Workshop on Networked AI SystemsJune 2023Article No.: 3Pages 1–6https://doi.org/10.1145/3597062.3597278

Published:21 June 2023Publication History

NetAISys '23: Proceedings of the 1st International Workshop on Networked AI Systems

Pages 1–6

ABSTRACT

Resource-constrained Edge Devices (EDs), e.g., IoT sensors and microcontroller units, are expected to make intelligent decisions using Deep Learning (DL) inference at the edge of the network. Toward this end, developing tinyML models is an area of active research - DL models with reduced computation and memory storage requirements - that can be embedded on these devices. However, tinyML models have lower inference accuracy. On a different front, DNN partitioning and inference offloading techniques were studied for distributed DL inference between EDs and Edge Servers (ESs). In this paper, we explore Hierarchical Inference (HI), a novel approach proposed in [19] for performing distributed DL inference at the edge. Under HI, for each data sample, an ED first uses a local algorithm (e.g., a tinyML model) for inference. Depending on the application, if the inference provided by the local algorithm is incorrect or further assistance is required from large DL models on edge or cloud, only then the ED offloads the data sample. At the outset, HI seems infeasible as the ED, in general, cannot know if the local inference is sufficient or not. Nevertheless, we present the feasibility of implementing HI for image classification applications. We demonstrate its benefits using quantitative analysis and show that HI provides a better trade-off between offloading cost, throughput, and inference accuracy compared to alternate approaches.

References

Ghina Al-Atat, Andrea Fresa, Adarsh P. Behera, Vishnu N. Moothedath, James Gross, and Jaya P. Champati. 2023. The Case for Hierarchical Deep Learning Inference at the Network Edge. arXiv:2304.11763 [cs.DC]Google Scholar
Ying Cui, Bixia Tang, Gangao Wu, Lun Li, Xin Zhang, Zhenglin Du, and Wenming Zhao. 2023. Classification of dog breeds using convolutional neural network models and support vector machine. bioRxiv (2023). Google ScholarCross Ref
Lei Deng, Guoqi Li, Song Han, Luping Shi, and Yuan Xie. 2020. Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey. Proc. IEEE 108, 4 (2020), 485--532. Google ScholarCross Ref
Chongwu Dong, Sheng Hu, Xi Chen, and Wushao Wen. 2021. Joint optimization with DNN partitioning and resource allocation in mobile edge computing. IEEE Transactions on Network and Service Management 18, 4 (2021), 3973--3986.Google ScholarCross Ref
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proc. ICLR.Google Scholar
Colby Banbury et al. 2021. MLPerf Tiny Benchmark. In Proc. Neural Information Processing Systems Datasets and Benchmarks Track (Round 1).Google Scholar
Igor Fedorov, Ryan P. Adams, Matthew Mattina, and Paul N. Whatmough. 2019. SpArSe: Sparse architecture search for CNNs on resource-constrained microcontrollers. Advances in Neural Information Processing Systems 32 (2019).Google Scholar
Andrea Fresa and Jaya P. Champati. 2022. An Offloading Algorithm for Maximizing Inference Accuracy on Edge Device in an Edge Intelligence System. In Proc. ACM MSWiM. 15--23.Google Scholar
Song Han, Huizi Mao, and William J. Dally. 2016. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding. In Proc. ICLR.Google Scholar
Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han. 2018. AMC: AutoML for Model Compression and Acceleration on Mobile Devices. In Proc. ECCV. 815--832.Google ScholarDigital Library
Chuang Hu, Wei Bao, Dan Wang, and Fengming Liu. 2019. Dynamic Adaptive DNN Surgery for Inference Acceleration on the Edge. In Proc. IEEE INFOCOM. 1423--1431. Google ScholarDigital Library
Chenghao Hu and Baochun Li. 2022. Distributed Inference with Deep Learning Models across Heterogeneous Edge Devices. In Proc. IEEE INFOCOM. 330--339. Google ScholarDigital Library
Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5MB model size.Google Scholar
Yiping Kang, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, and Lingjia Tang. 2017. Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge. In Proc. ACM ASPLOS. 615--629. Google ScholarDigital Library
Yiping Kang, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, and Lingjia Tang. 2017. Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge. SIGARCH Comput. Archit. News 45, 1 (apr 2017), 615--629. Google ScholarDigital Library
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Proc. NIPS. 1097--1105.Google ScholarDigital Library
En Li, Liekang Zeng, Zhi Zhou, and Xu Chen. 2020. Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing. IEEE Transactions on Wireless Communications 19, 1 (2020), 447--457. Google ScholarCross Ref
Pavel Mach and Zdenek Becvar. 2017. Mobile Edge Computing: A Survey on Architecture and Computation Offloading. IEEE Communications Surveys Tutorials 19, 3 (2017), 1628--1656.Google ScholarDigital Library
Vishnu N. Moothedath, Jaya P. Champati, and James Gross. 2023. Online Algorithms for Hierarchical Inference in Deep Learning applications at the Edge. arXiv:2304.00891Google Scholar
Ivana Nikoloska and Nikola Zlatanov. 2021. Data Selection Scheme for Energy Efficient Supervised Learning at IoT Nodes. IEEE Communications Letters 25, 3 (2021), 859--863. Google ScholarCross Ref
Emil Njor, Jan Madsen, and Xenofon Fafoutis. 2022. A Primer for tinyML Predictive Maintenance: Input and Model Optimisation. In Proc. Artificial Intelligence Applications and Innovations. 67--78.Google ScholarCross Ref
Julius Ruseckas. n.d.. EfficientNet on CIFAR10. https://juliusruseckas.github.io/ml/efficientnet-cifar10.html.Google Scholar
Ramon Sanchez-Iborra and Antonio F. Skarmeta. 2020. TinyML-Enabled Frugal Smart Objects: Challenges and Opportunities. IEEE Circuits and Systems Magazine 20, 3 (2020), 4--18.Google ScholarCross Ref
Mark Sandler, Andrew G. Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proc. IEEE CVPR. 4510--4520.Google ScholarCross Ref
Mingxing Tan and Quoc V. Le. 2019. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proc. ICML, Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.), Vol. 97. PMLR, 6105--6114.Google Scholar
Surat Teerapittayanon, Bradley McDanel, and H.T. Kung. 2016. BranchyNet: Fast inference via early exiting from deep neural networks. In Proc. ICPR. 2464--2469.Google Scholar
Yundong Zhang, Naveen Suda, Liangzhen Lai, and Vikas Chandra. 2017. Hello Edge: Keyword Spotting on Microcontrollers. CoRR abs/1711.07128 (2017).Google Scholar

Index Terms

The Case for Hierarchical Deep Learning Inference at the Network Edge
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence

Recommendations

Improved Decision Module Selection for Hierarchical Inference in Resource-Constrained Edge Devices
ACM MobiCom '23: Proceedings of the 29th Annual International Conference on Mobile Computing and Networking

The Hierarchical Inference (HI) paradigm has recently emerged as an effective method for balancing inference accuracy, data processing, transmission throughput, and offloading cost. This approach proves particularly efficient in scenarios involving ...
Read More
Deep active inference

This work combines the free energy principle and the ensuing active inference dynamics with recent advances in variational inference in deep generative models, and evolution strategies to introduce the "deep active inference" agent. This agent minimises ...
Read More
Distributing deep learning inference on edge devices
CoNEXT '20: Proceedings of the 16th International Conference on emerging Networking EXperiments and Technologies

Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs) are widely used in IoT related applications. However, inferencing pre-trained large DNNs and CNNs consumes a significant amount of time, memory and computational resources. This makes ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
NetAISys '23: Proceedings of the 1st International Workshop on Networked AI Systems
June 2023
43 pages
ISBN:9798400702129
DOI:10.1145/3597062
Co-chairs:
Roberto Morabito
University of Helsinki
,
Kwang Taik Kim
Purdue University
Copyright © 2023 Owner/Author(s)
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 June 2023
Check for updates
Author Tags
edge computing
deep learning
hierarchical inference
Qualifiers
- research-article
Conference
Upcoming Conference
MOBISYS '24

Sponsor:

sigmobile

The 22nd Annual International Conference on Mobile Systems, Applications and Services

June 3 - 7, 2024

Minato-ku, Tokyo , Japan
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 137
  Total Downloads
- Downloads (Last 12 months)137
- Downloads (Last 6 weeks)22
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The Case for Hierarchical Deep Learning Inference at the Network Edge

NetAISys '23: Proceedings of the 1st International Workshop on Networked AI Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Improved Decision Module Selection for Hierarchical Inference in Resource-Constrained Edge Devices

Deep active inference

Distributing deep learning inference on edge devices

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

The Case for Hierarchical Deep Learning Inference at the Network Edge

NetAISys '23: Proceedings of the 1st International Workshop on Networked AI Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Improved Decision Module Selection for Hierarchical Inference in Resource-Constrained Edge Devices

Deep active inference

Distributing deep learning inference on edge devices

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media