This is a list of readings that I found useful and informative when I started working on mobile deep learning in 2017. The list is by no means comprehensive as this subfield is moving very fast! If you have any recommendations, please feel free to open an issue or send me a pull request.
-
Bianco, S. et al. 2018. Benchmark Analysis of Representative Deep Neural Network Architectures. IEEE Access. 6, (2018). --- This is one of the comprehensive performance measurement studies on a lot of image classification models.
-
Guo, T. 2018. Cloud-Based or On-Device: An Empirical Study of Mobile Deep Inference. 2018 IEEE International Conference on Cloud Engineering (IC2E'18). --- this paper looks at how mobile applications can effectively use deep learning models; either leveraging more powerful networked servers or directly using mobile resources.
-
Hanhirova, J. et al. 2018. Latency and throughput characterization of convolutional neural networks for mobile computer vision. Proceedings of the 9th ACM Multimedia Systems Conference (MMSys'18).
-
Xu, M. et al. 2019. A First Look at Deep Learning Apps on Smartphones. The World Wide Web Conference (WWW'19). --- This is one of the first large-scale empirical study that looks at more than 16K Android apps in order to understand how smartphone apps exploit deep learning in the wild.
-
Wu, C.-J. et al. 2019. Machine Learning at Facebook: Understanding Inference at the Edge. 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA'19).
-
Zhang, C. et al. 2019. MArk: Exploiting Cloud Services for Cost-Effective, SLO-Aware Machine Learning Inference Serving. 2019 USENIX Annual Technical Conference (USENIX ATC'19). --- This paper has a nice comparison of running inference services using different cloud services including IaaS and FaaS.
-
LeMay, M. et al. 2020. PERSEUS: Characterizing Performance and Cost of Multi-Tenant Serving for CNN Models. 2020 IEEE International Conference on Cloud Engineering (IC2E'20)
-
Liang, Q. et al. 2020. AI on the Edge: Characterizing AI-based IoT Applications Using Specialized Edge Architectures. 2020 IEEE International Symposium on Workload Characterization (IISWC).