⚠️ This repository is not maintained actively. Checkout our survey paper on efficient LLM and the corresponding paper list.

Edge-AI-Paper-List

Target venues: system conferences (OSDI/SOSP/ATC/EuroSys/ASPLOS), network conferences (NSDI/SIGCOMM) and mobile conferences (MobiCom/MobiSys/SenSys/UbiComp).

We will keep maintaining this list :)

Note: Edge here refers to resource-constrained devices, not edge servers; AI here mostly refers to deep learning.

Attention: we are maintaining a dedicated paper list for resource-efficient LLM algorithms/systems.

Smartphones

2023

[ASPLOS'23] TelaMalloc: Efficient On-Chip Memory Allocation for Production Machine Learning Accelerators

2022

[MobiSys'22] FabToys: Plush Toys with Large Arrays of Fabric-based Pressure Sensors to Enable Fine-grained Interaction Detection
[MobiSys'22] Floo: Automatic, Lightweight Memoization for Faster Mobile Apps
[MobiCom'22] A-Mash: Providing Single-App Illusion for Multi-App Use through User-centric UI Mashup
[MobiCom'22] Tutti: Coupling 5G RAN and Edge Computing for Latency-critical Video Analytics

2021

[MobiCom'21] AsyMo: scalable and efficient deep-learning inference on asymmetric mobile CPUs
[MobiCom'21] Elf: accelerate high-resolution mobile deep vision with content-aware parallel offloading
[MobiCom'21] UltraSE: single-channel speech enhancement using ultrasound
[MobiCom'21] Experience: a five-year retrospective of MobileInsight
[MobiCom'21] LegoDNN: block-grained scaling of deep neural networks for mobile vision
[MobiSys'21] Tap: an app framework for dynamically composable mobile systems
[MobiSys'21] zTT: learning-based DVFS with zero thermal throttling for mobile devices
[ATC'21] Octo: INT8 Training with Loss-aware Compensation and Backward Quantization for Tiny On-device Learning

2020

[MobiCom'20] Deep Learning Based Wireless Localization for Indoor Navigation
[MobiCom'20] SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud
[MobiCom'20] Heimdall: Mobile GPU Coordination Platform for Augmented Reality Applications
[MobiCom'20] NEMO: Enabling Neural-enhanced Video Streaming on Commodity Mobile Devices
[MobiCom'20] OnRL: Improving Mobile Video Telephony via Online Reinforcement Learning
[ASPLOS'20] PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning
[MobiSys'20] Deep Compressive Offloading: Speeding up Neural Network Inference by Trading Edge Computation for Network Latency
[MobiSys'20] Fast and scalable In-memory Deep Multitask Learning via Neural Weight Virtualization
[MobiSys'20] MDLdroidLite: A Release-and-inhibit Control Approach to Resource-efficient Deep Neural Networks on Mobile Devices
[MobiSys'20] RF-net: A Unified Meta-learning Framework for RF-enabled One-shot Human Activity Recognition
[SenSys'20] MobiPose: real-time multi-person pose estimation on mobile devices

2019 and before

[MobiCom'19] RNN-Based Room Scale Hand Motion Tracking
[MobiCom'19] MobiSR: Efficient On-Device Super-Resolution through Heterogeneous Mobile Processors
[EuroSys'19] µLayer: Low Latency On-Device Inference Using Cooperative Single-Layer Acceleration and Processor-Friendly Quantization
[SenSys'19] DeepAPP: A Deep Reinforcement Learning Framework for Mobile Application Usage Prediction
[MobiCom'18] DeepCache: Principled Cache for Mobile Deep Vision
[MobiCom'18] NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision
[MobiCom'18] FoggyCache: Cross-Device Approximate Computation Reuse
[MobiSys'18]On-Demand Deep Model Compression for Mobile Devices: A Usage-Driven Model Selection Framework
[MobiSys'18]FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices
[MobiSys'17] Accelerating Mobile Audio Sensing Algorithms through On-Chip GPU Offloading
[MobiSys'17] MobileDeepPill: A Small-Footprint Mobile Deep Learning System for Recognizing Unconstrained Pill Images
[MobiSys'17] DeepEye: Resource Efficient Local Execution of Multiple Deep Vision Models using Wearable Commodity Hardware
[MobiSys'17] DeepMon: Building Mobile GPU Deep Learning Models for Continuous Vision Applications
[ASPLOS'17] Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge
[Ubicomp'16] SpotGarbage: Smartphone App to Detect Garbage Using Deep Learning
[Ubicomp'15] DeepEar: robust smartphone audio sensing in unconstrained acoustic environments using deep learning

AR/VR

[MobiCom'23] AccuMO: Accuracy-Centric Multitask Offloading in Edge-Assisted Mobile Augmented Reality
[MobiCom'22] SalientVR: saliency-driven mobile 360-degree video streaming with gaze information
[MobiCom'21] Face-Mic: inferring live speech and speaker identity via subtle facial dynamics captured by AR/VR motion sensors
[MobiSys'21] Xihe: a 3D vision-based lighting estimation framework for mobile augmented reality
[MobiSys'21] LensCap: split-process framework for fine-grained visual privacy control for augmented reality apps
[ASPLOS'20] Coterie: Exploiting Frame Similarity to Enable High-Quality Multiplayer VR on Commodity Mobile Devices
[MobiCom'19] Edge Assisted Real-time Object Detection for Mobile Augmented Reality
[EuroSys'19] Transparent AR Processing Acceleration at the Edge
[ASPLOS'21] Q-VR: System-Level Design for Future Collaborative Virtual Reality Rendering
[ATC'20] Firefly: Untethered Multi-user VR for Commodity Mobile Devices

IoTs

2023

[MobiCom'23] Re-thinking computation offload for efficient inference on IoT devices with duty-cycled radios
[NSDI'23] Gemel: Model Merging for Memory-Efficient, Real-Time Video Analytics at the Edge
[ASPLOS'23] Space-Efficient TREC for Enabling Deep Learning on Microcontrollers
[ASPLOS'23] STI: Turbocharge NLP Inference at the Edge via Elastic Pipelining
[ASPLOS'23] HuffDuff: Stealing Pruned DNNs from Sparse Accelerators
[HPCA'23] GROW: A Row-Stationary Sparse-Dense GEMM Accelerator for Memory-Efficient Graph Convolutional Neural Networks
[HPCA'23] Mix-GEMM: An efficient HW-SW Architecture for Mixed-Precision Quantized Deep Neural Networks Inference on Edge Devices
[HPCA'23] FlowGNN: A Dataflow Architecture for Real-Time Workload-Agnostic Graph Neural Network Inference
[ISCA'23] Inter-layer Scheduling Space Definition and Exploration for Tiled Accelerators

2022

[MobiSys'22] TEO: Ephemeral Ownership for IoT Devices to Provide Granular Data Control
[MobiSys'22] TinyNet: a Lightweight, Modular, and Unified Network Architecture for the Internet of Things
[MobiSys'22] Bringing WebAssembly to Resource-constrained IoT Devices for Seamless Device-Cloud Integration
[MobiCom'22] RetroIoT: Retrofitting Internet of Things Deployments by Hiding Data in Battery Readings
[Mobisys'22] DeepMix: Mobility-aware, Lightweight, and Hybrid 3D Object Detection for Headsets
[ATC'22] CoVA: Exploiting Compressed-Domain Analysis to Accelerate Video Analytics
[EuroSys'22] LiteReconfig: Cost and Content Aware Reconfiguration of Video Object Detection Systems for Mobile GPUs
[SenSys'22] AutoMatch: Leveraging Traffic Camera to Improve Perception and Localization of Autonomous Vehicles
[NeurIPS'22] On-Device Training Under 256KB Memory

2021

[ATC'21] Video Analytics with Zero-streaming Cameras
[ATC'21] Fine-tuning giant neural networks on commodity hardware with automatic pipeline model parallelism
[ATC'21] Palleon: A Runtime System for Efficient Video Processing toward Dynamic Class Skew
[ASPLOS'21] Rhythmic Pixel Regions: Visual sensing architecture for flexible spatiotemporal resolution towards high-precision visual computing at low power
[NSDI'21] AIRCODE: Hidden Screen-Camera Communication on an Invisible and Inaudible Dual Channel
[NSDI'21] MAVL: Multiresolution Analysis of Voice Localization

2020

[MobiSys'20] Approximate Query Service on Autonomous IoT Cameras
[MobiSys'20] EMO: Real-time Emotion Recognition From Single-eye Images for Resource-constrained Eyewear Devices
[MobiCom'20] CLIO: Enabling Automatic Compilation of Deep Learning Pipelines Across IoT and Cloud
[MobiCom'20] EagleEye: Wearable Camera-based Person Identification in Crowded Urban Spaces
[SigComm'20] Reducto: On-Camera Filtering for Resource-Efficient Real-Time Video Analytics
[EuroSys'20] Balancing efficiency and fairness in heterogeneous GPU clusters for deep learning
[OSDI'20] A Unified Architecture for Accelerating Distributed DNN Training in Heterogeneous GPU/CPU Clusters
[OSDI'20] PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications
[OSDI'20] Serving DNNs like Clockwork: Performance Predictability from the Bottom Up

2019 and before

[MobiCom'19] Source Compression with Bounded DNN Perception Loss for IoT Edge Computer Vision
[SenSys'19] Neuro.ZERO: A Zero-energy Neural Network Accelerator for Embedded Sensing and Inference Systems
[Ubicomp'19] Performance Characterization of Deep Learning Models for Breathing-based Authentication on Resource-Constrained Devices
[ASPLOS'18] SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing
[SenSys'17] DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework
[MobiSys'17] Glimpse: A Programmable Early-Discard Camera Architecture for Continuous Mobile Vision
[Ubicomp'17] Low-resource Multi-task Audio Sensing for Mobile and Embedded Devices via Shared Deep Neural Network Representations
[MobiSys'16] MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints
[SenSys'16] Sparsification and Separation of Deep Learning Layers for Constrained Resource Inference on Wearables

Energy-harvested devices

[MobiCom'23] LUT-NN: Empower Efficient Neural Network Inference with Centroid Learning and Table Lookup
[MobiCom'23] AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge Environments
[MobiSys'20] Approximate Query Service on Autonomous IoT Cameras
[SenSys'20] Ember: Energy Management of Batteryless Event Detection Sensors with Deep Reinforcement Learning
[ASPLOS'19] Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems
[ASPLOS'21] Quantifying the Design-Space Tradeoffs in Autonomous Drones
[ASPLOS'21] Rhythmic Pixel Regions: Visual sensing architecture for flexible spatiotemporal resolution towards high-precision visual computing at low power
[ATC'22] PilotFish: Harvesting Free Cycles of Cloud Gaming with Deep Learning Training

Privacy&Security

2023

[MobiCom'23] Efficient Federated Learning for Modern NLP
[MobiCom'23] Federated Few-shot Learning for Mobile NLP
[MobiCom'23] Enc2: Privacy-Preserving Inference for Tiny IoTs via Encoding and Encryption
[MobiCom'23] AutoFed: Heterogeneity-Aware Federated Multimodal Learning for Robust Autonomous Driving

2022

[MobiCom'22] Audio-domain Position-independent Backdoor Attack via Unnoticeable Triggers
[MobiCom'22] Sifter: Protecting Security-Critical Kernel Modules in Android through Attack Surface Reduction
[ASPLOS'22] Eavesdropping User Credentials via GPU Side Channels on Smartphones
[NSDI'22] Privid: Practical, Privacy-Preserving Video Analytics Queries
[EuroSys'22] Minimum Viable Device Drivers for ARM TrustZone
[ATC'22] PRIDWEN: Universally Hardening SGX Programs via Load-Time Synthesis
[ATC'22] HyperEnclave: An Open and Cross-platform Trusted Execution Environment
[OSDI'22] BlackBox: A Container Security Monitor for Protecting Containers on Untrusted Operating Systems
[OSDI'22] Blockaid: Data Access Policy Enforcement for Web Applications

2021

[MobiCom'21] PECAM: privacy-enhanced video streaming and analytics via securely-reversible transformation
[MobiSys'21] SafetyNOT: on the usage of the SafetyNet attestation API in Android
[MobiSys'21] Rushmore: securely displaying static and animated images using TrustZone
[OSDI'21] Privacy Budget Scheduling
[OSDI'21] Addra: Metadata-private voice communication over fully untrusted infrastructure
[OSDI'21] MAGE: Nearly Zero-Cost Virtual Memory for Secure Computation (Awarded Best Paper!)
[OSDI'21] Zeph: Cryptographic Enforcement of End-to-End Data Privacy

2020 and before

[MobiCom'20] FaceRevelio: A Face Liveness Detection System for Smartphones with A Single Front Camera
[ASPLOS'20] DNNGuard: An Elastic Heterogeneous DNN Accelerator Architecture against Adversarial Attacks
[Ubicomp'20] Countering Acoustic Adversarial Attacks in Microphone-equipped mart Home Devices
[Ubicomp'19] DeepType: On-Device Deep Learning for Input Personalization Service with Minimal Privacy Concern
[Ubicomp'19] Keyboard Snooping from Mobile Phone Arrays with Mixed Convolutional and Recurrent Neural Networks
[MobiCom'19] Occlumency: Privacy-preserving Remote Deep-learning Inference Using SGX
[EuroSys'19] Forward and Backward Private Searchable Encryption with SGX
[SOSP'19] Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform
[SOSP'19] Honeycrisp: Large-scale Differentially Private Aggregation Without a Trusted Core
[SOSP'19] Yodel: Strong Metadata Security for Voice Calls

Learning

Strikethrough indicates that these papers may have nothing to do with mobile

2023

[ICLR'23] MocoSFL: enabling cross-client collaborative self-supervised learning
[EuroSys'23] REFL: Resource-Efficient Federated Learning
[NSDI'23] FLASH: Towards a High-performance Hardware Acceleration Architecture for Cross-silo Federated Learning
[NSDI'23]RECL: Responsive Resource-Efficient Continuous Learning for Video Analytics

2022

[MICRO'22] GCD2: A Globally Optimizing Compiler for Mapping DNNs to Mobile DSPs
[MobiSys'22] mGEMM: Low-latency Convolution with Minimal Memory Overhead Optimized for Mobile Devices
[MobiSys'22] Band: Coordinated Multi-DNN Inference on Heterogeneous Mobile Processors
[MobiSys'22] CoDL: Efficient CPU-GPU Co-execution for Deep Learning Inference on Mobile Devices
[MobiSys'22] FedBalancer: Data and Pace Control for Efficient Federated Learning on Heterogeneous Clients
[MobiSys'22] Memory-efficient DNN Training on Mobile Devices
[MobiSys'22] Melon: Breaking the Memory Wall for Resource-Efficient On-Device Machine Learning
[MobiCom'22] Real-time Neural Network Inference on Extremely Weak Devices: Agile Offloading with Explainable AI
[MobiCom'22] Romou: Rapidly Generate High-Performance Tensor Kernels for Mobile GPUs
[MobiCom'22] InFi: end-to-end learnable input filter for resource-efficient mobile-centric inference
[MobiCom'22] PyramidFL: A Fine-grained Client Selection Framework for Efficient Federated Learning
[MobiCom'22] Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading
[MobiCom'22] NeuLens: Spatial-based Dynamic Acceleration of Convolutional Neural Networks on Edge
[MobiCom'22] RF-URL: Unsupervised Representation Learning for RF Sensing
[MobiCom'22] Cosmo: Contrastive Fusion Learning with Small Data for Multimodal Human Activity Recognition
[SenSys'22] BlastNet: Exploiting Duo-Blocks for Cross-Processor Real-Time DNN Inference
[SenSys'22] PriMask: Cascadable and Collusion-Resilient Data Masking for Mobile Cloud Inference
[UbiComp'22] Context-Aware Compilation of DNN Training Pipelines across Edge and Cloud
[OSDI'22] Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning
[ATC'22] Campo: Cost-Aware Performance Optimization for Mixed-Precision Neural Network Training
[ATC'22] SOTER: Guarding Black-box Inference for General Neural Networks at the Edge
[EuroSys'22] Varuna: Scalable, Low-cost Training of Massive Deep Learning Models (Best Paper Award)

2021

[MobiCom'21] Hermes: an efficient federated learning framework for heterogeneous mobile clients
[MobiSys'21] PPFL: privacy-preserving federated learning with trusted execution environments
[MobiSys'21] ClusterFL: a similarity-aware federated learning system for human activity recognition
[MobiSys'21] nn-Meter: towards accurate latency prediction of deep-learning model inference on diverse edge devices
[SenSys'21] FedDL: Federated Learning via Dynamic Layer Sharing for Human Activity Recognition
[SenSys'21] Mercury: Efficient On-Device Distributed DNN Training via Stochastic Importance Sampling
[SenSys'21] FedMask: Joint Computation and Communication-Efficient Personalized Federated Learning via Heterogeneous Masking
[NSDI'21] Mistify: Automating DNN Model Porting for On-Device Inference at the Edge
[OSDI'21] Oort: Efficient Federated Learning via Guided Participant Selection
[ATC'21] Jump-Starting Multivariate Time Series Anomaly Detection for Online Service Systems
[ATC'21] Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training

2020 and before

[MobiCom'20] Billion-scale Federated Learning on Mobile Clients: a submodel design with tunable privacy
[OSDI'20] A Tensor Compiler for Unified Machine Learning Prediction Serving
[SenSys'19] MetaSense: Few-shot Adaptation to Untrained Conditions in Deep Mobile Sensing
[UbiComp'18] DeepType: On-Device Deep Learning for Input Personalization Service with Minimal Privacy Concern

Another awesome paper list about Federated Learning