tLLM
Building Energy-Optimal Systems for On-Device LLM Inference
The goal of this project is to build an energy-optimal inference system for LLMs on mobile devices. We are currently developing a novel DVFS (Dynamic Voltage-Frequency Scaling) governor that efficiently controls core and memory frequencies to minimize the energy consumption of LLM inference, while ensuring that the latency requirements of applications using the LLM are not violated.
The details of the governor and system will be unveiled after our paper submission to ACM MobiSys 2026. Stay tuned!