#计算机科学#Running large language models on a single GPU for throughput-oriented scenarios.
#大语言模型#Run Mixtral-8x7B models in Colab or consumer desktops
PyTorch native quantization and sparsity for training and inference
A QoE-Oriented Computation Offloading Algorithm based on Deep Reinforcement Learning (DRL) for Mobile Edge Computing (MEC) | This algorithm captures the dynamics of the MEC environment by integrating ...
LLM Inference on consumer devices
dpdk infrastructure for software acceleration. Currently working on RX and ACL pre-filter
DPU-Powered File System Virtualization over virtio-fs
A Dynamic Programming Offloading Algorithm for Mobile Cloud Computing
LeapIO: Efficient and Portable Virtual NVMe Storage on ARM SoCs (ASPLOS'20)
A framework for IoT devices to offload tasks to the cloud, resulting in efficient computation and decreased cloud costs.
A lightweight framework that enables serverless users to reduce their bills by harvesting non-serverless compute resources such as their VMs, on-premise servers, or personal computers.
Monero hardware wallet protocol implementation for Trezor, agent
#安卓#The container-based cloud platform for mobile code offloading
Code for paper "Real-time Neural Network Inference on Extremely Weak Devices: Agile Offloading with Explainable AI" (MobiCom'22)
Monero wallet Trezor integration documentation
Implementation of the RTSS'23 Best Student Paper Award paper Progressive Neural Compression for Adaptive Image Offloading under Timing Constraints