go back
go back
Volume 17, No. 3
SmartLite: A DBMS-based Serving System for DNN Inference in Resource-constrained Environments
Abstract
Many IoT applications require the use of multiple deep neural networks (DNNs) to perform various tasks on low-cost edge devices with limited computation resources. However, existing DNN model serving platforms, such as TensorFlow Serving and TorchServe, are resource-intensive and require high-performance GPUs that are often not available on low-cost edge devices. In this paper, we propose SmartLite, a lightweight DBMS that addresses these challenges by storing the parameters and structural information of neural networks as database tables and implementing neural network operators inside the DBMS engine. SmartLite quantizes model parameters as binarized values, applies neural pruning techniques to compress the models, and transforms tensor manipulations into value lookup operations of the DBMS to reduce computation overhead. Experimental results show that SmartLite requires 98% less memory while achieving about a 134% performance speedup compared to TorchServe. Our proposed solution addresses the challenges of running multiple DNN models on low-cost edge devices and provides a significant contribution to the field of IoT applications.
PVLDB is part of the VLDB Endowment Inc.
Privacy Policy