A Serverless Quantization-as-a-Service Model to Run Compression Jobs for Edge Intelligence

IRIS

Edge computing's rise demands efficient compression strategies for deploying Machine Learning (ML) models on resource-constrained devices. As Artificial Intelligence (AI) shifts from cloud to edge, optimizing models across heterogeneous layers is crucial. Quantization reduces numerical precision, improving model size, inference speed and energy efficiency, key for edge deployments. However, its complexity limits accessibility. To address this, we propose Quantization-as-a-Service (QaaS), a serverless framework that automates model quantization for both cloud and edge environments. Built on OpenFaaS and Kubernetes, QaaS enables on-demand execution with dynamic resource orchestration, implementing Layer 5 of Edge Intelligence (EI). Our evaluation compares quantization performance on edge devices in terms of CPU usage and execution time when performed as a service versus locally. Results demonstrate that deploying quantization workflows using Function-as-a-Service (FaaS) not only maintains computational efficiency but also reduces CPU consumption compared to standalone execution, showcasing the potential of serverless solutions in EI.

A Serverless Quantization-as-a-Service Model to Run Compression Jobs for Edge Intelligence

De Novi, Danny^{Primo

Software};Dell'Acqua, Pierluigi^{Secondo

Methodology};Carnevale, Lorenzo^{Conceptualization};Fazio, Maria^{Penultimo

Validation};Villari, Massimo^{Ultimo

Supervision}

2025-01-01

Abstract

Edge computing's rise demands efficient compression strategies for deploying Machine Learning (ML) models on resource-constrained devices. As Artificial Intelligence (AI) shifts from cloud to edge, optimizing models across heterogeneous layers is crucial. Quantization reduces numerical precision, improving model size, inference speed and energy efficiency, key for edge deployments. However, its complexity limits accessibility. To address this, we propose Quantization-as-a-Service (QaaS), a serverless framework that automates model quantization for both cloud and edge environments. Built on OpenFaaS and Kubernetes, QaaS enables on-demand execution with dynamic resource orchestration, implementing Layer 5 of Edge Intelligence (EI). Our evaluation compares quantization performance on edge devices in terms of CPU usage and execution time when performed as a service versus locally. Results demonstrate that deploying quantization workflows using Function-as-a-Service (FaaS) not only maintains computational efficiency but also reduces CPU consumption compared to standalone execution, showcasing the potential of serverless solutions in EI.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2025
			
	Lingue
	
				Inglese
			
	Titolo del volume
	
				2025 IEEE Symposium on Computers and Communications (ISCC)
			
	Editore
	
				Institute of Electrical and Electronics Engineers Inc.
			
	Città Editore
	
				New York
			
	Nazionalità Editore
	
				STATI UNITI D'AMERICA
			
	Formato
	
				ELETTRONICO
			
	Su Invito
	
				no
			
	Da pagina
	
				1
			
	A pagina
	
				7
			
	Numero di Pagine
	
				7
			
	Titolo del Convegno
	
				30th IEEE Symposium on Computers and Communications, ISCC 2025
			
	Luogo del Convegno
	
				ita
			
	Periodo del Convegno
	
				2025
			
	Rilevanza del Convegno
	
				Internazionale
			
	DOI
	
				https://dx.doi.org/10.1109/iscc65549.2025.11326460
			
	Pubblicazione ISI
	
				sì
			
	Codice Scopus
	
				2-s2.0-105032740966
			
	Parole Chiave
	
				Edge Intelligence; FaaS; Quantization; Serverless
			
	Presenza di coautori internazionali
	
				no
			
	Fulltext
	
				none
			
	Tutti gli autori
	
						De Novi, Danny; Dell'Acqua, Pierluigi; Carnevale, Lorenzo; Fazio, Maria; Villari, Massimo
					
	Numero autori
	
				5
			
	Tipologia
	
				14.d Contributo in Atti di Convegno::14.d.3 Contributi in extenso in Atti di convegno
			
	Tipologia sito docente
	
				273
			
	Tipologia
	
				info:eu-repo/semantics/conferenceObject
			
	Appare nelle tipologie:
	
				14.d.3 Contributi in extenso in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11570/3353929

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

ND

social impact