Skip to Main Content (Press Enter)

Logo UNIMI
  • ×
  • Home
  • Persone
  • Attività
  • Ambiti
  • Strutture
  • Pubblicazioni
  • Terza Missione

Expertise & Skills
Logo UNIMI

|

Expertise & Skills

unimi.it
  • ×
  • Home
  • Persone
  • Attività
  • Ambiti
  • Strutture
  • Pubblicazioni
  • Terza Missione
  1. Pubblicazioni

Efficient and Compact Representations of Deep Neural Networks via Entropy Coding

Articolo
Data di Pubblicazione:
2023
Citazione:
Efficient and Compact Representations of Deep Neural Networks via Entropy Coding / G. Cataldo Marinò, F. Furia, D. Malchiodi, M. Frasca. - In: IEEE ACCESS. - ISSN 2169-3536. - 11:(2023 Oct 03), pp. 106103-106125. [10.1109/ACCESS.2023.3317293]
Abstract:
Matrix operations are nowadays central in many Machine Learning techniques, including in particular Deep Neural Networks (DNNs), whose core of any inference is represented by a sequence of dot product operations. An increasingly emerging problem is how to efficiently engineer their storage and operations. In this article we propose two new lossless compression schemes for real-valued matrices, supporting efficient vector-matrix multiplications in the compressed format, and specifically suitable for DNNs compression. Exploiting several recent studies that use weight pruning and quantization techniques to reduce the complexity of DNN inference, our schemes are expressly designed to benefit from both, that is from input matrices characterized by low entropy. In particular, our solutions are able to take advantage from the depth of the model, and the deeper the model, the higher the efficiency. Moreover, we derived space upper bounds for both variants in terms of the source entropy. Experiments show that our tools favourably compare in terms of energy and space efficiency against state-of-the-art matrix compression approaches, including Compressed Linear Algebra (CLA) and Compressed Shared Elements Row (CSER), the latter explicitly proposed in the context of DNN compression.
Tipologia IRIS:
01 - Articolo su periodico
Keywords:
Neural network compression; space-conscious data structures; weight pruning; weight quantization; source coding; sparse matrices;
Elenco autori:
G. Cataldo Marinò, F. Furia, D. Malchiodi, M. Frasca
Autori di Ateneo:
FRASCA MARCO ( autore )
FURIA FLAVIO ( autore )
MALCHIODI DARIO ( autore )
Link alla scheda completa:
https://air.unimi.it/handle/2434/1012789
Link al Full Text:
https://air.unimi.it/retrieve/handle/2434/1012789/2314623/Efficient_and_Compact_Representations_of_Deep_Neural_Networks_via_Entropy_Coding.pdf
Progetto:
Multi-criteria optimized data structures: from compressed indexes to learned indexes, and beyond
  • Aree Di Ricerca

Aree Di Ricerca

Settori


Settore INF/01 - Informatica
  • Informazioni
  • Assistenza
  • Accessibilità
  • Privacy
  • Utilizzo dei cookie
  • Note legali

Realizzato con VIVO | Progettato da Cineca | 25.11.5.0