Training-aware Low Precision Quantization in Spiking Neural Networks

(1)

Training-aware Low Precision

Quantization in Spiking Neural Networks

Item Type Conference Paper

Authors Shymyrbay, Ayan;Fouda, Mohammed E.;Eltawil, Ahmed

Citation Shymyrbay, A., Fouda, M. E., & Eltawil, A. (2022). Training-aware Low Precision Quantization in Spiking Neural Networks. 2022 56th Asilomar Conference on Signals, Systems, and Computers.

https://doi.org/10.1109/ieeeconf56349.2022.10051957 Eprint version Post-print

DOI 10.1109/IEEECONF56349.2022.10051957

Publisher IEEE

Rights This is an accepted manuscript version of a paper before final publisher editing and formatting. Archived with thanks to IEEE.

Download date 2023-11-27 04:20:46

Link to Item http://hdl.handle.net/10754/690669

(2)

Training-aware Low Precision Quantization in Spiking Neural Networks

Ayan Shymyrbay¹, Mohammed E. Fouda², and Ahmed Eltawil¹

1 Department of ECE, CEMSE Division, King Abdullah University of Science and Technology, Saudi Arabia

2Center for Embedded & Cyber-physical Systems, University of California-Irvine, Irvine, CA, USA 92697-2625.

Abstract—Spiking neural networks (SNNs) have become an attractive alternative to conventional artificial neural networks (ANN) due to their temporal information processing capability, energy efficiency, and high biological plausibility. Yet, their computational and memory costs still restrict them from being widely deployed on portable devices. The quantization of SNNs, which converts the full-precision synaptic weights into low-bit versions, emerged as one of the solutions. The development of quantization techniques is far more advanced in the ANN domain compared to the SNN domain. In this work, we utilize the concept of one of the promising ANN quantization methods called Learned Step Size Quantization (LSQ) to adapt to SNN.

Furthermore, we extend the mentioned technique for binary quantization of SNNs. Our analysis shows that the proposed method for SNN quantization yields a negligible drop in accuracy and a significant reduction in the needed memory.

Index Terms—Spiking Neural Networks, Quantization, Bina- rization, Edge Computing

I. INTRODUCTION

Spiking neural networks (SNNs) recently became a major research field interesting for both machine learning enthusiasts and neuromorphic hardware engineers due to biological plausibility, temporal data processing, and energy efficiency [1], [2]. These properties make SNNs more suitable to overcome the obstacles faced by the deployment of deep neural networks (DNNs) on edge devices for real-world artificial intelligence (AI) applications [3]. Additionally, recent advancements in event-based audio and vision sensors have opened doors for many applications, and SNNs are viewed as one of the effective tools to process such information [3].

Increasing the size of neural networks (NN) is usually cor- related with improved performance, but it also means that both training and inference times are going to be longer, as well as more memory usage. While choosing the right NN model can greatly impact resource utilization, there are additional things one can do to alleviate disadvantages associated with the deployment of large models on edge devices. Quantization of SNN models is one of the NN compression techniques, where the number of bits, to represent synaptic weights, is reduced [4]. Storing and operating on reduced bit weights allows for significantly improved memory savings and power efficiency.

Quantization methods can be broadly classified into two classes: post-train quantization (PTQ) and quantization-aware training (QAT). As their names suggest, PTQ quantizes model parameters after NN training, and QAT - during NN training.

PTQ is suitable for applications where speed and simplicity of

quantization are preferred to the extreme bit precision reduction, while QAT shall be chosen for low-precision quantization with negligible accuracy degradation [5].

Learned Step Size Quantization (LSQ) is one of the ANN QAT quantization techniques showing best-to-date performance [6]. Although it provides competitive results in ANN quantization, its performance in other NN types such as SNNs has not been studied. Due to the benefits of using SNNs and quantization together, this paper aims to provide the findings on the performance of LSQ for low-precision quantization.

The original LSQ method supports only unsigned binary quantization. Not having signed binarization deteriorates the model performance as it acts as pruning weight connections in the network. We address this limitation of LSQ by proposing a modification to the quantization function.

The main contributions of this paper are:

1) We present the framework for quantizing the SNN model by using the LSQ as a quantization method. This framework utilizes a leaky integrate-and-fire (LIF) neuron model and surrogate gradient method for training.

2) We propose a modification of the LSQ method for signed binary quantization.

3) We evaluate the quantization method on popular bench- mark datasets such as CIFAR10-DVS, DVS128 Gesture, N-Caltech101, and N-MNIST. We find that the presented quantization method exceeds the state-of-the-art accuracy and memory savings in all listed datasets.

The paper is organized as follows: Section II describes the LSQ method and our modification to support signed binary quantization, Section III provides the experimental setup and discusses the obtained results, Section IV summarizes the work and suggests the possible directions to extend the analysis.

II. PROPOSEDMETHOD

In this work, the learned step-size quantization (LSQ) method [6] is used for SNN quantization. LSQ quantizes the NN weights during training by using a non-differentiable quantization function. Since it is not feasible to calculate gradients of non-differentiable functions, LSQ uses a straight- through estimator to approximate the gradients to train ANNs by the gradient descent method. At inference, the quantized model uses low-precision integer value weights which reduces memory utilization and computational overhead.

A difference between ANN and SNN quantization is the notion of time. Inputs to ANN are static, meanwhile, the

(3)

operation of SNN is based on dynamic inputs as a function of time. This difference defines how the neurons are activated - for ANN it is a function of only spatial information, while for SNN it is a function of both spatial and temporal information.

Therefore, traditional ANN training methods do not apply to SNN. The surrogate gradient method is one of the common SNN training methods, which replaces a spike function with a differentiable surrogate analog in backward propagation to calculate gradients [7]. Thus, training quantized SNN models using LSQ requires a two-step gradient approximation. In the forward pass, LSQ uses the following quantization function:

¯

w=⌊clip(^w/s,−QN, QP)⌉ (1) ˆ

w= ¯w×s. (2)

Here w is a weight to be quantized, s is the quantizer step size, QP and QN are the numbers of positive and negative quantization levels respectively. w¯ is a resulting quantized weight at an integer scale, and wˆ is a quantized weight at the same scale asw. A function clip(^w/s,−QN, Q_P)sets the values of^w/sbelow−Q_N to−Q_N, and values aboveQ_P to Q_P. A roundfunction ⌊·⌉ rounds the number to the nearest integer value.

For the quantization withbbits, Q_N = 0andQ_P = 2^b−1 for unsigned weights, andQN = 2^b−1andQP = 2^b−1−1for signed weights. Here it must be noted that in original work [6], binary quantization is only feasible for unsigned levels (i.e.

QN = 0andQP = 1). To support signed binary quantization, we propose replacing roundfunction in (1) bysignfunction as shown below:

¯

w=sign(w) (3)

This allows using signed binary quantization levels withQ_N = Q_P = 1.

Transitions between quantization levels are controlled by the step size s, which is a learned parameter during the training process. Since the quantization function is not differentiable, LSQ uses the following straight-through estimator for gradient approximation to learn sbased on training loss:

∂wˆ

∂s =







−^w/s+⌊^w/s⌉ if−QN <^w/s< QP

−QN if^w/s≤ −QN

QP if^w/s≥QP

(4)

As in [6], the step size loss is multiplied by a gradient scale,g=¹/^√NWQP, where NW is the number of weights in a network layer. It is needed to control the scale of the update depending on the amount of precision, e.g. smaller step size is needed for higher precision, and larger step size for smaller precision to achieve better convergence.

For binarization, a straight-through estimator in (4) is modified as following since aroundfunction is replaced by asign function:

TABLE I

SNNNETWORK STRUCTURE FOR EACH DATASET.

Dataset Network

CIFAR10-DVS 5×(128Conv3-MP2) - 512Dense100 - AP10 DVS128 Gesture 5×(128Conv3-MP2) - 512Dense110 - AP10 N-Caltech101 5×(128Conv3-MP2) - 512Dense1010 - AP10 N-MNIST 5×(128Conv3-MP2) - 512Dense100 - AP10

∂wˆ

∂s =







−^w/s+sign(w) if−1<^w/s<1

−1 if^w/s≤ −1 1 if^w/s≥1

(5)

During training, the full precision weights are stored and updated, and quantized weights are used for the forward and backward passes. To update the weights, the following straight-through estimator is utilized:

∂wˆ

∂w =

(1 if−QN <^w/s< QP

0 otherwise, (6)

III. EXPERIMENTAL DATA AND RESULTS

We built a platform for SNN quantization using PyTorch and SpikingJelly [8] package for working with GPU-accelerated spiking neuron models.

A. Datasets

The performance of quantized SNN models is evaluated on four datasets, which are commonly used as a bench- mark: CIFAR10-DVS, DVS128 Gesture, N-Caltech101, and N-MNIST. These datasets are formed by the data represented in address event representation (AER) format. Similar to [2], the event-to-frame integrating method is used to pre-process the datasets. In this method, a large number of event data is split into slices, and each slice is integrated into a single frame.

For N-MNIST and DVS128 Gesture, the event data is sliced into 10 frames, and for N-Caltech101 and CIFAR10-DVS 16 frames.

B. Network structure

The leaky integrate-and-fire (LIF) neuron model is used as it is one of the most widely used mathematical models employed to accurately mimic the biological behavior of the human brain [9], [10], [2]. We use the SNN network model formulated by [2], which consists of a spiking encoder network and the classifier network. The spiking encoder network consists of a convolutional layer, spiking neurons, and a pooling layer. The classifier network consists of a fully connected dense layer and spiking neurons. Similarly to [2], we use a voting layer, represented as an average pooling layer with a window size equal to 10, after the output spiking layer to improve robustness. For all datasets, we set convolutional layer parameters the same: kernel size = 3, stride = 1, and padding = 1. The network structures used for each dataset are summarized in Table I.

(4)

Fig. 1. Top-1 accuracies for different bit precision SNN models on DVS128 Gesture, N-MNIST, CIFAR10-DVS, and N-Caltech101 datasets.

Fig. 2. Accuracy drop and compression ratio of quantized SNN models on DVS128 Gesture, N-MNIST, CIFAR10-DVS, and N-Caltech101 datasets.

We also evaluate the performance of LSQ compared to other available quantization methods by exploiting the same SNN model as used in corresponding literature such as [11]

for DVS128 Gesture and [12] for N-MNIST dataset. These works are chosen for comparison because they achieve the highest quantized model accuracies. By that, we can minimize the influence of a network structure and emphasizes the efficiency of the quantization method. Corresponding SNN model configurations can be found in Table II.

C. Training

We use 1, 2, 4, and 8-bit precision quantization for all layers of SNN models. For all experiments, the Adam optimizer with a momentum of 0.9 and a learning rate of 0.001 is used. As a learning rate scheduler, we utilize a cosine annealing learning rate with the maximum number of iterations set to 64. The membrane time constant,τ, for all spiking neurons is set to 2.

Networks are trained from scratch using mean squared error (MSE) loss.

TABLE II

SNNNETWORK STRUCTURES ADOPTED FROM[11]FORDVS128 GESTURE AND[12]FORN-MNIST.

Dataset Network

DVS128 Gesture 16Conv5-AP2-32Conv5-AP2-8800Dense11 N-MNIST 64Conv7-MP2-128Conv7-128Conv7-MP2-Dense11

D. Results

This section presents the obtained results according to the experimental setup. First, test accuracies are evaluated for different bit-precision SNN models and datasets. Secondly, the accuracy drop – memory savings trade-off is presented for each dataset. Finally, the performance of the proposed quantization method is compared to the state-of-the-art SNN models, both full-precision and quantized ones.

1) Accuracy degradation: Fig. 1 illustrates the test accuracy results obtained for a certain combination of bit-precision value and dataset. The blue bar corresponds to the non- quantized, full-precision model and hence serves as a baseline to which other accuracies are compared.

For the DVS128 Gesture dataset, the full-precision SNN model achieves96.53%. The 8-bit quantized SNN model does not encounter any accuracy drop compared to the full-precision baseline. There is a slight drop in accuracy (0.01%) for the 2- bit quantized model, while for the 4-bit version the degradation is higher (0.12%). The binarized SNN model encounters the highest accuracy drop of0.7%.

Similar results are obtained for the N-MNIST dataset, where the 8-bit quantized model has identical accuracy as the full- precision counterpart,99.65%. Accuracy degradation increases as the precision decrease: for 4-bit model -0.01%, 2-bit model -0.06%, binary model -0.15%.

For CIFAR10-DVS, the full-precision SNN model achieves 68.4%. The closest performance is obtained by the 4-bit

(5)

TABLE III

PERFORMANCE COMPARISON BETWEEN THE PROPOSED METHOD AND THE STATE-OF-THE-ART METHODS.

Dataset Reference Training Method Quantization Method Weights Model Size (MB) Accuracy (%)

CIFAR10-DVS

[13] STDP - 32-bit 392 60.50

[14] Tandem learning - 32-bit 416 58.65

[15] STDP - 32-bit 53 60.30

This work Spike-based BP modified LSQ 32-bit 6.46 68.4

1-bit 0.21 68.1

DVS128

[16] DECOLLE Integer Quantization 32-bit - 88.89

2-bit - 76.04

[11] Spike-based BP Threshold annealing 32-bit - 92.87

1-bit - 91.32

[17] CNN-to-SNN + EEDN Deterministic rounding 2-bit 38 91.77

This work SGM modified LSQ 32-bit 6.48 96.53

1-bit 0.21 95.83

N-Caltech101

[18] BPTT - 32-bit 1.7 71.20

[19] SALT - 32-bit - 55.00

This work Spike-based BP modified LSQ 32-bit 13 80.48

1-bit 0.41 76.81

N-MNIST

[12] DECOLLE Stochastic rounding 32-bit 4.84 98.10

4-bit 0.61 81.10

[13] STDP - 32-bit 392 99.53

[14] Tandem learning - 32-bit 22 99.31

[15] STDP - 32-bit 53 99.42

This work SGM modified LSQ 32-bit 4.84 99.65

1-bit 0.15 99.5

model, which has an accuracy degradation of 0.05%. The 8- bit model encounters 0.2% drop in accuracy, while 2-bit and binary models have the same degradation, 0.3%.

Performance degradation of quantized SNN models is more apparent on the N-Caltech101 dataset, where baseline test accuracy is 80.48%. 8-bit, 4-bit, 2-bit, and 1-bit quantized models encounters accuracy drops of 0.63%, 0.97%,1.15%, and6.94%, respectively.

2) Accuracy drop - memory savings trade-off: Fig. 2 demonstrates the amount of accuracy drop to obtain a certain reduction in model memory size. It can be seen that substantial memory space can be saved by exploiting a low-bit precision version of the same model. Around 31 times memory savings can be achieved by losing less than0.3%accuracy drop on N- MNIST and CIFAR10-DVS datasets. While the same memory compression is achieved by accuracy degradation of0.7%on the DVS128 Gesture dataset. For N-Caltech101, model size can be reduced by about 32 times with a performance drop of about 3.67%.

3) Comparison with other works: Table III illustrates how the studied method compares with other full-precision and quantized SNN implementations. It shows the summary of learning rules and quantization methodology used in recent

research work, as well as highlights the bit precision, memory size, and accuracy.

To the best of our knowledge, there is no study on quantized SNN models tested on CIFAR10-DVS and N-Caltech101 datasets. Hence, obtained results by quantized SNN models are compared with best-to-date full-precision models in other literature. It can be seen that both the 32-bit and 1-bit models in this work significantly outperform other works in terms of accuracy on CIFAR10-DVS and N-Caltech101 datasets.

Binarized model, while requiring almost 31 times less model size, still shows better performance than full-precision models, that have large model sizes, as reported in other literature.

On the DVS128 Gesture dataset, there are three quantization techniques available in the literature. Two of them, such as [16] and [17], reduce the bit precision down to 2 bits, while the third one, [11], binarizes the model. By using the SNN architecture in Table I, the full-precision model achieved in this work shows higher accuracy than models reported in the literature. Consequently, the binarized model outperforms the quantization methods shown in the Table III both by accuracy and memory savings. By using the SNN model structure from [11] (shown in Table II) to compare the performance of quantization functions, test accuracies for full-precision and

(6)

1-bit quantized SNN models are slightly lower than the ones reported in [11] by 0.86%and 0.35% respectively. However, the accuracy drop between the binarized model and its full- precision counterpart is 1.5 times smaller in our work than in [11].

Having SNN structure as shown in Table I, the full-precision SNN model outperforms other available full-precision models [12], [13], [14], and [20], while the binary model is on par with full-precision models on N-MNIST dataset. By having the same SNN structure as in [12] (shown in Table II), binarized and 4-bit precision models outperform the 4-bit one from [12]

significantly,99.55%(binary) and99.58%(4-bit) in this work against 81.1% (4-bit) in [12]. At the same time, our full- precision model shows an accuracy that is1.55%higher than the one reported in [12]. Also, the accuracy drop between full- precision and quantized models is significantly lower in this work.

IV. CONCLUSION

In this work, we study the performance of the QAT ANN quantization method, called LSQ, on quantizing SNN models.

Also, we propose a modification to this method to support signed binarization to reduce weight precision further without a significant performance drop. The main evaluation metrics used are memory savings and accuracy drop of quantized SNN models compared to full precision counterparts. The following results are obtained using a modified LSQ method for SNN model binarization: accuracy degradations on CIFAR10-DVS, DVS128 Gesture, N-MNIST, and N-Caltech101 are 0.3%, 0.7%, 0.15%, 3.67% respectively (as compared with full- precision counterparts) while providing up to 31× memory savings. The simulations conducted in this work allow us to gain a deeper understanding of truly efficient SNN quantization methods. We conclude that SNN models can be compressed extremely without significant loss in accuracy.

REFERENCES

[1] M. Bouvier, A. Valentian, T. Mesquida, F. Rummens, M. Reyboz, E. Vianello, and E. Beigne, “Spiking Neural Networks Hardware Im- plementations and Challenges: A Survey,”ACM Journal on Emerging Technologies in Computing Systems, vol. 15, pp. 1–35, June 2019.

[2] W. Fang, Z. Yu, Y. Chen, T. Masquelier, T. Huang, and Y. Tian, “Incor- porating Learnable Membrane Time Constant to Enhance Learning of Spiking Neural Networks,”arXiv:2007.05785 [cs], Aug. 2021. arXiv:

2007.05785.

[3] M. Pfeiffer and T. Pfeil, “Deep learning with spiking neurons: Oppor- tunities and challenges,”Frontiers in Neuroscience, vol. 12, 2018.

[4] G. Menghani, “Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better,”arXiv:2106.08962 [cs], June 2021. arXiv: 2106.08962.

[5] O. Weng, “Neural Network Quantization for Efficient Inference: A Survey,”arXiv:2112.06126 [cs], Dec. 2021. arXiv: 2112.06126.

[6] S. K. Esser, J. L. McKinstry, D. Bablani, R. Appuswamy, and D. S.

Modha, “Learned step size quantization,” 2019.

[7] E. O. Neftci, H. Mostafa, and F. Zenke, “Surrogate Gradient Learning in Spiking Neural Networks: Bringing the Power of Gradient-Based Optimization to Spiking Neural Networks,” IEEE Signal Processing Magazine, vol. 36, pp. 51–63, Nov. 2019.

[8] W. Fang, Y. Chen, J. Ding, D. Chen, Z. Yu, H. Zhou, and Y. Tian,

“SpikingJelly,” 2020.

[9] W. Gerstner, W. M. Kistler, R. Naud, and L. Paninski, Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition.

Cambridge: Cambridge University Press, 2014.

[10] A. Abusnaina and R. Abdullah, “Spiking neuron models: A review,”In- ternational Journal of Digital Content Technology and its Applications, vol. 8, pp. 14–21, 06 2014.

[11] J. K. Eshraghian and W. D. Lu, “The fine line between dead neurons and sparsity in binarized spiking neural networks,”arXiv:2201.11915 [cs, q-bio], Jan. 2022. arXiv: 2201.11915.

[12] H. W. Lui and E. Neftci, “Hessian Aware Quantization of Spiking Neural Networks,” inInternational Conference on Neuromorphic Systems 2021, (Knoxville TN USA), pp. 1–5, ACM, July 2021.

[13] Y. Wu, L. Deng, G. Li, J. Zhu, Y. Xie, and L. Shi, “Direct Training for Spiking Neural Networks: Faster, Larger, Better,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 1311–1318, July 2019.

[14] J. Wu, Y. Chua, M. Zhang, G. Li, H. Li, and K. C. Tan, “A Tandem Learning Rule for Effective Training and Rapid Inference of Deep Spiking Neural Networks,”IEEE Transactions on Neural Networks and Learning Systems, pp. 1–15, 2021.

[15] L. Deng, Y. Wu, X. Hu, L. Liang, Y. Ding, G. Li, G. Zhao, P. Li, and Y. Xie, “Rethinking the performance comparison between SNNS and ANNS,”Neural Networks, vol. 121, pp. 294–307, Jan. 2020.

[16] C. J. Schaefer and S. Joshi, “Quantizing Spiking Neural Networks with Integers,” inInternational Conference on Neuromorphic Systems 2020, (Oak Ridge TN USA), pp. 1–8, ACM, July 2020.

[17] A. Amir, B. Taba, D. Berg, T. Melano, J. McKinstry, C. Di Nolfo, T. Nayak, A. Andreopoulos, G. Garreau, M. Mendoza, J. Kusnitz, M. Debole, S. Esser, T. Delbruck, M. Flickner, and D. Modha, “A Low Power, Fully Event-Based Gesture Recognition System,” in2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Honolulu, HI), pp. 7388–7397, IEEE, July 2017.

[18] X. She, S. Dash, and S. Mukhopadhyay, “Sequence approximation using feedforward spiking neural network for spatiotemporal learning: Theory and optimization methods,” in International Conference on Learning Representations, 2022.

[19] Y. Kim and P. Panda, “Optimizing Deeper Spiking Neural Networks for Dynamic Vision Sensing,”Neural Networks, vol. 144, pp. 686–698, Dec. 2021.

[20] L. Deng, G. Li, S. Han, L. Shi, and Y. Xie, “Model compression and hardware acceleration for neural networks: A comprehensive survey,”

Proceedings of the IEEE, vol. 108, no. 4, pp. 485–532, 2020.