SegFormer Quantization Part 1 Short Intro and Reason

SegFormer Quantization, Part 1 Introduction and Reason

Purpose

  • Quantization to reduce SPACE and it’s effect on TIME and model quality
  • Research different quantization schemes on pre-trained models
  • Using HuggingFace built-in or custom functions
  • If HuggingFace is insufficient use PyTorch or TensorFlow Hubs
  • If all fall back to using low level PyTorch

What

  • Overcoming and recording difficulties along the way
  • From PoC to MVP
  • As generic as possible using jupytext and papermill

How

To come

  • Using PyTorch quantization capabilities like quant/dequant-Layers, dtype.qint32, quantize_fx, QConfigMapping
  • Task specific distribution of w/b, act and grad
  • Use learned task specific distribution as initialisation
Read More