SegFormer Quantization Part 1 Short Intro and Reason
Purpose
- Quantization to reduce SPACE and it’s effect on TIME and model quality
- Research different quantization schemes on pre-trained models
- Using HuggingFace built-in or custom functions
- If HuggingFace is insufficient use PyTorch or TensorFlow Hubs
- If all fall back to using low level PyTorch
What
- Overcoming and recording difficulties along the way
- From PoC to MVP
- As generic as possible using jupytext and papermill
How
To come
- Using PyTorch quantization capabilities like
quant/dequant
-Layers, dtype.qint32
, quantize_fx
, QConfigMapping
- Task specific distribution of w/b, act and grad
- Use learned task specific distribution as initialisation