Extras
🤯 Extra Installations
Install extra packages to speed up Overeasy
Extras are currently only supported on Linux
Overeasy
currently supports int4
quantization for running models like QwenVL
.
To use these models, you will need to install AutoGPTQ
in a performant manner. Make sure to build the relevant CUDA
extensions!
In our example Colab, we install AutoGPTQ
but we also install a prebuilt wheel so things are a bit easier.
You can install AutoGPTQ
from pip using the provided instructions.
Alternatively, you can install AutoGPTQ
from source:
Source Install
Note: Building the source install can take around 20 minutes.
After installing, you can use the int4
quantized model like this, as long as you have over 11GB of VRAM: