🤯 Extra Installations

Extras are currently only supported on Linux

Overeasy currently supports int4 quantization for running models like QwenVL. To use these models, you will need to install AutoGPTQ in a performant manner. Make sure to build the relevant CUDA extensions! In our example Colab, we install AutoGPTQ but we also install a prebuilt wheel so things are a bit easier. You can install AutoGPTQ from pip using the provided instructions. Alternatively, you can install AutoGPTQ from source:

Source Install

!pip install optimum tiktoken gekko einops transformers_stream_generator accelerate
!pip install git+https://github.com/AutoGPTQ/AutoGPTQ@v0.7.1

Note: Building the source install can take around 20 minutes.

After installing, you can use the int4 quantized model like this, as long as you have over 11GB of VRAM:

from overeasy import QwenVL
model = QwenVL("int4")
model.load_resources()
response = model.prompt("What is the capital of the California?")

Getting Started

Types

Agents

Models

Examples

🤯 Extra Installations