Currently Instructor Agents only work with OpenAI models.

Initialization

Parameters

The InstructorImageAgent is initialized with two arguments:

InstructorImageAgent(response_model, model(Optional))
response_model
BaseModel
required

Specifies the structure of the data that the OpenAI API is expected to return. This structured model helps in parsing and validating the data returned from the API, ensuring it adheres to the expected format.

This parameter expects a class that is a subclass of BaseModel from the pydantic library.

model
Optional[String]

Specifies the model identifier used by the OpenAI API to process the image. The default value is "gpt-4o".

You can find a comprehensive list of all OpenAI API models here.

Example

Here is an example of the InstructorImageAgent designed for a Workflow to detect if workers are wearing personal protective equipment (PPE).

Step 1. Define a response_model.

example.py
class PPE(BaseModel):
    hardhat: bool

The PPE class has a single attribute hardhat that is a boolean, indicating whether a hardhat is present or not.

Step 2. Use InstructorImageAgent.

example.py
InstructorImageAgent(response_model=PPE)

The InstructorImageAgent processes images and extracts structured data based on the PPE model. The output from this agent will be an ExecutionNode containing the original image and a Pydantic object that may look like

{
  "hardhat": true
}