Initialization

Parameters

The VisionPromptAgent is initialized with two arguments:

VisionPromptAgent(query, model)
query
string
required

The prompt used to analyze the image.

Here are a few illustrative examples:

Given an image of an X-ray scan, query = "Is there a fracture in the bone?"

Given a frame of security camera footage, query = "Identify any suspicious activities or individuals in this security camera footage."

Given a photo of a manufactured product, query = "Inspect this product for any defects or irregularities."

model
MultimodalLLM
required

The selected model. All supported MultimodalLLM models can be found below:

Example

Here is an example of the VisionPromptAgent designed for a Workflow to detect damage on an egg.

example.py
VisionPromptAgent("Is the egg damaged in anyway?",  model=GPT4Vision())