Initialization

Parameters

The VisionPromptAgent is initialized with two arguments:
VisionPromptAgent(query, model)
query
string
required
The prompt used to analyze the image.Here are a few illustrative examples:
Given an image of an X-ray scan, query = "Is there a fracture in the bone?" Given a frame of security camera footage, query = "Identify any suspicious activities or individuals in this security camera footage." Given a photo of a manufactured product, query = "Inspect this product for any defects or irregularities."
model
MultimodalLLM
required
The selected model. All supported MultimodalLLM models can be found below:

Example

Here is an example of the VisionPromptAgent designed for a Workflow to detect damage on an egg.
example.py
VisionPromptAgent("Is the egg damaged in anyway?",  model=GPT4Vision())