Image Agents
Vision Prompt Agent
The VisionPromptAgent
handles image-based queries.
Initialization
Parameters
The VisionPromptAgent
is initialized with two arguments:
The prompt used to analyze the image.
Here are a few illustrative examples:
Given an image of an X-ray scan,
query = "Is there a fracture in the bone?"
Given a frame of security camera footage,
query = "Identify any suspicious activities or individuals in this security camera footage."
Given a photo of a manufactured product,
query = "Inspect this product for any defects or irregularities."
The selected model. All supported MultimodalLLM
models can be found below:
Example
Here is an example of the VisionPromptAgent
designed for a Workflow to detect damage on an egg.
example.py