Image Agents
Binary Choice Agent
The BinaryChoiceAgent
performs a binary classification.
This Agent will analyze the image and query, and determine whether the resulting classification is of the class βyesβ or βnoβ.
Initialization
Parameters
The BinaryChoiceAgent
is initialized with two arguments:
query
String
requiredSpecifies the text prompt that will be used to analyze the image.
model
MultimodalLLM
Represents the model used to perform the binary classification task.
If left unspecified, this parameter defaults to the QwenVL()
Model. The supported MultimodalLLM
models can be found below:
Example
example.py
The output from this agent will be an ExecutionNode containing the previous image and the classification result (βyesβ or βnoβ).