This Agent will analyze the image and query, and determine whether the resulting classification is of the class βyesβ or βnoβ.
Initialization
Parameters
The BinaryChoiceAgent is initialized with two arguments:
BinaryChoiceAgent(query, model(Optional))
Specifies the text prompt that will be used to analyze the image.
Represents the model used to perform the binary classification task. If left unspecified, this parameter defaults to the QwenVL() Model. The supported MultimodalLLM models can be found below: Show Supported MultimodalLLMs
Supports gpt-4-turbo , gpt-4o .
Supports claude-3-opus-20240229 , claude-3-haiku-20240307 , claude-3-sonnet-20240229 .
Supports gemini-pro-vision .
Example
BinaryChoiceAgent( "Is this person wearing a hardhat?" )
The output from this agent will be an ExecutionNode containing the previous image and the classification result (βyesβ or βnoβ).