This Agent will analyze the image and query, and determine whether the resulting classification is of the class β€œyes” or β€œno”.

Initialization

Parameters

The BinaryChoiceAgent is initialized with two arguments:

BinaryChoiceAgent(query, model(Optional))
query
String
required

Specifies the text prompt that will be used to analyze the image.

model
MultimodalLLM

Represents the model used to perform the binary classification task.

If left unspecified, this parameter defaults to the QwenVL() Model. The supported MultimodalLLM models can be found below:

Example

example.py
BinaryChoiceAgent("Is this person wearing a hardhat?")

The output from this agent will be an ExecutionNode containing the previous image and the classification result (β€œyes” or β€œno”).