Binary Choice Agent

On this page

Initialization
Parameters
Example

This Agent will analyze the image and query, and determine whether the resulting classification is of the class “yes” or “no”.

Initialization

Parameters

The BinaryChoiceAgent is initialized with two arguments:

BinaryChoiceAgent(query, model(Optional))

query

String

required

Specifies the text prompt that will be used to analyze the image.

model

MultimodalLLM

Represents the model used to perform the binary classification task.If left unspecified, this parameter defaults to the QwenVL() Model. The supported MultimodalLLM models can be found below:

Show Supported MultimodalLLMs

QwenVL()

MultimodalLLM (Default)

Supports qwen-vl-chat .

GPT4Vision()

MultimodalLLM

Supports gpt-4-turbo , gpt-4o .

Claude()

MultimodalLLM

Supports claude-3-opus-20240229 , claude-3-haiku-20240307 , claude-3-sonnet-20240229 .

Gemini()

MultimodalLLM

Supports gemini-pro-vision .

Example

example.py

BinaryChoiceAgent("Is this person wearing a hardhat?")

The output from this agent will be an ExecutionNode containing the previous image and the classification result (“yes” or “no”).

Bounding Box Select Agent Classification Agent

Getting Started

Types

Agents

Models

Examples

Binary Choice Agent

Initialization

Parameters

Example

Getting Started

Types

Agents

Models

Examples

​Initialization

​Parameters

​Example

Initialization

Parameters

Example