Skip to content

Image support? #85

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
nahco314 opened this issue Mar 12, 2025 · 3 comments
Closed

Image support? #85

nahco314 opened this issue Mar 12, 2025 · 3 comments
Labels
question Question about using the SDK

Comments

@nahco314
Copy link

Question

Is there any support for images?

It would be great if we could input images that exist locally (or at a URL) and then ask questions about them or analyse them.

@nahco314 nahco314 added the question Question about using the SDK label Mar 12, 2025
@rm-openai
Copy link
Collaborator

Yes, you can upload images. The input parameter accepts items of type EasyInputMessageParam or Message (from the openAI SDK), both of which allow ResponseInputImageParam in their content list. You can specify a base64 or fully qualified URL there.

Feel free to reopen for followups!

@omidsrezai
Copy link

omidsrezai commented Apr 1, 2025

Could you share a quick code snippet for this?

@DanieleMorotti
Copy link
Contributor

Hi, you can check the issue #159 .
Otherwise, if you prefer to use ResponseInputImageParam, this is another example:

from agents import Agent, Runner, ModelSettings
from openai.types.responses import ResponseInputImageParam, ResponseInputTextParam
from openai.types.responses.response_input_item_param import Message
import base64
import asyncio

# Function to encode the image
def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

# Path to your image
image_path = "test_img.jpeg"

# Getting the Base64 string
base64_image = encode_image(image_path)

agent = Agent(
    name="Assistant",
    model="gpt-4o",
    model_settings=ModelSettings(temperature=0.4, max_tokens=1024),
    instructions="Given an input image you will generate the description of the image in the style specified by the user."
)

async def main():
    result = await Runner.run(agent, input=[
        Message(
            content=[
                ResponseInputTextParam(text="Describe this image with an haiku", type="input_text"),
                ResponseInputImageParam(detail="low", image_url=f"data:image/jpeg;base64,{base64_image}", type="input_image")
            ],
            role="user"
        )
    ])
    print(result.final_output)

asyncio.run(main())

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
question Question about using the SDK
Projects
None yet
Development

No branches or pull requests

4 participants