Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Support images directly in UserMessage #387

Merged
merged 48 commits into from
Jan 6, 2025

Conversation

jackmpcollins
Copy link
Owner

@jackmpcollins jackmpcollins commented Dec 3, 2024

  • Enable UserMessage to contain text and image parts. Introduce new types ImageBytes, ImageUrl to identify these.
  • Deprecate UserImageMessage
  • Update docs to use UserMessage instead of UserImageMessage

Possible breaking changes

  • UserMessage type hint needs to be changed to UserMessage[Any] if strict type checking

Example

from magentic import chatprompt, ImageUrl, Placeholder, UserMessage


IMAGE_URL_WOODEN_BOARDWALK = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"


@chatprompt(
    UserMessage(
        [
            "Describe the following image in one sentence.",
            Placeholder(ImageUrl, "image_url"),
        ]
    ),
)
def describe_image(image_url: str) -> str: ...


describe_image(IMAGE_URL_WOODEN_BOARDWALK)
# 'A wooden boardwalk meanders through lush green wetlands under a partly cloudy blue sky.'

This more closely aligns magentic with provider APIs

openai
https://platform.openai.com/docs/guides/vision?lang=python

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
          },
        },
      ],
    }
  ],
  max_tokens=300,
)

print(response.choices[0])

anthropic
https://docs.anthropic.com/en/docs/build-with-claude/vision#about-the-prompt-examples

import anthropic

client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image1_media_type,
                        "data": image1_data,
                    },
                },
                {
                    "type": "text",
                    "text": "Describe this image."
                }
            ],
        }
    ],
)
print(message)

@jackmpcollins jackmpcollins self-assigned this Dec 3, 2024
@jackmpcollins jackmpcollins changed the title Support for multi-part UserMessage Support images directly in UserMessage Jan 6, 2025
@jackmpcollins jackmpcollins marked this pull request as ready for review January 6, 2025 02:19
@jackmpcollins jackmpcollins merged commit 0cb9e7b into main Jan 6, 2025
1 check passed
@jackmpcollins jackmpcollins deleted the allow-images-in-user-message branch January 6, 2025 02:19
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant