Skip to content

Conversation

marco-agile
Copy link
Contributor

πŸ“ Summary

This pull request introduces two new optional fields to the AgentDescriptor schema:
β€’ supportedInput: defines which input types the agent can handle.
β€’ supportedOutput: defines which output types the agent can produce.

Both fields accept a checklist of the following values:
β€’ text
β€’ images
β€’ video
β€’ files

πŸ“Œ Motivation

These additions aim to make agent capabilities more explicit and machine-readable, enabling better discoverability, filtering, and compatibility validation across agentic systems.

πŸ”§ Changes
β€’ Added supportedInput and supportedOutput as arrays of strings with fixed enums to components.schemas.AgentDescriptor.

βœ… Example

supportedInput: ["text", "images"]
supportedOutput: ["text", "files"]

πŸ“š Documentation

The OpenAPI specification has been updated accordingly in the AgentDescriptor schema block.

Copy link
Contributor

@tmnd1991 tmnd1991 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal makes sense, I'm thinking a bit broader about this, shouldn't we rely on HTTP Content-Types for this kind of stuff? So that we don't "reinvent" the wheel? What's your position?

@@ -46,6 +46,8 @@ The agent descriptor follows an **OpenAPI 3.0-based** schema to enable easy docu
- `ModelDrivenWorkflow`: the agent is implement as a workflow. The execution through the workflow is controlled by LLMs.
- `toolsUse` *(string)* – Define if the system can use tools in order to execute its task. Values: true/false.
- `learningCapability` *(string)* – Learning approach (None, Reinforcement Learning, Fine Tuning).
- `supportedInput` *(array)* – List of supported output formats or content types (e.g., text, images, video, files).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `supportedInput` *(array)* – List of supported output formats or content types (e.g., text, images, video, files).
- `supportedInput` *(array)* – List of supported input formats or content types (e.g., text, images, video, files).

Comment on lines +263 to +265
type: string
enum: [text, images, video, files]
examples: [["text", "images"]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you declare this as a type and reuse for input and output fields?

@marco-agile
Copy link
Contributor Author

I agree that using standard MIME types makes sense in principle.
That said, I think relying only on them might be too rigid, especially for agents that handle broad categories like video or images, where formats can vary and evolve.
A possible compromise could be to define a category field (e.g. video, text, image) and optionally list specific MIME types when needed. What do you think about it?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants