Skip to content

Add registry functions that return raw schema information #21

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
140 changes: 79 additions & 61 deletions src/pulumi/registry.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ import { execSync } from 'node:child_process';

// Define schema types
type ResourceProperty = {
type: string;
type?: string;
description: string;
};

Expand All @@ -17,9 +17,16 @@ type ResourceSchema = {
requiredInputs: string[];
};

type TypeSchema = {
description: string;
properties: Record<string, ResourceProperty>;
required: string[];
};

type Schema = {
name: string;
resources: Record<string, ResourceSchema>;
types: Record<string, TypeSchema>;
};

type GetResourceArgs = {
Expand All @@ -29,6 +36,12 @@ type GetResourceArgs = {
version?: string;
};

type GetTypeSchemaArgs = {
provider: string;
ref: string;
version?: string;
};

type ListResourcesArgs = {
provider: string;
module?: string;
Expand All @@ -51,8 +64,50 @@ export const registryCommands = function (cacheDir: string) {
}

return {
'get-type': {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is pretty low-level because it requires knowledge of the specific $ref. The idea is that it would have already called get-resource which would contain the $ref.

Another option would be to either add a separate function or update this one to also be more generic and take

{
  provider: string,
  module: string,
  propertyName?: string,
  ref?: string,
}

So then it could try and find the type if it only knew the property name (which it would know from looking at code).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this tool have the same arguments as get-resource - provider, version, module, name? Or what's the difference?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should definitely have version, i'll add that.

For the others, that's what I was asking about. If get-resource returns something like

"inputProperties": {
    "nestedProp": {
        "type": "object",
        "$ref": "#/types/provider:module/SomeType:SomeType"
    }
}

Then the LLM would have the exact type token to call get-type.

The question is whether there is a use case for calling get-type on it's own. For example if the LLM were evaluating some code:

new aws.s3.Bucket('my-bucket', {
  acl: 'read',
});

It could potentially call just get-type with

{
  "provider": "aws",
  "module": "s3",
  "name": "acl"
}

But that has very limited usefulness. The LLM would probably want to see all the properties on Bucket so would probably need to call get-resource anyway. Also if the type returned had any nested types then it would need to call get-type for those as well.

description: 'Get the JSON schema for a specific JSON schema type reference',
schema: {
provider: z
.string()
.describe(
"The cloud provider (e.g., 'aws', 'azure', 'gcp', 'random') or github.com/org/repo for Git-hosted components"
),
version: z
.string()
.optional()
.describe(
"The provider version to use (e.g., '6.0.0'). If not specified, uses the latest available version."
),
ref: z.string().describe("The type ref to query (e.g., 'aws:s3/BucketGrant:BucketGrant')")
},
handler: async (args: GetTypeSchemaArgs) => {
const schema = await getSchema(args.provider, args.version);
const typeEntry = Object.entries(schema.types).find(([key]) => key === args.ref);
if (typeEntry) {
return {
description: 'Returns information about Pulumi Registry Types',
content: [
{
type: 'text' as const,
text: JSON.stringify(typeEntry[1])
}
]
};
} else {
return {
description: 'Returns information about Pulumi Registry Types', // Consider making this more specific, e.g., "Type not found"
content: [
{
type: 'text' as const,
text: `No information found for ${args.ref}`
}
]
};
}
}
},
'get-resource': {
description: 'Get information about a specific resource from the Pulumi Registry',
description: 'Returns information about a Pulumi Registry resource',
schema: {
provider: z
.string()
Expand Down Expand Up @@ -92,30 +147,40 @@ export const registryCommands = function (cacheDir: string) {
});

if (resourceEntry) {
// Destructure the found entry - TS knows these are defined now
const [resourceKey, resourceData] = resourceEntry;

const schema = resourceEntry[1];
const resourceName = resourceEntry[0];
const outputProperties: Record<string, ResourceProperty> = {};
for (const [key, value] of Object.entries(schema.properties)) {
if (!(key in schema.inputProperties)) {
outputProperties[key] = value;
}
}
return {
description: 'Returns information about Pulumi Registry resources',
description: 'Returns information about a Pulumi Registry resource',
content: [
{
type: 'text' as const,
text: formatSchema(resourceKey, resourceData) // No '!' needed
text: JSON.stringify({
// for now leaving out:
// - `description`: Can be pretty large and contains all language examples (if we knew the language we could extract the specific language example)
// - `properties`: contains a lot of duplicated properties with `inputProperties` and is probably less useful
Copy link
Member

@mikhailshilkov mikhailshilkov Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

properties may be important to read the right resource outputs. Should we filter out the inputProperties keys from properties and return the remaining set?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

// - `required`: only needed if you return `properties`
type: resourceName,
requiredInputs: schema.requiredInputs,
inputProperties: schema.inputProperties,
outputProperties: outputProperties,
requiredOutputs: schema.required
})
}
]
};
} else {
// Handle the case where the resource was not found
const availableResources = Object.keys(schema.resources)
.map((key) => key.split(':').pop())
.filter(Boolean);

return {
description: 'Returns information about Pulumi Registry resources', // Consider making this more specific, e.g., "Resource not found"
description: 'Returns information about a Pulumi Registry resource', // Consider making this more specific, e.g., "Resource not found"
content: [
{
type: 'text' as const,
text: `No information found for ${args.resource}${args.module ? ` in module ${args.module}` : ''}. Available resources: ${availableResources.join(', ')}` // Slightly improved message
text: `No information found for ${args.resource}${args.module ? ` in module ${args.module}` : ''}. You can call list-resources to get a list of resources` // Slightly improved message
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to just let the LLM decide to call list-resources? It may even have already called list-resources. I've been running into a lot of context exceeded issues so have been trying to limit it.

}
]
};
Expand Down Expand Up @@ -202,50 +267,3 @@ export const registryCommands = function (cacheDir: string) {
}
};
};

// Helper function to format schema
export function formatSchema(resourceKey: string, resourceData: ResourceSchema): string {
// Format the input properties section
const inputProperties = Object.entries(resourceData.inputProperties ?? {})
.sort(([nameA], [nameB]) => {
const isRequiredA = (resourceData.requiredInputs ?? []).includes(nameA);
const isRequiredB = (resourceData.requiredInputs ?? []).includes(nameB);
if (isRequiredA !== isRequiredB) {
return isRequiredA ? -1 : 1;
}
return nameA.localeCompare(nameB);
})
.map(([name, prop]) => {
const isRequired = (resourceData.requiredInputs ?? []).includes(name);
return `- ${name} (${prop.type}${isRequired ? ', required' : ''}): ${prop.description ?? '<no description>'}`;
})
.join('\n');

// Format the output properties section
const outputProperties = Object.entries(resourceData.properties ?? {})
.sort(([nameA], [nameB]) => {
const isRequiredA = (resourceData.required ?? []).includes(nameA);
const isRequiredB = (resourceData.required ?? []).includes(nameB);
if (isRequiredA !== isRequiredB) {
return isRequiredA ? -1 : 1;
}
return nameA.localeCompare(nameB);
})
.map(([name, prop]) => {
const isRequired = (resourceData.required ?? []).includes(name);
return `- ${name} (${prop.type}${isRequired ? ', always present' : ''}): ${prop.description ?? '<no description>'}`;
})
.join('\n');

return `
Resource: ${resourceKey}

${resourceData.description ?? '<no description>'}

Input Properties:
${inputProperties}

Output Properties:
${outputProperties}
`;
}
13 changes: 12 additions & 1 deletion test/.cache/test_schema.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,16 @@
{
"name": "test",
"types": {
"test:test:TestReferenceProperty": {
"properties": {
"name": {
"type": "string",
"description": "The name of the property"
}
},
"type": "object"
}
},
"resources": {
"test:test:Test": {
"description": "A test resource for unit testing",
Expand Down Expand Up @@ -98,4 +109,4 @@
"requiredInputs": ["complexity"]
}
}
}
}
Loading