Interface version canonicalization #536

lann · 2025-06-25T19:34:14Z

I stuck fullversion in the import/export productions rather than interfacename because I wanted it to be clear that it wouldn't be lowered into the core name.
The version canonicalization rules are adapted from Add BuildTargets.md #378. I'm still leaning toward omitting prerelease versions but I've only thought "medium hard" about it.
Still needs binary encoding; see comment below.
Not sure how best to capture the discussion about making canonicalization mandatory pre-1.0; the "Binary Warts" section doesn't seem quite right.

lann · 2025-06-25T21:28:14Z

For the binary encoding the most straightforward option from a quick review would seem to be adding variants of importname' / exportname' along the lines of:

importname' ::= 0x00 len:<u32> in:<importname>                       => in  (if len = |in|)
              | 0x01 len:<u32> in:<importname> fullverlen:<u16> fullver:<valid semver>

I suppose if we wanted to optimize the binary a bit this extra field could contain just the part of the original version that got lopped off by canonicalization.

On this field width:

fullverlen:<u16>

https://semver.org/#does-semver-have-a-size-limit-on-the-version-string

No, but use good judgment. A 255 character version string is probably overkill, for example. Also, specific systems may impose their own limits on the size of the string.

🤷

lukewagner · 2025-06-26T21:02:44Z

@lann Thanks for starting this! For the binary encoding question: yes, taking over the 0x00 byte and using it as a discriminant is a nice coincidence we can take advantage of (and could you update the corresponding bullet in the "Warts" section at the end)?

I suppose if we wanted to optimize the binary a bit this extra field could contain just the part of the original version that got lopped off by canonicalization.

Is there a simplicity argument to be made that requiring the concatenation of the version and the fullversion to match <valid semver> is simpler than allowing the fullversion to be <valid semver> and then adding the additional validation requirement (which I assume we want) that the fullversion has to "match" the version? If so, that could be a second argument in favor in addition to size.

lukewagner

Looking good! A few drive-by comments:

lukewagner · 2025-06-26T20:52:20Z

design/mvp/Explainer.md

-export ::= (export <id>? "<exportname>" <sortidx> <externdesc>?)
+import      ::= (import "<importname>" <fullversion>? bind-id(<externdesc>))
+export      ::= (export <id>? "<exportname>" <fullversion>? <sortidx> <externdesc>?)
+fullversion ::= (fullversion <valid semver>)


Suggested change

fullversion ::= (fullversion <valid semver>)

fullversion ::= (fullversion "<valid semver>")

lukewagner · 2025-06-26T20:52:45Z

design/mvp/Explainer.md

@@ -294,7 +294,7 @@ sort           ::= core <core:sort>
                 | type
                 | component
                 | instance
-inlineexport   ::= (export <exportname> <sortidx>)
+inlineexport   ::= (export <exportname> <fullversion>? <sortidx>)


Suggested change

inlineexport ::= (export <exportname> <fullversion>? <sortidx>)

inlineexport ::= (export "<exportname>" <fullversion>? <sortidx>)

(pre-existing, but since we're touching this line)

lukewagner · 2025-06-26T20:52:59Z

design/mvp/Explainer.md

+importdecl    ::= (import <importname> <fullversion>? bind-id(<externdesc>))
+exportdecl    ::= (export <exportname> <fullversion>? bind-id(<externdesc>))


Suggested change

importdecl ::= (import <importname> <fullversion>? bind-id(<externdesc>))

exportdecl ::= (export <exportname> <fullversion>? bind-id(<externdesc>))

importdecl ::= (import "<importname>" <fullversion>? bind-id(<externdesc>))

exportdecl ::= (export "<exportname>" <fullversion>? bind-id(<externdesc>))

(pre-existing)

lukewagner · 2025-06-26T21:11:53Z

design/mvp/Explainer.md

@@ -2379,6 +2383,33 @@ interpreted with the same [semantics][SemVerRange]. (Mostly this
 interpretation is the usual SemVer-spec-defined ordering, but note the
 particular behavior of pre-release tags.)

+The `version` production used in `interfacename`s accepts both `valid semver`


Could you frame "canonicalized interface version" as a validation rule factored out into a new "Canonical Interface Name" section alongside and symmetric to "Name Uniquness" and say that it is temporarily not enforced but will start issuing warnings and be enforced post-Preview-3?

lukewagner · 2025-06-26T21:13:29Z

design/mvp/Explainer.md

@@ -2283,10 +2284,13 @@ words         ::= <word>
                | <words> '-' <word>
 projection    ::= '/' <label>
 version       ::= '@' <valid semver>


Probably good to rename this interfaceversion since it's specific to interfacename (and to be symmetric to pkgversion). (I know that'll mess up the column alignment and fixing bloats the diff obscuring the change; maybe leave it unaligned and fix it right before merging.)

lukewagner

(oops, meant to "comment" not approve before it's even ready to review 🙃 )

alexcrichton · 2025-06-27T14:41:51Z

For the binary encoding, here's another possible encoding:

importname' ::= 0x00 len:<u32> in:<importname>                       => in  (if len = |in|)
              | 0x01 len:<u32> in:<importname>                       => "${in.name}@N"  (if len = |in|,  in.version = N.*)
              | 0x02 len:<u32> in:<importname>                       => "${in.name}@0.N"  (if len = |in|,  in.version = 0.N.*)
              | 0x03 len:<u32> in:<importname>                       => "${in.name}@0.0.N"  (if len = |in|,  in.version = 0.0.N.*)

maybe with affordances for rc/etc unsure. The basic idea though is that the actual import name would always be foo:bar/baz@0.1.2 in the binary format but the semantic meaning (e.g. the text format) would be a subslice of such a string. This codifies that in the binary format it's always a valid semver and the discriminant byte says basically how to shorten it. The goal here would be to make the binary format still pretty clear what it can be without changing the meaning of the meaning at a parsed layer.

fullverlen:

https://semver.org/#does-semver-have-a-size-limit-on-the-version-string

No, but use good judgment. A 255 character version string is probably overkill, for example. Also, specific systems may impose their own limits on the size of the string.

For this I'd recommend using <u32> regardless. We already limit many strings far below the theoretical 4G limit with a 32-bit length and keeping <u32> makes it more consistent with the rest of the decoding process. Otherwise when implementing a decoder you'd have to implement a specific function for decoding a 16-bit LEB which is otherwise not required when parsing WebAssembly today. Basically while I agree that >255 characters for a version is silly, I'd say that for consistency with the rest of the binary format this'd want to be <u32> if we go with this variant.

lann force-pushed the truncated-versions branch 2 times, most recently from 79c15f7 to 7b6bd7d Compare June 25, 2025 19:54

Add interface version canonicalization

2f8eda8

lann force-pushed the truncated-versions branch from 7b6bd7d to 2f8eda8 Compare June 25, 2025 20:46

lann changed the title ~~WIP: Truncated interface versions~~ Interface version canonicalization Jun 25, 2025

lann mentioned this pull request Jun 25, 2025

Interface version / compatibilty changes #534

Open

lukewagner approved these changes Jun 26, 2025

View reviewed changes

lukewagner reviewed Jun 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Interface version canonicalization #536

Interface version canonicalization #536

Uh oh!

lann commented Jun 25, 2025 •

edited

Loading

Uh oh!

lann commented Jun 25, 2025 •

edited

Loading

Uh oh!

lukewagner commented Jun 26, 2025

Uh oh!

lukewagner left a comment

Uh oh!

lukewagner Jun 26, 2025

Uh oh!

lukewagner Jun 26, 2025

Uh oh!

lukewagner Jun 26, 2025

Uh oh!

lukewagner Jun 26, 2025

Uh oh!

lukewagner Jun 26, 2025

Uh oh!

lukewagner Jun 26, 2025

Uh oh!

lukewagner left a comment

Uh oh!

alexcrichton commented Jun 27, 2025

Uh oh!

Uh oh!

	fullversion ::= (fullversion <valid semver>)
	fullversion ::= (fullversion "<valid semver>")

	inlineexport ::= (export <exportname> <fullversion>? <sortidx>)
	inlineexport ::= (export "<exportname>" <fullversion>? <sortidx>)

		importdecl ::= (import <importname> <fullversion>? bind-id(<externdesc>))
		exportdecl ::= (export <exportname> <fullversion>? bind-id(<externdesc>))

Interface version canonicalization #536

Are you sure you want to change the base?

Interface version canonicalization #536

Uh oh!

Conversation

lann commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lann commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lukewagner commented Jun 26, 2025

Uh oh!

lukewagner left a comment

Choose a reason for hiding this comment

Uh oh!

lukewagner Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

lukewagner Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

lukewagner Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

lukewagner Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

lukewagner Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

lukewagner Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

lukewagner left a comment

Choose a reason for hiding this comment

Uh oh!

alexcrichton commented Jun 27, 2025

Uh oh!

Uh oh!

lann commented Jun 25, 2025 •

edited

Loading

lann commented Jun 25, 2025 •

edited

Loading