From be8515ba27cd8d3632db73fe6c5b4c6e4290a901 Mon Sep 17 00:00:00 2001 From: Joshua Bell Date: Mon, 28 Apr 2025 11:36:35 -0700 Subject: [PATCH 1/4] Add notes regarding label usage, provided by i18n review A suggested by @xfq, add additional details about operator labels, clarifying that they are not intended to be natural language strings. And include an advisement that use of developer-provided labels is subject to spoofing, and implementations should sanitize them. --- index.bs | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/index.bs b/index.bs index d8085b49..e659e5ec 100644 --- a/index.bs +++ b/index.bs @@ -701,8 +701,12 @@ A key part of the {{MLGraphBuilder}} interface are methods such as {{MLGraphBuil An [=operator=] has a label, a string which may be included in diagnostics such as [=exception=] messages. When an [=operator=] is created its [=operator/label=] is initialized in an [=implementation-defined=] manner and may include the passed {{MLOperatorOptions/label}}. +Note: The label is not intended to be a natural language string. It is a language-independent identifier, analogous to a variable name or error code, like `"mul#1234"`. + Note: Implementations are encouraged to use the {{MLOperatorOptions/label}} provided by developers to enhance error messages and improve debuggability, including both synchronous errors during graph construction and for errors that occur during the asynchronous {{MLGraphBuilder/build()}} method. +Advisement: When displaying labels provided by developers via {{MLOperatorOptions/label}} in debugging tools, logs, or error messages, implementations should sanitize the output to prevent security risks, such as injection of malicious Unicode sequences (e.g., bidirectional control characters as described in Unicode Technical Report #36, Trojan Source attacks). For example, implementations should escape or filter control characters (e.g., U+202A to U+202E, U+2066 to U+2069) or use a safe rendering mechanism to neutralize potential spoofing. + ISSUE(778): Consider adding a mechanism for reporting errors during {{MLContext/dispatch()}}. At inference time, every {{MLOperand}} will be bound to a tensor (the actual data), which are essentially multidimensional arrays. The representation of the tensors is implementation dependent, but it typically includes the array data stored in some buffer (memory) and some metadata describing the array data (such as its shape). @@ -10304,6 +10308,8 @@ Thanks to Jiewei Qian for Chromium implementation review and feedback. Thanks to Dwayne Robinson, Joshua Lochner and Wanming Lin for their work investigating and providing recommendation for transformer support. Additional thanks to Dwayne and Wanming for providing reviews of operator conformance and web-platform-tests implementation. Thanks to Feng Dai for his continuous contributions that keep web-platform-tests evolving alongside the specification. + +Thanks to Fuqiao Xue and the W3C Internationalization Activity for reviews and suggestions.
 {
   "Models": {

From a3f8ab2eff4650137cd6cd74ebdeba6c61a656da Mon Sep 17 00:00:00 2001
From: Joshua Bell 
Date: Mon, 5 May 2025 09:27:48 -0700
Subject: [PATCH 2/4] relocate notes

---
 index.bs | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/index.bs b/index.bs
index 65815842..ce2f607a 100644
--- a/index.bs
+++ b/index.bs
@@ -701,12 +701,6 @@ A key part of the {{MLGraphBuilder}} interface are methods such as {{MLGraphBuil
 
 An [=operator=] has a label, a string which may be included in diagnostics such as [=exception=] messages. When an [=operator=] is created its [=operator/label=] is initialized in an [=implementation-defined=] manner and may include the passed {{MLOperatorOptions/label}}.
 
-Note: The label is not intended to be a natural language string. It is a language-independent identifier, analogous to a variable name or error code, like `"mul#1234"`.
-
-Note: Implementations are encouraged to use the {{MLOperatorOptions/label}} provided by developers to enhance error messages and improve debuggability, including both synchronous errors during graph construction and for errors that occur during the asynchronous {{MLGraphBuilder/build()}} method.
-
-Advisement: When displaying labels provided by developers via {{MLOperatorOptions/label}} in debugging tools, logs, or error messages, implementations should sanitize the output to prevent security risks, such as injection of malicious Unicode sequences (e.g., bidirectional control characters as described in Unicode Technical Report #36, Trojan Source attacks). For example, implementations should escape or filter control characters (e.g., U+202A to U+202E, U+2066 to U+2069) or use a safe rendering mechanism to neutralize potential spoofing.
-
 ISSUE(778): Consider adding a mechanism for reporting errors during {{MLContext/dispatch()}}.
 
 At inference time, every {{MLOperand}} will be bound to a tensor (the actual data), which are essentially multidimensional arrays. The representation of the tensors is implementation dependent, but it typically includes the array data stored in some buffer (memory) and some metadata describing the array data (such as its shape).
@@ -1579,6 +1573,12 @@ Implementations may impose a more restricted lower bound and/or upper bound on t
         Optionally provided when an [=operator=] is created using {{MLGraphBuilder}} methods that create {{MLOperand}}s. The implementation may use this value to initialize the [=operator=]'s [=operator/label=].
 
 
+Note: The label is not intended to be a natural language string. It is a language-independent identifier, analogous to a variable name or error code, like `"mul#1234"`.
+
+Note: Implementations are encouraged to use the {{MLOperatorOptions/label}} provided by developers to enhance error messages and improve debuggability, including both synchronous errors during graph construction and for errors that occur during the asynchronous {{MLGraphBuilder/build()}} method.
+
+Advisement: When displaying labels provided by developers via {{MLOperatorOptions/label}} in debugging tools, logs, or error messages, implementations should sanitize the output to prevent security risks, such as injection of malicious Unicode sequences (e.g., bidirectional control characters as described in Unicode Technical Report #36, Trojan Source attacks). For example, implementations should escape or filter control characters (e.g., U+202A to U+202E, U+2066 to U+2069) or use a safe rendering mechanism to neutralize potential spoofing.
+
 ### Creating an {{MLOperand}} ### {#api-mloperand-create}
 The {{MLOperand}} objects are created by the methods of {{MLGraphBuilder}}, internally using the following algorithms.
 

From de96aff4b1371c3ae547251fd824ad0ceb1a6cf9 Mon Sep 17 00:00:00 2001
From: Joshua Bell 
Date: Mon, 12 May 2025 14:18:54 -0700
Subject: [PATCH 3/4] Add links

---
 index.bs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/index.bs b/index.bs
index a2e7148a..a235c8c7 100644
--- a/index.bs
+++ b/index.bs
@@ -1577,7 +1577,7 @@ Note: The label is not intended to be a natural language string. It is a languag
 
 Note: Implementations are encouraged to use the {{MLOperatorOptions/label}} provided by developers to enhance error messages and improve debuggability, including both synchronous errors during graph construction and for errors that occur during the asynchronous {{MLGraphBuilder/build()}} method.
 
-Advisement: When displaying labels provided by developers via {{MLOperatorOptions/label}} in debugging tools, logs, or error messages, implementations should sanitize the output to prevent security risks, such as injection of malicious Unicode sequences (e.g., bidirectional control characters as described in Unicode Technical Report #36, Trojan Source attacks). For example, implementations should escape or filter control characters (e.g., U+202A to U+202E, U+2066 to U+2069) or use a safe rendering mechanism to neutralize potential spoofing.
+Advisement: When displaying labels provided by developers via {{MLOperatorOptions/label}} in debugging tools, logs, or error messages, implementations should sanitize the output to prevent security risks, such as injection of malicious Unicode sequences (e.g. bidirectional control characters as described in Unicode Technical Report #36, Trojan Source attacks and other concerns described in Unicode Technical Standard #55). For example, implementations should escape or filter control characters (e.g., U+202A to U+202E, U+2066 to U+2069) or use a safe rendering mechanism to neutralize potential spoofing.
 
 ### Creating an {{MLOperand}} ### {#api-mloperand-create}
 The {{MLOperand}} objects are created by the methods of {{MLGraphBuilder}}, internally using the following algorithms.

From 2fff2980ca3f83a1667e0fa5696b9d1d9c5b716e Mon Sep 17 00:00:00 2001
From: Joshua Bell 
Date: Tue, 13 May 2025 06:46:33 -0700
Subject: [PATCH 4/4] Improve links

Co-authored-by: Anssi Kostiainen 
---
 index.bs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/index.bs b/index.bs
index a235c8c7..dd739079 100644
--- a/index.bs
+++ b/index.bs
@@ -1577,7 +1577,7 @@ Note: The label is not intended to be a natural language string. It is a languag
 
 Note: Implementations are encouraged to use the {{MLOperatorOptions/label}} provided by developers to enhance error messages and improve debuggability, including both synchronous errors during graph construction and for errors that occur during the asynchronous {{MLGraphBuilder/build()}} method.
 
-Advisement: When displaying labels provided by developers via {{MLOperatorOptions/label}} in debugging tools, logs, or error messages, implementations should sanitize the output to prevent security risks, such as injection of malicious Unicode sequences (e.g. bidirectional control characters as described in Unicode Technical Report #36, Trojan Source attacks and other concerns described in Unicode Technical Standard #55). For example, implementations should escape or filter control characters (e.g., U+202A to U+202E, U+2066 to U+2069) or use a safe rendering mechanism to neutralize potential spoofing.
+Advisement: When displaying labels provided by developers via {{MLOperatorOptions/label}} in debugging tools, logs, or error messages, implementations should sanitize the output to prevent security risks, such as injection of malicious Unicode sequences (e.g. Bidirectional Text Spoofing [[UTR36]], Source Code Spoofing [[UTS55]] and other concerns). For example, implementations should escape or filter control characters (e.g., U+202A to U+202E, U+2066 to U+2069) or use a safe rendering mechanism to neutralize potential spoofing.
 
 ### Creating an {{MLOperand}} ### {#api-mloperand-create}
 The {{MLOperand}} objects are created by the methods of {{MLGraphBuilder}}, internally using the following algorithms.