Allow 0 size dimensions (dimensions containing a 0 in the list of sizes, not a rank of 0 which is valid) #391

huningxin · 2023-05-23T00:20:11Z

In Chromium CL review, @fdwr mentioned (Thanks Dwayne!)

0 size dimensions really should just be treated as nops. e.g. memcpy(dest, src, 0) is valid and does nothing, and adding two empty tensors just returns an empty tensor. There are legitimate cases within a graph where a tensor may be temporarily sliced down to emptiness and then reconcatenated later with other data (I've come across at least one ONNX model that does this).

From native ML API perspective, Dwayne also mentioned

Now, there are certain backends that are not prepared for empty dimensions (e.g. DML doesn't accept them and rejects the operator creation call), and those operators will need to be skipped inside the backend, but the front-end model builder imo should treat them validly and return 0 elements.

We may want to investigate more native ML APIs to understand what's the status of the support.

I am opening this issue to start tracking, e.g. adding TODO in the implementation. @fdwr, feel free to share more details. Thanks!

fdwr · 2023-05-23T04:24:36Z

References

All the major Python ML APIs handle them robustly, and they are not considered degenerate:

NumPy

import numpy

x = numpy.ones(shape=(2,0,2), dtype=numpy.float32)
y = numpy.add(x, x)
print("NumPy:")
print("value:", y)
print("shape:", y.shape)

# Prints:
# value: []
# shape: (2, 0, 2)

TensorFlow

import tensorflow as tf

x = tf.ones(shape=(2,0,2), dtype=tf.float32)
y = tf.add(x, x);
print("TensorFlow:")
print("value:", y)
print("shape:", y.shape)

# Prints:
# value: tf.Tensor([], shape=(2, 0, 2), dtype=float32)
# shape: ()

PyTorch

import torch

x = torch.ones(size=(2,0,2), dtype=torch.float)
y = torch.add(x, x)
print("PyTorch:")
print("value:", y)
print("shape:", y.shape)

# Prints:
# value: tensor([], size=(2, 0, 2))
# shape: torch.Size([2, 0, 2])

ONNX / ONNX Runtime

import onnx

# Scalar via [].
x = onnx.helper.make_tensor(
    name="value", data_type=onnx.TensorProto.FLOAT, dims=[2,0,2], vals=[]
)
print(x)

# Prints:
# dims: 2
# dims: 0
# dims: 2
# data_type: 1
# name: "value"

In ONNX Runtime, these cases are handled as nops, either directly by the EP backend (if it handles them gracefully) or by the lower-level code just before it reaches the backend API call (such as with DirectML which currently rejects 0's in the dimensions, where the EP skips operator creation while still leaving the overall graph connectivity intact).

XNNPack

Allows them.

// see Bin Miao's code below

SafeTensors

The SafeTensors file format (commonly used with Stable Diffusion models for custom weights) explicitly allows 0D scalars and 0-size tensors - "Empty tensors (tensors with 1 dimension being 0) are allowed" and "0-rank Tensors (tensors with shape []) are allowed, they are merely a scalar".

CoreML / MPS / BNNS

? Not evident from documentation:

DirectML

Disallows 0 for DML_BUFFER_TENSOR_DESC::Sizes. The backend must skip the operation.

huningxin · 2023-06-30T01:05:32Z

I'm unsure what XNNPack would do if you tried to add two empty tensors (needs research).

@miaobin would volunteer to help investigate XNNPACK's support. Thanks!

miaobin · 2023-07-01T04:00:55Z

I'm unsure what XNNPack would do if you tried to add two empty tensors (needs research).

@miaobin would volunteer to help investigate XNNPACK's support. Thanks!

After I deleted and modified the errant validation statement in the ml_graph_builder.cc and graph_validation_utils.cc. I verified that XNNPack supports both 0D scalars and 0-size tensors through the following two test cases:

Test for 0D scalars:

{
    auto* input1 =
        BuildInput(builder, "input1", {}, V8MLOperandType::Enum::kFloat32,
                   scope.GetExceptionState());
    EXPECT_NE(input1, nullptr);
    EXPECT_EQ(scope.GetExceptionState().CodeAs<DOMExceptionCode>(),
              DOMExceptionCode::kNoError);
    EXPECT_EQ(input1->Kind(), MLOperand::OperandKind::kInput);
    EXPECT_EQ(input1->Type(), V8MLOperandType::Enum::kFloat32);
    EXPECT_EQ(input1->Dimensions(), Vector<uint32_t>({}));
    EXPECT_EQ(input1->Name(), "input1");

    auto* input2 =
        BuildInput(builder, "input2", {}, V8MLOperandType::Enum::kFloat32,
                   scope.GetExceptionState());
    EXPECT_NE(input2, nullptr);
    EXPECT_EQ(scope.GetExceptionState().CodeAs<DOMExceptionCode>(),
              DOMExceptionCode::kNoError);
    EXPECT_EQ(input2->Kind(), MLOperand::OperandKind::kInput);
    EXPECT_EQ(input2->Type(), V8MLOperandType::Enum::kFloat32);
    EXPECT_EQ(input2->Dimensions(), Vector<uint32_t>({}));
    EXPECT_EQ(input2->Name(), "input2");

    auto* output_operand = builder->add(input1, input2, scope.GetExceptionState());
    EXPECT_NE(output_operand, nullptr);
    EXPECT_EQ(scope.GetExceptionState().CodeAs<DOMExceptionCode>(),
              DOMExceptionCode::kNoError);
    EXPECT_EQ(output_operand->Kind(), MLOperand::OperandKind::kOutput);
    EXPECT_EQ(output_operand->Type(), V8MLOperandType::Enum::kFloat32);
    EXPECT_EQ(output_operand->Dimensions(), Vector<uint32_t>({}));

    auto [graph, build_exception] =
        BuildGraph(scope, builder, {{"output", output_operand}});
    EXPECT_NE(graph, nullptr);

    // Compute the graph.
    MLNamedArrayBufferViews inputs(
        {{"input1", CreateArrayBufferViewForOperand<float>(input1, {42.0})},
         {"input2", CreateArrayBufferViewForOperand<float>(input2, {42.0})}});
    MLNamedArrayBufferViews outputs(
        {{"output", CreateArrayBufferViewForOperand(output_operand)}});
    auto* compute_exception = ComputeGraph(scope, graph, inputs, outputs);
    EXPECT_EQ(compute_exception, nullptr);
    auto results = GetArrayBufferViewValues<float>(outputs[0].second);
    Vector<float> r{84.0};
    EXPECT_EQ(results, r);
  }

Test for 0-size tensors:

  {
    auto* input1 =
        BuildInput(builder, "input1", {2, 0, 2}, V8MLOperandType::Enum::kFloat32,
                   scope.GetExceptionState());
    EXPECT_NE(input1, nullptr);
    EXPECT_EQ(scope.GetExceptionState().CodeAs<DOMExceptionCode>(),
              DOMExceptionCode::kNoError);
    EXPECT_EQ(input1->Kind(), MLOperand::OperandKind::kInput);
    EXPECT_EQ(input1->Type(), V8MLOperandType::Enum::kFloat32);
    EXPECT_EQ(input1->Dimensions(), Vector<uint32_t>({2, 0, 2}));
    EXPECT_EQ(input1->Name(), "input1");

    auto* input2 =
        BuildInput(builder, "input2", {2, 0, 2}, V8MLOperandType::Enum::kFloat32,
                   scope.GetExceptionState());
    EXPECT_NE(input2, nullptr);
    EXPECT_EQ(scope.GetExceptionState().CodeAs<DOMExceptionCode>(),
              DOMExceptionCode::kNoError);
    EXPECT_EQ(input2->Kind(), MLOperand::OperandKind::kInput);
    EXPECT_EQ(input2->Type(), V8MLOperandType::Enum::kFloat32);
    EXPECT_EQ(input2->Dimensions(), Vector<uint32_t>({2, 0, 2}));
    EXPECT_EQ(input2->Name(), "input2");

    auto* output_operand = builder->add(input1, input2, scope.GetExceptionState());
    EXPECT_NE(output_operand, nullptr);
    EXPECT_EQ(scope.GetExceptionState().CodeAs<DOMExceptionCode>(),
              DOMExceptionCode::kNoError);
    EXPECT_EQ(output_operand->Kind(), MLOperand::OperandKind::kOutput);
    EXPECT_EQ(output_operand->Type(), V8MLOperandType::Enum::kFloat32);
    EXPECT_EQ(output_operand->Dimensions(), Vector<uint32_t>({2, 0, 2}));

    auto [graph, build_exception] =
        BuildGraph(scope, builder, {{"output", output_operand}});
    EXPECT_NE(graph, nullptr);

    // Compute the graph.
    MLNamedArrayBufferViews inputs(
        {{"input1", CreateArrayBufferViewForOperand<float>(input1, {})},
         {"input2", CreateArrayBufferViewForOperand<float>(input2, {})}});
    MLNamedArrayBufferViews outputs(
        {{"output", CreateArrayBufferViewForOperand(output_operand)}});
    auto* compute_exception = ComputeGraph(scope, graph, inputs, outputs);
    EXPECT_EQ(compute_exception, nullptr);
    auto results = GetArrayBufferViewValues<float>(outputs[0].second);
    Vector<float> r{};
    EXPECT_EQ(results, r);
  }

Both the tests have passed.

fdwr · 2023-07-01T05:45:12Z

miaobin: Great - thanks for investigating and adding the test cases. It will be more interesting for the DirectML backend because the current API rejects zero size tensors, and even if we were to update the API to accept them (and add test cases for all 100+ operators...), the older version would still be on the operating system. So we'll have to do the same thing like was done in ONNX Runtime where the operator creation is bypassed for such operators (the node is left null as a placeholder, and it's not added to the graph later).

fdwr · 2023-09-14T00:08:37Z

Evidently LLaMA is another model that can encounter legal 0 size tensors during concat.

huningxin · 2023-11-27T07:40:18Z

@fdwr

There are legitimate cases within a graph where a tensor may be temporarily sliced down to emptiness and then reconcatenated later with other data (I've come across at least one ONNX model that does this).

WebNN's slice requires "the size must not be 0". Would this prevent the ONNX model you mentioned from slicing a tensor down to emptiness?

fdwr · 2023-11-28T05:26:54Z

@fdwr

There are legitimate cases within a graph where a tensor may be temporarily sliced down to emptiness and then reconcatenated later with other data (I've come across at least one ONNX model that does this).

WebNN's slice requires "the size must not be 0". Would this prevent the ONNX model you mentioned from slicing a tensor down to emptiness?

@huningxin: 🤔 It could, as TF and ONNX support 0 size slices (see below). Granted, it's unlikely a TF or ONNX model would typically contain a 0-slice window (ends - starts = 0), but it could occur indirectly as a result of a model generation process and manipulating some other variable:

TF

import tensorflow as tf

values = tf.constant([0, 1, 2, 3, 4, 5], dtype=tf.uint8)
result = tf.slice(values, [1], [1])
print("value:", result)
print("shape:", result.shape)

ONNX

ir_version: 4
producer_name: "OnnxConformanceTest"
graph {
  node {
    input: "data"
    output: "output"
    op_type: "Slice"
    attribute {
      name: "axes"
      ints: 0
      type: INTS
    }
    attribute {
      name: "starts"
      ints: 1
      type: INTS
    }
    attribute {
      name: "ends"
      ints: 1
      type: INTS
    }
    domain: ""
  }
  name: "Slice_1d_zero_size"
  input {
    name: "data"
    type {
      tensor_type {
        elem_type: 1
        shape {
          dim {
            dim_value: 6
          }
        }
      }
    }
  }
  output {
    name: "output"
    type {
      tensor_type {
        elem_type: 1
        shape {
          dim {
            dim_value: 0
          }
        }
      }
    }
  }
}
opset_import {
  domain: ""
  version: 1
}
opset_import {
  domain: ""
  version: 7
}

huningxin · 2023-11-29T10:22:16Z

According to my test, XNNPACK concat (xnn_define_concatenate2/3/4) and split (xnn_define_even_split2/3/4) operators would report invalid parameter error (xnn_status_invalid_parameter) when the input has 0 size dimension.

fdwr · 2023-11-29T23:37:42Z

Bin Miao showed that XNNPack's add supports empty tensors fine, and so if it fails on concat, then that's a bug in XNNPack. Shall I open a GitHub issue, or do you want to? It doesn't matter either way for the WebNN EP though, because you just skip passing that input tensor to concat, the same as the ORT DML EP. So if there were 3 inputs (a=[2,3], b=[2,0], c=[2,4]), then only pass the nonzero ones to the XNNPack call (inputs = [a, c]).

huningxin · 2023-11-30T04:07:25Z

Shall I open a GitHub issue, or do you want to?

Opened: google/XNNPACK#5807

It doesn't matter either way for the WebNN EP though

If frameworks can handle that, it would help simplify WebNN implementation.

sushraja-msft · 2024-03-18T21:39:46Z

Evidently LLaMA is another model that can encounter legal 0 size tensors during concat.

Another case that we should consider is having webnn input operands that have 0 dimensions. This is not allowed today.
However, for TinyLama the first round of next token generation requires representing the past key value tensor as a tensor of dimension [1,4,0,64] the graph then takes the .shape() of that tensor and performs operations on it to determine the size of other tensors it creates via generateConstantOfShape.

fdwr · 2024-03-18T22:56:46Z

And it sounds like from @guschmue today that this affects yolov9 too.

huningxin · 2024-03-19T00:14:29Z

@sushraja-msft

However, for TinyLama the first round of next token generation requires representing the past key value tensor as a tensor of dimension [1,4,0,64] the graph then takes the .shape() of that tensor and performs operations on it to determine the size of other tensors it creates via generateConstantOfShape.

Would this mean the shape of key value tensors keeps changing for each round of inference? WebNN only supports static shape. This may cause re-compiling WebNN graph for each round? We met similar issue for Whisper model inference. The static key value cache seems to be useful: huggingface/transformers#27931

Noticed during a review of the Chromium prototype. These are all pretty obvious except for slice() where there is subtlety for 0-size dimensions. I added an issue linking to webmachinelearning#391 since the steps will need to be revised depending on how that issue is resolved.

* Add missing validation for pad(), slice(), and split() Noticed during a review of the Chromium prototype. These are all pretty obvious except for slice() where there is subtlety for 0-size dimensions. I added an issue linking to #391 since the steps will need to be revised depending on how that issue is resolved. * Add another note for split()

reillyeon · 2024-05-24T23:00:36Z

@fdwr, you mentioned an ONNX model which depends on this. Can you elaborate on what this is used for in the model?

fdwr · 2024-05-24T23:05:08Z

@fdwr, you mentioned an ONNX model which depends on this. Can you elaborate on what this is used for in the model?

@reillyeon It will take some history digging for full context (like which operators in the model hit the issue). Two affected operators I recall were concatenation and slice. ⌛

bbernhar · 2024-08-19T23:12:16Z

@fdwr

If I permit MLBuffer to exist with 0-dim, what does it mean for DML to execute a IDMLCommandRecorder::RecordDispatch using a DML_BUFFER_BINDING::SizeInBytes equal to 0 and can this binding be NULL or left unbound?

fdwr · 2024-08-19T23:22:00Z

@fdwr, you mentioned an ONNX model which depends on this. Can you elaborate on what this is used for in the model?

@reillyeon It will take some history digging for full context (like which operators in the model hit the issue). Two affected operators I recall were concatenation and slice. ⌛

I know it hit a few more models, but my email search is just turning up RCNN models (like MaskRCNN) with operators {Cast, Xor, Unsqueeze, Concat, Scatter with empty indices which becomes identity}, plus this ORT CUDA CR microsoft/onnxruntime#2337 (but I didn't see the context of impacted models for CUDA EP).

If I permit MLBuffer to exist with 0-dim, what does it mean for DML to execute a IDMLCommandRecorder::RecordDispatch using a DML_BUFFER_BINDING::SizeInBytes equal to 0 and can this binding be NULL or left unbound?

@bbernhar Currently the DML API rejects empty tensors anyway, but I've been thinking of relaxing that (we see that it actually "just works" for a lot of operators when the validation is relaxed), and I think we'd still need the binding even for emptiness (so not unbound).

bbernhar · 2024-08-20T20:48:09Z

@fdwr

it actually "just works" for a lot of operators when the validation is relaxed

So if we create an "empty binding" by giving IDMLCommandRecorder::RecordDispatch a 4B dummy buffer with a DML_BUFFER_BINDING::SizeInBytes of zero, DML API will NOT reject?

fdwr · 2024-08-20T22:12:23Z

So if we create an "empty binding" by giving IDMLCommandRecorder::RecordDispatch a 4B dummy buffer with a DML_BUFFER_BINDING::SizeInBytes of zero, DML API will NOT reject?

@bbernhar I don't know what would happen in that case, but it's a moot point currently anyway because you cannot create an operator with empty tensors anyway (so you wouldn't even get as far as RecordDispatch). I just don't want to back ourselves into an inoperable corner where we can't support this in the DML API later (empty tensors have also been an issue when DML is called from TensorFlow, PyTorch, and ORT). Note it probably requires 16-byte dummy (per DML_MINIMUM_BUFFER_TENSOR_ALIGNMENT = 16).

bbernhar · 2024-08-20T22:47:36Z

@fdwr

Currently, we can't specify MLBuffer using an operator (or "in the graph"), only as input/output to dispatch() which basically calls nothing but RecordDispatch . We need to ensure we're not relying on undefined DML behavior - perhaps others on DML team have thoughts on this "dummy buffer" approach? I believe DML_MINIMUM_BUFFER_TENSOR_ALIGNMENT is the offset alignment requirement for bound buffers - DML buffer size alignment is 4B per MSDN [1]. Currently, the WebNN runtime disallows MLBuffer to be bound to non-zero offsets so only the size requirement matters, I think.

[1] https://learn.microsoft.com/en-us/windows/ai/directml/dml-helper-functions#dmlcalcbuffertensorsize

fdwr · 2024-08-21T00:03:49Z

DML buffer size alignment is 4B

@bbernhar : Confirmed. Passing < 16 bytes is okay for DML_BUFFER_TENSOR_DESC::TotalTensorSizeInBytes, but it must be >= 4, or else you get: "The TotalTensorSizeInBytes of '...' for tensor '...' does not meet the minimum size required for this tensor, which is %llu bytes...".

fdwr mentioned this issue Jun 14, 2023

[WebNN EP] Support Shape op microsoft/onnxruntime#16282

Merged

BruceDai mentioned this issue Aug 15, 2023

Add missing algorithms, add stylistic improvements, update with spec conventions #446

Merged

fdwr mentioned this issue Sep 13, 2023

[DML EP] Add graph support for concat when some provided inputs are 0-dimension tensors microsoft/onnxruntime#17501

Closed

fdwr mentioned this issue Oct 20, 2023

Empty tensors are valid for new shape #472

Closed

huningxin mentioned this issue Nov 30, 2023

Concatenate and Split don't support input that has 0 size dimension google/XNNPACK#5807

Closed

inexorabletash added the feature request label Feb 1, 2024

huningxin mentioned this issue Apr 6, 2024

Need clarify scale factor for resample2d #610

Closed

inexorabletash mentioned this issue Apr 8, 2024

Introduce "valid dimension", used as needed when calculating operand shapes #641

Merged

inexorabletash mentioned this issue Apr 17, 2024

Update 2024-04-18-wg-agenda.md webmachinelearning/meetings#22

Merged

inexorabletash mentioned this issue May 2, 2024

Meta: Introduce "Interop" label? #673

Closed

inexorabletash added the interop label May 9, 2024

inexorabletash mentioned this issue May 20, 2024

Add missing validation for pad(), slice(), and split() #690

Merged

fdwr mentioned this issue Jul 6, 2024

[MLBuffer] Uploading/downloading tensor data #543

Closed

inexorabletash mentioned this issue Jul 11, 2024

Copy input ArrayBuffers #700

Closed

fdwr changed the title ~~Allow 0 size dimensions~~ Allow 0 size dimensions (dimensions containing a 0 in the list of sizes, not a rank of 0 which is valid) Aug 20, 2024

reillyeon mentioned this issue Nov 26, 2024

Disallow operations on scalar tensors that are no-ops #794

Open

fdwr mentioned this issue Feb 27, 2025

Operator set wave 3 #805

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow 0 size dimensions (dimensions containing a 0 in the list of sizes, not a rank of 0 which is valid) #391

Allow 0 size dimensions (dimensions containing a 0 in the list of sizes, not a rank of 0 which is valid) #391

huningxin commented May 23, 2023

fdwr commented May 23, 2023 •

edited

Loading

huningxin commented Jun 30, 2023

miaobin commented Jul 1, 2023

fdwr commented Jul 1, 2023 •

edited

Loading

fdwr commented Sep 14, 2023

huningxin commented Nov 27, 2023

fdwr commented Nov 28, 2023

huningxin commented Nov 29, 2023

fdwr commented Nov 29, 2023

huningxin commented Nov 30, 2023

sushraja-msft commented Mar 18, 2024 •

edited

Loading

fdwr commented Mar 18, 2024

huningxin commented Mar 19, 2024

reillyeon commented May 24, 2024

fdwr commented May 24, 2024 •

edited

Loading

bbernhar commented Aug 19, 2024

fdwr commented Aug 19, 2024 •

edited

Loading

bbernhar commented Aug 20, 2024

fdwr commented Aug 20, 2024

bbernhar commented Aug 20, 2024

fdwr commented Aug 21, 2024 •

edited

Loading

Allow 0 size dimensions (dimensions containing a 0 in the list of sizes, not a rank of 0 which is valid) #391

Allow 0 size dimensions (dimensions containing a 0 in the list of sizes, not a rank of 0 which is valid) #391

Comments

huningxin commented May 23, 2023

fdwr commented May 23, 2023 • edited Loading

References

NumPy

TensorFlow

PyTorch

ONNX / ONNX Runtime

XNNPack

SafeTensors

CoreML / MPS / BNNS

DirectML

huningxin commented Jun 30, 2023

miaobin commented Jul 1, 2023

fdwr commented Jul 1, 2023 • edited Loading

fdwr commented Sep 14, 2023

huningxin commented Nov 27, 2023

fdwr commented Nov 28, 2023

TF

ONNX

huningxin commented Nov 29, 2023

fdwr commented Nov 29, 2023

huningxin commented Nov 30, 2023

sushraja-msft commented Mar 18, 2024 • edited Loading

fdwr commented Mar 18, 2024

huningxin commented Mar 19, 2024

reillyeon commented May 24, 2024

fdwr commented May 24, 2024 • edited Loading

bbernhar commented Aug 19, 2024

fdwr commented Aug 19, 2024 • edited Loading

bbernhar commented Aug 20, 2024

fdwr commented Aug 20, 2024

bbernhar commented Aug 20, 2024

fdwr commented Aug 21, 2024 • edited Loading

fdwr commented May 23, 2023 •

edited

Loading

fdwr commented Jul 1, 2023 •

edited

Loading

sushraja-msft commented Mar 18, 2024 •

edited

Loading

fdwr commented May 24, 2024 •

edited

Loading

fdwr commented Aug 19, 2024 •

edited

Loading

fdwr commented Aug 21, 2024 •

edited

Loading