[js/api] introducing IO binding for tensor #16452

fs-eire · 2023-06-22T01:31:40Z

Description

This PR adds a few properties, methods and factories to Tensor type to support IO-binding feature. This will allow user to create tensor from GPU/CPU bound data without a force transferring of data between CPU and GPU.

This change is a way to resolve #15312

Change Summary

Add properties to Tensor type:
a. location: indicating where the data is sitting. valid values are cpu, cpu-pinned, texture, gpu-buffer.
b. texture: sit side to data, a readonly property of WebGLTexture type. available only when location === 'texture'
c. gpuBuffer: sit side to data, a readonly property of GPUBuffer type. available only when location === 'gpu-buffer'
Add methods to Tensor type (usually dealing with inference outputs):
- async function getData() allows user to download data from GPU to CPU manually.
- function dispose() allows user to release GPU resources manually.
Add factories for creating Tensor instances:
a. fromTexture() to create a WebGL texture bound tensor data
b. fromGpuBuffer() to create a WebGPUBuffer bound tensor data
c. fromPinnedBuffer() to create a tensor using a CPU pinned buffer

Examples:

create tensors from texture and pass to inference session as inputs

// when create session, specify we prefer 'image_output:0' to be stored on GPU as texture
const session = await InferenceSession.create('./my_model.onnx', {
  executionProviders: [ 'webgl' ],
  preferredOutputLocation: { 'image_output:0': 'texture' }
});

...

const myImageTexture = getTexture(); // user's function to get a texture
const myFeeds = { input0: Tensor.fromTexture(myImageTexture, { width: 224, height: 224 }) }; // shape [1, 224, 224, 4], RGBA format.
const results = await session.run(myFeeds);
const myOutputTexture = results['image_output:0'].texture;

fs-eire · 2023-06-22T01:46:43Z

A few questions need to be figured out:

what is a reasonable definition for the options (second parameter) for Tensor.fromTexture()? currently I only have width and height in definition. maybe need more (layout/format/...?)
~~to add { preserveGpuData: true } to session options so that it can produce texture bound tensor as output instead of previous behavior (always download to CPU)~~ now using new property preferredOutputLocation in session options.
to add functions from wasm to allocate/free memories for CPU-pinned buffer.

…-gpu

qjia7

I suppose this PR is only the tensor part. You still need another PR to make sure the tensorFromTextureXXX's gpu resource is in the same context with the backend so that the external resource can be recognized by backend and used for copy/draw/dispatch.

Just for inference, in tfjs, create tensor from cpu/buffer/texture is like below:

export function tensor<R extends Rank>(values: TensorLike|WebGLData|WebGPUData, shape?: ShapeMap[R], dtype?: DataType): Tensor<R>

And for getting data from tensor, there are three methods:
tensor.data() // Asynchronously downloads the values.
tensor.dataSync() // Synchronously downloads the values.
tensor.dataToGPU() // Copy the tensor's data to a new GPU resource. Comparing to the dataSync() and data(), this method prevents data from being downloaded to CPU.

js/common/lib/tensor-factory-impl.ts

js/common/lib/tensor-factory.ts

js/common/lib/tensor-impl.ts

fs-eire · 2023-07-24T23:48:48Z

I suppose this PR is only the tensor part. You still need another PR to make sure the tensorFromTextureXXX's gpu resource is in the same context with the backend so that the external resource can be recognized by backend and used for copy/draw/dispatch.

Just for inference, in tfjs, create tensor from cpu/buffer/texture is like below:
export function tensor<R extends Rank>(values: TensorLike|WebGLData|WebGPUData, shape?: ShapeMap[R], dtype?: DataType): Tensor<R>
And for getting data from tensor, there are three methods: tensor.data() // Asynchronously downloads the values. tensor.dataSync() // Synchronously downloads the values. tensor.dataToGPU() // Copy the tensor's data to a new GPU resource. Comparing to the dataSync() and data(), this method prevents data from being downloaded to CPU.

unlike tfjs, ort-web always runs a model. ORT-web users cannot run a single kernel, pause from a middle point of a graph, or use any graph API to construct a model graph. This offers less flexibility in return of a much simpler usage scenario. This is almost all users do:

create input tensor(s)
call session.run()
get output tensor(s)

so, this mean, in ort-web(and all other ort-JS libraries), tensors are 2 types: created by users and created by runtime. Tensors that created by users will be used as input tensors or pre-allocated bound outputs, and tensors that created by runtime are originally a model's output tensors, but they can be used as input to another model.

Tensors created by users do not "own" the underlying resources. Users are expected to use the non-internal APIs to create CPU tensors via the following constructor:

new Tensor(type, data, dims?);
new Tensor(typedArrayData, dims?);

or create location specific tensors using

Tensor.fromTexture(texture, options); // with no 'download' and 'dispose' in 'options'
Tensor.fromGpuBuffer(gpuBuffer, options); // with no 'download' and 'dispose' in 'options'
Tensor.fromPinnedBuffer(type, buffer, dims?);

on the other hand, tensor created by ORT as outputs, should be created with 'download' and 'dispose' so that users can manually release the data.

Considering the above explaination, we can tell the following scenarios:

downloading GPU data to CPU for user created tensors: NOT ALLOWED. we don't expect users to use Tensor class to download GPU data from their own resources. If this is a model input, ORT will handle data transfer inside if a copy is required.
uploading CPU data to GPU for user created tensors: NOT ALLOWED. If this is a model input, ORT will handle data transfer inside.
downloading GPU data to CPU for ORT created tensors: via tensor.getData().
uploading CPU data to GPU for ORT created tensors: NOT ALLOWED. I assume this scenario is actually out of scope, as users can use the raw data to play with their canvas/image element.

Overall, considering multiple reasons, onnxruntime-web choose to use a different way to design the Tensor class like what tfjs does. this brings pros and cons.

…-gpu

qjia7

LGTM with one nit. Thanks.

js/common/lib/tensor-impl.ts

langhuihui · 2024-01-30T02:25:04Z

is there an onnx for example?

langhuihui · 2024-02-29T09:34:16Z

results still return CPU data

langhuihui · 2024-02-29T10:19:47Z

I found that preferredOutputLocation only used in wasm/wasm-core-impl.ts. But webgl only use backend-onnxjs.ts


export * from 'onnxruntime-common';
import * as ort from 'onnxruntime-common';
export default ort;

import {registerBackend, env} from 'onnxruntime-common';
import {version} from './version';

if (!BUILD_DEFS.DISABLE_WEBGL) {
  const onnxjsBackend = require('./backend-onnxjs').onnxjsBackend;
  registerBackend('webgl', onnxjsBackend, -10);
}

if (!BUILD_DEFS.DISABLE_WASM) {
  const wasmBackend = BUILD_DEFS.DISABLE_TRAINING ? require('./backend-wasm-inference').wasmBackend :
                                                    require('./backend-wasm-training').wasmBackend;
  if (!BUILD_DEFS.DISABLE_WEBGPU) {
    registerBackend('webgpu', wasmBackend, 5);
    registerBackend('webnn', wasmBackend, 5);
  }
  registerBackend('cpu', wasmBackend, 10);
  registerBackend('wasm', wasmBackend, 10);
}

Object.defineProperty(env.versions, 'web', {value: version, enumerable: true});

[//]: # (## Work In Progress. Feedbacks are welcome!) ### Description This PR adds a few properties, methods and factories to Tensor type to support IO-binding feature. This will allow user to create tensor from GPU/CPU bound data without a force transferring of data between CPU and GPU. This change is a way to resolve microsoft#15312 ### Change Summary 1. Add properties to `Tensor` type: a. `location`: indicating where the data is sitting. valid values are `cpu`, `cpu-pinned`, `texture`, `gpu-buffer`. b. `texture`: sit side to `data`, a readonly property of `WebGLTexture` type. available only when `location === 'texture'` c. `gpuBuffer`: sit side to `data`, a readonly property of `GPUBuffer` type. available only when `location === 'gpu-buffer'` 2. Add methods to `Tensor` type (usually dealing with inference outputs): - async function `getData()` allows user to download data from GPU to CPU manually. - function `dispose()` allows user to release GPU resources manually. 3. Add factories for creating `Tensor` instances: a. `fromTexture()` to create a WebGL texture bound tensor data b. `fromGpuBuffer()` to create a WebGPUBuffer bound tensor data c. `fromPinnedBuffer()` to create a tensor using a CPU pinned buffer ### Examples: create tensors from texture and pass to inference session as inputs ```js // when create session, specify we prefer 'image_output:0' to be stored on GPU as texture const session = await InferenceSession.create('./my_model.onnx', { executionProviders: [ 'webgl' ], preferredOutputLocation: { 'image_output:0': 'texture' } }); ... const myImageTexture = getTexture(); // user's function to get a texture const myFeeds = { input0: Tensor.fromTexture(myImageTexture, { width: 224, height: 224 }) }; // shape [1, 224, 224, 4], RGBA format. const results = await session.run(myFeeds); const myOutputTexture = results['image_output:0'].texture; ```

[js/api] introducing IO binding for tensor

5fa4213

fs-eire added 3 commits June 22, 2023 00:49

Merge remote-tracking branch 'origin/main' into fs-eire/js-api-tensor…

c8fc4b1

…-gpu

add format in fromTexture options

e6f850d

fix linting

10df048

fs-eire requested a review from guschmue June 23, 2023 00:35

fs-eire added 3 commits July 5, 2023 21:44

Merge remote-tracking branch 'origin/main' into fs-eire/js-api-tensor…

bfb24fe

…-gpu

add 'preferredOutputLocation' to session options

bb1f16e

add gpu buffer

bc4cbce

guschmue previously approved these changes Jul 11, 2023

View reviewed changes

Merge remote-tracking branch 'origin/main' into fs-eire/js-api-tensor…

95249d4

…-gpu

fs-eire dismissed guschmue’s stale review via 95249d4 July 15, 2023 00:49

fs-eire added 5 commits July 14, 2023 18:59

expose new Tensor({params})

f3dd99a

add comment for 'preferredOutputLocation'

783aba2

revise type declaration

23b5713

allow manually disposal

c075e94

Merge remote-tracking branch 'origin/main' into fs-eire/js-api-tensor…

0834350

…-gpu

qjia7 reviewed Jul 20, 2023

View reviewed changes

fs-eire added 2 commits July 24, 2023 16:57

Merge remote-tracking branch 'origin/main' into fs-eire/js-api-tensor…

588738b

…-gpu

remove location-specific constructors from public API

99af8c1

guschmue previously approved these changes Jul 25, 2023

View reviewed changes

fs-eire added 2 commits August 17, 2023 11:20

Merge remote-tracking branch 'origin/main' into fs-eire/js-api-tensor…

74b9066

…-gpu

expose webgl context and webgpu device

e3b1a25

fs-eire dismissed guschmue’s stale review via e3b1a25 August 17, 2023 19:05

fs-eire added 5 commits August 17, 2023 15:59

fix build

065573d

Merge remote-tracking branch 'origin/main' into fs-eire/js-api-tensor…

65855f0

…-gpu

fix test failure

8ab50b2

make webgl.context and webgpu.device unenumerable

0b382d3

Merge remote-tracking branch 'origin/main' into fs-eire/js-api-tensor…

7598fba

…-gpu

satyajandhyala self-assigned this Aug 23, 2023

fs-eire mentioned this pull request Aug 24, 2023

[JS/WebGPU] support Range operator #17233

Merged

satyajandhyala approved these changes Aug 28, 2023

View reviewed changes

qjia7 approved these changes Aug 29, 2023

View reviewed changes

js/common/lib/tensor-impl.ts Show resolved Hide resolved

fs-eire merged commit e5ca3f3 into main Aug 29, 2023

fs-eire deleted the fs-eire/js-api-tensor-gpu branch August 29, 2023 19:58

langhuihui mentioned this pull request Feb 29, 2024

I found that preferredOutputLocation only used in wasm/wasm-core-impl.ts. But webgl only use backend-onnxjs.ts #19719

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[js/api] introducing IO binding for tensor #16452

[js/api] introducing IO binding for tensor #16452

Uh oh!

fs-eire commented Jun 22, 2023 •

edited

Loading

Uh oh!

fs-eire commented Jun 22, 2023 •

edited

Loading

Uh oh!

qjia7 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fs-eire commented Jul 24, 2023

Uh oh!

qjia7 left a comment

Uh oh!

Uh oh!

langhuihui commented Jan 30, 2024

Uh oh!

langhuihui commented Feb 29, 2024

Uh oh!

langhuihui commented Feb 29, 2024

Uh oh!

Uh oh!

[js/api] introducing IO binding for tensor #16452

[js/api] introducing IO binding for tensor #16452

Uh oh!

Conversation

fs-eire commented Jun 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Change Summary

Examples:

Uh oh!

fs-eire commented Jun 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qjia7 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fs-eire commented Jul 24, 2023

Uh oh!

qjia7 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

langhuihui commented Jan 30, 2024

Uh oh!

langhuihui commented Feb 29, 2024

Uh oh!

langhuihui commented Feb 29, 2024

Uh oh!

Uh oh!

fs-eire commented Jun 22, 2023 •

edited

Loading

fs-eire commented Jun 22, 2023 •

edited

Loading