Skip to content

Is it possible to run this in node? #4

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
Madd0g opened this issue Mar 5, 2023 · 9 comments
Closed

Is it possible to run this in node? #4

Madd0g opened this issue Mar 5, 2023 · 9 comments

Comments

@Madd0g
Copy link

Madd0g commented Mar 5, 2023

I got this error when trying:

TypeError [ERR_WORKER_PATH]: The worker script or module filename must be an absolute path or a relative path starting with './' or '../'. Received "blob:nodedata:....

@xenova
Copy link
Collaborator

xenova commented Mar 5, 2023

Hi there. Yes, this is actually a known bug in onnx runtime web. (Similar problem to this: microsoft/onnxruntime#14445)

You can fix it as follows:

// 1. Fix "ReferenceError: self is not defined" bug when running directly with node
// https://github.com/microsoft/onnxruntime/issues/13072
global.self = global;

const { pipeline, env } = require('@xenova/transformers')

// 2. Disable spawning worker threads for testing.
// This is done by setting numThreads to 1
env.onnx.wasm.numThreads = 1

// 3. Continue as per usual:
// ...

You can see a working example in the testing script: https://github.com/xenova/transformers.js/blob/main/tests/index.js

(In that case, I also set env.remoteModels=false, for testing locally)

There is seems to be a node-specific module for onnxruntime (https://www.npmjs.com/package/onnxruntime-node), but I haven't tested it.

@Madd0g
Copy link
Author

Madd0g commented Mar 5, 2023

thanks, that made it work!

are you planning to integrate onnxruntime-node in the future? would be cool to be able to choose

@Madd0g Madd0g closed this as completed Mar 5, 2023
@Madd0g
Copy link
Author

Madd0g commented Mar 5, 2023

(also seems like there's no caching when running this under node, so it redownloads the model every time)

@xenova
Copy link
Collaborator

xenova commented Mar 5, 2023

(also seems like there's no caching when running this under node, so it redownloads the model every time)

Yes, that is correct. At the moment, caching is only implemented with the Cache Web API, which is not available for node.

I will hopefully add that functionality soon, so that it acts in a similar way to huggingface's system, which downloads it to your file system.

In the meantime, I suggest you just download the model and place it in the ./models/onnx/quantized folder (or another location, provided you set env.localURL)

@dkogut1996
Copy link

dkogut1996 commented Mar 16, 2023

Im getting the ReferenceError: self is not defined bug when running in typescript and trying to import Autotokenizer.

I have global.self = global at the top of the file, and vscode complains with this error:
Type 'typeof globalThis' is not assignable to type 'Window & typeof globalThis'.

Importing using import { AutoTokenizer } from '@xenova/transformers' and simply referencing the class produces the reference error.

Any ideas how this would work in ts?

@xenova
Copy link
Collaborator

xenova commented Mar 16, 2023

Im getting the ReferenceError: self is not defined bug when running in typescript and trying to import Autotokenizer.

I have global.self = global at the top of the file, and vscode complains with this error: Type 'typeof globalThis' is not assignable to type 'Window & typeof globalThis'.

Importing using import { AutoTokenizer } from '@xenova/transformers' and simply referencing the class produces the reference error.

Any ideas how this would work in ts?

I unfortunately haven't worked with typescript before, so, I wouldn't be able to give you very good advice 😅 ... However, I do intend to one day convert the library to TS!

In the meantime, I tried asking Chat-GPT. Here are some of its answers. Do take it with a grain of salt though, because I don't quite know what it means haha 😅
image
image


In future, It will be best for the library to support both web and node versions of onnxruntime, and choose which to use based on the environment.

@dkogut1996
Copy link

Ah ok, those didn't work but it was hilarious.

I'm actually using your tokenizer classes so that I may use onnxruntime-node without having to rewrite the tokenizer libraries for the various LLMs out there. This repo is awesome and you've been very responsive, so thanks so much for all the help!

@xenova
Copy link
Collaborator

xenova commented Mar 16, 2023

Ah ok, those didn't work but it was hilarious.

I'm actually using your tokenizer classes so that I may use onnxruntime-node without having to rewrite the tokenizer libraries for the various LLMs out there. This repo is awesome and you've been very responsive, so thanks so much for all the help!

Great! 👍 No worries 😄

getting this version working out-of-the-box for node + typescript is on the TODO list :)

@Madd0g
Copy link
Author

Madd0g commented Mar 16, 2023

Ah ok, those didn't work but it was hilarious.

I have this on the top of the file:

(global as any).self = global;

Actually CGPT suggested better approaches, since this will not fly in a strict: true codebase.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants