Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

now how to pass Tensor or data dir path to ChunkDataReader in Javacpp-pytorch #1556

Open
mullerhai opened this issue Dec 26, 2024 · 7 comments

Comments

@mullerhai
Copy link

Hi :
now we have write some code to extend storch framework https://github.com/sbrunk/storch make pytorch in scala env, so will rewrite the dataset dataloader simpler and dataReader, now javacpp-pytorch only have ChunkDataReader ,but I not know how to pass data chunk path or tensor Example to ChunkDataReader ,could you give me one example to show how to use it .@h @saudet . by the way @sbrunk If you know thank tell me ,
image

image
image
image

@saudet
Copy link
Member

saudet commented Dec 26, 2024

@mullerhai
Copy link
Author

There's some sample code here: https://github.com/bytedeco/javacpp-presets/blob/master/pytorch/samples/TestChunkData.java

Very thanks , these ChunkDataLoader and ChunkDataset could use ChunkDataReader , but for these JavaDataset also need pass org.bytedeco.javacpp.Pointer object, and should also pass some DataReader object? but now javacpp-pytorch only have ChunkDataReader ,so what should we pass to these JavaDataLoader JavaDataset, I also think maybe should pass InputStream** but It not is Pointer subclass.

image
image

@saudet
Copy link
Member

saudet commented Dec 27, 2024

I don't know what JavaDataset is for. @HGuillemet ?

@HGuillemet
Copy link
Collaborator

JavaDataSet is the abstract class to subclass for implementing stateless datasets in Java.

@HGuillemet HGuillemet removed their assignment Dec 27, 2024
@mullerhai
Copy link
Author

JavaDataSet is the abstract class to subclass for implementing stateless datasets in Java.

how to use these javaDataset, please show me one use case,thanks

@HGuillemet
Copy link
Collaborator

Just subclass it:

JavaDataset ds = new JavaDataset() {
  @Override public Example get(long idx) {
      // ...
  }

  @Override public SizeTOptional size() {
     // ...
  }
};

Then use it for instance with a random sampler and a random loader:

DataLoaderOptions opts = new DataLoaderOptions(2);
opts.workers().put(5);
JavaRandomDataLoader loader = new JavaRandomDataLoader(ds, new RandomSampler(ds.size().get()), opts);      

@mullerhai
Copy link
Author

Just subclass it:

JavaDataset ds = new JavaDataset() {
  @Override public Example get(long idx) {
      // ...
  }

  @Override public SizeTOptional size() {
     // ...
  }
};

Then use it for instance with a random sampler and a random loader:

DataLoaderOptions opts = new DataLoaderOptions(2);
opts.workers().put(5);
JavaRandomDataLoader loader = new JavaRandomDataLoader(ds, new RandomSampler(ds.size().get()), opts);      

thanks. but I do not see how to pass data dir path or tensor param to the javadataset , need me implement javadataset ,and use difined how to pass ?

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

3 participants