Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Data augmentation with skorch #171

Open
arthur-thuy opened this issue Jun 22, 2023 · 0 comments
Open

Data augmentation with skorch #171

arthur-thuy opened this issue Jun 22, 2023 · 0 comments

Comments

@arthur-thuy
Copy link

I am using modAL with skorch for integration with PyTorch. Most tutorials, such as the Pytorch models in modAL workflows tutorial from modAL, use the MNIST dataset which commonly has the same data transformation in train, pool, validation, and test sets.

From the tutorial:

mnist_data = MNIST('.', download=True, transform=ToTensor())
dataloader = DataLoader(mnist_data, shuffle=True, batch_size=60000)
X, y = next(iter(dataloader))

I would like to use data augmentation for more complex computer vision problems. The difficulty is that the augmentation should only be applied to the train set, and not to the pool set. The modAL tutorial creates a DataLoader with 1 large batch, applies data transformations, splits it into labelled (train) and unlabeled (pool) sets and feeds this to the modAL functions.

As such, observations moving from pool set to labeled set should receive additional transformations. However, I have difficulties with applying these transformations because the PyTorch transforms expect PIL images.

What would be the best way to handle this?

Thank you

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant