-
Notifications
You must be signed in to change notification settings - Fork 2
Home
Electron microscopy datasets are available from a combination of Zenodo and Google Drive storage (mirror 1). They're also available from a publicly accessible University of Warwick dataserver (mirror 2). TEM and STEM Images/Crops datasets were collected by dozens of Warwick scientists working on hundreds of projects and therefore have a diverse constitution. Wavefunctions are for atom columns.
A preprint|paper provides dataset details and visualizations. Datasets are in the public domain and can be used without restriction. Most datasets are large (100+ GB) so downloads may take a couple of hours or more depending on your internet connection. In addition, if many users have recently downloaded a dataset from mirror 1, you might get an error saying "download quota exceeded for this file so you can't download at this time". To avoid this, either # to Google Drive or use mirror 2.
Multiple datasets containing 98340 wavefunctions simulated with clTEM. In addition, there are 1000 experimental focal series. Wavefunctions are in 64-bit complex (320, 320) numpy array files (.npy) that can be opened with np.load(). Focal series images are in TIFF format. Featured in this preprint.
Datasets include:
- Wavefunctions (wavefunctions_partitioned_multiple_hq): n=3, multiple materials - 27.8 GB.
- Wavefunctions Unseen Training (wavefunctions_multiple_unseen_train_hq): n=3, multiple materials, materials in training set - 1.2 GB.
- Wavefunctions Single (wavefunctions_single_hq): n=3, single material - 3.7 GB.
- Wavefunctions Restricted (wavefunctions_multiple_forth_hq): n=3, multiple materials, simulation hyperparameter ranges reduced by a factor close to 1/4 - 9.1 GB.
- Wavefunctions n=1 (wavefunctions): n=1, multiple materials. See dataset_info.txt for partitioning into training, validation and test sets. - 28.6 GB.
- Wavefunctions n=1 Unseen Training (unseen_train): n=1, multiple materials, materials in training set - 1.1 GB.
- Wavefunctions n=1 Single (wavefunctions_single): n=1, single material - 3.7 GB.
- Experimental Focal Series (experimental_focal_series): 1000 experimental focal series. Series have a quadratically increasing defocus sequence; however, they are at different spatial scales - 13.7 GB.
- CIFs (cifs): Downloaded from the COD and used for clTEM simulations - 203.9 MB.
- ULRs (url_lists): COD URLs cifs were downloaded from.
Download mirror 1
Download mirror 2 (Password: W4rw1ck3m!)
Wavefunctions downsampled to 96x96. They are in 32-bit complex (dataset_size, 320, 320, 2) numpy array files (.npy) that can be opened with np.load(). Python index [...,0] is the real part, and [...,1] is the imaginary part. Training, validation, and test sets are concatenated along the batch axis (training data at low indices).
- Wavefunctions 96x96 (wavefunctions_n=3): Bilinearly dowsampled from wavefunctions_multiple_hq with antialiasing. 36324 wavefunctions: 24530 training, 3399 validation, and 8395 test. - 2.62 GB.
- Wavefunctions 96x96 Restricted (wavefunctions_restricted_n=3): Bilinearly dowsampled from wavefunctions_multiple_forth_hq with antialiasing. 11870 wavefunctions: 8002 training, 1105 validation, and 2763 test. - 855 MB.
- Wavefunctions 96x96 Single (wavefunctions_single_n=3): Bilinearly dowsampled from wavefunctions_single_hq with antialiasing. 4825 wavefunctions: 3861 training, and 964 validation. - 347 MB.
Download mirror 1
Size 96x96 images intended for rapid development. Images are in numpy array files (.npy) that can be opened with np.load().
- Full TEM images downsampled to 96x96 with antialiasing. Images are in a (17266, 96, 96, 1) numpy array file (.npy). - 607 MB.
- Full STEM images downsampled to 96x96 with antialiasing. Images are in a (19769, 96, 96, 1) numpy array file (.npy). - 695 MB.
- 96x96 crops from full STEM images. Images are in a (19769, 96, 96, 1) numpy array file (.npy). - 695 MB.
Download mirror 1
Full STEM images in a variety of shapes. Featured in this paper.
Info: 159.4 GB. 16227 images.
Download mirror 1
Download mirror 2 (Password: W4rw1ck3m!)
Non-overlapping 512x512 crops from images in the STEM full images dataset. Featured in this paper.
Info: 157.3 GB. 110933 training, 21259 validation and 28877 test set crops, totalling 161069 crops.
Download mirror 1
Download mirror 2 (Password: W4rw1ck3m!)
Full TEM images. Featured in this paper.
Info: 269.8 GB. 11350 training, 2431 validation and 3486 test images, totalling 17267 images.
Download mirror 1
Download mirror 2 (Password: W4rw1ck3m!)
Jeffrey Ede: j.m.ede@warwick.ac.uk