MMM data loading and preprocessing #23

JoBurchert · 2024-06-05T10:43:40Z

Hi, thanks for the interesting paper! I am currently trying to reproduce the results from the paper and have some questions regarding the preprocessing. Following the documentation, I have obtained the SEED dataset, which has the following structure:

SEED_EEG
-- ExtractedFeatures_1s
-- ExtractedFeatures_4s
-- Preprocessed_EEG
-- SEED_RAW_EEG

Are you using the extracted features for the 1s or 4s versions? Furthermore, there are several issues with the script SEED_DE.py. In the current version, the filename includes the datapath, which causes an error when trying to open the file in line 24. Additionally, the sorting for the filenames in line 12 returns the wrong ordering of the patients. Here, the patient ID will be returned as the following: ['10_xxx.mat', '11_xxx.mat', ..., '15_xxx.mat', '1_xxx.mat', '2_xxx.mat', ..., '9_xxx.mat'], which will lead to a misalignment with the labels.mat.

I would also be interested in the preprocessing for the SEED datasets as well as the TUEG, since those follow a different schema. Could you be so kind as to also include those in the repo?

As a last point, you describe how you perform the DE feature extraction in Eq. 1-5 in the appendix of your paper; however, I was unable to locate these steps in your code. Could you help me out in this regard and point me in the right direction?

Thanks a lot in advance!

victorywys · 2024-06-11T07:27:27Z

Hi,

Thank you for your interest in our paper and for bringing up these issues!

Data Preprocessing:
In our experiments, we used both the 1s and 4s versions of the data. For the 4s data, it requires DE processing code that we have not yet released whose copyrights are held by the authors of the SEED datasets. However, the 1s data can be used directly with the extracted features provided in the dataset, i.e., ExtractedFeatures_1s.
Issues in SEED_DE.py:

Line 24. I'm not sure why it causes an error. Did you set the data_path in line 9 to your local path? Or can you share with us more details about the error you are encountering?
For the sorting problem, since each data_file.mat records a single experiment for one person, and the order of stimuli in an experiment is fixed, all data_file.mat follow the same set of labels. Therefore, there shouldn’t be a misalignment no matter how the filenames are sorted.

Preprocessing for SEED
The preprocessing of EEG data in SEED undergoes the same process as the original dataset detailed here (Dataset Summary -> SEED_EEG -> B. "Extracted_features") with the codes mentioned in 1. Sincerely sorry that due to the copyright issue, we can not publish this part of codes. However, the 1s extracted features by the authors of SEED are the same as what we're using and are directly available for the experiments.

Thank you for your understanding. If you have any more questions or need further assistance, please feel free to ask.

JoBurchert · 2024-06-12T09:17:16Z

Thanks for your reply,

regarding the SEED_DE.py the combination of line 11 and 24 is causing issues because 'filenames' also contains the full path to the data and are then joined with the 'data_path' again producing the following error:

Traceback (most recent call last):
File "/home/burchert/.local/lib/python3.10/site-packages/scipy/io/matlab/_mio.py", line 39, in _open_file
return open(file_like, mode), True
FileNotFoundError: [Errno 2] No such file or directory: '../data/SEED/SEED_EEG/ExtractedFeatures_1s/../data/SEED/SEED_EEG/ExtractedFeatures_1s/10_20131130.mat'

victorywys · 2024-06-17T03:45:15Z

Apologize for the bugs, we found we've incorrectly modified it when cleaning up the comments unnecessary code lines. We have now fixed it in pull request #24 . Thanks for your contribution!

samin9796 · 2024-07-20T22:50:54Z

@victorywys Hello! It looks like this repo contains the code for working with the SEED dataset. However, I would like to fine-tune MMM with the SEED IV dataset. Would it be possible to share the scripts for SEED IV?

Thank you.

victorywys · 2024-07-30T07:05:21Z

@victorywys Hello! It looks like this repo contains the code for working with the SEED dataset. However, I would like to fine-tune MMM with the SEED IV dataset. Would it be possible to share the scripts for SEED IV?

Thank you.

@samin9796 Added in the pull request #28 . Please kindly refer to scipt/extract.py for details. Thanks for your interest!

victorywys closed this as completed Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MMM data loading and preprocessing #23

MMM data loading and preprocessing #23

JoBurchert commented Jun 5, 2024

victorywys commented Jun 11, 2024

JoBurchert commented Jun 12, 2024

victorywys commented Jun 17, 2024

samin9796 commented Jul 20, 2024

victorywys commented Jul 30, 2024 •

edited

Loading

MMM data loading and preprocessing #23

MMM data loading and preprocessing #23

Comments

JoBurchert commented Jun 5, 2024

victorywys commented Jun 11, 2024

JoBurchert commented Jun 12, 2024

victorywys commented Jun 17, 2024

samin9796 commented Jul 20, 2024

victorywys commented Jul 30, 2024 • edited Loading

victorywys commented Jul 30, 2024 •

edited

Loading