Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

MMM data loading and preprocessing #23

Closed
JoBurchert opened this issue Jun 5, 2024 · 5 comments
Closed

MMM data loading and preprocessing #23

JoBurchert opened this issue Jun 5, 2024 · 5 comments

Comments

@JoBurchert
Copy link

Hi, thanks for the interesting paper! I am currently trying to reproduce the results from the paper and have some questions regarding the preprocessing. Following the documentation, I have obtained the SEED dataset, which has the following structure:

SEED_EEG
-- ExtractedFeatures_1s
-- ExtractedFeatures_4s
-- Preprocessed_EEG
-- SEED_RAW_EEG

Are you using the extracted features for the 1s or 4s versions? Furthermore, there are several issues with the script SEED_DE.py. In the current version, the filename includes the datapath, which causes an error when trying to open the file in line 24. Additionally, the sorting for the filenames in line 12 returns the wrong ordering of the patients. Here, the patient ID will be returned as the following: ['10_xxx.mat', '11_xxx.mat', ..., '15_xxx.mat', '1_xxx.mat', '2_xxx.mat', ..., '9_xxx.mat'], which will lead to a misalignment with the labels.mat.

I would also be interested in the preprocessing for the SEED datasets as well as the TUEG, since those follow a different schema. Could you be so kind as to also include those in the repo?

As a last point, you describe how you perform the DE feature extraction in Eq. 1-5 in the appendix of your paper; however, I was unable to locate these steps in your code. Could you help me out in this regard and point me in the right direction?

Thanks a lot in advance!

@victorywys
Copy link
Collaborator

Hi,

Thank you for your interest in our paper and for bringing up these issues!

  1. Data Preprocessing:
    In our experiments, we used both the 1s and 4s versions of the data. For the 4s data, it requires DE processing code that we have not yet released whose copyrights are held by the authors of the SEED datasets. However, the 1s data can be used directly with the extracted features provided in the dataset, i.e., ExtractedFeatures_1s.

  2. Issues in SEED_DE.py:

  • Line 24. I'm not sure why it causes an error. Did you set the data_path in line 9 to your local path? Or can you share with us more details about the error you are encountering?

  • For the sorting problem, since each data_file.mat records a single experiment for one person, and the order of stimuli in an experiment is fixed, all data_file.mat follow the same set of labels. Therefore, there shouldn’t be a misalignment no matter how the filenames are sorted.

  1. Preprocessing for SEED
    The preprocessing of EEG data in SEED undergoes the same process as the original dataset detailed here (Dataset Summary -> SEED_EEG -> B. "Extracted_features") with the codes mentioned in 1. Sincerely sorry that due to the copyright issue, we can not publish this part of codes. However, the 1s extracted features by the authors of SEED are the same as what we're using and are directly available for the experiments.

Thank you for your understanding. If you have any more questions or need further assistance, please feel free to ask.

@JoBurchert
Copy link
Author

Thanks for your reply,

regarding the SEED_DE.py the combination of line 11 and 24 is causing issues because 'filenames' also contains the full path to the data and are then joined with the 'data_path' again producing the following error:

Traceback (most recent call last):
File "/home/burchert/.local/lib/python3.10/site-packages/scipy/io/matlab/_mio.py", line 39, in _open_file
return open(file_like, mode), True
FileNotFoundError: [Errno 2] No such file or directory: '../data/SEED/SEED_EEG/ExtractedFeatures_1s/../data/SEED/SEED_EEG/ExtractedFeatures_1s/10_20131130.mat'

@victorywys
Copy link
Collaborator

Apologize for the bugs, we found we've incorrectly modified it when cleaning up the comments unnecessary code lines. We have now fixed it in pull request #24 . Thanks for your contribution!

@samin9796
Copy link

@victorywys Hello! It looks like this repo contains the code for working with the SEED dataset. However, I would like to fine-tune MMM with the SEED IV dataset. Would it be possible to share the scripts for SEED IV?

Thank you.

@victorywys
Copy link
Collaborator

victorywys commented Jul 30, 2024

@victorywys Hello! It looks like this repo contains the code for working with the SEED dataset. However, I would like to fine-tune MMM with the SEED IV dataset. Would it be possible to share the scripts for SEED IV?

Thank you.

@samin9796 Added in the pull request #28 . Please kindly refer to scipt/extract.py for details. Thanks for your interest!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants