Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix: Potential Data Leakage in Quantum Data Tutorial. #829

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

OkuyanBoga
Copy link

A solution to potential data leakage in #828.

Instead of concatenating train and test sets, they should be separately dealt with when getting a stilted dataset:

In lines L745-752:

y_train_new = get_stilted_dataset(S_pqk, V_pqk, S_original, V_original)
y_test_new = get_stilted_dataset(S_pqk_test, V_pqk_test, S_test_original, V_test_original)

where spectrum is calculated separately for test set:

S_pqk_test, V_pqk_test = get_spectrum(
    tf.reshape(x_test_pqk, [-1, len(qubits) * 3]))

S_test_original, V_test_original = get_spectrum(
    tf.cast(x_test, tf.float32), gamma=0.005)

print('Eigenvectors of pqk kernel matrix for test:', V_pqk_test)
print('Eigenvectors of original kernel matrix for test:', V_test_original)

Closes #828.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Possible data leakage in quantum/docs/tutorials /quantum_data.ipynb
1 participant