Skip to content

Added Entanglement Concentration Dataset for 3 and 4 qubits for Benchmarking Binary Classifiers #915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

RishiNandha
Copy link
Contributor

Summary

Classification dataset for 3 and 4 qubits based on the concentration of entanglement (CE) in Quantum States. Two pre-trained circuits are used to generate states of a given amount of CE. Users can use this dataset to benchmark their binary classification pipelines.

Pre-trained weights courtesy to https://github.com/LSchatzki/NTangled_Datasets. The CE values claimed in the above repository had a mismatch for 8 qubits, hence we've left other number of qubits for future development

I've verified mypy, spell, lint and black. For some reason, make html breaks the other modules. Need some help with resolving that

Details and comments

  • We have made the order and default values of parameters match the existing ad hoc data generator for consistency. Hence equal number of datapoints in each class are generated.
  • There are two sampling options: the input states given to the circuit before it's action can either be sampled by setting each qubit's state as one of the axes of the bloch sphere ("cardinal") or can be sampled randomly ("isotropic")
  • Each qubit has an easy and a hard mode. Easy has a larger difference in CE values than hard. This is to make benchmarking of pipelines more standardizable. Easy can be used to verify the working of algorithms, while hard can be used to test the maximum the algorithm can achieve.
  • There are two formatting options. The x_train and x_test can either be a numpy array or a list of quantum states
  • Reference for confirming the relevance of such a dataset: https://arxiv.org/abs/2109.03400. Authors have shown that QCNNs can learn from these datasets effectively

- Classification dataset for 3 and 4 qubits
- Pre-trained weights courtesy to https://github.com/LSchatzki/NTangled_Datasets
- The CE values claimed in the above repository had a mismatch for 8 qubits, hence we've left other number of qubits for future development
- Make html breaks for some reason. Need to fix in upcoming commits

Co-Authored-By: Nishant Vasan <[email protected]>
Co-Authored-By: rogue-infinity <[email protected]>
@RishiNandha
Copy link
Contributor Author

Oh the init file seems to have gotten missed. I'll recommit

@RishiNandha
Copy link
Contributor Author

Not sure why the 3.9 tests are passing but the 3.11 and 3.12 ones are failing. And it seems like the routine is raising errors majorly only on files I've left untouched. Any inputs of what might be happening?

@woodsp-ibm
Copy link
Member

If you look at Actions tab at the top of the page i.e. here https://github.com/qiskit-community/qiskit-machine-learning/actions you will see Machine Learning Tests that are Scheduled. These are the same tests but are run nightly so any changes in dependents that may cause problems/failures etc are caught. The scheduled tests have been failing for a while and the main branch code needs updating in some way (maybe pinning to an earlier dependent or changing code to suit etc) so things pass again. With that done, and then merged with the code in your PR, it would then only be changes done by your PR that could cause failures - but with the base (i.e. main) failing its not.

@woodsp-ibm
Copy link
Member

The CI issues have been fixed so I updated the branch (via the update button that was here) so any issues now would be just down to this PR - unless it has a random failure for which there is an issue #903 around that,

@coveralls
Copy link

coveralls commented May 14, 2025

Pull Request Test Coverage Report for Build 15074635331

Details

  • 95 of 119 (79.83%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.3%) to 90.561%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit_machine_learning/datasets/entanglement_concentration.py 93 117 79.49%
Totals Coverage Status
Change from base Build 15074608848: -0.3%
Covered Lines: 4586
Relevant Lines: 5064

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants