Added Entanglement Concentration Dataset for 3 and 4 qubits for Benchmarking Binary Classifiers #915

RishiNandha · 2025-05-09T16:03:34Z

Summary

Classification dataset for 3 and 4 qubits based on the concentration of entanglement (CE) in Quantum States. Two pre-trained circuits are used to generate states of a given amount of CE. Users can use this dataset to benchmark their binary classification pipelines.

Pre-trained weights courtesy to https://github.com/LSchatzki/NTangled_Datasets. The CE values claimed in the above repository had a mismatch for 8 qubits, hence we've left other number of qubits for future development

I've verified mypy, spell, lint and black. For some reason, make html breaks the other modules. Need some help with resolving that

Details and comments

We have made the order and default values of parameters match the existing ad hoc data generator for consistency. Hence equal number of datapoints in each class are generated.
There are two sampling options: the input states given to the circuit before it's action can either be sampled by setting each qubit's state as one of the axes of the bloch sphere ("cardinal") or can be sampled randomly ("isotropic")
Each qubit has an easy and a hard mode. Easy has a larger difference in CE values than hard. This is to make benchmarking of pipelines more standardizable. Easy can be used to verify the working of algorithms, while hard can be used to test the maximum the algorithm can achieve.
There are two formatting options. The x_train and x_test can either be a numpy array or a list of quantum states
Reference for confirming the relevance of such a dataset: https://arxiv.org/abs/2109.03400. Authors have shown that QCNNs can learn from these datasets effectively

- Classification dataset for 3 and 4 qubits - Pre-trained weights courtesy to https://github.com/LSchatzki/NTangled_Datasets - The CE values claimed in the above repository had a mismatch for 8 qubits, hence we've left other number of qubits for future development - Make html breaks for some reason. Need to fix in upcoming commits Co-Authored-By: Nishant Vasan <[email protected]> Co-Authored-By: rogue-infinity <[email protected]>

RishiNandha · 2025-05-09T16:07:46Z

Oh the init file seems to have gotten missed. I'll recommit

RishiNandha · 2025-05-10T02:41:59Z

Not sure why the 3.9 tests are passing but the 3.11 and 3.12 ones are failing. And it seems like the routine is raising errors majorly only on files I've left untouched. Any inputs of what might be happening?

woodsp-ibm · 2025-05-11T14:20:18Z

If you look at Actions tab at the top of the page i.e. here https://github.com/qiskit-community/qiskit-machine-learning/actions you will see Machine Learning Tests that are Scheduled. These are the same tests but are run nightly so any changes in dependents that may cause problems/failures etc are caught. The scheduled tests have been failing for a while and the main branch code needs updating in some way (maybe pinning to an earlier dependent or changing code to suit etc) so things pass again. With that done, and then merged with the code in your PR, it would then only be changes done by your PR that could cause failures - but with the base (i.e. main) failing its not.

woodsp-ibm · 2025-05-14T15:44:33Z

The CI issues have been fixed so I updated the branch (via the update button that was here) so any issues now would be just down to this PR - unless it has a random failure for which there is an issue #903 around that,

coveralls · 2025-05-14T16:21:42Z

Pull Request Test Coverage Report for Build 15622602924

Details

95 of 119 (79.83%) changed or added relevant lines in 2 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage decreased (-0.3%) to 90.542%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
qiskit_machine_learning/datasets/entanglement_concentration.py	93	117	79.49%

Totals
Change from base Build 15622557464:	-0.3%
Covered Lines:	4624
Relevant Lines:	5107

💛 - Coveralls

edoaltamura

I think this PR is nearly ready. I'd suggest not using npy files though, because decoding the binaries might require a specific version of Numpy, which will likely change in the future. Since the files are relatively small, we could convert the arrays in the npys to json or txt files.

woodsp-ibm · 2025-07-11T19:09:21Z

qiskit_machine_learning/datasets/entanglement_concentration.py

+    training_size: int,
+    test_size: int,
+    n: int,
+    mode: str = "easy",


My take, instead of suppressing the warning about too many positional args, would be to add *, after the n parameter which would allow the first 3 arguments to be positional but require the following, which do have defaults so they do not need to be provided, be provided when doing so as keyword arguments only.

RishiNandha requested review from woodsp-ibm, adekusar-drl, smens, edoaltamura, oscar-wallis, OkuyanBoga and Benjamin-Symons as code owners May 9, 2025 16:03

RishiNandha added 3 commits May 9, 2025 21:38

Update __init__.py

506ebb4

Made __init__.py black, Added seed for unittest

d2a3ad9

Update .pylintdict

05135b4

Merge branch 'main' into PR_NTangled

fbfd42a

edoaltamura and others added 6 commits May 16, 2025 13:46

Merge branch 'main' into PR_NTangled

836a697

Merge branch 'main' into PR_NTangled

c630b74

Merge branch 'main' into PR_NTangled

b5e2e91

Merge branch 'main' into PR_NTangled

dd42bfb

Merge branch 'main' into PR_NTangled

2bc37d3

Merge branch 'main' into PR_NTangled

161b2bb

edoaltamura added type: enhancement ✨ Features or aspects to improve short project A task amounting to a small project (but larger than a "good first issue") labels Jun 12, 2025

edoaltamura requested changes Jul 4, 2025

View reviewed changes

edoaltamura mentioned this pull request Jul 4, 2025

QGAN functionality request #959

Open

woodsp-ibm reviewed Jul 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added Entanglement Concentration Dataset for 3 and 4 qubits for Benchmarking Binary Classifiers #915

Added Entanglement Concentration Dataset for 3 and 4 qubits for Benchmarking Binary Classifiers #915

RishiNandha commented May 9, 2025

Uh oh!

RishiNandha commented May 9, 2025

Uh oh!

RishiNandha commented May 10, 2025

Uh oh!

woodsp-ibm commented May 11, 2025

Uh oh!

woodsp-ibm commented May 14, 2025

Uh oh!

coveralls commented May 14, 2025 •

edited

Loading

Uh oh!

edoaltamura left a comment

Uh oh!

woodsp-ibm Jul 11, 2025 •

edited

Loading

Uh oh!

Uh oh!

Added Entanglement Concentration Dataset for 3 and 4 qubits for Benchmarking Binary Classifiers #915

Are you sure you want to change the base?

Added Entanglement Concentration Dataset for 3 and 4 qubits for Benchmarking Binary Classifiers #915

Conversation

RishiNandha commented May 9, 2025

Summary

Details and comments

Uh oh!

RishiNandha commented May 9, 2025

Uh oh!

RishiNandha commented May 10, 2025

Uh oh!

woodsp-ibm commented May 11, 2025

Uh oh!

woodsp-ibm commented May 14, 2025

Uh oh!

coveralls commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 15622602924

Details

💛 - Coveralls

Uh oh!

edoaltamura left a comment

Choose a reason for hiding this comment

Uh oh!

woodsp-ibm Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coveralls commented May 14, 2025 •

edited

Loading

woodsp-ibm Jul 11, 2025 •

edited

Loading