1
1
# Benchmarking for quantum machine learning models
2
2
3
3
This repository contains tools to compare the performance of near-term quantum machine learning (QML)
4
- as well as standard classical machine learning models on supervised learning tasks.
4
+ as well as standard classical machine learning models on supervised and generative learning tasks.
5
5
6
6
It is based on pipelines using [ Pennylane] ( https://pennylane.ai/ ) for the simulation of quantum circuits,
7
7
[ JAX] ( https://jax.readthedocs.io/en/latest/index.html ) for training,
@@ -39,12 +39,12 @@ Dependencies of this package can be installed in your environment by running
39
39
pip install -r requirements.txt
40
40
```
41
41
42
- ## Adding a custom model
42
+ ## Adding a custom classifier
43
43
44
44
We use the [ Scikit-learn API] ( https://scikit-learn.org/stable/developers/develop.html ) to create
45
45
models and perform hyperparameter search.
46
46
47
- A minimal template for a new quantum model is as follows, and can be stored
47
+ A minimal template for a new quantum classifier is as follows, and can be stored
48
48
in ` qml_benchmarks/models/my_model.py ` :
49
49
50
50
``` python
@@ -61,18 +61,23 @@ class MyModel(BaseEstimator, ClassifierMixin):
61
61
62
62
# reproducibility is ensured by creating a numpy PRNG and using it for all
63
63
# subsequent random functions.
64
- self ._random_state = random_state
65
- self ._rng = np.random.default_rng(random_state)
64
+ self .random_state = random_state
65
+ self .rng = np.random.default_rng(random_state)
66
66
67
67
# define data-dependent attributes
68
68
self .params_ = None
69
69
self .n_qubits_ = None
70
+
71
+ def initialize (self , args ):
72
+ """
73
+ initialize the model if necessary
74
+ """
75
+ # ... your code here ...
70
76
71
77
def fit (self , X , y ):
72
78
""" Fit the model to data X and labels y.
73
79
74
80
Add your custom training loop here and store the trained model parameters in `self.params_`.
75
- Set the data-dependent attributes, such as `self.n_qubits_`.
76
81
77
82
Args:
78
83
X (array_like): Data of shape (n_samples, n_features)
@@ -146,9 +151,86 @@ model.fit(X_train, y_train)
146
151
print (model.score(X_test, y_test))
147
152
```
148
153
154
+
155
+ ## Adding a custom generative model
156
+
157
+ The minimal template for a new generative model closely follows that of the classifier models.
158
+ Labels are set to ` None ` throughout to maintain sci-kit learn functionality.
159
+
160
+ ``` python
161
+ import numpy as np
162
+
163
+ from sklearn.base import BaseEstimator
164
+
165
+
166
+ class MyModel (BaseEstimator ):
167
+ def __init__ (self , hyperparam1 = " some_value" , random_state = 42 ):
168
+
169
+ # store hyperparameters as attributes
170
+ self .hyperparam1 = hyperparam1
171
+
172
+ # reproducibility is ensured by creating a numpy PRNG and using it for all
173
+ # subsequent random functions.
174
+ self .random_state = random_state
175
+ self .rng = np.random.default_rng(random_state)
176
+
177
+ # define data-dependent attributes
178
+ self .params_ = None
179
+ self .n_qubits_ = None
180
+
181
+ def initialize (self , args ):
182
+ """
183
+ initialize the model if necessary
184
+ """
185
+ # ... your code here ...
186
+
187
+ def fit (self , X , y = None ):
188
+ """ Fit the model to data X.
189
+
190
+ Add your custom training loop here and store the trained model parameters in `self.params_`.
191
+
192
+ Args:
193
+ X (array_like): Data of shape (n_samples, n_features)
194
+ y (array_like): not used (no labels)
195
+ """
196
+ # ... your code here ...
197
+
198
+ def sample (self , num_samples ):
199
+ """ sample from the generative model
200
+
201
+ Args:
202
+ num_samples (int): number of points to sample
203
+
204
+ Returns:
205
+ array_like: sampled points
206
+ """
207
+ # ... your code here ...
208
+
209
+ return samples
210
+
211
+ def score (self , X , y = None ):
212
+ """ A optional custom score function to be used with hyperparameter optimization
213
+ Args:
214
+ X (array_like): Data of shape (n_samples, n_features)
215
+ y: unused (no labels for generative models)
216
+
217
+ Returns:
218
+ (float): score for the dataset X
219
+ """
220
+ # ... your code here ...
221
+ return score
222
+ ```
223
+
224
+ If the model samples binary data, it is recommended to construct models that sample binary strings (rather than $\pm1$ valued strings)
225
+ to align with the datasets designed for generative models.
226
+ Energy based models can easily be constructed by replacing the multilayer perceptron neural network in ` DeepEBM ` by
227
+ any other differentiable network written in ` flax ` .
228
+
149
229
## Datasets
150
230
151
- The ` qml_benchmarks.data ` module provides generating functions to create datasets for binary classification.
231
+ The ` qml_benchmarks.data ` module provides generating functions to create datasets for binary classification and
232
+ generative learning.
233
+
152
234
A generating function can be used like this:
153
235
154
236
``` python
@@ -158,7 +240,7 @@ X, y = generate_two_curves(n_samples=200, n_features=4, degree=3, noise=0.1, off
158
240
```
159
241
160
242
Note that some datasets might have different return data structures, for example if the train/test split
161
- is performed by the generating function.
243
+ is performed by the generating function. If the dataset does not include labels, ` y = None ` is returned.
162
244
163
245
The original datasets used in the paper can be generated by running the scripts in the ` paper/benchmarks ` folder,
164
246
such as:
@@ -172,15 +254,18 @@ This will create a new folder in `paper/benchmarks` containing the datasets.
172
254
## Running hyperparameter optimization
173
255
174
256
In the folder ` scripts ` we provide an example that can be used to
175
- generate results for a hyperparameter search for any model and dataset. The script
257
+ generate results for a hyperparameter search for any model and dataset. The script functions
258
+ for both classifier and generative models. The script
176
259
can be run as
177
260
178
261
```
179
- python run_hyperparameter_search.py --classifier-name "DataReuploadingClassifier" --dataset-path "my_dataset.csv"
262
+ python run_hyperparameter_search.py --model "DataReuploadingClassifier" --dataset-path "my_dataset.csv"
180
263
```
181
264
182
- where ` my_dataset.csv ` is a CSV file containing the training data such that each column is a feature
183
- and the last column is the target.
265
+ where` my_dataset.csv ` is a CSV file containing the training data. For classification problems, each column should
266
+ correspond to a feature and the last column to the target. For generative learning, each row
267
+ should correspond to a binary string that specifies a unique data sample, and the model should implement a ` score `
268
+ method.
184
269
185
270
Unless otherwise specified, the hyperparameter grid is loaded from ` qml_benchmarks/hyperparameter_settings.py ` .
186
271
One can override the default grid of hyperparameters by specifying the hyperparameter list,
@@ -189,7 +274,7 @@ For example, for the `DataReuploadingClassifier` we can run:
189
274
190
275
```
191
276
python run_hyperparameter_search.py \
192
- --classifier-name DataReuploadingClassifier \
277
+ --model DataReuploadingClassifier \
193
278
--dataset-path "my_dataset.csv" \
194
279
--n_layers 1 2 \
195
280
--observable_type "single" "full"\
0 commit comments