|
5 | 5 | "id": "6ee8630c",
|
6 | 6 | "metadata": {},
|
7 | 7 | "source": [
|
8 |
| - "# Part 7: Deployment\n", |
| 8 | + "# Part 7a: Bitstream Generation\n", |
9 | 9 | "\n",
|
10 | 10 | "In the previous sections we've seen how to train a Neural Network with a small resource footprint using QKeras, then to convert it to `hls4ml` and create an IP. That IP can be interfaced into a larger design to deploy on an FPGA device. In this section, we introduce the `VivadoAccelerator` backend of `hls4ml`, where we can easily target some supported devices to get up and running quickly. Specifically, we'll deploy the model on a [pynq-z2 board](http://www.pynq.io/)."
|
11 | 11 | ]
|
|
26 | 26 | "_add_supported_quantized_objects(co)\n",
|
27 | 27 | "import os\n",
|
28 | 28 | "\n",
|
29 |
| - "os.environ['PATH'] = '/opt/Xilinx/Vivado/2019.2/bin:' + os.environ['PATH']" |
| 29 | + "os.environ['PATH'] = os.environ['XILINX_VIVADO'] + '/bin:' + os.environ['PATH']" |
30 | 30 | ]
|
31 | 31 | },
|
32 | 32 | {
|
|
136 | 136 | "np.save('model_3/y_hls.npy', y_hls)"
|
137 | 137 | ]
|
138 | 138 | },
|
139 |
| - { |
140 |
| - "cell_type": "markdown", |
141 |
| - "id": "9ca4f0e2", |
142 |
| - "metadata": {}, |
143 |
| - "source": [ |
144 |
| - "## Synthesize\n", |
145 |
| - "Now synthesize the model, and also export the IP." |
146 |
| - ] |
147 |
| - }, |
148 |
| - { |
149 |
| - "cell_type": "code", |
150 |
| - "execution_count": null, |
151 |
| - "id": "ef6c817f", |
152 |
| - "metadata": { |
153 |
| - "tags": [] |
154 |
| - }, |
155 |
| - "outputs": [], |
156 |
| - "source": [ |
157 |
| - "hls_model.build(csim=False, export=True)" |
158 |
| - ] |
159 |
| - }, |
160 | 139 | {
|
161 | 140 | "attachments": {
|
162 | 141 | "part7_block_design.png": {
|
|
167 | 146 | "id": "3412fa7c",
|
168 | 147 | "metadata": {},
|
169 | 148 | "source": [
|
170 |
| - "## Make bitfile\n", |
171 |
| - "Now we've exported the NN IP, let's create a bitfile! The `VivadoAccelerator` backend design scripts create a Block Design in Vivado IPI containing our Neural Network IP, as well as the other necessary IPs to create a complete system.\n", |
| 149 | + "## Synthesize and make bitfile\n", |
| 150 | + "\n", |
| 151 | + "Now we will synthesize the model, export the IP, and create a bitfile! The `VivadoAccelerator` backend design scripts create a Block Design in Vivado IPI containing our Neural Network IP, as well as the other necessary IPs to create a complete system.\n", |
172 | 152 | "\n",
|
173 | 153 | "In the case of our `pynq-z2`, we add a DMA IP to transfer data between the PS and PL containg the Neural Network. If you want to create a different design, for example to connect your NN to a sensor, you can use our block designs as a starting point and add in relevant IP for your use case.\n",
|
174 | 154 | "\n",
|
|
184 | 164 | "metadata": {},
|
185 | 165 | "outputs": [],
|
186 | 166 | "source": [
|
187 |
| - "hls4ml.templates.VivadoAcceleratorBackend.make_bitfile(hls_model)" |
| 167 | + "hls_model.build(csim=False, export=True, bitfile=True)" |
188 | 168 | ]
|
189 | 169 | },
|
190 | 170 | {
|
|
222 | 202 | },
|
223 | 203 | {
|
224 | 204 | "cell_type": "markdown",
|
225 |
| - "id": "033cc4d9", |
| 205 | + "id": "aac3541d", |
226 | 206 | "metadata": {},
|
227 | 207 | "source": [
|
228 |
| - "## Part 7b: running on a pynq-z2\n", |
229 |
| - "The following section is the code to execute in the pynq-z2 jupyter notebook to execute NN inference. \n", |
| 208 | + "## Preparations for deployment\n", |
| 209 | + "First, you'll need to follow the [setup instructions for the pynq-z2 board](https://pynq.readthedocs.io/en/latest/getting_started/pynq_z2_setup.html).\n", |
| 210 | + "Typically, this includes connecting the board to your host via ethernet and setting up a static IP address for your host (192.168.2.1). \n", |
| 211 | + "The IP address for the pynq-z2 is 192.168.2.99.\n", |
| 212 | + "The default username and password is xilinx.\n", |
230 | 213 | "\n",
|
231 |
| - "First, you'll need to follow the setup instructions for the pynq-z2 board, then transfer the following files from the earlier part of this notebook into a directory on the pynq-z2:\n", |
| 214 | + "Next you'll transfer the following files from the earlier part of this notebook into a directory on the pynq-z2:\n", |
232 | 215 | "- bitfile: `model_3/hls4ml_prj_pynq/myproject_vivado_accelerator/project_1.runs/impl_1/design_1_wrapper.bit` -> `hls4ml_nn.bit`\n",
|
233 | 216 | "- hardware handoff: `model_3/hls4ml_prj_pynq/myproject_vivado_accelerator/project_1.srcs/sources_1/bd/design_1/hw_handoff/design_1.hwh` -> `hls4ml_nn.hwh`\n",
|
234 | 217 | "- driver: `model_3/hls4ml_prj_pynq/axi_stream_driver.py` -> `axi_stream_driver.py`\n",
|
235 | 218 | "- data: `X_test.npy`, `y_test.npy`\n",
|
| 219 | + "- notebook: `part7b_deployment.ipynb`\n", |
236 | 220 | "\n",
|
237 | 221 | "The following commands archive these files into `model_3/hls4ml_prj_pynq/package.tar.gz` that can be copied over to the pynq-z2 and extracted."
|
238 | 222 | ]
|
239 | 223 | },
|
240 | 224 | {
|
241 | 225 | "cell_type": "code",
|
242 | 226 | "execution_count": null,
|
243 |
| - "id": "22892f4b", |
| 227 | + "id": "3ca20d05", |
244 | 228 | "metadata": {},
|
245 | 229 | "outputs": [],
|
246 | 230 | "source": [
|
247 |
| - "!mkdir model_3/hls4ml_prj_pynq/package/\n", |
| 231 | + "!mkdir -p model_3/hls4ml_prj_pynq/package\n", |
248 | 232 | "!cp model_3/hls4ml_prj_pynq/myproject_vivado_accelerator/project_1.runs/impl_1/design_1_wrapper.bit model_3/hls4ml_prj_pynq/package/hls4ml_nn.bit\n",
|
249 | 233 | "!cp model_3/hls4ml_prj_pynq/myproject_vivado_accelerator/project_1.srcs/sources_1/bd/design_1/hw_handoff/design_1.hwh model_3/hls4ml_prj_pynq/package/hls4ml_nn.hwh\n",
|
250 | 234 | "!cp model_3/hls4ml_prj_pynq/axi_stream_driver.py model_3/hls4ml_prj_pynq/package/\n",
|
251 | 235 | "!cp X_test.npy y_test.npy model_3/hls4ml_prj_pynq/package\n",
|
| 236 | + "!cp part7b_deployment.ipynb model_3/hls4ml_prj_pynq/package\n", |
252 | 237 | "!tar -czvf model_3/hls4ml_prj_pynq/package.tar.gz -C model_3/hls4ml_prj_pynq/package/ ."
|
253 | 238 | ]
|
254 | 239 | },
|
255 | 240 | {
|
256 | 241 | "cell_type": "markdown",
|
257 |
| - "id": "a9a52cfb", |
| 242 | + "id": "06d6bd96", |
258 | 243 | "metadata": {},
|
259 | 244 | "source": [
|
260 |
| - "The following cells are intended to run on a pynq-z2, they will not run on the server used to train and synthesize models!\n", |
| 245 | + "To copy it to the pynq-z2 with the default settings, the command would be:\n", |
261 | 246 | "\n",
|
262 |
| - "First, import our driver `Overlay` class. We'll also load the test data." |
263 |
| - ] |
264 |
| - }, |
265 |
| - { |
266 |
| - "cell_type": "code", |
267 |
| - "execution_count": null, |
268 |
| - "id": "89c67e4f", |
269 |
| - "metadata": {}, |
270 |
| - "outputs": [], |
271 |
| - "source": [ |
272 |
| - "from axi_stream_driver import NeuralNetworkOverlay\n", |
273 |
| - "import numpy as np\n", |
| 247 | + "```bash\n", |
| 248 | + "scp model_3/hls4ml_prj_pynq/package.tar.gz [email protected]:~/jupyter_notebooks\n", |
| 249 | + "```\n", |
274 | 250 | "\n",
|
275 |
| - "X_test = np.load('X_test.npy')\n", |
276 |
| - "y_test = np.load('y_test.npy')" |
277 |
| - ] |
278 |
| - }, |
279 |
| - { |
280 |
| - "cell_type": "markdown", |
281 |
| - "id": "551c5cd6", |
282 |
| - "metadata": {}, |
283 |
| - "source": [ |
284 |
| - "Create a `NeuralNetworkOverlay` object. This will download the `Overlay` (bitfile) onto the PL of the pynq-z2. We provide the `X_test.shape` and `y_test.shape` to allocate some buffers for the data transfer." |
285 |
| - ] |
286 |
| - }, |
287 |
| - { |
288 |
| - "cell_type": "code", |
289 |
| - "execution_count": null, |
290 |
| - "id": "cfb786f3", |
291 |
| - "metadata": {}, |
292 |
| - "outputs": [], |
293 |
| - "source": [ |
294 |
| - "nn = NeuralNetworkOverlay('hls4ml_nn.bit', X_test.shape, y_test.shape)" |
295 |
| - ] |
296 |
| - }, |
297 |
| - { |
298 |
| - "cell_type": "markdown", |
299 |
| - "id": "5fde9b2d", |
300 |
| - "metadata": {}, |
301 |
| - "source": [ |
302 |
| - "Now run the prediction! When we set `profile=True` the function times the inference, and prints out a summary as well as returning the profiling information. We also save the output to a file so we can do some validation." |
303 |
| - ] |
304 |
| - }, |
305 |
| - { |
306 |
| - "cell_type": "code", |
307 |
| - "execution_count": null, |
308 |
| - "id": "1fd6dee7", |
309 |
| - "metadata": {}, |
310 |
| - "outputs": [], |
311 |
| - "source": [ |
312 |
| - "y_hw, latency, throughput = nn.predict(X_test, profile=True)" |
313 |
| - ] |
314 |
| - }, |
315 |
| - { |
316 |
| - "cell_type": "markdown", |
317 |
| - "id": "1983e7d7", |
318 |
| - "metadata": {}, |
319 |
| - "source": [ |
320 |
| - "An example print out looks like:\n", |
| 251 | + "Then you can navigate your web browser to http://192.168.2.99.\n", |
| 252 | + "You will see the JupyterHub running on the pynq-z2. Open a terminal and extract the tarball.\n", |
| 253 | + "\n", |
| 254 | + "```bash\n", |
| 255 | + "cd ~/jupyter_notebooks\n", |
| 256 | + "tar xvzf package.tar.gz\n", |
| 257 | + "```\n", |
321 | 258 | "\n",
|
322 |
| - "Classified 166000 samples in 0.402568 seconds (412352.6956936468 inferences / s)" |
| 259 | + "Now open the notebook `part7b_deployment.ipynb` on the pynq-z2 JupyterHub" |
323 | 260 | ]
|
324 | 261 | },
|
325 | 262 | {
|
326 | 263 | "cell_type": "markdown",
|
327 |
| - "id": "005ae126", |
328 |
| - "metadata": {}, |
329 |
| - "source": [ |
330 |
| - "## Part 7c: final validation\n", |
331 |
| - "We executed NN inference on the pynq-z2! Now we can copy the `y_hw.npy` back to the host we've been using for the training and synthesis, and make a final plot to check that the output we took on the board is as expected." |
332 |
| - ] |
333 |
| - }, |
334 |
| - { |
335 |
| - "cell_type": "code", |
336 |
| - "execution_count": null, |
337 |
| - "id": "fee790be", |
| 264 | + "id": "f876eff5", |
338 | 265 | "metadata": {},
|
339 |
| - "outputs": [], |
340 |
| - "source": [ |
341 |
| - "import matplotlib.pyplot as plt\n", |
342 |
| - "\n", |
343 |
| - "%matplotlib inline\n", |
344 |
| - "from sklearn.metrics import accuracy_score\n", |
345 |
| - "\n", |
346 |
| - "y_hw = np.load('y_hw.npy')\n", |
347 |
| - "y_test = np.load('y_test.npy')\n", |
348 |
| - "classes = np.load('classes.npy', allow_pickle=True)\n", |
349 |
| - "y_qkeras = model.predict(X_test)\n", |
350 |
| - "\n", |
351 |
| - "print(\"Accuracy QKeras, CPU: {}\".format(accuracy_score(np.argmax(y_test, axis=1), np.argmax(y_qkeras, axis=1))))\n", |
352 |
| - "print(\"Accuracy hls4ml, pynq-z2: {}\".format(accuracy_score(np.argmax(y_test, axis=1), np.argmax(y_hw, axis=1))))\n", |
353 |
| - "\n", |
354 |
| - "fig, ax = plt.subplots(figsize=(9, 9))\n", |
355 |
| - "_ = plotting.makeRoc(y_test, y_qkeras, classes, linestyle='-')\n", |
356 |
| - "plt.gca().set_prop_cycle(None) # reset the colors\n", |
357 |
| - "_ = plotting.makeRoc(y_test, y_hw, classes, linestyle='--')\n", |
358 |
| - "\n", |
359 |
| - "from matplotlib.lines import Line2D\n", |
360 |
| - "\n", |
361 |
| - "lines = [Line2D([0], [0], ls='-'), Line2D([0], [0], ls='--')]\n", |
362 |
| - "from matplotlib.legend import Legend\n", |
363 |
| - "\n", |
364 |
| - "leg = Legend(ax, lines, labels=['QKeras, CPU', 'hls4ml, pynq-z2'], loc='lower right', frameon=False)\n", |
365 |
| - "ax.add_artist(leg)" |
366 |
| - ] |
| 266 | + "source": [] |
367 | 267 | }
|
368 | 268 | ],
|
369 | 269 | "metadata": {
|
|
0 commit comments