Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AP Float Tutorial #3

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

AP Float Tutorial #3

wants to merge 7 commits into from

Conversation

tiwaria1
Copy link
Owner

Internal review of ap_float tutorial. This is a port of the 3 HLS ap_float tutorials.

Testing:
[DONE] Linux Local compile
[TODO] Linux Local compile with CMAKE
[TODO] Linux Regtest
[TODO] Windows Local compile with CMAKE
[TODO] Windows Regtest

Copy link

@whitepau whitepau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like your tutorials, Abhishek. This one is pretty long, but overall I think it is composed of many bite-sized pieces. Perhaps one could argue for parcelling them out into smaller tutorials, but I know that having many tutorials can be overwhelming for users.

```cpp
ihc::ap_float<EW, MW> a;
```
Here EW specifies the exponent width and MW specifies the mantissa width of the number. Optionally, another template parameter can be specified to set the rounding mode. For more details please refer to the section titled `Variable-Precision Integer and Floating-Point Support` in the Intel® oneAPI DPC++ FPGA Optimization Guide.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should explicitly describe floating points, either here or in the other documentation, and include a picture like this:
image

// We set the rounding mode to RZERO (truncate to zero) because this allows us
// to generate compile-time ap_float constants from double type literals shown
// below, which eliminates the area usage for initialization.
using APDoubleTypeC = ihc::ap_float<11, 44, kRoundingModeRZERO>;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to test this but I couldn't get it to compile in HUB with these directions. My concern is that I believe we should show that it's possible to get identical performance between a native float/double and a similarly configured ap_float type. This was an issue in HLS.

Copy link

@whitepau whitepau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part 2


In C++ applications, the basic binary operations have little expressiveness. On the contrary, FPGAs implement these operations using configurable logic, so you can improve your design's performance by fine-tuning the floating-point operations since they are usually area and latency intensive.

The kernel `SpecializedQuadraticEqnSolverKernel` demonstrates how to use the explicit versions of `ap_float` binary operators to perform floating-point arithmetic operations based on your need.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

T

Suggested change
The kernel `SpecializedQuadraticEqnSolverKernel` demonstrates how to use the explicit versions of `ap_float` binary operators to perform floating-point arithmetic operations based on your need.
The kernel code in the function `TestSpecializedQuadraticEqnSolver()` demonstrates how to use the explicit versions of `ap_float` binary operators to perform floating-point arithmetic operations based on your need.

I won't fix these anymore, but please update them.


You should observe an area reduction of up to 30% in resource utilization of the binary operations.

TODO: Is simulation supported for customers yet?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question...

Expand the lines with the kernel names by clicking on them and expand the sub hierarchies to observe how the `add, mult` and `div`
operations use lesser resources for the `ApproximateSineWithAPFloat` kernel.

You should observe an area reduction of up to 30% in resource utilization of the binary operations.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You should observe an area reduction of up to 30% in resource utilization of the binary operations.
You should observe an area reduction in resource utilization of up to 30% for the binary operations.

perhaps include a before/after screenshot? I know my first instinct would be to compare the overall area in the summary report.

2. Kernel: `ConversionKernelC`
This kernel shows how to use the `convert_to` function and modify the rounding mode for a specific operation.

In the graph for the cluster under `Kernel_C`, you will find that it contains two "cast" nodes, corresponding to the conversions:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to include screenshots? probably not since we change the report graphics so much.

#include <sycl/ext/intel/ac_types/ap_float_math.hpp>
```

Additionally, you must use the flag `-qactypes` in order to ensure that the headers are correctly included and that the compiler links against the necessary libraries for emulation support.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please tell the user where the -qactypes or /Qactypes flag appears.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note, however I might have the -fintelfpga flag absorb -qactypes before this tutorial is released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants