-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This update simplifies the user experience:
- Remove URAM support - it will be available in a separate branch - Add SIMPLE_SQ multiplier (a*a)%N as a basic starting point - Use simple packed vector interface to squaring circuit - Remove msu_tb, drive MSU AXI interface from software - Add DIRECT_TB mode to connect the testbench directly to the squaring circuit - Add Vivado projects for both example squarers
- Loading branch information
Showing
39 changed files
with
2,358 additions
and
1,554 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,24 @@ | ||
**~ | ||
**vivado | ||
**__pycache__ | ||
**logs | ||
**obj_dir | ||
**\.dat | ||
**obj | ||
**\.dat | ||
**msuconfig.vh | ||
|
||
**vivado_*backup.jou | ||
**vivado_*backup.log | ||
**vivado.jou | ||
**vivado.log | ||
**vivado_pid*.str | ||
|
||
msu/rtl/vivado_ozturk/msu.cache | ||
msu/rtl/vivado_ozturk/msu.hw | ||
msu/rtl/vivado_ozturk/msu.ip_user_files | ||
msu/rtl/vivado_ozturk/msu.runs | ||
msu/rtl/vivado_ozturk/msu.srcs | ||
msu/rtl/vivado_simple/msu.cache | ||
msu/rtl/vivado_simple/msu.hw | ||
msu/rtl/vivado_simple/msu.ip_user_files | ||
msu/rtl/vivado_simple/msu.runs | ||
msu/rtl/vivado_simple/msu.srcs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,146 @@ | ||
# AWS F1 | ||
|
||
AWS F1 supports hardware emulation as well as FPGA accelerated execution. | ||
|
||
The typical workflow involves two types of hosts. You will most likely have to submit a request for an instance limit increase. This process is described in the error message if you try to instantiate one of these hosts and your limit is insufficient. | ||
- Development, using a z1d.2xlarge with no attached FPGA | ||
- Accelerated, using a f1.2xlarge with attached FPGA | ||
|
||
AWS provides general information for using F1 (<https://github.com/aws/aws-fpga/blob/master/SDAccel/README.md>). | ||
|
||
A distilled down set of instructions specific to this design follows. | ||
|
||
**Note that you can also enable AWS F1 hardware emulation and synthesis on-premise. See [SDAccel On-Premise](#sdaccel-on-premise)** | ||
|
||
## Host instantiation | ||
|
||
We assume some familiarity with the AWS environment. To instantiate a new AWS host for working with the FPGA follow these steps: | ||
|
||
1. Login to the AWS page <https://aws.amazon.com/>, go to the EC2 service portal | ||
1. Click on Launch Instance | ||
1. For AMI, go to AWS Marketplace, then search for FPGA | ||
1. Choose FPGA Developer AMI | ||
1. For instance type choose z1d.2xlarge for development, f1.2xlarge for FPGA enabled, then Review and Launch | ||
1. For configuration of the host we recommend: | ||
- Increase root disk space by about 20GB for an f1.2xlarge, 60GB for a z1d.2xlarge. | ||
- Add a descriptive tag to help track instances and volumes | ||
1. Launch the instance | ||
1. In the EC2 Instances page, select the instance and choose Actions->Connect. This will tell you the instance hostname that you can ssh to. | ||
- Note that for the FPGA Developer AMI the username will be 'centos' | ||
- Log in with `ssh centos@HOST` | ||
|
||
You may find it convenient to install additional ssh keys for github, etc. | ||
|
||
## Host setup | ||
|
||
Some initial setup is required for new F1 hosts. See <https://github.com/aws/aws-fpga/blob/master/SDAccel/README.md> for more detail. | ||
|
||
We've encapsulated a typical setup that includes vnc: | ||
``` | ||
./msu/scripts/f1_setup.sh | ||
``` | ||
|
||
You can then optionally start a vncserver if you prefer to work in an X-windows environment: | ||
``` | ||
# Start a vncserver | ||
vncserver | ||
``` | ||
|
||
Connect using ssh to tunnel the vnc port: | ||
``` | ||
ssh -L 5908:localhost:5901 centos@HOST | ||
``` | ||
|
||
And view it locally: | ||
``` | ||
vncviewer :8 | ||
``` | ||
|
||
Once you have vnc up run vncconfig to enable copy/paste: | ||
``` | ||
vncconfig & | ||
``` | ||
|
||
## Hardware Emulation | ||
|
||
To build and run a test in hardware emulation: | ||
``` | ||
source ./msu/scripts/sdaccel_env.sh | ||
cd msu | ||
make clean | ||
make hw_emu | ||
``` | ||
|
||
Rerunning without cleaning the build will retain the hardware emulation (hardware) portion while rebuilding and executing the host (software) portion. | ||
|
||
Tracing is enabled by default in the hw_emu run. To view the resulting waveforms run: | ||
``` | ||
vivado -source open_waves.tcl | ||
``` | ||
|
||
## Hardware Synthesis | ||
|
||
Synthesis and Place&Route compile the design from RTL into a bitstream that can be loaded on the FPGA. This step takes 1-3 hours depending on complexity of the design, host speed, synthesis targets, etc. | ||
|
||
You can enable a **faster run** by relaxing the kernel frequency (search for kernel_frequency in the Makefile) or building a smaller multiplier (comment out 1024b, uncomment 128b in the Makefile). This is often convenient when trying things out. | ||
|
||
``` | ||
source ./msu/scripts/sdaccel_env.sh | ||
cd msu/rtl/sdaccel | ||
make clean | ||
make hw | ||
``` | ||
|
||
Once synthesis successfully completes you can register the new image. Follow the instructions in <https://github.com/aws/aws-fpga/blob/master/SDAccel/docs/Setup_AWS_CLI_and_S3_Bucket.md> to setup an S3 bucket. This only needs to be done once. We assume a bucket name 'vdfsn' but you will need to change this to match your bucket name. Once that is done run the following: | ||
|
||
``` | ||
# Configure AWS credentials. You should only need to do this once on a given | ||
# host | ||
# AWS Access Key ID [None]: XXXXXX | ||
# AWS Secret Access Key [None]: XXXXXX | ||
# Default region name [None]: us-east-1 | ||
# Default output format [None]: json | ||
aws configure | ||
# Register the new bitstream | ||
# Update S3_BUCKET in Makefile.sdaccel to reflect the name of your bucket. | ||
cd msu/rtl/sdaccel | ||
make to_f1 | ||
# Check status using the afi_id from the last step. It should say | ||
# pending for about 30 minutes, then available. | ||
cat *afi_id.txt | ||
aws ec2 describe-fpga-images --fpga-image-ids afi-XXXXXXXXXXXX | ||
# Copy the required files to an FPGA enabled host for execution: | ||
HOST=xxxx # Your F1 hostname here | ||
scp obj/to_f1.tar.gz centos@$HOST:. | ||
``` | ||
|
||
## FPGA Execution | ||
|
||
Once you have synthesized a bitstream, registered it using `create_sdaccel_afi.sh`, describe-fpga-image reports available, and copied the necessary files to an f1 machine you are ready to execute on the FPGA. | ||
|
||
Currently debug mode is required due to a known AWS issue. Create an `sdaccel.ini` file in the same directory you will be running from: | ||
``` | ||
cat <<EOF > sdaccel.ini | ||
[Debug] | ||
profile=true | ||
EOF | ||
``` | ||
|
||
Execute the host driver code. This will automatically load the image referenced by the awsxclbin file onto the FPGA. | ||
``` | ||
tar xf to_f1.tar.gz | ||
sudo su | ||
source $AWS_FPGA_REPO_DIR/sdaccel_runtime_setup.sh | ||
# Run a short test and verify the result in software | ||
./host -e -u 0 -f 100 | ||
# Run a billion iterations starting with an input of 2 | ||
./host -u 0 -s 0x2 -f 1000000000 | ||
``` | ||
|
||
The expected result of 2^2^1B using the default 1k (64 coefficient) modulus in the Makefile is: | ||
`305939394796769797811431929207587607176284037479412924905827147439718856946037842431593490055940763973150879770720223457997191020439404083394702653096083649807090448385799021330059496823106654989629199132438283594347957634468046231084628857389350823217443926925454895121571284954146032303555585511855910526` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# SDAccel On-Premise | ||
|
||
It's possible to perform hardware emulation and synthesis on-premise using the flow defined by AWS. | ||
|
||
The steps to enable an on-premise are described here: <https://github.com/aws/aws-fpga/blob/master/SDAccel/docs/On_Premises_Development_Steps.md>. | ||
|
||
You will need a license for the vu9p in Vivado and for SDAccel. Xilinx offers trial licenses on their website. The licenses should be loaded through the license manager, which is accessed from the Vivado Help menu. | ||
|
||
Host requirements: 32GB of memory is preferred though 16GB of memory should be sufficient. Single threaded performance is the main determinant of runtime. | ||
|
||
## Ubuntu 18 | ||
|
||
While Ubuntu 18 is not officially supported, the on-premise flow can be made to work with a few additional changes after installing SDAccel. | ||
|
||
``` | ||
# Link to the OS installed version of libstdc++: | ||
cd /tools/Xilinx/SDx/2018.3/lib/lnx64.o/Default/ | ||
mv libstdc++.so.6 libstdc++.so.6_orig | ||
ln -s /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | ||
cd /tools/Xilinx/SDx/2018.3/lib/lnx64.o/Default/ | ||
mv libstdc++.so.6 libstdc++.so.6_orig | ||
ln -s /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | ||
# After the changes above this should report "ERROR: no card found" | ||
/opt/xilinx/xrt/bin/xbutil validate | ||
# Some of the python scripts reference /bin/env | ||
cd /bin | ||
sudo ln -s /usr/bin/env | ||
``` | ||
|
||
## helloworld | ||
|
||
The `helloworld_ocl` example should now successfully complete: | ||
``` | ||
source ./msu/scripts/sdaccel_env.sh | ||
cd $AWS_FPGA_REPO_DIR/SDAccel/examples/xilinx/getting_started/host/helloworld_ocl | ||
# in Makefile, change DEVICE to: | ||
DEVICE := $(AWS_PLATFORM) | ||
make cleanall; make TARGETS=sw_emu DEVICES=$AWS_PLATFORM check | ||
``` | ||
|
||
You can now follow the hardware emulation and synthesis flows described in [aws_f1](docs/aws_f1.md). | ||
|
||
To register the image built from on-premise synthesis first copy the `msu/rtl/obj/xclbin/vdf.hw.xilinx_aws-vu9p-f1-04261818_dynamic_5_0.xclbin` and `host` files to an AWS F1 instance, then run `create_sdaccel_afi.sh`. |
Oops, something went wrong.