|
| 1 | +.. _AM62D-2dfft-dsp-offload-from-linux-user-guide: |
| 2 | + |
| 3 | +AM62D 2D FFT DSP offload from Linux |
| 4 | +################################### |
| 5 | + |
| 6 | +Overview |
| 7 | +******** |
| 8 | + |
| 9 | +This guide describes how to set up, build, and run the 2D Fast Fourier Transform (FFT) |
| 10 | +Digital Signal Processing (DSP) offload example by using the Texas Instruments |
| 11 | +AM62D evaluation module (EVM). This demo example shows how to offload 2D |
| 12 | +Fast Fourier Transform (FFT) computation to the C7x DSP from Linux user-space. |
| 13 | +The input is a 128x128 complex matrix, and the output is the 2D FFT transformed |
| 14 | +data in the same format. |
| 15 | + |
| 16 | +Below figure shows how this demo works: |
| 17 | + |
| 18 | +.. figure:: /images/AM62D_2DFFT_DSP_offload_Demo.png |
| 19 | + :height: 450 |
| 20 | + :width: 1000 |
| 21 | + |
| 22 | +- Step 1: Read test data |
| 23 | + - The 2D FFT offload example application reads the 128x128 complex test data from storage. |
| 24 | + |
| 25 | +- Step 2: Copy data to shared Direct Memory Access (DMA) Buffer (DDR) |
| 26 | + - Copies input data to a shared DMA buffer located in DDR memory. |
| 27 | + |
| 28 | +- Step 3: Send Buffer Ready Message |
| 29 | + - A53 sends control message via RPMsg to tell C7x |
| 30 | + |
| 31 | +- Step 4: Notify DSP using RPMsg (IPC) |
| 32 | + - A53 sends a control message via RPMsg (Remote Processor Messaging) to |
| 33 | + the C7x core. This message informs the C7x that new input data is available |
| 34 | + for processing in the shared DDR buffer. |
| 35 | + |
| 36 | +- Step 5: C7x reads from shared DMA Buffer into L2 Static Random Access Memory (SRAM) |
| 37 | + - The C7x DSP copies the input data from the DMA buffer (DDR) into its local |
| 38 | + L2 SRAM for processing. This operation minimizes access latency compared |
| 39 | + to reading directly from DDR. and performs 2D FFT computation on C7x. |
| 40 | + |
| 41 | +- Step 6:2D FFT computation on DSP |
| 42 | + - Below figure shows 2D FFT computation on C7x. |
| 43 | + |
| 44 | + .. figure:: /images/fft_2d_signal_chain.png |
| 45 | + :height: 140 |
| 46 | + :width: 1000 |
| 47 | + |
| 48 | + - 1D Batched FFT: C7x performs the first 1D FFT on the rows. |
| 49 | + - Matrix Transpose: The system transposes the data matrix to convert columns to |
| 50 | + rows and vice versa. Because TI designs FFTLIB libraries to perform FFT |
| 51 | + on 1D data in rows format. |
| 52 | + - 1D Batched FFT: C7x performs the second 1D FFT on the column data. |
| 53 | + - During processing, the C7x moves data between L2SRAM (lower latency, lower |
| 54 | + capacity) and DDR (higher capacity, higher latency) to use memory efficiently. |
| 55 | + |
| 56 | +- Step 7: Processed data copied back to shared DMA Buffer (DDR) |
| 57 | + - Once DSP processing is complete, the C7x copies the output (2D FFT transformed data) |
| 58 | + back into the shared DMA buffer. |
| 59 | + - C7x sends a control message via RPMsg to the A53 core, informing it that |
| 60 | + processed output data is available in the shared DDR buffer. |
| 61 | + |
| 62 | +- Step 8: Send Complete Message |
| 63 | + - C7x notifies that processing is complete |
| 64 | + |
| 65 | +- Step 9: Notify A53 |
| 66 | + - RPMsg forwards completion notification to A53 |
| 67 | + |
| 68 | +- Step 10: A53 reads back processed data from DMA buffer |
| 69 | + - A53 copies the processed data from the shared buffer for validation. |
| 70 | + |
| 71 | +- Step 11: Validation and Performance Reporting |
| 72 | + - The application compares the output data against expected results to verify correctness. |
| 73 | + - The system displays performance metrics: |
| 74 | + - DSP Load (%) |
| 75 | + - Cycle Count |
| 76 | + - DDR Throughput (MB/s) |
| 77 | + |
| 78 | +Hardware prerequisites |
| 79 | +********************** |
| 80 | + |
| 81 | +- `AM62D-EVM <https://www.ti.com/tool/AUDIO-AM62D-EVM>`__ |
| 82 | +- SD card (minimum 16GB) |
| 83 | +- USB Type-C 20W power supply (make sure to use type-C to type-C cable) |
| 84 | +- USB-to-UART cable for console access |
| 85 | +- PC (Windows or Linux) to flash image onto an SD Card |
| 86 | +- Host PC Requirements: |
| 87 | + |
| 88 | + - Operating system: |
| 89 | + |
| 90 | + - Windows: |__WINDOWS_SUPPORTED_LONG__| |
| 91 | + - Ubuntu: |__LINUX_UBUNTU_VERSION_LONG__| |
| 92 | + |
| 93 | + - Memory: Minimum 4GB RAM (8GB or more recommended) |
| 94 | + - Storage: At least 10GB of free space |
| 95 | + |
| 96 | +Software and tools |
| 97 | +****************** |
| 98 | + |
| 99 | +- TI Processor SDK Linux RT (AM62Dx) |
| 100 | +- MCU+ SDK for AM62Dx |
| 101 | +- `C7000-CGT <https://www.ti.com/tool/C7000-CGT#downloads>`__ compiler |
| 102 | +- `Code Composer Studio <https://software-dl.ti.com/mcu-plus-sdk/esd/AM62DX/11_00_00_16/exports/docs/api_guide_am62dx/CCS_PROJECTS_PAGE.html>`__ |
| 103 | +- `TI Clang Compiler Toolchain <https://www.ti.com/tool/download/ARM-CGT-CLANG>`__ |
| 104 | +- CMake, GCC, make, git, scp, minicom |
| 105 | +- `rpmsg-dma library <https://github.com/TexasInstruments/rpmsg-dma/tree/scarthgap>`__ |
| 106 | + |
| 107 | +EVM setup |
| 108 | +********* |
| 109 | + |
| 110 | +#. Cable Connections |
| 111 | + |
| 112 | + - The figure below shows some important cable connections, ports and switches. |
| 113 | + - Take note of the location of the "BOOTMODE" switch for SD card boot mode. |
| 114 | + |
| 115 | + .. figure:: /images/AM62D_evm_setup.png |
| 116 | + :height: 600 |
| 117 | + :width: 1000 |
| 118 | + |
| 119 | +#. Setup UART Terminal |
| 120 | + |
| 121 | + - First identify the UART port as enumerated on the host machine. |
| 122 | + - Make sure that the EVM and UART cable are connected to the UART to USB |
| 123 | + port as shown in Cable Connections. |
| 124 | + - In Windows, you can use the "Device Manager" to see the detected UART ports: |
| 125 | + |
| 126 | + - Search "Device Manager" in Windows Search Box in the Windows taskbar. |
| 127 | + |
| 128 | + - If you do not see any USB serial ports listed in "Device Manager" under |
| 129 | + "Ports (COM & LPT)", then make sure you have installed the UART to USB |
| 130 | + driver from `FTDI <https://www.ftdichip.com/drivers>`__. |
| 131 | + - For A53 Linux console, select UART boot port (ex: COM34 in below screenshot), |
| 132 | + keep other options to default and set 115200 baud rate. |
| 133 | + |
| 134 | +#. Setup SD card Boot Mode |
| 135 | + |
| 136 | + - EVM SD card boot mode setting: |
| 137 | + |
| 138 | + - BOOTMODE [ 8 : 15 ] (SW3) = 0100 0000 |
| 139 | + - BOOTMODE [ 0 : 7 ] (SW2) = 1100 0010 |
| 140 | + |
| 141 | +Steps to validate 2D FFT DSP offload demo |
| 142 | +***************************************** |
| 143 | + |
| 144 | +#. Flash an SD card with the :file:`tisdk-default-image-rt-am62dxx-evm.rootfs.wic.xz` |
| 145 | + image and follow the instructions provided at :ref:`Create SD Card <processor-sdk-linux-create-sd-card>` guide. |
| 146 | + |
| 147 | +#. Insert the flashed SD card into the `AUDIO-AM62D-EVM <https://www.ti.com/tool/AUDIO-AM62D-EVM>`__ |
| 148 | + and power on the TI AUDIO-AM62D-EVM. |
| 149 | + |
| 150 | +#. Make sure the EVM boot mode switches are positioned for SD card boot as |
| 151 | + described earlier. |
| 152 | + |
| 153 | +#. Connect the USB-C cable from the power adapter to one of the two USB-C |
| 154 | + ports on the EVM. |
| 155 | + |
| 156 | +#. The EVM should boot and the booting progress should display in the serial |
| 157 | + port console. At the end of booting, the Arago login prompt will appear. |
| 158 | + Just enter "root" to log in. |
| 159 | + |
| 160 | +#. Run the 2D FFT DSP offload demo application from the console: |
| 161 | + |
| 162 | + .. code-block:: console |
| 163 | + |
| 164 | + root@am62dxx-evm:~# rpmsg_2dfft_example |
| 165 | +
|
| 166 | +#. The application will execute and display the results: |
| 167 | + |
| 168 | + .. code-block:: console |
| 169 | + |
| 170 | + RPMsg based 2D FFT Offload Example |
| 171 | + |
| 172 | + ***************************************** |
| 173 | + ***************************************** |
| 174 | + |
| 175 | + C7x 2DFFT Test PASSED |
| 176 | + C7x Load: 85% |
| 177 | + C7x Cycle Count: 1234567 |
| 178 | + C7x DDR Throughput: 123.45 MB/s |
| 179 | + |
| 180 | + ***************************************** |
| 181 | + ***************************************** |
| 182 | +
|
| 183 | +.. note:: |
| 184 | + |
| 185 | + The test reports "PASSED" if the computed 2D FFT output matches the |
| 186 | + expected results within tolerance (0.01), otherwise it reports "FAILED". |
| 187 | + |
| 188 | +Demo output interpretation |
| 189 | +========================== |
| 190 | + |
| 191 | +The demo provides the following performance metrics: |
| 192 | + |
| 193 | +- **Test Result**: PASSED or FAILED based on output validation |
| 194 | +- **C7x Load**: DSP utilization percentage during FFT computation |
| 195 | +- **C7x Cycle Count**: Number of DSP cycles consumed for the operation |
| 196 | +- **C7x DDR Throughput**: Data transfer rate to/from DDR memory in MB/s |
| 197 | + |
| 198 | +How to build 2D FFT DSP offload demo |
| 199 | +************************************ |
| 200 | + |
| 201 | +Building 2D FFT DSP offload image from yocto |
| 202 | +============================================ |
| 203 | + |
| 204 | +- To build the 2D FFT DSP offload image, refer :ref:`Processor SDK - Building the SDK with Yocto <building-the-sdk-with-yocto>` |
| 205 | + |
| 206 | +Building the linux demo binary from sources |
| 207 | +=========================================== |
| 208 | + |
| 209 | +#. The source code for the 2D FFT DSP offload demo is available as part of |
| 210 | + the `rpmsg-dma <https://github.com/TexasInstruments/rpmsg-dma/tree/scarthgap>`__. |
| 211 | + |
| 212 | + .. code-block:: console |
| 213 | + |
| 214 | + host# git clone https://github.com/TexasInstruments/rpmsg-dma.git -b scarthgap |
| 215 | +
|
| 216 | +#. Download and Install the AM62D Linux SDK from |__SDK_DOWNLOAD_URL__| following |
| 217 | + the steps mentioned at :ref:`Download and Install the SDK <download-and-install-sdk>`. |
| 218 | + |
| 219 | +#. Prepare the environment for cross compilation. |
| 220 | + |
| 221 | + .. code-block:: console |
| 222 | + |
| 223 | + host# source <path to linux installer>/linux-devkit/environment-setup |
| 224 | +
|
| 225 | +#. Compile the source: |
| 226 | + |
| 227 | + .. code-block:: console |
| 228 | + |
| 229 | + [linux-devkit]:> cd <path to rpmsg-dma source> |
| 230 | + [linux-devkit]:> cmake -S . -B build; cmake --build build |
| 231 | + |
| 232 | + - This command builds: |
| 233 | + |
| 234 | + - The example application :file:`rpmsg_2dfft_example` |
| 235 | + |
| 236 | + - Transfer the generated files to SD card: |
| 237 | + |
| 238 | + - The example binary :file:`rpmsg_2dfft_example` to :file:`/usr/bin` |
| 239 | + - The test input data file :file:`2dfft_input_data.bin` to :file:`/usr/share/2dfft_test_data/` |
| 240 | + - The expected output data file :file:`2dfft_expected_output_data.bin` to :file:`/usr/share/2dfft_test_data/` |
| 241 | + - The C7 DSP firmware file :file:`fft2d_linux_dsp_offload_example.c75ss0-0.release.strip.out` to :file:`/lib/firmware/` |
| 242 | + |
| 243 | + - Optional: |
| 244 | + |
| 245 | + - To build only the library or only the example, use: |
| 246 | + |
| 247 | + .. code-block:: console |
| 248 | + |
| 249 | + cmake -S . -B build -DBUILD_LIB=OFF # disables library build |
| 250 | + cmake -S . -B build -DBUILD_EXAMPLE=OFF # disables example build |
| 251 | +
|
| 252 | +Building the C7 firmware from sources |
| 253 | +===================================== |
| 254 | + |
| 255 | +- Refer to the `MCU+ SDK Documentation <https://software-dl.ti.com/mcu-plus-sdk/esd/AM62DX/latest/exports/docs/api_guide_am62dx/GETTING_STARTED_BUILD.html>`__ |
| 256 | +- Refer to the `C7 2D FFT Demo Firmware <https://software-dl.ti.com/mcu-plus-sdk/esd/AM62DX/latest/exports/docs/api_guide_am62dx/EXAMPLES_DRIVERS_IPC_RPMESSAGE_LINUX_2DFFT_OFFLOAD.html>`__ |
0 commit comments