Replies: 5 comments 6 replies
-
Congratulations on this. Documentation on the xtensa assembler would be great.
I'm guessing here but I believe the MicroPython runtime yields periodically to FreeRTOS. Is it possible that the loop is simply blocking for too long? This could be verified (or otherwise) by running other blocking loops in assembler or C. Re collaboration I don't think I'm your man. I did write most of the ARMV7 inline assembler docs but I did this from a position of some knowledge: I'd studied the ARM Assembler manual, I'd written inline assembler code and knew the C code of the MP asm emitter. By contrast my knowledge of the ESPx architecture is minimal, having only coded for it in Python. I hope someone steps up who can bring some relevant knowledge and experience to this worthwhile task. |
Beta Was this translation helpful? Give feedback.
-
Recently I created an initial POC for stubs/documentation for the rp2 asm_pio emitter. That works reasonably well. I know @jimmo was working on using stubs as the source for generating documentation. Combining these approaches might also work for the extensa emitter. I could help on the approach, structure and format, not so much on the esp32 parts |
Beta Was this translation helpful? Give feedback.
-
Update: Also a viper loop of a few seconds blocks the FreeRTOS and leads to a crash. As it is related to the state of the docs I present some investigations of the generated assembler code.
The resulting hexadecimal code from e.g. viper function object at: 0x3ffef500
number of args: 0
machine code at: 0x401079e8
-- code --
0600000012c1e00901c911d921e931f941f812f83ff85f2d033d04ed0556530042a00047120842a000022f2cc0000022a07b32a002084fc00000c6fffff841e831d821c811080112c1200df006010000
---------- may be put into a hex editor (I used bless) to save it to a file and then from the sdk/toolchain one may use the xtensa version of objdump to disassemble the code. (I did not find an online disassembler). Now the results (after some editing): # We investigate the assembly of:
@micropython.asm_xtensa
def asm_ret123():
movi(a2, 123) # hex(123) = 0x7b
Upon entry to the assembly function the registers a0, a12, a13, a14 and 15 are
pushed to the stack and the stack pointer (a1) decreased by 32. Upon
exit, these registers and the stack pointer are restored, and ret.n is
executed to return to the caller (caller address is in a0).
401079c0: 060100 j 0x401079c8 # jump over to actual program start
401079c3: 00 # probably just an align to 32 bits
401079c4: 7b000000 <-- the value 123 = 0x0000007b
401079c8: 12c1e0 addi a1, a1, -32 # actual program start: stack pointer a1 is decreased by 32
401079cb: 0901 s32i.n a0, a1, 0 # the registers a0, a12, a13, a14 and a15 are pushed on the stack
401079cd: c911 s32i.n a12, a1, 4 # there is room for another three 32-bit values left
401079cf: d921 s32i.n a13, a1, 8
401079d1: e931 s32i.n a14, a1, 12
401079d3: f941 s32i.n a15, a1, 16
401079d5: 21fbff l32r a2, 0x401079c4 <-- the asm_xtensa movi instr. uses the l32r load32 bit relative.
401079d8: f841 l32i.n a15, a1, 16 # finish: restore registers 15, a14, 13,
401079da: e831 l32i.n a14, a1, 12
401079dc: d821 l32i.n a13, a1, 8
401079de: c811 l32i.n a12, a1, 4
401079e0: 0801 l32i.n a0, a1, 0
401079e2: 12c120 addi a1, a1, 32 # and increase the stack pointer by 32 to its old value
401079e5: 0df0 ret.n # return to the caller (a0 --> next PC)
############################################
@micropython.asm_xtensa
def asm_add_43981(a2):
movi(a12, 0xabcd) # hex(43981) = 0xabcd
add(a2, a2, a12)
40107a54: 060100 j 0x40107a5c
40107a57: 00
40107a58: cdab0000 <-- the value 43981 = 0x0000abcd (bytes: cd ab 00 00)
40107a5c: 12c1e0 addi a1, a1, -32 # actual program start: stack pointer a1 is decreased by 32
40107a5d: 0901 s32i.n a0, a1, 0 # the registers a0, a12, a13, a14 and a15 are pushed on the stack
40107a60: c911 s32i.n a12, a1, 4 # there is room for another three 32-bit values left
40107a63: d921 s32i.n a13, a1, 8
40107a65: e931 s32i.n a14, a1, 12
40107a67: f941 s32i.n a15, a1, 16
40107a69: c1fbff l32r a12, 0x40107a58 <-- load a12 with value 0x0000abcd at this address
40107a6c: c02280 add a2, a2, a12 <-- and add to a2
40107a6f: f841 l32i.n a15, a1, 16
40107a71: e831 l32i.n a14, a1, 12
40107a73: d821 l32i.n a13, a1, 8
40107a75: c811 l32i.n a12, a1, 4
40107a77: 0801 l32i.n a0, a1, 0
40107a79: 12c120 addi a1, a1, 32
40107a7c: 0df0 ret.n
############################################
@micropython.viper
def vip_ret123() -> int:
return 123 # 0x7b
40107a54: 060000 j 0x40107a58
40107a57: 00
40107a58: 12c1e0 addi a1, a1, -32 # actual program start: stack pointer a1 is decreased by 32
40107a5a: 0901 s32i.n a0, a1, 0 # the registers a0, a12, a13, a14 and a15 are pushed on the stack
40107a5d: c911 s32i.n a12, a1, 4
40107a5f: d921 s32i.n a13, a1, 8
40107a61: e931 s32i.n a14, a1, 12
40107a63: f941 s32i.n a15, a1, 16
40107a65: f812 l32i.n a15, a2, 4 # a15 = [a2+4] pointer lookup, use a2 for that
40107a67: f83f l32i.n a15, a15, 12 # a15 = [[a2+4]+12]
40107a69: f85f l32i.n a15, a15, 20 # a15 = [[[a2+4]+12]+20]
40107a6b: 2d03 mov.n a2, a3 # a3, a4, a5 --> a2, a3, a14
40107a6d: 3d04 mov.n a3, a4
40107a6f: ed05 mov.n a14, a5
40107a71: 565300 bnez a3, 0x40107a7a
40107a74: 42a000 movi a4, 0
40107a77: 471208 beq a2, a4, 0x40107a83
40107a7a: 42a000 movi a4, 0
40107a7d: 022f2c l32i a0, a15, 176
40107a80: c00000 callx0 a0
40107a83: 22a07b movi a2, 123 <-- the return value
40107a86: 32a002 movi a3, 2
40107a89: 084f l32i.n a0, a15, 16
40107a8b: c00000 callx0 a0
40107a8e: c6ffff j 0x40107a91 <-- jump to register pop from stack
40107a91: f841 l32i.n a15, a1, 16
40107a93: e831 l32i.n a14, a1, 12
40107a95: d821 l32i.n a13, a1, 8
40107a97: c811 l32i.n a12, a1, 4
40107a99: 0801 l32i.n a0, a1, 0
40107a9b: 12c120 addi a1, a1, 32
40107a9e: 0df0 ret.n These examples nicely demonstrate how registers are pushed to and popped from the stack --> needs to go into the docs too. |
Beta Was this translation helpful? Give feedback.
-
This documentation seems very relevant. Thanks a lot for doing the research. Right now there is not even a mention of the existence of asm_xtensa in the MicroPython documentation, so it is impossible for people to get started with it. From this perspective, almost any addition to the documentation would be a good step forward. |
Beta Was this translation helpful? Give feedback.
-
Related, I also note that there are no automated tests for |
Beta Was this translation helpful? Give feedback.
-
I wanted to try the CRC algorithms with the @micropython.asm_xtensa assembler and found that information on it is scarse.
So I wrote a preliminary document that I present here.
It's a mix of briefing on instructions (in comments) and test/demonstration code that I used to check the functionality.
You should be able to copy/save and run that.
The last function - which is a loop - seems to reveal a bug. It crashes the system if the loop takes too many iterations.
If someone is interested in transforming this to the official MP documentation I would like to join/collaborate. Perhaps @pythoncoder, who has done much of the docs already?
Beta Was this translation helpful? Give feedback.
All reactions