Skip to content

Learn LEGv8

Mitchell Talyat edited this page Oct 19, 2022 · 4 revisions

Introduction

LEGv8 Assembly is a subset of the ARMv8 Assembly architecture.

Assembly

The following is a guide to LEGv8 Assembly. I am no professional: I am currently a student learning this as I go. If there are any significant missing features, or any wrong information, please make an issue for it so it can be addressed.

Instructions

Instructions are set commands that the computer hardware will recognize. The CPU itself will only recognize binary codes to run things- assembly is the visual representation of those 1s and 0s. All instructions, including the type, and their values- can be crammed into a string of 1s and 0s so the computer hardware can execute the instruction. Instructions are split into 6 categories, or formats, in which define how each instruction is turned into a binary value. Each format has been specifically designed to fit their needs.

Instruction Formats

All instruction formats include opcode as their first value. Opcode resembles the code for the instruction being executed. The size of each opcode will vary with each format.

R (Register format)

opcode Rm shamt Rn Rd
31-21 20-16 15-10 9-5 4-0

R format instructions hold 5 values: opcode, Rm, shamt, Rn and Rd. Rm, Rn, and Rd are all register indices. Shamt is short for shift amount, which is used for shift instructions such as LSL and LSR.

I (Immediate format)

opcode ALU_immediate Rn Rd
31-22 21-10 9-5 4-0

I format instructions hold 4 values: opcode, ALU_immediate, Rn and Rd. Rn and Rd are all register indices. ALU_immediate holds the raw number value that is to be used within the instruction.

D (Data transfer format)

opcode DT_address op Rn Rt
31-21 20-12 11-10 9-5 4-0

D format instructions hold 5 values: opcode, DT_address, op, Rn and Rt. Rn and Rt are all register indices. Op is a special value used in some situations, and DT_address is the offset in memory. D format is used with STUR or LDUR instructions. DT_address is typically used in combination with Rn, by using the value of a pointer stored in the Rn register, and adding DT_address to it to get the address in memory.

B (Branch format)

opcode BR_address
31-26 25-0

B format instructions hold 2 values: opcode and BR_address. BR_address holds a value that represents the line of code to branch/jump to.

CB (Conditional Branch format)

opcode COND_BR_address Rt
31-24 23-5 4-0

CB format instructions hold 3 values: opcode, COND_BR_address, and Rt. Rt is a register index. COND_BR_address is a value that represents the line of code to branch/jump to. The value in the Rt register can be used to evaluate conditions for things like CBZ or CBNZ, otherwise, the flags are used.

IM (Immediate Move format)

opcode MOV_immediate Rd
31-21 20-5 4-0

IM format instructions hold 3 values: opcode, MOV_immediate, and Rd. Rd is a register index. MOV_immediate is an immediate value that is 'moved' or stored within a register.

Core Instructions

The following are all the instructions this extension will/already supports. These core instructions handle everything from adding values, to accessing memory, to conditional branching.

Implemented? Name Mnemonic Format Definition Example
YES ADD ADD R Adds two registers together. ADD X0, X1, X2 // adds X1 and X2, stores result in X0
YES ADD and Set flags ADDS R Adds two registers together and sets flags. ADDS X0, X1, X2 // adds X1 and X2, stores result in X0, sets flags
YES ADD Immediate ADDI I Adds a register to an immediate value. ADDI X0, X1, 5 // adds 5 to X1, stores result in X0
YES ADD Immediate and Set flags ADDIS I Adds a register to an immediate value, and sets flags. ADDIS X0, X1, 5 // adds 5 to X1, stores result in X0, sets flags
YES AND AND R Bitwise AND using two registers. AND X0, X1, X2 // stores result of X1 AND X2 in X0
YES AND and Set flags ANDS R Bitwise AND using two registers, and set flags. ANDS X0, X1, X2 // stores result of X1 AND X2 in X0, sets flags
YES AND Immediate ANDI I Bitwise AND using a register and an immediate value. ANDI X0, X1, 5 // stores result of X1 AND 5 in X0
YES AND Immedaite and Set flags ANDIS I Bitwise AND using a register and an immediate value, and sets flags. ANDIS X0, X1, 5 // stores result of X1 AND 5 in X0, sets flags
YES Branch unconditionally B B Starts executing instructions at the given label. B loop // branches to label "loop:"
YES Branch conditionally (EQual) B.EQ CB Branches to the given label, if the zero flag is enabled. B.EQ loop // branches to label "loop:" if last flags call had equal values
YES Branch conditionally (Not Equal) B.NE CB Branches to the given label, if the zero flag is not enabled. B.NE loop // branches to the label "loop:" if the last flags call had not equal values
YES Branch conditionally (Less Than) B.LT CB ... ...
YES Branch conditionally (Less than or Equal) B.LE CB ... ...
YES Branch conditionally (Greater Than) B.GT CB ... ...
YES Branch conditionally (Greater than or Equal) B.GE CB ... ...
YES Branch conditionally (HIgher) B.HI CB ... ...
YES Branch conditionally (Higher or Same) B.HS CB ... ...
YES Branch conditionally (LOwer) B.LO CB ... ...
YES Branch conditionally (Lower or Same) B.LS CB ... ...
YES Branch conditionally (on MInus) B.MI CB ... ...
YES Branch conditionally (on PLus) B.PL CB ... ...
NO Branch conditionally (on oVerflow Set) B.VS CB ... ...
NO Branch conditionally (on oVerflow Clear) B.VC CB ... ...
YES Branch with Link BL B Branches to the given label, and sets the return register (LR/X30) to the current execution index. BL loop // branches to label "loop:", saves execution index for return later
YES Branch to Register BR R Branches to the execution index given within the register. BR X30 // branches to the execution index stored in X30 (could be any register)
YES Compare and Branch if Not Zero CBNZ CB Branches to the given label, if the given register is not zero. CBNZ X0, loop // branches to label "loop:" if the given register is not zero
YES Compare and Branch if Zero CBNZ CB Branches to the given label, if the given register is zero. CBZ X0, loop // branches to label "loop:" if the given register is zero
YES Exclusive OR EOR R Exclusive OR using two registers. EOR X0, X1, X2 // stores result of X1 XOR X2 in X0
YES Exclusive OR Immediate EORI I Exclusive OR using a register and an immediate value. EOR X0, X1, 5 // stores the result of X1 XOR 5 in X0
YES LoaD Register Unscaled offset LDUR D Retrieves data from memory and stores it in a register. LDUR X0, [X1, 0] // stores the memory at pointer location X1 + offset (0) in register X0
YES LoaD Byte Unscaled offset LDURB D ... ...
YES LoaD Half Unscaled offset LDURH D ... ...
YES LoaD Signed Word Unscaled offset LDURSW D ... ...
YES Logical Shift Left LSL R Shifts the register to the left (<<) using the immediate value. LSL X0, X1, 5 // shifts the register X1 5 times to the left, stores result in X0
YES Logical Shift Right LSR R Shifts the register to the right (>>) using the immediate value. LSR X0, X1, 5 // shifts the register X1 5 times to the right, stores result in X0
YES MOVe wide with Keep MOVK IM Stores the immediate value in the register, but does not change the other bits in the register. MOVK X0, 5 // sets register to 5, but does not change existing bits
YES MOV wide with Zero MOVZ IM Stores the immediate value in the register, and sets the other bits in the register to zero. MOVZ X0, 5 // sets register to 5, and changes existing bits to 0
YES Inclusive OR ORR R Bitwise OR using two registers. ORR X0, X1, X2 // stores result of X1 OR X2 in X0
YES Inclusive OR Immediate ORRI I Bitwise OR using a register and an immediate value. ORRI X0, X1, 5 // stores result of X1 OR 5 in X0
YES STore Register Unscaled offset STUR D Sets data in memory to the register value. STUR X0, [X1, 0] // stores the value in register X0 to memory at location X1 + offset (0)
YES STore Byte Unscaled offset STURB D ... ...
YES STore Half Unscaled offset STURH D ... ...
YES STore Signed Word Unscaled offset STURSW D ... ...
YES SUBtract SUB R Subtracts one register from another. SUB X0, X1, X2 // subtracts X2 from X1, stores result in X0
YES SUBtract and Set flags SUBS R Subtracts one register from another, and sets flags. SUB X0, X1, X2 // subtracts X2 from X1, stores result in X0, sets flags
YES SUBtract Immediate SUBI I Subtracts an immediate value from a register. SUBI X0, X1, 5 // subtracts 5 from X1, stores result in X0
YES SUBtract Immediate and Set flags SUBIS I Subtracts an immediate value from a register, and sets flags. SUB X0, X1, 5 // subtracts 5 from X1, stores result in X0, sets flags

Arithmetic Core Instructions

The following are instructions that are do arithmetic/math, at a more advanced level than just adding and subtracting.

Implemented? Name Mnemonic Format Definition Example
YES MULtiply MUL R Multiplies two registers together. MUL X0, X1, X2 // multiplies X1 and X2 together, stores result in X0
YES Signed DIVide SDIV R Divides two registers together. SDIV X0, X1, X2 // divides X2 into X1, stores result in X0
NO Signed MULtiply High SMULH R ... ...
YES Unsigned DIVide UDIV R Divides two unsigned registers together. UDIV X0, X1, X2 // divides X2 into X1, stores result in X0
NO Unsigned MULtiply High UMULH R ... ...

Pseudo Instructions

These are instructions that are shorthand another instruction. They can be written in the program and still function as they should. When being assembled, they would be swapped out with their equivalents.

Implemented? Name Mnemonic Equivalent Definition Example
YES CoMPare CMP SUBS XZR, m, n where m and n are registers Compares the two given registers by setting the flags. CMP X0, X1 // compares X0 and X1
YES CoMPare Immediate CMPI SUBIS XZR, m, n where m is a register, and n is an immediate value Compares the given register and the given value by setting the flags. CMPI X0, 5 //compares X0 and 5
NO LoaD Address LDA ... Loads the return address. LDA X0, loop // loads the address of the label "loop" into register X0
YES MOVe MOV ORR m, XZR, n where m and n are registers Moves a register value to another register. MOV X0, X1 // sets X0 to X1

Debug Instructions

These are instructions that are specific to this extension and may not be implemented in other LEGv8 applications.

Implemented? Name Mnemonic Definition Example
YES DUMP DUMP Displays all registers and memory in the output, and stops the program. DUMP // show all register values and memory at this point in time
YES HALT program HALT Displays all registers and memory in the output, and stops the program. HALT // show all register values and memory at this point in time
YES PRiNT PRNT Prints a string to the output. PRNT The value of X0 is {X0} // prints "The value of X0 is 5" to the output or PRNT X0 // prints 5 to the output
YES PRiNt Line PRNL Prints an empty line to the output. PRNL // prints an empty line to the output

Procedures

Procedures are the way to organize instructions within a file. A naive approach would be to consider them as functions- although similar, they are not quite the same. Procedures are used with branch instructions, which are used to jump around the code. This can be used with functions, as well as other things such as loops or conditionals.

Procedures are not actual instructions themselves. Instead, they provide a reference for the assembler. That way, when a branch instruction is used, it can reference the procedure so it knows the line index that it needs to branch to.

The syntax for procedures is as follows.'

Declaring a procedure:

procedure_name:

Branching to a procedure:

B procedure_name // branch to procedure_name

CBNZ X0, procedure_name // if X0 != 0, branch to procedure_name