Category:ARM Assembly
This programming language may be used to instruct a computer to perform a task.
See Also: |
|
---|
The ARM architecture is widely used on mobile phones and tablets. It falls under the category of RISC (Reduced Instruction Set Computer) processors, which means it has fewer opcodes than a CPU such as those in the x86 family. However, it makes up for this with its speed. The ARM and its variants are used in many well-known systems such as the Raspberry Pi, Nintendo DS, iPad, and more.
Registers
The ARM has 15 main registers the programmer can use, numbered R0
through R15
. The higher-numbered ones have special purposes, but R0
through R10
can be used for anything. In other words, there are no commands that only work with R0
(system calls notwithstanding). Registers with a specific purpose have alternate abbreviations that your assembler allows you to use for clarity.
Barrel Shifter
The ARM can add a bit shift or rotate to one of its operands at no additional cost to execution time or bytecode. If the operand being shifted is a register, the value of that register is not actually changed. The shift or rotate only applies during that instruction.
<lang ARM Assembly>add r0,r0,r1 lsl 2 ;shift r1 left 2 bits, add r0 to r1, store the result in r0. r1 is unchanged after this instruction</lang>
Separate Destination for Math
With the x86, 68000, and other similar processors, arithmetic functions take two operands: the source and the destination. Anytime you add two numbers, one of them gets changed. This is not the case with the ARM. The destination can be a third register that isn't involved in the arithmetic whatsoever!
<lang ARM Assembly> add r3,r2,r1 ;add r2 to r1 and store the result in r3. r1 and r2 are unchanged.</lang>
Conditional Opcodes
Checking for condition codes isn't just limited to branching on the ARM. Almost every instruction can be made conditional. If the condition is not met, the opcode will have no effect. This saves a lot of cycles that would be spent branching just to execute a single instruction.
Compare the following snippets of code. The first is written in 8086 Assembly, the second in ARM. Both do the same thing, but ARM can do it without branching.
<lang asm>mov ax, word ptr [ds:TestData] ;dereference the pointer to TestData and store the value contained within that address into ax add ax,1 ;add 1 to ax jo OverflowSet ;the addition caused an overflow, jump to this label. ret ;return from subroutine
OverflowSet: sub ax,1 ;rollback the previous addition. ret ;return from subroutine.</lang>
The same code translated to ARM doesn't need to branch: <lang ARM Assembly>mov r1,#TestData ;get the address of TestData ldr r0,[r1] ;load the 32-bit value stored at TestData into r0 adds r0,r0,#1 ;add 1 to r0 and store the result in r0, updating the flags accordingly. subvs r0,r0,#1 ;subtract 1 from r0 and store the result in r0, only if the overflow flag was set.</lang>
If your code does one thing when a flag is set and another when that same flag is clear, the ARM can select the correct option without having to branch at all:
<lang ARM Assembly>;ARM ASSEMBLY mov r1,#TestData ;get the address of TestData ldrs r0,[r1] ;load the 32-bit value stored at TestData into r0, updating the flags accordingly. addeq r0,r0,r2 ;if r0 equals zero, add r2 to r0 and store the result in r0. subne r0,r0,r2 ;if r0 doesn't equal zero, subtract r2 from r0 and store the result in r0.</lang>
The equivalent in x86 would take at least one branch, maybe 2 depending on the outcome: <lang asm>;x86 ASSEMBLY
mov ax, word ptr [ds:TestData] cmp ax,0 jne subtract_bx add ax,bx jmp done
subtract_bx:
sub ax,bx
done:</lang>
Setting Flags
The flags, or condition codes, are only set by instructions that end in an "s," or by compare commands such as CMP
. This lets you "preserve" the processor's state after an important calculation, but do some other things before execution branches depending on the result of that calculation. On any other processor, the calculation that determines whether a branch occurs must happen immediately before that branch statement or the branch will be taken/not taken based on the wrong data.
<lang ARM Assembly>cmp r0,r1 ;compare r0 to r1 ldr r2,[r3] ;load r2 from the address stored in r3 ldr r3,[r4] ;load r3 from the address stored in r4 bne myLabel ;branch to myLabel if the result of "cmp r0,r1" was not equal to zero.</lang>
Most processors would have to push and pop the condition code register between the compare and the branch. Otherwise, the act of loading r2
and r3
would affect the outcome of the branch. Not so on the ARM!
Call Stack
Most processors, including the x86 family, will use the same hardware stack for function arguments, local variables, and return addresses. The ARM doesn't actually need to store a return address onto the stack until subroutines are nested (though ARM Assembly written by a compiler will most likely do so anyway.) This is because the link register or r13
is responsible for holding the return address. BL
is the equivalent of CALL
on the x86 architecture, and instead of pushing the program counter to the stack, it gets copied to the link register before the branch. Once the function is complete, execution returns by moving the value in the link register back into the program counter. For nested subroutines, the link register will need to be pushed onto the stack, as the link register can only "remember" the return address of the most recent BL
instruction.
Limitations of the ARM
While the ARM has a rich amount of features that other processors only dream of having, there are a few limitations.
The biggest one (which was more of an issue on earlier versions such as the ARM7TDMI CPU in the Game Boy Advance) is the limitation of the MOV
command. Arguably the most important command any processor has (apart from JMP
), the MOV
command on the ARM is often limited in what can be loaded into a register in a single command. Depending on the pattern of bits, some immediate values cannot be loaded into a register directly. The key features of the ARM instructions (barrel shifter, conditional commands, etc) all take up bytes in each command, whether they are used in a given instance of a command or not. So in order to store 32 bit numbers in a MOV
command, the value has to be "8-bit rotatable," meaning that it can be expressed as an 8 bit number if you shift it enough times. Basically if there are too many 1s in the binary equivalent of the number you're trying to load, it can't be done in one go.
Most of the time, this isn't a huge deal, as the easiest way around this is to define a data block nearby containing the value you wish to load. Since each command on the ARM takes 4 bytes of storage, this is guaranteed to take equal or fewer bytes than loading the number into a register piece-by-piece.
<lang ARM Assembly>mov r0,#0x04000000 add r0,r0,#0x130
- compare to
TestDataAddr: .word 0x04000130 ldr r0,TestDataAddr</lang>
Thankfully, there's an even easier solution than this. The GNU Assembler saves the day with the following special notation. <lang ARM Assembly>mov r0, =#value</lang>
This isn't actually valid ARM code, it's more of a built-in macro. Essentially, the value will be loaded in one go as an immediate if it can. If not, it will get placed nearby as a data block and the MOV
will be changed to an LDR
command. Basically you can take everything in the above paragraph and forget about it, since equals notation does the work for you.
Subcategories
This category has the following 3 subcategories, out of 3 total.
@
- ARM Assembly Implementations (empty)
- ARM Assembly User (16 P)
Pages in category "ARM Assembly"
The following 51 pages are in this category, out of 251 total.
(previous page) (next page)S
- Sorting algorithms/Pancake sort
- Sorting algorithms/Patience sort
- Sorting algorithms/Permutation sort
- Sorting algorithms/Quicksort
- Sorting algorithms/Radix sort
- Sorting algorithms/Selection sort
- Sorting algorithms/Shell sort
- Split a character string based on change of character
- Stack
- String append
- String case
- String comparison
- String concatenation
- String interpolation (included)
- String length
- String matching
- Subleq
- Substitution cipher
- Substring
- Sum and product of an array
- System time
T
- Take notes on the command line
- Tau function
- Terminal control/Clear the screen
- Terminal control/Coloured text
- Terminal control/Cursor movement
- Terminal control/Cursor positioning
- Terminal control/Inverse video
- Time a function
- Tokenize a string
- Tonelli-Shanks algorithm
- Totient function
- Towers of Hanoi
- Tree traversal
- Two sum
- Two's complement