Category:ARM Assembly: Difference between revisions

 
(5 intermediate revisions by the same user not shown)
Line 46:
MyCode:
adr R2,RAM_Address ;get the address of a nearby place to store values.
MOV R0,#0x12345678 ;the value to store.
STR R0,[R2] ;store 0x12345678 into the first 32-bit slot.</lang>
 
Line 83:
Here, the offset is performed ''before'' the storage operation. What if you want to offset afterwards? That would be useful for reading in a data stream. Good news - you can do that simply by having the offset value or register '''outside''' the brackets. This is called "post-increment" or "post-indexing." Unlike pre-indexing, these changes to the pointer are not temporary.
 
<lang ARM Assembly>LDR R0,[R1],#4 ;load the 32-bit value stored at memory location R1 into R0, THEN add 4 to R0R1. This offset remains even after this
; instruction is finished.</lang>
 
Line 150:
===Call Stack===
Most processors, including the x86 family, will use the same hardware stack for function arguments, local variables, and return addresses. The ARM doesn't actually need to store a return address onto the stack until subroutines are nested (though ARM Assembly written by a compiler will most likely do so anyway.) This is because the <i>link register</i> or <code>r13</code> is responsible for holding the return address. <code>BL</code> is the equivalent of <code>CALL</code> on the x86 architecture, and instead of pushing the program counter to the stack, it gets copied to the link register before the branch. Once the function is complete, execution returns by moving the value in the link register back into the program counter. For nested subroutines, the link register will need to be pushed onto the stack, as the link register can only "remember" the return address of the most recent <code>BL</code> instruction.
 
Actually using the stack to save registers and retrieve them has somewhat strange syntax. I'd recommend using the ''unified syntax'' option if your assembler has it - which lets you use the simple <code>PUSH</code> and <code>POP</code> commands to back up and restore register contents. Normally, these two instructions are only valid in THUMB mode, but with unified syntax you can use them in 32-bit ARM programming as well. Arguments for the <code>PUSH</code> and <code>POP</code> instructions are all enclosed in curly braces, and separated by dashes to specify a range of registers, or commas to separate individual registers. It doesn't matter what order you type them in - they all get pushed/popped in the same order regardless. Standard calling conventions dictate that the stack shall be aligned to 8 bytes at all times - in order to do this, always push/pop an even number of registers, even if you end up having to push/pop one more than necessary. It won't hurt anything if you do, as long as you put it back where you got it.
 
<lang ARM Assembly>PUSH {R4,R5,R6,R7} ;the contents of these registers are stored on the stack.
POP {R4,R5,R6,R7} ;you don't need to list these in reverse order like you would on x86 - the assembler takes care of that for you.</lang>
 
If you don't have unified syntax, you'll need to use the commands below for 32-bit ARM. (<code>PUSH</code> and <code>POP</code> are valid in THUMB mode even if you don't have unified syntax.)
<lang ARM Assembly>STMFD sp!,{r0-r12,lr} ;equivalent of PUSH {r0-r12,lr}
LDMFD sp!,{r0-r12,lr} ;equivalent of POP {r0-r12,lr}</lang>
 
===Limitations of the ARM===
While the ARM has a rich amount of features that other processors only dream of having, there are a few limitations.
The biggest one (which was more of an issue on earlier versions such as the ARM7TDMI CPU in the Game Boy Advance) is the limitation of the <code>MOV</code> command. Arguably the most important command any processor has (apart from <code>JMP</code>), the <code>MOV</code> command on the ARM is often limited in what can be loaded into a register in a single command. Depending on the pattern of bits, some immediate values cannot be loaded into a register directly. The key features of the ARM instructions (barrel shifter, conditional commands, etc) all take up bytes in each command, whether they are used in a given instance of a command or not. So in order to store 32 bit numbers in a <code>MOV</code> command, the value has to be "8-bit rotatable," meaning that it can be expressed as an 8 bit number if you shift it enough times. Basically if there are too many 1s in the binary equivalent of the number you're trying to load, it can't be done in one go.
 
Looking at the following in C and its ARM assembly equivalent (I've cut the stack twiddling and the return statement for clarity) we can see just what exactly happens:
Most of the time, this isn't a huge deal, as the easiest way around this is to define a data block nearby containing the value you wish to load. Since each command on the ARM takes 4 bytes of storage, this is guaranteed to take equal or fewer bytes than loading the number into a register piece-by-piece.
 
<lang ARM AssemblyC>movint r0,#0x04000000main(){
return 0xFFFF;
add r0,r0,#0x130
}</lang>
 
<lang ARM Assembly>mov r0, #255 ;MOV R0,#0xFF
orr r0, r0, #65280 ;ORR R0,#0xFF00 (0xFF00|0x00FF = 0xFFFF)</lang>
 
 
It's very common to store "complicated" numbers into a nearby data block and just load from that data block with PC-relative addressing. These data blocks are usually placed after the nearest return statement so that they don't get executed as instructions.
<lang ARM Assembly>ldr r0,testData ;load 0xABCD1234 into R0
bx lr ;return
testData:
.long 0xABCD1234</lang>
 
;compare to:
TestDataAddr: .word 0x04000130
ldr r0,TestDataAddr</lang>
 
Thankfully, there's an even easier solution than this. The GNU Assembler saves the day with the following special notation.
Line 168 ⟶ 185:
 
This isn't actually valid ARM code, it's more of a built-in macro. Essentially, the value will be loaded in one go as an immediate if it can. If not, it will get placed nearby as a data block and the <code>MOV</code> will be changed to an <code>LDR</code> command. Basically you can take everything in the above paragraph and forget about it, since equals notation does the work for you.
 
===THUMB Mode===
THUMB Mode is a more limited version of the ARM instruction set. The advantage to using it is that each instruction only takes 16 bits to represent rather than 32. This makes it handy for programming on systems that have very little space to work with. It can do almost anything 32-bit ARM can do, but not as easily. There are a few key limitations:
* Immediate operands can only be 8-bit values, period. In other words, only numbers ranging from 0 to 255 are allowed.
* You can still use LDR and ADR to retrieve embedded constants; however they have to be "further along" in memory than the current value of the program counter. In THUMB mode the program counter offsets cannot be negative.
* Registers R0-R7 can do almost anything, but registers of a higher number are harder to use. For registers R8 and above, you can no longer store immediate values into them, for example - you have to load them from registers.
* In THUMB mode you cannot use the barrel shifter, nor can you conditionally set the flags. THUMB Mode works more like an x86 CPU, where each instruction affects the flags differently (or sometimes not at all), and you just have to know which instructions affect which flags.
* Operations you would normally use the barrel shifter for are now separate commands. (You might be used to using these even in 32-bit ARM mode thanks to unified syntax.)
* The stack can still be interacted with using <code>PUSH</code> and <code>POP</code> (again, you were likely doing this anyway.)
 
That being said, it's not all doom and gloom. The registers are still 32-bit, and you can still do most of what the 32-bit ARM can do. If you're coding in C or some other language that gets compiled to ARM Assembly, the compiler will decide whether to use THUMB or 32-bit ARM, but you can request one or the other with command line arguments.
 
[[Category:Assembly]]
1,489

edits