Category:68000 Assembly
This programming language may be used to instruct a computer to perform a task.
See Also: |
|
---|
68000 assembly is the assembly language used for the Motorola 68000, or commonly known as the 68K. It should not be confused with the 6800 (which predates it). The Motorola 68000 is a big-endian processor with full 32-bit capabilities (despite most systems that use it being considered 16-bit.) It was used in many computers such as the Amiga or the Canon Cat, as well as game consoles such as the Sega Genesis and Neo Geo.
Architecture Overview
Big-Endian
The 68000, unlike most processors of its era, is big-endian. This means that bytes are stored from left to right. The example below illustrates this concept: <lang 68000devpac>MOVE.L #$12345678,$100000 ;store the hexadecimal numeral #$12345678 at memory address $100000
- hexdump of $100000
- $100000 = $12
- $100001 = $34
- $100002 = $56
- $100003 = $78</lang>
On a little-endian processor such as in x86 Assembly, the order of the bytes would be reversed, i.e.: <lang asm>;hexdump of $100000
- $100000 = $78
- $100001 = $56
- $100002 = $34
- $100003 = $12</lang>
This difference isn't usually relevant in the majority of situations, so don't concern yourself too much. It's much more important when doing 6502 Assembly where registers are smaller than the address space.
Notation Conventions
How you write the source code depends on your assembler and the syntax it uses. This page is written using Motorola syntax but there is also Milo syntax which has different conventions.
How you go about defining numbers or text in your code varies wildly between assemblers. I'm using VASM and these are the rules I have to follow, but your assembler may be different.
- A number with a # in front represents a constant, literal value. For example, the 3 in
MOVE.B #3,D0
represents the number 3.
- A number without a # in front represents a memory location. For example, the 3 in
MOVE.B 3,D0
represents the byte stored at memory address$00000003
.
- A number that doesn't have a $ or % prefix is a decimal (base 10) value.
- A number that begins with a $ is a hexadecimal value.
- A number that begins with a % is a binary value.
- Single and double quotes can be used for ASCII values. When you type
MOVE.L "EGGS",D0
for example, this is equivalent toMOVE.L #$69717183,D0
.
- The operand of
BTST
,BSET
,BCLR
, orBCHG
represents a bit position, starting at the rightmost binary digit as 0 and counting up from right to left. For example,BCLR #3,D0
performs the same bitwise operation that you would get from doingAND.B #%11110111,D0
.
- The operand before the comma is the "source", and the operand after is the "destination." For example,
MOVE.L D3,D2
takes the value in D3 and stores it into D2, not the other way around. This is the opposite of x86 and ARM, which have the source on the right and the destination on the left.
Data blocks, on the other hand, begin with DC.B
, DC.W
, or DC.L
and each represents a constant numeric value. Strangely, you do NOT prefix these with # to signify them as constants (doing so will cause an error on most assemblers). However, you can use the $ or % modifiers to denote hexadecimal or binary.
Keep in mind that there is no requirement to use hexadecimal, decimal, or binary in your source code. It all gets converted to binary anyway. However, it is recommended to use the notation that is appropriate for how your data is meant to be interpreted, for readability purposes.
Data Registers
There are eight 32-bit data registers on the 68000, numbered D0-D7. As the name implies, these are designed to hold data. Much like in ARM Assembly, each one is identical in terms of which commands it can use. A command that can be used for D0 can be used for any other D-register.
<lang 68000devpac>MOVE.B #$FF,D0 ;move the hexadecimal value 0xFF into the bottom byte of D0. ADD.W #$8000,D4 ;add hexadecimal 0x8000 to the value stored in D4.</lang>
Address Registers
There are eight of these as well, numbered A0-A7. A7 is reserved as the stack pointer, and is commonly referenced as SP in assemblers. The others are free to use for any purpose. Although these registers are 32-bit, the 68000's address space is 24-bit (ranges from 0x000000 to 0xFFFFFF), so the leftmost byte is ignored. You can do simple math involving these registers but more complicated commands like multiply or divide can only be used with data registers. Address registers are used to contain addresses and extract the values stored within.
Loading From Memory
<lang 68000devpac>MOVEA.L #$200000,A2 ;usually these are loaded from a label.
- The hex dump of address $200000
- 44 55 66 77
MOVE.L #$00000000,D0 MOVE.B (A2),D0 ;load the byte stored at $200000 into D0. D0 = #$00000044 MOVE.W (A2),D0 ;load the word stored at $200000 into D0. D0 = #$00004455 MOVE.L (A2),D0 ;load the long stored at $200000 into D0. D0 = #$44556677 MOVE.L D2,(A5) ;store the contents of D2 into the memory address pointed to by A5.</lang>
Note that it's also possible to transfer values to/from memory directly, without involving address registers at all. For constant memory locations, this is fine. However, the real strength of the address registers is in their pre-decrement and post-increment modes, which constant memory locations cannot use.
<lang 68000devpac>MOVE.L ($00FF0000),D0 MOVE.W D1,($00FFFFFE) MOVE.W ($00FF0000),($00FF1000)</lang>
The use of parentheses is not required on most assemblers, but can be used as a reminder to someone reading your code that these represent the values stored at the specified memory locations rather than literal numbers.
Post-Increment
The post-increment mode is specified by adding a + to the end of parentheses. This means that after the command is done, the address stored in the address register (not the value stored at that address) is increased by the byte length of the command (1 for .B
, 2 for .W
, 4 for .L
).
<lang 68000devpac>MOVEA.L #$00240000,A4 ;load the address $240000 into A4
MOVE.W (A4)+,D0 ;move the word stored at $240000 into D0, then increment to #$240002
MOVE.L (A4)+,D1 ;move the long stored at $240000 into D1, then increment to #$240006
MOVE.L (SP)+,D3 ;pop the top value of the stack into D3</lang>
Pre-Decrement
The pre-decrement mode is specified by typing a - before the parentheses. This means that before the command is done, the address stored in the address register is decreased by the byte length of the command. <lang 68000devpac>MOVEA.L #$0024000A,A4 ;load the address $24000A into A4 MOVE.W -(A4),D0 ;move the word stored at $240008 into D0 MOVE.L -(A4),D1 ;move the long stored at $240004 into D1 MOVE.L D2,-(SP) ;push the contents of D2 onto the stack</lang>
Address Offsets
A memory address can be offset by a data register, an immediate value, or both. If a data register is used, only the bottom 2 bytes are considered. In either case, the contents of the data register and/or the immediate value are added to the value stored in the address register, and the value is read from that address at the specified length. The offsets are applied during the calculation only; the actual contents in the address register after the move are unchanged. Using a post-increment or pre-decrement with this addressing mode will only update the address by the specified length, not by the offsets.
<lang 68000devpac>MOVE.B (4,A0,D0),D1 ;The byte at A0+D0+4 is loaded into D1.</lang> It's possible to use the same data register as the offset and the destination. This does not cause any problems whatsoever, as the data register offset is "locked in" before the move, and is only updated after the command fully executes. Using the same command again immediately afterwards will offset based on the new value of that register.
<lang 68000devpac>MOVE.W (6,A0,D0),D0 ;The word at A0+D0+6 is read, then loaded into D0.</lang>
A very important note is that when using this method with words and longs, the resulting address must be even! Otherwise the CPU will crash. For MOVE.B
it doesn't matter.
Effective Address
A calculated offset can be saved to an address register with the LEA
command, which stands for "Load Effective Address." x86 Assembly also has this command, and it serves the same purpose. The syntax for it can be a bit misleading depending on your assembler.
<lang 68000devpac>LEA myData,A0 ;load the effective address of myData into A0 LEA (4,A0),A1 ;load into A1 the effective address A0+4. This looks like a dereference operation but it is not! MOVE.W (A1),D1 ;dereference A1, loading the value it points to into D1.</lang>
This can get confusing, especially if you have tables of pointers. Just remember that LEA
cannot dereference an address.
If you don't want to store the effective address in an address register, you can use PEA
(push effective address) to put it onto the stack instead.
<lang 68000devpac>LEA myData,A0 ;load the effective address of myData into A0
PEA (4,A0) ;store the effective address of A0+4 onto the stack.</lang>
The Stack
The 68000's stack is commonly referred to as SP
but it is also address register A7
. This register is handled differently than the other address registers when pushing bytes onto the stack. A byte value pushed onto the stack will be padded to the right. The stack needs to pad byte-length data so that it can stay word-aligned at all times. Otherwise the CPU would crash as soon as you tried to use the stack for anything other than a byte!
Length
The 68000 can work with 8-bit, 16-bit, or 32-bit values. Some commands only work with specific lengths but most work with all three. To specify which length you are using, the command ends in a different suffix:
.B
for byte length (8 bit).W
for word length (16 bit).L
for long length (32 bit)
If you don't specify a length with your command, it usually defaults to word length, but ultimately it depends on the command you are using. (Some commands cannot be used at word length.)
Bytes and words moved into a register are always stored on the right-hand side. For example: <lang 68000devpac>MOVE.L #$FFFFFFFF,D7 MOVE.B #$00,D7 ;D7 contains #$FFFFFF00 MOVE.W #$2222,D7 ;D7 contains #$FFFF2222</lang>
As you can see, the rest of the register is unchanged. (On the ARM, it would turn to zeroes.) This is very important to remember. If your code is doing something unexpected it might be due to the "old" value of the register corrupting another function.
If the given constant is smaller than the length provided, the value is padded to the left with zeroes. <lang 68000devpac>MOVE.W #$FF,D3 ;D3 = #$xxxx00FF, where x is the previous value of D3. MOVE.L #0,D3 ;D3 = #$00000000</lang>
Loading immediate values into address registers is different. You can only move words or longer into address registers, and if you move a word, the value is sign-extended. This means that if the top nibble of the word is 8 or greater, the value gets padded to the left with Fs, and is padded with zeroes if the top nibble is 7 or less. If you're adding a constant value less than 7FFF to an address, it's usually safe to use the word length operation, which takes less bytes to encode than the long length version.
<lang 68000devpac>MOVEA.W #$8000,A4 ;A4 = #$FFFF8000. Remember the top byte is ignored so this is the same as #$00FF8000. MOVEA.W #$7FFF,A3 ;A3 = #$00007FFF</lang>
The Flags
The flags are stored in the Condition Code Register, also known as the CCR
. The 68000 has no built-in commands like CLC
for clearing/setting individual flags. Rather, you can alter them directly with MOVE
,AND
,OR
, and EOR
. Unfortunately, this means you'll have to remember which bits represent which flags. Or, if your assembler supports macros, you can define a macro that handles this for you.
The flags update automatically after most operations, and take into consideration the operand sizes when doing so. Check out this example: <lang 68000devpac> MOVE.L #$12FF,D0 ADD.B #1,D0 </lang>
Since we used ADD.B
, D0 now contains $1200, and the extend, carry, and zero flags are all set. Had we done ADD.W
, we would get $1300 in D0 with none of those flags set. The flags are based on what the actual instruction "sees", not the entire register at all times.
- X: The eXtend flag is bit 4 of the CCR, and is similar to the carry flag. It gets set and cleared often for the same reasons and is used with the
ADDX
,SUBX
,NEGX
,ROXL
, andROXR
commands. Why the 68000 has both this and the carry flag, I still don't know.
- N: The negative flag is bit 3 of the CCR, and is set when the last operation resulted in a "negative" value. What constitutes a negative value depends on the size of the last operation - for
.B
instructions, $80-$FF. For.W
instructions, $8000-$FFFF, and for.L
instructions, $80000000-$FFFFFFFF.
- Z: The zero flag is bit 2 of the CCR and works like you would expect - it's set whenever an operation results in zero. Unlike x86 Assembly, this also includes moving 0 directly into a register, clearing a register or memory with
CLR
, etc.
- V: The overflow flag is bit 1 of the CCR. It is set whenever a math operation results in a value crossing the $7F-$80 boundary. (Wraparound from 00 to FF doesn't count as overflow, but it does set the carry flag.)
- C: The carry flag is bit 0 of the CCR. It is set when a math operation results in a carry or borrow. Rolling over from FF to 00, or a 1 getting "pushed out" via a bit shift or rotate, set the carry flag. When using
CMP
, the carry flag determines the unsigned magnitude comparison. Carry set is less than, carry clear is greater than or equal.
In truth, the flags are a 16-bit register, of which the CCR is just the "low half". The SR (status register) is the full 16-bit register. There are a few additional flags in the upper half, which are used by the operating system. You can read these but for the most part you won't need to write to them.
The Vector Table
Certain memory locations have special meanings to the 68000 CPU. Most of these are used for system calls or exception handling. Sixteen of them can be accessed with the TRAP #?
command, which takes an immediate constant ? ranging from 0 to 15 as an operand. This effectively equates to a JSR
to the address stored at $00000080 + (4*?)
. For the most part, what each TRAP
does depends on the system's firmware and so you'll need to read the documentation. Some machines allow the user to define their own TRAPs at $00000080 - $000000BF, others are hardcoded in. Even others will likely take a "middle ground" and have the hard-coded trap read a function pointer from some other address that the user is allowed to write to, and jump to that.
The 68000's vector table contains more than this, but to keep things simple there will be a few things omitted.
- Address $00000000 contains the initial value of the stack pointer. You never need to
MOVE.L (0),SP
, the CPU does this automatically. - Address $00000004 contains the program start. The second thing the CPU does (after loading the default stack pointer from $00000000) is
JMP
to the address stored in $00000004 (note that it doesn't jump TO $00000004, it loads the 4 bytes stored there and makes that the new program counter value.) - The next 8 longwords are the hardware traps. Each represents the memory location of a function or procedure meant for a hardware error (the 68000 doesn't have segmentation faults but it's a similar concept. Among them include handlers for division by zero, signed overflow, etc.)
- Interrupt requests and user-defined traps go here as well.
Programming your own trap is a lot like programming a typical subroutine. However, there are a few differences:
- The statement to return back to your regular program must be
RTE
rather thanRTS
. Otherwise, you'll most likely crash the CPU.
- It's best to push all registers (D0 thru D7 and A0-A6) onto the stack at the start and pop them off at the end. This is not required for the traps you call with the
TRAP #?
command, but for the others it's absolutely necessary, since you can't know in advance when they'll happen. You can leave out A7 when doing this since that's the stack pointer itself.
Interrupts
The 68000 supports 7 different interrupts, often called IRQs or Interrupt Requests. Enabling interrupts is often a twofold process: first, the interrupt source must be enabled, which is usually an implementation-defined process that involves interacting with memory-mapped ports. Second, the status register must be set accordingly to allow the interrupt to occur, which can be achieved with MOVE #$2x00,SR
where X is the desired interrupt level. X can range from 0 to 7, and for an interrupt to occur, its interrupt level (which is determined by the hardware implementation and the placement of the desired address in the vector table) must be greater than X or it will not happen (even if the source is enabled.)
Alignment
The 68000 can only read or write words and longs at even addresses. Doing so at an odd address will result in the CPU crashing. (Note that reading byte data will not cause a crash regardless of whether it's located at an odd or even address.) This isn't usually a problem, but it can be if the programmer is not careful with the way their data is organized. Consider the following example: <lang 68000devpac>TestData: DC.B $02 DC.W $0345
LEA TestData,A0 ;load effective address of TestData into A0. MOVE.B (A0)+,D0 ;load $02 into D0, increment A0 by 1 MOVE.W (A0)+,D1 ;this crashes the CPU since A0 is now odd</lang>
How was it known that the address was odd at the second instruction? Simple. All instructions take an even number of bytes to encode. So there are only a few ways improper alignment can occur:
- An odd value is loaded into an address register.
- An addition or subtraction done to the value in an address register resulted in an odd memory address.
- An indirect read at byte length was performed using pre-decrement or post-increment, at an even address. (e.g.
MOVE.B (A0)+,D0
orMOVE.B -(A0),D0
where A0 contained an even memory address.)
If the programmer is smart with the way they encode byte-length data they can avoid this problem entirely with little effort.
One way is to separate byte-length data into its own table. <lang 68000devpac>ByteData: DC.B $20,$40,$60,$80
WordData: DC.W $1000,$2000,$3000,$4000</lang>
Another way is to pad the data with an extra byte, so that there is an even number of entries in the table. This becomes impractical with large data tables, so the EVEN
directive can be placed after a series of bytes. If the byte count is odd, EVEN
will pad the data with an extra byte. If it's already even, the EVEN
command is ignored. This saves you the trouble of having to count a long series of bytes without worrying about wasting space.
<lang 68000devpac>MyString: DC.B "HELLO WORLD 12345678900000",0
EVEN ;some assemblers require this to be on its own line</lang>
A third way is to perform a "dummy read." This is when a value is read from an address using pre-decrement or post-increment, with the sole purpose of moving the pointer, and the value being read is of zero interest. This method lets you work with mixed data types in the same table, but it requires the programmer to know in advance where the byte-length data begins and ends. <lang 68000devpac>TestData: DC.B $02,$03,$04 DC.W $0345
LEA TestData,A0 ;load effective address of TestData into A0. MOVE.B (A0)+,(A1)+ ;copy $02 to a new memory location MOVE.B (A0)+,(A1)+ ;copy $03 to a new memory location MOVE.B (A0)+,(A1)+ ;copy $04 to a new memory location
- if we did MOVE.W (A0)+,(A1)+ now we'd crash. First we need to adjust the pointers.
MOVE.B (A0)+,D7 ;dummy read to D7. Now A0 is word aligned. MOVE.B (A1)+,D7 ;dummy read to D7. Now A1 is word aligned. MOVE.W (A0)+,(A1)+ ;copy $0345 to a new memory location</lang>
Using ADDA.L #1,A0
and ADDA.L #1,A1
would have worked also, instead of the dummy read. The 68000 gives the programmer a lot of different ways to do a task.
Subroutines
Subroutines work exactly the same as they do in 6502 Assembly. Even the commands are the same; JSR
and RTS
. The only difference is that return spoofing doesn't require the return address to be decremented by 1. The 68000 also adds BSR
for nearby subroutines. These still need to end in an RTS
just the same, but saves CPU cycles compared to a JSR
.
Citations
- 'Akuyou', Keith. Learn Multiplatform Assembly Programming with Chibiakumas! Las Vegas, NV. Self-published, 05 April 2021.
- [ChibiAkumas Motorola 68000 Tutorial]
- [68000 Tricks and Traps]
Subcategories
This category has the following 2 subcategories, out of 2 total.
@
- 68000 Assembly User (9 P)
Pages in category "68000 Assembly"
The following 106 pages are in this category, out of 106 total.