Category:8086 Assembly

This page is a stub. It needs more information! You can help Rosetta Code by filling it in!

8086 Assembly is the assembly language used by the Intel 8086 processor. This processor was used for the first time in the IBM PC, and in its various clones. The 8086 gave birth, starting with the 80186 processor, to the X86 family, that nowadays is the most used processor family in desktop computers. All the 32 and 64 bit processors from this family are able to operate in a 8086 compatibility mode, for backward compatibility with legacy software and running very low-level code (like the BIOS). For the evolution of this assembly implementation to 32 bits, see X86 assembly.

Architecture Overview

Segmented Memory

The 8086 uses a segmented memory model, similar to the Super Nintendo Entertainment System. Unlike banked memory models used in the Commodore 64 and late NES games, segment addresses are held in segment registers. These segment registers are 16 bit and get left-shifted by 4 and added to the pointer register of interest to determine the memory address to look up. The 8086 has four in total, but only the DS and ES registers can be used by the programmer. (The other two work with the stack pointer and instruction pointer, and are loaded for you.) On the 8086, you can only load segment registers with the value in a data register, or with the POP command. So first you must load a segment into a data register, THEN into a segment register.

<lang asm>;This is NOT valid code! mov ds, @data ;you're trying to move the data segment location directly into DS. You can't!

This is the proper way

mov ax, @data ;I chose AX but I could have used BX, CX, or DX. mov ds, ax ;load DS with the data segment.</lang>

It's important to remember the subtle distinction between a segment and a segment register. Your program might have labeled segments such as .data and .code, but in order to properly read from/write to these sections you'll need to load their memory locations into segment registers.

Data Registers

There are four data registers: AX, BX, CX, and DX. While most commands can use any of them, some only work with particular data registers. System calls in particular are very specific about which registers can be used for what. On MS-DOS, the AX register is used for selecting the desired interrupt to use with the INT command.

Each data register is 16-bit, but has two eight bit halves ending in H or L, e.g. AH, AL. Instructions can be executed using the whole register or just half of it. <lang asm>mov ax, 1000h ;move 1000h into AX, or equivalently, move 10h into AH and 00h into AL.</lang> Moving a value smaller than 16 bits into ?X is the same as moving it into ?L, and moving 0 into ?H. (? represents the data register of your choice. They are all the same in this regard.)

<lang asm>mov ax,0030h

is the same as

mov al, 30h mov ah, 0h</lang>

Commands that alter the contents of a single register will not affect the other half. <lang asm>mov ax,00FFh inc al</lang> If we had executed INC AX, then AX would have been incremented to 0x100. Since we only incremented AL, only AL was incremented in this case, and AH still equals 0x00!

Generally speaking, the 8086's registers serve the following purposes:

AX is the "Accumulator" and is used for advanced mathematics routines, as well as the source/destination for the STOSW and LODSW commands when loading/storing bytes from consecutive regions of memory.
BX can be used as a variable offset on the Source Index/Destination Index registers (more on those later).
CX is used as a loop counter. The JCXZ command jumps to the specified address, but only if CX = 0. In addition, CL is used to specify a shift amount when performing bit shifts and rotates. On later x86 CPUs, this can be any constant value, but on the original 8086 you can only specify 1 or CL.
DX contains the "high word" of a 32-bit product when multiplying AX with another 16-bit register. For example, if AX contains 0x2000 and BX contains 0x10, the command MUL BX will create the product 0x20000, with DX containing 0x0002 and AX containing 0x0000. In addition, DX is also used with the IN and OUT commands when selecting which ports to read from/write to external hardware.

Other Registers

When writing to or reading from consecutive sections of memory, it is helpful to apply an offset from a base value. The Base Pointer register BP, Source Index SI, and Destination Index DI can point to various regions of memory. Many commands that work with these registers can auto-increment or decrement them after each load or store. In addition, they can be optionally offset by a constant, the value stored in BX, or both at the same time. (BX cannot be added to BP in this manner, but it can be added to DI, and SI.) Unlike the data registers, BP,DI, and SI cannot be split in half and worked on separately. Only data registers allow you to work in 8-bit.

The syntax for offsetting an index register will vary depending on your assembler. Many assemblers will often accept multiple different ways of writing it, and it comes down to personal preference.

<lang asm>mov si, offset MyArray mov bx,2

mov al,[bx+si] ;loads decimal 30 into AL

MyArray: byte 10,20,30,40,50</lang>

The Stack

As with Z80 Assembly, you can't push 8-bit registers onto the stack. For data registers, you have to push/pop both halves. <lang asm>;This is valid code. push ax push bx

pop bx pop ax</lang>

<lang asm>;This is NOT valid code. push ah push al push bh push bl

pop bl pop bh pop al pop ah</lang>

As with all processors that use a stack, if you push one or more registers and want to restore the backed-up values correctly, you must pop them in the reverse order. You can pop them out of order on purpose to swap registers around. In fact, this is a quick way to move the segment from DS into ES, or vice-versa: <lang asm>push DS pop ES ;you can't do "mov es, ds" but you can do this!</lang>

The proper way to use the stack to preserve registers: <lang asm> call foo

mov ax,4C00h int 21h ;exit program and return to DOS. Instruction pointer cannot move beyond this except with a function call.

foo: push ax push bx push cx

pop cx pop bx pop ax ret</lang>

If one of the push/pop commands in the routine above were missing, the RET instruction would not properly return to where it came from. As long as you pop at the end the same number of registers you pushed at the start, the stack is "balanced" and your return instruction will return correctly. This is because RET is actually POP IP (IP being the instruction pointer, which you can think of as what "line" of code the CPU is on.) The CPU assumes the top of the stack is the correct place to return to, but has no way of actually verifying it. If the function you just wrote causes the CPU to crash or jump to a completely different part of the code, there's a good chance you might have forgotten to balance the stack properly.

Arithmetic

The 8086 has a lot of advanced mathematics commands. Like the 68000 it can multiply and divide. Although the 8086 doesn't work with 32 bit numbers in a single register, it has many more options for Binary Coded Decimal values, including commands for both "packed" (two digits per byte) and "unpacked" (one digit per byte.) By contrast, 68000 Assembly only has commands for the "packed" format.

The 8087 co-processor adds many different commands for floating-point mathematics. These instructions typically start with the letter "F" (e.g. FADD, FMUL, etc.) When the 8086 reads an instruction that is meant for the 8087, it passes that instruction to the 8087 which then executes it. The 8087's registers are more complicated than that of the 8086; the 8087 uses a stack-based register system.

While both the 8086 and the 8087 can read the same code, data, and memory, they cannot read the contents of each other's registers. So for example, if you wanted to use the 8087 to perform a calculation and then output the result to the screen, you'd need to give the 8087 the command to do the math and store the result into RAM. Then, the 8086 would read that RAM and output the result to the screen. The 8087 also doesn't have the robust indexed addressing modes of the 8086 - so the 8086 will often do the job of looking up a value from a table, then dumping that value into a temporary "loading zone" at a known location where the 8087 can more easily read from. In order to make sure that the 8086 and 8087 stay in sync, the 8086 can WAIT for the 8087 to finish its current instruction before executing more instructions. This is incredibly useful in the event that the 8086 needs to use a calculation that the 8087 did - it might end up reading from the "loading zone" before the 8087 actually has put the result of the calculation in it! (Most assemblers will handle this for you.)

One other caveat to mention: On early IBM PCs and compatibles, the 8087 was not built into the CPU like it is now. Back then, it was sold separately. So if you're programming on original hardware and the floating point commands don't work, it might be because you don't have the coprocessor installed!

Looping Constructs

The 8086 has a lot more of these than most CPUs. The most obvious one is LOOP, which will subtract 1 from CX, then jump back to a specified label if CX is nonzero after the subtraction. If CX becomes zero after subtracting 1, then no jump will occur and the instruction pointer simply moves to the next instruction.

<lang asm>mov cx,0100h ;set the loop counter's starting value. This must be outside the loop, otherwise you'll loop forever!

foo:

your code that you want to loop goes here

loop foo </lang>

It's important not to alter CX inside the loop. It's easy to make a mistake like this if you're in a hurry:

<lang asm>foobar: mov cx,1000h

your code goes here

loop foobar</lang>

It may not be obvious at first but the above loop will never end. CX decrements to 0x0FFF with the LOOP instruction but the mov cx,1000h was mistakenly placed inside the loop, resetting the loop counter back to 0x1000, which means no progress is really being made. The correct way to do this is to set the starting loop counter outside the loop, like so:

<lang asm>mov cx,1000h foobar:

your code goes here

loop foobar</lang>

Sometimes you'll have to change CX during a loop, like if you want to do bit shifts with a shift amount other than 1, for example. There's a simple fix - use the stack to stash and retrieve the loop counter. <lang asm>mov cx,0100h

foo: push cx

  mov cl,2
  ror ax,cl   ;starting with the 80186 you don't need CL for bit shifting.

pop cx loop foo</lang>

By using PUSH CX and POP CX, you can temporarily use CX for something else, as long as you restore it before the LOOP instruction.

You can use other registers as loop counters as well, but not with the LOOP instruction - it's hardcoded to only work with CX. But you can do this: <lang asm>mov dx,0100h baz:

your code goes here

dec dx jnz baz

A minor note for advanced programmers
If you're expecting the flags to be in a particular state based on the loop body, tough luck!
They'll only reflect the decrement of DX from 1 to 0 at this point in the code.
LOOP doesn't change the flags, which makes it more useful for loops meant to compare things.</lang>

Another frequently used looping construct is REP. REP can be combined with certain instructions to repeat that instruction until CX equals zero. However, unlike LOOP, which can be used to repeat a block of instructions, REP can only repeat one. It doesn't work on all instructions, only the "string" instructions which operate on a block of memory. Typically these include MOVSB, LODSB, STOSB, CMPSB, and SCASB (each has a variant that ends in W instead of B, for 16-bit data.) There are also REPZ and REPNZ, which stand for "Repeat if Zero" and "Repeat if Nonzero" respectively. These two only work properly with CMPSB and SCASB, as MOVSB, LODSB, STOSB do not affect the flags. (The CPU doesn't detect whether CX equals zero using the flags, as the flags never reflect this equality to zero like you would expect.)

Citations

'Akuyou', Keith. Learn Multiplatform Assembly Programming with Chibiakumas! Las Vegas, NV. Self-published, 05 April 2021.

Subcategories

This category has the following 3 subcategories, out of 3 total.

@

8086 Assembly examples needing attention‎ (empty)
8086 Assembly Implementations‎ (empty)
8086 Assembly User‎ (22 P)

Pages in category "8086 Assembly"

The following 123 pages are in this category, out of 123 total.

1

100 doors

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

Old lady swallowed a fly

P

R

S

T

U

V

W

Water collected between towers