# Pointers and references

You are encouraged to solve this task according to the task description, using any language you may know.

Basic Data Operation
This is a basic data operation. It represents a fundamental action on a basic data type.

You may see other such operations in the Basic Data Operations category, or:

Integer Operations
Arithmetic | Comparison

Boolean Operations
Bitwise | Logical

String Operations
Concatenation | Interpolation | Comparison | Matching

Memory Operations

In this task, the goal is to demonstrate common operations on pointers and references. These examples show pointer operations on the stack, which can be dangerous and is rarely done. Pointers and references are commonly used along with Memory allocation on the heap.

## 6502 Assembly

6502 Assembly is different from most other assembly languages, in the way constants are specified. When you have a statement like `int x = 3` in a language like C or `mov eax,3` in X86 Assembly, the 3 in both cases is called a constant or immediate value, meaning that the value 3 is what gets stored. 6502 Assembly does not work this way. It uses what is called "Motorola syntax" where an immediate value needs a # in front, or else the number is interpreted as a pointer to memory. For example:

```LDA 8  ;load the byte stored at memory address 0x0008 into the accumulator.
LDA #8 ;load the number 8 into the accumulator.```

Pointers in 6502 Assembly can be 8-bit or 16-bit. The 6502 has a 16-bit address space ranging from \$0000-\$FFFF. An 8-bit pointer assumes the high byte to be zero. These pointers are referred to as pointers to "zero-page RAM" and the CPU is much more efficient at working with them than with 16-bit pointers.

```LDX \$2000 ;load the byte at memory address \$2000 into X.
LDY \$75 ;load the byte at memory address \$0075 into Y. This instruction executes faster than the one above it.```

### Pointers on the stack

Works with: 65c02

The 6502's hardware stack is located at memory addresses \$0100-\$01FF and, unlike most CPU architectures, cannot be relocated to any arbitrary address outside this range. In addition, it uses the empty-stack convention, where the stack pointer points to free space rather than the value on top. Therefore, to create a pointer to the top item of the stack we can do the following:

```TSX
LDA \$0101,X ;load the byte most recently pushed on the stack into A without pulling it off the stack.```

Note that in a subroutine, the value at `\$0101,x` is typically the low byte of the program counter, plus 1. It may be different depending on how many registers you pushed onto the stack prior to entry as well. In this example, we'll look at a hand-written translation of the following C function (I'm ignoring 8-bit overflow just to keep things easier to write.)

```inline unsigned char foo (unsigned char a, unsigned char b, unsigned char c){
return a+b+c;
}
```
```LDA #arg_C ;load whatever this value is into accumulator
PHA
LDA #arg_B
PHA
LDA #arg_A
PHA
foo:
PHX
TSX
;stack looks like this: __, X,LOW(PC),HIGH(PC),arg_A,arg_B,arg_C
;stack pointer points to __, as does \$100,X after a TSX
LDA \$0106,X ;LDA #arg_C
CLC
CLC
;we have the desired value in A, now return.
PLX```

This can be done as well on the original 6502, but is much more difficult since you don't have `PHX/PLX`, which means that you can't preserve X or Y without destroying the value in the accumulator. To keep things simple I chose a 65c02 example.

### Pointer arithmetic

The 6502 can offset pointers up to 255 bytes forwards.

```LDX #\$80
LDA \$2000,X ;load the byte stored at \$2080
LDY #\$FF
LDA \$2080,Y ;load the byte stored at \$217F```

Note that when using zero-page addressing, this offsetting is always modulo 256, meaning there is no "carry" when offsetting, unlike with absolute addressing.

```LDX #\$C0

## 8086 Assembly

Works with: [UASM]

### Pointers

In most assemblers, an instruction or a numeric value can be given a label.

```.model small
.stack 1024
.data                    ; data segment begins here
UserRam byte 256 dup (0) ; the next 256 bytes have a value of zero. The address of the 0th of these bytes can be referenced as "UserRam"

tempByte equ UserRam     ;this variable's address is the same as that of the 0th byte of UserRam
tempWord equ UserRam+2   ;this variable's address is the address of UserRam + 2
.code
start:
mov ax, @data
mov ds, ax

mov ax, @code
mov es, ax
```

`equ` directives are flexible in that they can be intended as constants or as memory addresses. In this case, the `equ` directives involving `tempByte` and `tempWord` are used as memory addresses.

We can load to or from memory with the 8086 in a few different ways. In the example below, the value in a memory address is loaded directly from RAM to a register, or vice versa. Assume the code below takes place immediately after the setup above:

```mov ax,0FFFFh                 ;see note 1
mov word ptr [ds:tempWord],ax ;store hexadecimal value FFFF into tempWord

mov bl, 80h
mov byte ptr [ds:tempByte],bl ;the register size needs to match the data type

;Note 1:
;UASM doesn't like leading hex digits A-F so a 0 is placed in front.
;It doesn't change the storage type of the operand. (i.e. this is still a 16 bit value even though there are 5 digits)
```

8086 Assembly doesn't enforce data types that much. While your variables may be intended to be byte or word length, there's nothing actually stopping you from using the "wrong" pointer type, as long as the register you're moving to/from matches the size of the pointer variable.

```mov ax, byte ptr [ds:tempByte] ;assembler will generate an error - operands don't match
mov al, word ptr [ds:tempWord] ;assembler will generate an error - operands don't match
mov ax, word ptr [ds:tempByte] ;assembler will allow this even though the intended data size is wrong
mov al, byte ptr [ds:tempWord] ;assembler will allow this even though the intended data size is wrong
```

You can also define a full double word address like so: `UserRamPtr dword UserRam` Then this pointer can be loaded into a segment register and data register with a single command, like so:

```LDS bx,UserRamPtr ;loads [ds:bx] with UserRamPtr
```

### Pointer Arithmetic

A pointer can be loaded into a register like any other value.

```.data
myString db "Hello World!",0 ;the zero is the null terminator
.code
mov bx, seg myString    ;load into bx the segment where myString is stored.
mov ds, bx              ;load this segment into the data segment register. On the 8086, segment registers can't be loaded directly.
mov bx, offset MyString ;the memory address of the beginning of myString. The "H" is stored here.
```

Once we have the pointer to myString in bx, we can perform arithmetic on it like it was an ordinary numerical value. There is no distinction between pointer arithmetic and normal arithmetic in assembly. All arithmetic commands are available for use. If the programmer wishes to use that memory address as a number for some other purpose, that is perfectly legal. However this example will stick to the "proper" uses of pointer arithmetic, i.e. indexing and offsetting.

```add bx, 2 ;add 2 to bx. bx contains the memory address of the first "l" in "Hello"
mov al,[ds:bx] ;dereference the pointer and store the value it points to into al.
```

## 68000 Assembly

### Pointers

In most assemblers, a line of code or a value in work RAM can be given a label.

```myVar equ \$100000    ; a section of user ram given a label
myData:              ; a data table given a label
dc.b \$80,\$81,\$82,\$83```

The `LEA` instruction can load the address of a given label. This isn't needed for values in RAM, but it is needed for loading from data tables in ROM.

`LEA myData,A0   ;address of myData is stored in A0`

The address registers `A0` through `A6` hold 24-bit addresses. While they can contain 32 bits of data, the top byte is ignored.

68000 Assembly doesn't care about what type of data is being pointed to. It makes no distinction between bytes, words, longs, or executable code. That is up to the opcodes that interact with the data. For the following, assume that the address `myData` shown above is loaded into `A0`, and that `D0` equaled 0x00000000 prior to the code below.

```MOVE.B (A0),D0 ;D0 = 0x00000080
MOVE.W (A0),D0 ;D0 = 0x00008081
MOVE.L (A0),D0 ;D0 = 0x80818283```

Putting a + after `(A0)` auto-increments the pointer by 1,2,or 4 for `MOVE.B`,`MOVE.W`, and `MOVE.L` respectively. The increment happens after the move.

```LEA myData,A0  ;Assume for this example that data registers all equal 0.
MOVE.B (A0)+,D0  ;D0 = 00000080
MOVE.B (A0)+,D1  ;D1 = 00000081
MOVE.B (A0)+,D2  ;D2 = 00000082```

Putting a - before `(A0)` pre-decrements the pointer by 1,2, or 4 for `MOVE.B`,`MOVE.W`, and `MOVE.L` respectively, before the move takes place.

```LEA myData+4,A0  ;Assume for this example that data registers all equal 0.
MOVE.B -(A0),D0  ;D0 = 00000083
MOVE.B -(A0),D1  ;D1 = 00000082
MOVE.B -(A0),D2  ;D2 = 00000081```

`LEA` can also be used to pre-calculate a complex offset to an address.

```LEA myData,A0
MOVE.W #\$0C,D0
LEA (4,A0,D0),A3 ;A3 = address of myData + 4 + \$0C```

An address can be offset by an immediate value, a register, or both. Data register offsets are measured at 16 bit word length. Although the second `LEA` is in brackets, it does NOT dereference the pointer. Let's look at the following example data:

```myPointers: dc.l myData, myData2
myData: dc.b \$80,\$81,\$82,\$83
myData2: dc.b \$84,\$85,\$86,\$87```

These two code snippets are not the same:

`LEA myData2,A1`
```LEA myPointers,A1
LEA (4,A1),A1```

### Pointers on the stack

Like any other address register, the stack pointer can be dereferenced and loaded from with a `MOVE` instruction. The 68000 uses the full stack convention, which means that `MOVE.L (SP),D0` will load the 4 bytes most recently pushed onto the stack into D0, rather than 4 bytes of "empty space" (read: garbage data). If you execute `MOVE.L (SP),D0` immediately after calling a subroutine with `JSR/BSR`, you will load D0 with the value of the program counter prior to the call.

If you maintain the stack 32-bit aligned at all times, you can offset the stack by multiples of 4 to get the desired parameters, assuming you pushed them in the reverse order they were declared in your function prototype.

```;implements:
;uint64_t foo (uint16_t a,uint16_t b,uint16_t c){return a+b+c;}
MOVE.L #arg2,-(SP)
MOVE.L #arg1,-(SP)
MOVE.L #arg0,-(SP)
JSR foo
LEA (12,SP),SP  ;discard the three values pushed prior to the call.
RTS

foo:
;outputs to D0
MOVE.L (4,SP),D0