Compiler/virtual machine interpreter: Difference between revisions
Line 3,446:
=={{header|Icon}}==
{{trans|ObjectIcon}}
<lang icon># -*- Icon -*-
#
|
Revision as of 13:36, 15 February 2022
You are encouraged to solve this task according to the task description, using any language you may know.
Virtual Machine Interpreter
A virtual machine implements a computer in software.
Write a virtual machine interpreter. This interpreter should be able to run virtual assembly language programs created via the task. This is a byte-coded, 32-bit word stack based virtual machine.
The program should read input from a file and/or stdin, and write output to a file and/or stdout.
Input format:
Given the following program:
count = 1; while (count < 10) { print("count is: ", count, "\n"); count = count + 1; }
The output from the Code generator is a virtual assembly code program:
Output from gen, input to VM |
---|
Datasize: 1 Strings: 2 "count is: " "\n" 0 push 1 5 store [0] 10 fetch [0] 15 push 10 20 lt 21 jz (43) 65 26 push 0 31 prts 32 fetch [0] 37 prti 38 push 1 43 prts 44 fetch [0] 49 push 1 54 add 55 store [0] 60 jmp (-51) 10 65 halt |
The first line of the input specifies the datasize required and the number of constant strings, in the order that they are reference via the code.
The data can be stored in a separate array, or the data can be stored at the beginning of the stack. Data is addressed starting at 0. If there are 3 variables, the 3rd one if referenced at address 2.
If there are one or more constant strings, they come next. The code refers to these strings by their index. The index starts at 0. So if there are 3 strings, and the code wants to reference the 3rd string, 2 will be used.
Next comes the actual virtual assembly code. The first number is the code address of that instruction. After that is the instruction mnemonic, followed by optional operands, depending on the instruction.
Registers:
sp:
the stack pointer - points to the next top of stack. The stack is a 32-bit integer array.
pc:
the program counter - points to the current instruction to be performed. The code is an array of bytes.
Data:
data string pool
Instructions:
Each instruction is one byte. The following instructions also have a 32-bit integer operand:
fetch [index]
where index is an index into the data array.
store [index]
where index is an index into the data array.
push n
where value is a 32-bit integer that will be pushed onto the stack.
jmp (n) addr
where (n) is a 32-bit integer specifying the distance between the current location and the desired location. addr is an unsigned value of the actual code address.
jz (n) addr
where (n) is a 32-bit integer specifying the distance between the current location and the desired location. addr is an unsigned value of the actual code address.
The following instructions do not have an operand. They perform their operation directly against the stack:
For the following instructions, the operation is performed against the top two entries in the stack:
add sub mul div mod lt gt le ge eq ne and or
For the following instructions, the operation is performed against the top entry in the stack:
neg not
Print the word at stack top as a character.
prtc
Print the word at stack top as an integer.
prti
Stack top points to an index into the string pool. Print that entry.
prts
Unconditional stop.
halt
- A simple example virtual machine
<lang python>def run_vm(data_size)
int stack[data_size + 1000] set stack[0..data_size - 1] to 0 int pc = 0 while True: op = code[pc] pc += 1
if op == FETCH: stack.append(stack[bytes_to_int(code[pc:pc+word_size])[0]]); pc += word_size elif op == STORE: stack[bytes_to_int(code[pc:pc+word_size])[0]] = stack.pop(); pc += word_size elif op == PUSH: stack.append(bytes_to_int(code[pc:pc+word_size])[0]); pc += word_size elif op == ADD: stack[-2] += stack[-1]; stack.pop() elif op == SUB: stack[-2] -= stack[-1]; stack.pop() elif op == MUL: stack[-2] *= stack[-1]; stack.pop() elif op == DIV: stack[-2] /= stack[-1]; stack.pop() elif op == MOD: stack[-2] %= stack[-1]; stack.pop() elif op == LT: stack[-2] = stack[-2] < stack[-1]; stack.pop() elif op == GT: stack[-2] = stack[-2] > stack[-1]; stack.pop() elif op == LE: stack[-2] = stack[-2] <= stack[-1]; stack.pop() elif op == GE: stack[-2] = stack[-2] >= stack[-1]; stack.pop() elif op == EQ: stack[-2] = stack[-2] == stack[-1]; stack.pop() elif op == NE: stack[-2] = stack[-2] != stack[-1]; stack.pop() elif op == AND: stack[-2] = stack[-2] and stack[-1]; stack.pop() elif op == OR: stack[-2] = stack[-2] or stack[-1]; stack.pop() elif op == NEG: stack[-1] = -stack[-1] elif op == NOT: stack[-1] = not stack[-1] elif op == JMP: pc += bytes_to_int(code[pc:pc+word_size])[0] elif op == JZ: if stack.pop() then pc += word_size else pc += bytes_to_int(code[pc:pc+word_size])[0] elif op == PRTC: print stack[-1] as a character; stack.pop() elif op == PRTS: print the constant string referred to by stack[-1]; stack.pop() elif op == PRTI: print stack[-1] as an integer; stack.pop() elif op == HALT: break</lang>
- Additional examples
Your solution should pass all the test cases above and the additional tests found Here.
The C and Python versions can be considered reference implementations.
- Related Tasks
Aime
<lang>integer n, pc, sp; file f; text s; index code, Data; list l, stack, strings;
f.affix(argv(1));
f.list(l, 0);
n = atoi(l[-1]); while (n) {
f.lead(s); strings.append(erase(s, -1, 0)); n -= 1;
}
while (f.list(l, 0) ^ -1) {
code.put(atoi(lf_x_text(l)), l);
}
pc = sp = 0; while (1) {
l = code[pc]; s = l[0]; if (s == "jz") { if (lb_pick(stack)) { isk_greater(code, pc, pc); } else { pc = atoi(l[-1]); } } elif (s == "jmp") { pc = atoi(l[-1]); } else { if (s == "push") { lb_push(stack, atoi(l[1])); } elif (s == "fetch") { lb_push(stack, Data[atoi(erase(l[1], -1, 0))]); } elif (s == "neg") { stack[-1] = -stack[-1]; } elif (s == "not") { stack[-1] = !stack[-1]; } elif (s == "halt") { break; } else { n = lb_pick(stack); if (s == "store") { Data[atoi(erase(l[1], -1, 0))] = n; } elif (s == "add") { stack[-1] = stack[-1] + n; } elif (s == "sub") { stack[-1] = stack[-1] - n; } elif (s == "mul") { stack[-1] = stack[-1] * n; } elif (s == "div") { stack[-1] = stack[-1] / n; } elif (s == "mod") { stack[-1] = stack[-1] % n; } elif (s == "lt") { stack[-1] = stack[-1] < n; } elif (s == "gt") { stack[-1] = stack[-1] > n; } elif (s == "le") { stack[-1] = stack[-1] <= n; } elif (s == "ge") { stack[-1] = stack[-1] >= n; } elif (s == "eq") { stack[-1] = stack[-1] == n; } elif (s == "ne") { stack[-1] = stack[-1] != n; } elif (s == "and") { stack[-1] = stack[-1] && n; } elif (s == "or") { stack[-1] = stack[-1] || n; } elif (s == "prtc") { o_byte(n); } elif (s == "prti") { o_(n); } elif (s == "prts") { o_(strings[n]); } else { } }
isk_greater(code, pc, pc); }
}</lang>
ALGOL W
<lang algolw>begin % virtual machine interpreter %
% string literals % string(256) array stringValue ( 0 :: 256 ); integer array stringLength ( 0 :: 256 ); integer MAX_STRINGS; % op codes % integer oFetch, oStore, oPush , oAdd, oSub, oMul, oDiv, oMod, oLt, oGt, oLe, oGe, oEq, oNe , oAnd, oOr, oNeg, oNot, oJmp, oJz, oPrtc, oPrts, oPrti, oHalt ; string(6) array opName ( 1 :: 24 ); integer OP_MAX; % code % string(1) array byteCode ( 0 :: 4096 ); integer nextLocation, MAX_LOCATION; % data % integer array data ( 0 :: 4096 ); integer dataSize, MAX_DATA, MAX_STACK; % tracing % logical trace;
% reports an error and stops % procedure rtError( string(80) value message ); begin integer errorPos; write( s_w := 0, "**** Runtime error: " ); errorPos := 0; while errorPos < 80 and message( errorPos // 1 ) not = "." do begin writeon( s_w := 0, message( errorPos // 1 ) ); errorPos := errorPos + 1 end while_not_at_end_of_message ; writeon( s_w := 0, "." ); assert( false ) end genError ;
oFetch := 1; opName( oFetch ) := "fetch"; oStore := 2; opName( oStore ) := "store"; oPush := 3; opName( oPush ) := "push"; oAdd := 4; opName( oAdd ) := "add"; oSub := 5; opName( oSub ) := "sub"; oMul := 6; opName( oMul ) := "mul"; oDiv := 7; opName( oDiv ) := "div"; oMod := 8; opName( oMod ) := "mod"; oLt := 9; opName( oLt ) := "lt"; oGt := 10; opName( oGt ) := "gt"; oLe := 11; opName( oLe ) := "le"; oGe := 12; opName( oGe ) := "ge"; oEq := 13; opName( oEq ) := "eq"; oNe := 14; opName( oNe ) := "ne"; oAnd := 15; opName( oAnd ) := "and"; oOr := 16; opName( oOr ) := "or"; oNeg := 17; opName( oNeg ) := "neg"; oNot := 18; opName( oNot ) := "not"; oJmp := 19; opName( oJmp ) := "jmp"; oJz := 20; opName( oJz ) := "jz"; oPrtc := 21; opName( oPrtc ) := "prtc"; oPrts := 22; opName( oPrts ) := "prts"; oPrti := 23; opName( oPrti ) := "prti"; oHalt := 24; opName( oHalt ) := "halt"; OP_MAX := oHalt;
trace := false; MAX_STACK := 256; MAX_LOCATION := 4096; for pc := 0 until MAX_LOCATION do byteCode( pc ) := code( 0 ); MAX_DATA := 4096; for dPos := 0 until MAX_DATA do data( dPos ) := 0; MAX_STRINGS := 256; for sPos := 0 until MAX_STRINGS do begin stringValue( sPos ) := " "; stringLength( sPos ) := 0 end for_sPos ;
% load thge output from syntaxc analyser % begin % readCode %
% skips spaces on the source line % procedure skipSpaces ; begin while line( lPos // 1 ) = " " do lPos := lPos + 1 end skipSpaces ;
% parses a string from line and stores it in the string literals table % procedure readString ( integer value stringNumber ) ; begin string(256) str; integer sLen; str := " "; sLen := 0; lPos := lPos + 1; % skip the opening double-quote % while lPos <= 255 and line( lPos // 1 ) not = """" do begin str( sLen // 1 ) := line( lPos // 1 ); sLen := sLen + 1; lPos := lPos + 1 end while_more_string ; if lPos > 255 then rtError( "Unterminated String." ); % store the string % stringValue( stringNumber ) := str; stringLength( stringNumber ) := sLen end readString ;
% gets an integer from the line - checks for valid digits % integer procedure readInteger ; begin integer n; skipSpaces; n := 0; while line( lPos // 1 ) >= "0" and line( lPos // 1 ) <= "9" do begin n := ( n * 10 ) + ( decode( line( lPos // 1 ) ) - decode( "0" ) ); lPos := lPos + 1 end while_not_end_of_integer ; n end readInteger ;
% reads the next line from standard input % procedure readALine ; begin lPos := 0; readcard( line ); if trace then write( s_w := 0, ">> ", line( 0 // 32 ) ) end readALine ;
% loads an instruction from the current source line % procedure loadCodeFromLine ; begin integer pc, opCode, operand, oPos; string(256) op; logical haveOperand; % get the code location % pc := readInteger; if pc > MAX_LOCATION then rtError( "Code too large." ); % get the opCode % skipSpaces; oPos := 0; op := " "; while lPos <= 255 and line( lPos // 1 ) not = " " do begin op( oPos // 1 ) := line( lPos // 1 ); oPos := oPos + 1; lPos := lPos + 1 end while_more_opName ; % lookup the op code % opCode := 0; oPos := 1; while oPos <= OP_MAX and opCode = 0 do begin if opName( oPos ) = op then opCode := oPos else oPos := oPos + 1 end while_op_not_found ; if opCode = 0 then rtError( "Unknown op code." ); % get the operand if there is one % operand := 0; haveOperand := false; if opCode = oFetch or opCode = oStore then begin % fetch or store - operand is enclosed in square brackets % skipSpaces; if line( lPos // 1 ) not = "[" then rtError( """["" expected after fetch/store." ); lPos := lPos + 1; operand := readInteger; if operand > dataSize then rtError( "fetch/store address out of range." ); haveOperand := true end else if opCode = oPush then begin % push integer literal instruction % operand := readInteger; haveOperand := true end else if opCode = oJmp or opCode = oJz then begin % jump - the operand is the relative address enclosed in parenthesis % % followed by the absolute address - we use the absolute address so % % the opewrand will be >= 0 % skipSpaces; if line( lPos // 1 ) not = "(" then rtError( """("" expected after jmp/jz." ); lPos := lPos + 1; if line( lPos // 1 ) = "-" then % negative relative address % lPos := lPos + 1; operand := readInteger; if line( lPos // 1 ) not = ")" then rtError( """)"" expected after jmp/jz." ); lPos := lPos + 1; operand := readInteger; haveOperand := true end if_various_opcodes ; % store the code % byteCode( pc ) := code( opCode ); if haveOperand then begin % have an operand for the op code % if ( pc + 4 ) > MAX_LOCATION then rtError( "Code too large." ); for oPos := 1 until 4 do begin pc := pc + 1; byteCode( pc ) := code( operand rem 256 ); operand := operand div 256; end for_oPos end if_have_operand ; end loadCodeFromLine ;
string(256) line; string(16) name; integer lPos, tPos, stringCount;
% allow us to detect EOF % ENDFILE := EXCEPTION( false, 1, 0, false, "EOF" );
% first line should be "Datasize: d Strings: s" where d = number variables % % and s = number of strings % readALine; if line = "trace" then begin % extension - run in trace mode % trace := true; readALine end if_line_eq_trace ; if XCPNOTED(ENDFILE) then rtError( "Empty program file." ); if line( 0 // 10 ) not = "Datasize: " then rtError( "Header line missing." ); lPos := 10; dataSize := readInteger; if dataSize > MAX_DATA then rtError( "Datasize too large." ); skipSpaces; if line( lPos // 9 ) not = "Strings: " then rtError( """Strings: "" missing on header line." ); lPos := lPos + 9; stringCount := readInteger; if stringCount > MAX_STRINGS then rtError( "Too many strings." ); % read the string table % for stringNumber := 0 until stringCount - 1 do begin string(256) str; integer sLen, sPos; readALine; if XCPNOTED(ENDFILE) then rtError( "End-of-file in string table." ); if line( lPos // 1 ) not = """" then rtError( "String literal expected." ); str := " "; sLen := 0; lPos := lPos + 1; % skip the opening double-quote % while lPos <= 255 and line( lPos // 1 ) not = """" do begin str( sLen // 1 ) := line( lPos // 1 ); sLen := sLen + 1; lPos := lPos + 1 end while_more_string ; if lPos > 255 then rtError( "Unterminated String." ); % store the string % stringValue( stringNumber ) := str; stringLength( stringNumber ) := sLen end for_sPos ; % read the code % readALine; while not XCPNOTED(ENDFILE) do begin if line not = " " then loadCodeFromLine; readALine end while_not_eof end; % run the program % begin integer pc, opCode, operand, sp; integer array st ( 0 :: MAX_STACK ); logical halted; % prints a string from the string pool, escape sequences are interpreted % procedure writeOnString( integer value stringNumber ) ; begin integer cPos, sLen; string(256) text; if stringNumber < 0 or stringNumber > MAX_STRINGS then rtError( "Invalid string number." ); cPos := 0; sLen := stringLength( stringNumber ); text := stringValue( stringNumber ); while cPos < stringLength( stringNumber ) do begin string(1) ch; ch := text( cPos // 1 ); if ch not = "\" then writeon( s_w := 0, ch ) else begin % escaped character % cPos := cPos + 1; if cPos > sLen then rtError( "String terminates with ""\""." ); ch := text( cPos // 1 ); if ch = "n" then % newline % write() else writeon( s_w := 0, ch ) end; cPos := cPos + 1 end while_not_end_of_string end writeOnString ;
pc := 0; sp := -1; halted := false; while not halted do begin; % get the next op code and operand % opCode := decode( byteCode( pc ) ); pc := pc + 1; operand := 0; if opCode = oFetch or opCode = oStore or opCode = oPush or opCode = oJmp or opCode = oJz then begin % this opCode has an operand % pc := pc + 4; for bPos := 1 until 4 do begin operand := ( operand * 256 ) + decode( byteCode( pc - bPos ) ); end for_bPos end if_opCode_with_an_operand ; if trace then begin write( i_w:= 1, s_w := 0, pc, " op(", opCode, "): ", opName( opCode ), " ", operand ); write() end if_trace ; % interpret the instruction % if opCode = oFetch then begin sp := sp + 1; st( sp ) := data( operand ) end else if opCode = oStore then begin data( operand ) := st( sp ); sp := sp - 1 end else if opCode = oPush then begin sp := sp + 1; st( sp ) := operand end else if opCode = oHalt then halted := true else if opCode = oJmp then pc := operand else if oPCode = oJz then begin if st( sp ) = 0 then pc := operand; sp := sp - 1 end else if opCode = oPrtc then begin writeon( i_w := 1, s_w := 0, code( st( sp ) ) ); sp := sp - 1 end else if opCode = oPrti then begin writeon( i_w := 1, s_w := 0, st( sp ) ); sp := sp - 1 end else if opCode = oPrts then begin writeonString( st( sp ) ); sp := sp - 1 end else if opCode = oNeg then st( sp ) := - st( sp ) else if opCode = oNot then st( sp ) := ( if st( sp ) = 0 then 1 else 0 ) else begin operand := st( sp ); sp := sp - 1; if opCode = oAdd then st( sp ) := st( sp ) + operand else if opCode = oSub then st( sp ) := st( sp ) - operand else if opCode = oMul then st( sp ) := st( sp ) * operand else if opCode = oDiv then st( sp ) := st( sp ) div operand else if opCode = oMod then st( sp ) := st( sp ) rem operand else if opCode = oLt then st( sp ) := if st( sp ) < operand then 1 else 0 else if opCode = oGt then st( sp ) := if st( sp ) > operand then 1 else 0 else if opCode = oLe then st( sp ) := if st( sp ) <= operand then 1 else 0 else if opCode = oGe then st( sp ) := if st( sp ) >= operand then 1 else 0 else if opCode = oEq then st( sp ) := if st( sp ) = operand then 1 else 0 else if opCode = oNe then st( sp ) := if st( sp ) not = operand then 1 else 0 else if opCode = oAnd then st( sp ) := if st( sp ) not = 0 and operand not = 0 then 1 else 0 else if opCode = oOr then st( sp ) := if st( sp ) not = 0 or operand not = 0 then 1 else 0 else rtError( "Unknown opCode." ) end if_various_opCodes end while_not_halted end
end.</lang>
AWK
Tested with gawk 4.1.1 and mawk 1.3.4. <lang AWK> function error(msg) {
printf("%s\n", msg) exit(1)
}
function bytes_to_int(bstr, i, sum) {
sum = 0 for (i=word_size-1; i>=0; i--) { sum *= 256 sum += code[bstr+i] } return sum
}
function emit_byte(x) {
code[next_free_code_index++] = x
}
function emit_word(x, i) {
for (i=0; i<word_size; i++) { emit_byte(int(x)%256); x = int(x/256) }
}
function run_vm(data_size) {
sp = data_size + 1 pc = 0 while (1) { op = code[pc++] if (op == FETCH) { stack[sp++] = stack[bytes_to_int(pc)] pc += word_size } else if (op == STORE) { stack[bytes_to_int(pc)] = stack[--sp] pc += word_size } else if (op == PUSH) { stack[sp++] = bytes_to_int(pc) pc += word_size } else if (op == ADD ) { stack[sp-2] += stack[sp-1]; sp-- } else if (op == SUB ) { stack[sp-2] -= stack[sp-1]; sp-- } else if (op == MUL ) { stack[sp-2] *= stack[sp-1]; sp-- } else if (op == DIV ) { stack[sp-2] = int(stack[sp-2] / stack[sp-1]); sp-- } else if (op == MOD ) { stack[sp-2] %= stack[sp-1]; sp-- } else if (op == LT ) { stack[sp-2] = stack[sp-2] < stack[sp-1]; sp-- } else if (op == GT ) { stack[sp-2] = stack[sp-2] > stack[sp-1]; sp-- } else if (op == LE ) { stack[sp-2] = stack[sp-2] <= stack[sp-1]; sp-- } else if (op == GE ) { stack[sp-2] = stack[sp-2] >= stack[sp-1]; sp-- } else if (op == EQ ) { stack[sp-2] = stack[sp-2] == stack[sp-1]; sp-- } else if (op == NE ) { stack[sp-2] = stack[sp-2] != stack[sp-1]; sp-- } else if (op == AND ) { stack[sp-2] = stack[sp-2] && stack[sp-1]; sp-- } else if (op == OR ) { stack[sp-2] = stack[sp-2] || stack[sp-1]; sp-- } else if (op == NEG ) { stack[sp-1] = - stack[sp-1] } else if (op == NOT ) { stack[sp-1] = ! stack[sp-1] } else if (op == JMP ) { pc += bytes_to_int(pc) } else if (op == JZ ) { if (stack[--sp]) { pc += word_size } else { pc += bytes_to_int(pc) } } else if (op == PRTC) { printf("%c", stack[--sp]) } else if (op == PRTS) { printf("%s", string_pool[stack[--sp]]) } else if (op == PRTI) { printf("%d", stack[--sp]) } else if (op == HALT) { break } } # while
}
function str_trans(srce, dest, i) {
dest = "" for (i=1; i <= length(srce); ) { if (substr(srce, i, 1) == "\\" && i < length(srce)) { if (substr(srce, i+1, 1) == "n") { dest = dest "\n" i += 2 } else if (substr(srce, i+1, 1) == "\\") { dest = dest "\\" i += 2 } } else { dest = dest substr(srce, i, 1) i += 1 } } return dest
}
function load_code( n, i) {
getline line if (line == "") error("empty line") n=split(line, line_list) data_size = line_list[2] n_strings = line_list[4] for (i=0; i<n_strings; i++) { getline line gsub(/\n/, "", line) gsub(/"/ , "", line) string_pool[i] = str_trans(line) } while (getline) { offset = int($1) instr = $2 opcode = code_map[instr] if (opcode == "") error("Unknown instruction " instr " at " offset) emit_byte(opcode) if (opcode == JMP || opcode == JZ) { p = int($4) emit_word(p - (offset + 1)) } else if (opcode == PUSH) { value = int($3) emit_word(value) } else if (opcode == FETCH || opcode == STORE) { gsub(/\[/, "", $3) gsub(/\]/, "", $3) value = int($3) emit_word(value) } } return data_size
}
BEGIN {
code_map["fetch"] = FETCH = 1 code_map["store"] = STORE = 2 code_map["push" ] = PUSH = 3 code_map["add" ] = ADD = 4 code_map["sub" ] = SUB = 5 code_map["mul" ] = MUL = 6 code_map["div" ] = DIV = 7 code_map["mod" ] = MOD = 8 code_map["lt" ] = LT = 9 code_map["gt" ] = GT = 10 code_map["le" ] = LE = 11 code_map["ge" ] = GE = 12 code_map["eq" ] = EQ = 13 code_map["ne" ] = NE = 14 code_map["and" ] = AND = 15 code_map["or" ] = OR = 16 code_map["neg" ] = NEG = 17 code_map["not" ] = NOT = 18 code_map["jmp" ] = JMP = 19 code_map["jz" ] = JZ = 20 code_map["prtc" ] = PRTC = 21 code_map["prts" ] = PRTS = 22 code_map["prti" ] = PRTI = 23 code_map["halt" ] = HALT = 24
next_free_node_index = 1 next_free_code_index = 0 word_size = 4 input_file = "-" if (ARGC > 1) input_file = ARGV[1] data_size = load_code() run_vm(data_size)
} </lang>
- Output — count:
count is: 1 count is: 2 count is: 3 count is: 4 count is: 5 count is: 6 count is: 7 count is: 8 count is: 9
C
Tested with gcc 4.81 and later, compiles warning free with -Wall -Wextra <lang C>#include <stdio.h>
- include <stdlib.h>
- include <stdarg.h>
- include <string.h>
- include <stdint.h>
- include <ctype.h>
- define NELEMS(arr) (sizeof(arr) / sizeof(arr[0]))
- define da_dim(name, type) type *name = NULL; \
int _qy_ ## name ## _p = 0; \ int _qy_ ## name ## _max = 0
- define da_redim(name) do {if (_qy_ ## name ## _p >= _qy_ ## name ## _max) \
name = realloc(name, (_qy_ ## name ## _max += 32) * sizeof(name[0]));} while (0)
- define da_rewind(name) _qy_ ## name ## _p = 0
- define da_append(name, x) do {da_redim(name); name[_qy_ ## name ## _p++] = x;} while (0)
typedef unsigned char uchar; typedef uchar code;
typedef enum { FETCH, STORE, PUSH, ADD, SUB, MUL, DIV, MOD, LT, GT, LE, GE, EQ, NE, AND,
OR, NEG, NOT, JMP, JZ, PRTC, PRTS, PRTI, HALT
} Code_t;
typedef struct Code_map {
char *text; Code_t op;
} Code_map;
Code_map code_map[] = {
{"fetch", FETCH}, {"store", STORE}, {"push", PUSH }, {"add", ADD }, {"sub", SUB }, {"mul", MUL }, {"div", DIV }, {"mod", MOD }, {"lt", LT }, {"gt", GT }, {"le", LE }, {"ge", GE }, {"eq", EQ }, {"ne", NE }, {"and", AND }, {"or", OR }, {"neg", NEG }, {"not", NOT }, {"jmp", JMP }, {"jz", JZ }, {"prtc", PRTC }, {"prts", PRTS }, {"prti", PRTI }, {"halt", HALT },
};
FILE *source_fp; da_dim(object, code);
void error(const char *fmt, ... ) {
va_list ap; char buf[1000];
va_start(ap, fmt); vsprintf(buf, fmt, ap); va_end(ap); printf("error: %s\n", buf); exit(1);
}
/*** Virtual Machine interpreter ***/ void run_vm(const code obj[], int32_t data[], int g_size, char **string_pool) {
int32_t *sp = &data[g_size + 1]; const code *pc = obj;
again: switch (*pc++) { case FETCH: *sp++ = data[*(int32_t *)pc]; pc += sizeof(int32_t); goto again; case STORE: data[*(int32_t *)pc] = *--sp; pc += sizeof(int32_t); goto again; case PUSH: *sp++ = *(int32_t *)pc; pc += sizeof(int32_t); goto again; case ADD: sp[-2] += sp[-1]; --sp; goto again; case SUB: sp[-2] -= sp[-1]; --sp; goto again; case MUL: sp[-2] *= sp[-1]; --sp; goto again; case DIV: sp[-2] /= sp[-1]; --sp; goto again; case MOD: sp[-2] %= sp[-1]; --sp; goto again; case LT: sp[-2] = sp[-2] < sp[-1]; --sp; goto again; case GT: sp[-2] = sp[-2] > sp[-1]; --sp; goto again; case LE: sp[-2] = sp[-2] <= sp[-1]; --sp; goto again; case GE: sp[-2] = sp[-2] >= sp[-1]; --sp; goto again; case EQ: sp[-2] = sp[-2] == sp[-1]; --sp; goto again; case NE: sp[-2] = sp[-2] != sp[-1]; --sp; goto again; case AND: sp[-2] = sp[-2] && sp[-1]; --sp; goto again; case OR: sp[-2] = sp[-2] || sp[-1]; --sp; goto again; case NEG: sp[-1] = -sp[-1]; goto again; case NOT: sp[-1] = !sp[-1]; goto again; case JMP: pc += *(int32_t *)pc; goto again; case JZ: pc += (*--sp == 0) ? *(int32_t *)pc : (int32_t)sizeof(int32_t); goto again; case PRTC: printf("%c", sp[-1]); --sp; goto again; case PRTS: printf("%s", string_pool[sp[-1]]); --sp; goto again; case PRTI: printf("%d", sp[-1]); --sp; goto again; case HALT: break; default: error("Unknown opcode %d\n", *(pc - 1)); }
}
char *read_line(int *len) {
static char *text = NULL; static int textmax = 0;
for (*len = 0; ; (*len)++) { int ch = fgetc(source_fp); if (ch == EOF || ch == '\n') { if (*len == 0) return NULL; break; } if (*len + 1 >= textmax) { textmax = (textmax == 0 ? 128 : textmax * 2); text = realloc(text, textmax); } text[*len] = ch; } text[*len] = '\0'; return text;
}
char *rtrim(char *text, int *len) { // remove trailing spaces
for (; *len > 0 && isspace(text[*len - 1]); --(*len)) ;
text[*len] = '\0'; return text;
}
char *translate(char *st) {
char *p, *q; if (st[0] == '"') // skip leading " if there ++st; p = q = st;
while ((*p++ = *q++) != '\0') { if (q[-1] == '\\') { if (q[0] == 'n') { p[-1] = '\n'; ++q; } else if (q[0] == '\\') { ++q; } } if (q[0] == '"' && q[1] == '\0') // skip trialing " if there ++q; }
return st;
}
/* convert an opcode string into its byte value */ int findit(const char text[], int offset) {
for (size_t i = 0; i < sizeof(code_map) / sizeof(code_map[0]); i++) { if (strcmp(code_map[i].text, text) == 0) return code_map[i].op; } error("Unknown instruction %s at %d\n", text, offset); return -1;
}
void emit_byte(int c) {
da_append(object, (uchar)c);
}
void emit_int(int32_t n) {
union { int32_t n; unsigned char c[sizeof(int32_t)]; } x;
x.n = n;
for (size_t i = 0; i < sizeof(x.n); ++i) { emit_byte(x.c[i]); }
}
/* Datasize: 5 Strings: 3 " is prime\n" "Total primes found: " "\n"
154 jmp (-73) 82 164 jz (32) 197 175 push 0 159 fetch [4] 149 store [3] */
/* Load code into global array object, return the string pool and data size */ char **load_code(int *ds) {
int line_len, n_strings; char **string_pool; char *text = read_line(&line_len); text = rtrim(text, &line_len);
strtok(text, " "); // skip "Datasize:" *ds = atoi(strtok(NULL, " ")); // get actual data_size strtok(NULL, " "); // skip "Strings:" n_strings = atoi(strtok(NULL, " ")); // get number of strings
string_pool = malloc(n_strings * sizeof(char *)); for (int i = 0; i < n_strings; ++i) { text = read_line(&line_len); text = rtrim(text, &line_len); text = translate(text); string_pool[i] = strdup(text); }
for (;;) { int len;
text = read_line(&line_len); if (text == NULL) break; text = rtrim(text, &line_len);
int offset = atoi(strtok(text, " ")); // get the offset char *instr = strtok(NULL, " "); // get the instruction int opcode = findit(instr, offset); emit_byte(opcode); char *operand = strtok(NULL, " ");
switch (opcode) { case JMP: case JZ: operand++; // skip the '(' len = strlen(operand); operand[len - 1] = '\0'; // remove the ')' emit_int(atoi(operand)); break; case PUSH: emit_int(atoi(operand)); break; case FETCH: case STORE: operand++; // skip the '[' len = strlen(operand); operand[len - 1] = '\0'; // remove the ']' emit_int(atoi(operand)); break; } } return string_pool;
}
void init_io(FILE **fp, FILE *std, const char mode[], const char fn[]) {
if (fn[0] == '\0') *fp = std; else if ((*fp = fopen(fn, mode)) == NULL) error(0, 0, "Can't open %s\n", fn);
}
int main(int argc, char *argv[]) {
init_io(&source_fp, stdin, "r", argc > 1 ? argv[1] : ""); int data_size; char **string_pool = load_code(&data_size); int data[1000 + data_size]; run_vm(object, data, data_size, string_pool);
}</lang>
COBOL
Code by Steve Williams (with changes to work around code highlighting issues). Tested with GnuCOBOL 2.2.
<lang cobol> >>SOURCE FORMAT IS FREE identification division.
- > this code is dedicated to the public domain
- > (GnuCOBOL) 2.3-dev.0
program-id. vminterpreter. environment division. configuration section. repository. function all intrinsic. input-output section. file-control.
select input-file assign using input-name status is input-status organization is line sequential.
data division.
file section. fd input-file. 01 input-record pic x(64).
working-storage section. 01 program-name pic x(32). 01 input-name pic x(32). 01 input-status pic xx.
01 error-record pic x(64) value spaces global.
01 v-max pic 99. 01 parameters.
03 offset pic 999. 03 opcode pic x(8). 03 parm0 pic x(16). 03 parm1 pic x(16). 03 parm2 pic x(16).
01 opcodes.
03 opFETCH pic x value x'00'. 03 opSTORE pic x value x'01'. 03 opPUSH pic x value x'02'. 03 opADD pic x value x'03'. 03 opSUB pic x value x'04'. 03 opMUL pic x value x'05'. 03 opDIV pic x value x'06'. 03 opMOD pic x value x'07'. 03 opLT pic x value x'08'. 03 opGT pic x value x'09'. 03 opLE pic x value x'0A'. 03 opGE pic x value x'0B'. 03 opEQ pic x value x'0C'. 03 opNE pic x value x'0D'. 03 opAND pic x value x'0E'. 03 opOR pic x value x'0F'. 03 opNEG pic x value x'10'. 03 opNOT pic x value x'11'. 03 opJMP pic x value x'13'. 03 opJZ pic x value x'14'. 03 opPRTC pic x value x'15'. 03 opPRTS pic x value x'16'. 03 opPRTI pic x value x'17'. 03 opHALT pic x value x'18'.
01 filler.
03 s pic 99. 03 s-max pic 99 value 0. 03 s-lim pic 99 value 16. 03 filler occurs 16. 05 string-length pic 99. 05 string-entry pic x(48).
01 filler.
03 v pic 99. 03 v-lim pic 99 value 16. 03 variables occurs 16 usage binary-int.
01 generated-code global.
03 c pic 999 value 1. 03 pc pic 999. 03 c-lim pic 999 value 512. 03 kode pic x(512).
01 filler.
03 stack1 pic 999 value 2. 03 stack2 pic 999 value 1. 03 stack-lim pic 999 value 998. 03 stack occurs 998 usage binary-int.
01 display-definitions global.
03 ascii-character. 05 numeric-value usage binary-char. 03 display-integer pic -(9)9. 03 word-x. 05 word usage binary-int. 03 word-length pic 9. 03 string1 pic 99. 03 length1 pic 99. 03 count1 pic 99. 03 display-pending pic x.
procedure division. start-vminterpreter.
display 1 upon command-line *> get arg(1) accept program-name from argument-value move length(word) to word-length perform load-code perform run-code stop run .
run-code.
move 1 to pc perform until pc >= c evaluate kode(pc:1) when opFETCH perform push-stack move kode(pc + 1:word-length) to word-x add 1 to word *> convert offset to subscript move variables(word) to stack(stack1) add word-length to pc when opPUSH perform push-stack move kode(pc + 1:word-length) to word-x move word to stack(stack1) add word-length to pc when opNEG compute stack(stack1) = -stack(stack1) when opNOT if stack(stack1) = 0 move 1 to stack(stack1) else move 0 to stack(stack1) end-if when opJMP move kode(pc + 1:word-length) to word-x move word to pc when opHALT if display-pending = 'Y' display space end-if exit perform when opJZ if stack(stack1) = 0 move kode(pc + 1:word-length) to word-x move word to pc else add word-length to pc end-if perform pop-stack when opSTORE move kode(pc + 1:word-length) to word-x add 1 to word *> convert offset to subscript move stack(stack1) to variables(word) add word-length to pc perform pop-stack when opADD add stack(stack1) to stack(stack2) perform pop-stack when opSUB subtract stack(stack1) from stack(stack2) perform pop-stack when opMUL multiply stack(stack1) by stack(stack2) *>rounded mode nearest-toward-zero *> doesn't match python perform pop-stack when opDIV divide stack(stack1) into stack(stack2) *>rounded mode nearest-toward-zero *> doesn't match python perform pop-stack when opMOD move mod(stack(stack2),stack(stack1)) to stack(stack2) perform pop-stack when opLT if stack(stack2) < stack(stack1) move 1 to stack(stack2) else move 0 to stack(stack2) end-if perform pop-stack when opGT if stack(stack2) > stack(stack1) move 1 to stack(stack2) else move 0 to stack(stack2) end-if perform pop-stack when opLE if stack(stack2) <= stack(stack1) move 1 to stack(stack2) else move 0 to stack(stack2) end-if perform pop-stack when opGE if stack(stack2) >= stack(stack1) move 1 to stack(stack2) else move 0 to stack(stack2) end-if perform pop-stack when opEQ if stack(stack2) = stack(stack1) move 1 to stack(stack2) else move 0 to stack(stack2) end-if perform pop-stack when opNE if stack(stack2) <> stack(stack1) move 1 to stack(stack2) else move 0 to stack(stack2) end-if perform pop-stack when opAND call "CBL_AND" using stack(stack1) stack(stack2) by value word-length perform pop-stack when opOR call "CBL_OR" using stack(stack1) stack(stack2) by value word-length perform pop-stack when opPRTC move stack(stack1) to numeric-value if numeric-value = 10 display space move 'N' to display-pending else display ascii-character with no advancing move 'Y' to display-pending end-if perform pop-stack when opPRTS add 1 to word *> convert offset to subscript move 1 to string1 move string-length(word) to length1 perform until string1 > string-length(word) move 0 to count1 inspect string-entry(word)(string1:length1) tallying count1 for characters before initial '\' *> ' workaround code highlighter problem evaluate true when string-entry(word)(string1 + count1 + 1:1) = 'n' *> \n display string-entry(word)(string1:count1) move 'N' to display-pending compute string1 = string1 + 2 + count1 compute length1 = length1 - 2 - count1 when string-entry(word)(string1 + count1 + 1:1) = '\' *> ' \\ display string-entry(word)(string1:count1 + 1) with no advancing move 'Y' to display-pending compute string1 = string1 + 2 + count1 compute length1 = length1 - 2 - count1 when other display string-entry(word)(string1:count1) with no advancing move 'Y' to display-pending add count1 to string1 subtract count1 from length1 end-evaluate end-perform perform pop-stack when opPRTI move stack(stack1) to display-integer display trim(display-integer) with no advancing move 'Y' to display-pending perform pop-stack end-evaluate add 1 to pc end-perform .
push-stack.
if stack1 >= stack-lim string 'in vminterpreter at ' pc ' stack overflow at ' stack-lim into error-record perform report-error end-if add 1 to stack1 stack2 >>d display ' push at ' pc space stack1 space stack2 .
pop-stack.
if stack1 < 2 string 'in vminterpreter at ' pc ' stack underflow' into error-record perform report-error end-if >>d display ' pop at ' pc space stack1 space stack2 subtract 1 from stack1 stack2 .
load-code.
perform read-input if input-status <> '00' string 'in vminterpreter no input data' into error-record perform report-error end-if
unstring input-record delimited by all spaces into parm1 v-max parm2 s-max if v-max > v-lim string 'in vminterpreter datasize exceeds ' v-lim into error-record perform report-error end-if if s-max > s-lim string 'in vminterpreter number of strings exceeds ' s-lim into error-record perform report-error end-if
perform read-input perform varying s from 1 by 1 until s > s-max or input-status <> '00' compute string-length(s) string-length(word) = length(trim(input-record)) - 2 move input-record(2:string-length(word)) to string-entry(s) perform read-input end-perform if s <= s-max string 'in vminterpreter not all strings found' into error-record perform report-error end-if
perform until input-status <> '00' initialize parameters unstring input-record delimited by all spaces into parm0 offset opcode parm1 parm2 evaluate opcode when 'fetch' call 'emitbyte' using opFETCH call 'emitword' using parm1 when 'store' call 'emitbyte' using opSTORE call 'emitword' using parm1 when 'push' call 'emitbyte' using opPUSH call 'emitword' using parm1 when 'add' call 'emitbyte' using opADD when 'sub' call 'emitbyte' using opSUB when 'mul' call 'emitbyte' using opMUL when 'div' call 'emitbyte' using opDIV when 'mod' call 'emitbyte' using opMOD when 'lt' call 'emitbyte' using opLT when 'gt' call 'emitbyte' using opGT when 'le' call 'emitbyte' using opLE when 'ge' call 'emitbyte' using opGE when 'eq' call 'emitbyte' using opEQ when 'ne' call 'emitbyte' using opNE when 'and' call 'emitbyte' using opAND when 'or' call 'emitbyte' using opOR when 'not' call 'emitbyte' using opNOT when 'neg' call 'emitbyte' using opNEG when 'jmp' call 'emitbyte' using opJMP call 'emitword' using parm2 when 'jz' call 'emitbyte' using opJZ call 'emitword' using parm2 when 'prtc' call 'emitbyte' using opPRTC when 'prts' call 'emitbyte' using opPRTS when 'prti' call 'emitbyte' using opPRTI when 'halt' call 'emitbyte' using opHALT when other string 'in vminterpreter unknown opcode ' trim(opcode) ' at ' offset into error-record perform report-error end-evaluate perform read-input end-perform .
read-input.
if program-name = spaces move '00' to input-status accept input-record on exception move '10' to input-status end-accept exit paragraph end-if if input-name = spaces string program-name delimited by space '.gen' into input-name open input input-file if input-status <> '00' string 'in vminterpreter ' trim(input-name) ' file open status ' input-status into error-record perform report-error end-if end-if read input-file into input-record evaluate input-status when '00' continue when '10' close input-file when other string 'in vminterpreter unexpected input-status: ' input-status into error-record perform report-error end-evaluate .
report-error.
display error-record upon syserr stop run with error status -1 .
identification division. program-id. emitbyte. data division. linkage section. 01 opcode pic x. procedure division using opcode. start-emitbyte.
if c >= c-lim string 'in vminterpreter emitbyte c exceeds ' c-lim into error-record call 'reporterror' end-if move opcode to kode(c:1) add 1 to c .
end program emitbyte.
identification division. program-id. emitword. data division. working-storage section. 01 word-temp pic x(8). linkage section. 01 word-value any length. procedure division using word-value. start-emitword.
if c + word-length >= c-lim string 'in vminterpreter emitword c exceeds ' c-lim into error-record call 'reporterror' end-if move word-value to word-temp inspect word-temp converting '[' to ' ' inspect word-temp converting ']' to ' ' move numval(trim(word-temp)) to word move word-x to kode(c:word-length) add word-length to c .
end program emitword.
end program vminterpreter.</lang>
- Output — Count:
prompt$ ./lexer <testcases/Count | ./parser | ./generator | ./vminterpreter count is: 1 count is: 2 count is: 3 count is: 4 count is: 5 count is: 6 count is: 7 count is: 8 count is: 9
Forth
Tested with Gforth 0.7.3 <lang Forth>CREATE BUF 0 , \ single-character look-ahead buffer
- PEEK BUF @ 0= IF KEY BUF ! THEN BUF @ ;
- GETC PEEK 0 BUF ! ;
- SPACE? DUP BL = SWAP 9 14 WITHIN OR ;
- >SPACE BEGIN PEEK SPACE? WHILE GETC DROP REPEAT ;
- DIGIT? 48 58 WITHIN ;
- >INT ( -- n) >SPACE 0
BEGIN PEEK DIGIT? WHILE GETC [CHAR] 0 - SWAP 10 * + REPEAT ;
CREATE A 0 ,
- C@A ( -- c) A @ C@ ;
- C@A+ ( -- c) C@A 1 CHARS A +! ;
- C!A+ ( c --) A @ C! 1 CHARS A +! ;
- WORD ( -- c-addr) >SPACE PAD 1+ A !
BEGIN PEEK SPACE? INVERT WHILE GETC C!A+ REPEAT >SPACE PAD A @ OVER - 1- PAD C! ;
- >STRING ( -- c-addr) >SPACE GETC DROP PAD 1+ A !
BEGIN PEEK [CHAR] " <> WHILE GETC C!A+ REPEAT GETC DROP PAD A @ OVER - 1- PAD C! ;
- \INTERN ( c-addr -- c-addr) HERE >R A ! C@A+ DUP C,
BEGIN DUP WHILE C@A+ DUP [CHAR] \ = IF DROP -1 R@ +! C@A+ [CHAR] n = IF 10 ELSE [CHAR] \ THEN THEN C, 1- REPEAT DROP R> ;
- . 0 .R ;
CREATE DATA 0 , CREATE STRINGS 0 ,
- >DATA HERE DATA !
WORD DROP >INT 4 * BEGIN DUP WHILE 0 C, 1- REPEAT DROP ;
- >STRINGS HERE STRINGS !
WORD DROP >INT DUP >R CELLS ALLOT 0 BEGIN DUP R@ < WHILE DUP CELLS >STRING \INTERN STRINGS @ ROT + ! 1+ REPEAT R> DROP DROP ;
- >HEADER >DATA >STRINGS ;
- i32! ( n addr --)
OVER $FF AND OVER C! 1+ OVER 8 RSHIFT $FF AND OVER C! 1+ OVER 16 RSHIFT $FF AND OVER C! 1+ SWAP 24 RSHIFT $FF AND SWAP C! ;
- i32@ ( addr -- n) >R \ This is kinda slow... hmm
R@ C@ R@ 1 + C@ 8 LSHIFT OR R@ 2 + C@ 16 LSHIFT OR R> 3 + C@ 24 LSHIFT OR DUP $7FFFFFFF AND SWAP $80000000 AND - ; \ sign extend
- i32, ( n --) HERE 4 ALLOT i32! ;
- i32@+ ( -- n) A @ i32@ A @ 4 + A ! ;
CREATE BYTECODE 0 ,
- @fetch i32@+ 4 * DATA @ + i32@ ;
- @store i32@+ 4 * DATA @ + i32! ;
- @jmp i32@+ BYTECODE @ + A ! ;
- @jz IF 4 A +! ELSE @jmp THEN ;
- @prts CELLS STRINGS @ + @ COUNT TYPE ;
- @div >R S>D R> SM/REM SWAP DROP ;
CREATE OPS ' @fetch , ' @store , ' i32@+ , ' @jmp , ' @jz , ' EMIT , ' . , ' @prts , ' NEGATE , ' 0= , ' + , ' - , ' * , ' @div , ' MOD , ' < , ' > , ' <= , ' >= , ' = , ' <> , ' AND , ' OR , ' BYE , CREATE #OPS 0 ,
- OP: CREATE #OPS @ , 1 #OPS +! DOES> @ ;
OP: fetch OP: store OP: push OP: jmp OP: jz OP: prtc OP: prti OP: prts OP: neg OP: not OP: add OP: sub OP: mul OP: div OP: mod OP: lt OP: gt OP: le OP: ge OP: eq OP: ne OP: and OP: or OP: halt
- >OP WORD FIND
0= IF ." Unrecognized opcode" ABORT THEN EXECUTE ;
- >i32 >INT i32, ;
- >[i32] GETC DROP >i32 GETC DROP ;
- >OFFSET WORD DROP ( drop relative offset) >i32 ;
CREATE >PARAM ' >[i32] DUP , , ' >i32 , ' >OFFSET DUP , ,
- >BYTECODE HERE >R
BEGIN >INT DROP >OP >R R@ C, R@ 5 < IF R@ CELLS >PARAM + @ EXECUTE THEN R> halt = UNTIL R> BYTECODE ! ;
- RUN BYTECODE @ A !
BEGIN C@A+ CELLS OPS + @ EXECUTE AGAIN ;
>HEADER >BYTECODE RUN</lang>
Fortran
Fortran 2008/2018 code with some limited use of the C preprocessor. If you are on a platform with case-sensitive filenames, and call the source file vm.F90, then gfortran will know to use the C preprocessor. <lang fortran>module compiler_type_kinds
use, intrinsic :: iso_fortran_env, only: int32 use, intrinsic :: iso_fortran_env, only: int64
implicit none private
! Synonyms. integer, parameter, public :: size_kind = int64 integer, parameter, public :: length_kind = size_kind integer, parameter, public :: nk = size_kind
! Synonyms for character capable of storing a Unicode code point. integer, parameter, public :: unicode_char_kind = selected_char_kind ('ISO_10646') integer, parameter, public :: ck = unicode_char_kind
! Synonyms for integers capable of storing a Unicode code point. integer, parameter, public :: unicode_ichar_kind = int32 integer, parameter, public :: ick = unicode_ichar_kind
! Synonyms for integers in the virtual machine or the interpreter’s ! runtime. (The Rosetta Code task says integers in the virtual ! machine are 32-bit, but there is nothing in the task that prevents ! us using 64-bit integers in the compiler and interpreter.) integer, parameter, public :: runtime_int_kind = int64 integer, parameter, public :: rik = runtime_int_kind
end module compiler_type_kinds
module helpers
use, non_intrinsic :: compiler_type_kinds, only: nk, rik, ck
implicit none private
public :: new_storage_size public :: next_power_of_two
public :: isspace public :: quoted_string
public :: int32_to_vm_bytes public :: uint32_to_vm_bytes public :: int32_from_vm_bytes public :: uint32_from_vm_bytes
public :: bool2int
character(1, kind = ck), parameter, public :: horizontal_tab_char = char (9, kind = ck) character(1, kind = ck), parameter, public :: linefeed_char = char (10, kind = ck) character(1, kind = ck), parameter, public :: vertical_tab_char = char (11, kind = ck) character(1, kind = ck), parameter, public :: formfeed_char = char (12, kind = ck) character(1, kind = ck), parameter, public :: carriage_return_char = char (13, kind = ck) character(1, kind = ck), parameter, public :: space_char = ck_' '
! The following is correct for Unix and its relatives. character(1, kind = ck), parameter, public :: newline_char = linefeed_char
character(1, kind = ck), parameter, public :: backslash_char = char (92, kind = ck)
contains
elemental function new_storage_size (length_needed) result (size) integer(kind = nk), intent(in) :: length_needed integer(kind = nk) :: size
! Increase storage by orders of magnitude.
if (2_nk**32 < length_needed) then size = huge (1_nk) else size = next_power_of_two (length_needed) end if end function new_storage_size
elemental function next_power_of_two (x) result (y) integer(kind = nk), intent(in) :: x integer(kind = nk) :: y
! ! It is assumed that no more than 64 bits are used. ! ! The branch-free algorithm is that of ! https://archive.is/nKxAc#RoundUpPowerOf2 ! ! Fill in bits until one less than the desired power of two is ! reached, and then add one. !
y = x - 1 y = ior (y, ishft (y, -1)) y = ior (y, ishft (y, -2)) y = ior (y, ishft (y, -4)) y = ior (y, ishft (y, -8)) y = ior (y, ishft (y, -16)) y = ior (y, ishft (y, -32)) y = y + 1 end function next_power_of_two
elemental function isspace (ch) result (bool) character(1, kind = ck), intent(in) :: ch logical :: bool
bool = (ch == horizontal_tab_char) .or. & & (ch == linefeed_char) .or. & & (ch == vertical_tab_char) .or. & & (ch == formfeed_char) .or. & & (ch == carriage_return_char) .or. & & (ch == space_char) end function isspace
function quoted_string (str) result (qstr) character(*, kind = ck), intent(in) :: str character(:, kind = ck), allocatable :: qstr
integer(kind = nk) :: n, i, j
! Compute n = the size of qstr. n = 2_nk do i = 1_nk, len (str, kind = nk) select case (str(i:i)) case (newline_char, backslash_char) n = n + 2 case default n = n + 1 end select end do
allocate (character(n, kind = ck) :: qstr)
! Quote the string. qstr(1:1) = ck_'"' j = 2_nk do i = 1_nk, len (str, kind = nk) select case (str(i:i)) case (newline_char) qstr(j:j) = backslash_char qstr((j + 1):(j + 1)) = ck_'n' j = j + 2 case (backslash_char) qstr(j:j) = backslash_char qstr((j + 1):(j + 1)) = backslash_char j = j + 2 case default qstr(j:j) = str(i:i) j = j + 1 end select end do if (j /= n) error stop ! Check code correctness. qstr(n:n) = ck_'"' end function quoted_string
subroutine int32_to_vm_bytes (n, bytes, i) integer(kind = rik), intent(in) :: n character(1), intent(inout) :: bytes(0:*) integer(kind = rik), intent(in) :: i
! ! The virtual machine is presumed to be little-endian. Because I ! slightly prefer little-endian. !
bytes(i) = achar (ibits (n, 0, 8)) bytes(i + 1) = achar (ibits (n, 8, 8)) bytes(i + 2) = achar (ibits (n, 16, 8)) bytes(i + 3) = achar (ibits (n, 24, 8)) end subroutine int32_to_vm_bytes
subroutine uint32_to_vm_bytes (n, bytes, i) integer(kind = rik), intent(in) :: n character(1), intent(inout) :: bytes(0:*) integer(kind = rik), intent(in) :: i
call int32_to_vm_bytes (n, bytes, i) end subroutine uint32_to_vm_bytes
subroutine int32_from_vm_bytes (n, bytes, i) integer(kind = rik), intent(out) :: n character(1), intent(in) :: bytes(0:*) integer(kind = rik), intent(in) :: i
! ! The virtual machine is presumed to be little-endian. Because I ! slightly prefer little-endian. !
call uint32_from_vm_bytes (n, bytes, i) if (ibits (n, 31, 1) == 1) then ! Extend the sign bit. n = ior (n, not ((2_rik ** 32) - 1)) end if end subroutine int32_from_vm_bytes
subroutine uint32_from_vm_bytes (n, bytes, i) integer(kind = rik), intent(out) :: n character(1), intent(in) :: bytes(0:*) integer(kind = rik), intent(in) :: i
! ! The virtual machine is presumed to be little-endian. Because I ! slightly prefer little-endian. !
integer(kind = rik) :: n0, n1, n2, n3
n0 = iachar (bytes(i), kind = rik) n1 = ishft (iachar (bytes(i + 1), kind = rik), 8) n2 = ishft (iachar (bytes(i + 2), kind = rik), 16) n3 = ishft (iachar (bytes(i + 3), kind = rik), 24) n = ior (n0, ior (n1, ior (n2, n3))) end subroutine uint32_from_vm_bytes
elemental function bool2int (bool) result (int) logical, intent(in) :: bool integer(kind = rik) :: int
if (bool) then int = 1_rik else int = 0_rik end if end function bool2int
end module helpers
module string_buffers
use, intrinsic :: iso_fortran_env, only: error_unit use, intrinsic :: iso_fortran_env, only: int64 use, non_intrinsic :: compiler_type_kinds, only: nk, ck, ick use, non_intrinsic :: helpers
implicit none private
public :: strbuf_t public :: skip_whitespace public :: skip_non_whitespace public :: skip_whitespace_backwards public :: at_end_of_line
type :: strbuf_t integer(kind = nk), private :: len = 0 ! ! ‘chars’ is made public for efficient access to the individual ! characters. ! character(1, kind = ck), allocatable, public :: chars(:) contains procedure, pass, private :: ensure_storage => strbuf_t_ensure_storage procedure, pass :: to_unicode_full_string => strbuf_t_to_unicode_full_string procedure, pass :: to_unicode_substring => strbuf_t_to_unicode_substring procedure, pass :: length => strbuf_t_length procedure, pass :: set => strbuf_t_set procedure, pass :: append => strbuf_t_append generic :: to_unicode => to_unicode_full_string generic :: to_unicode => to_unicode_substring generic :: assignment(=) => set end type strbuf_t
contains
function strbuf_t_to_unicode_full_string (strbuf) result (s) class(strbuf_t), intent(in) :: strbuf character(:, kind = ck), allocatable :: s
! ! This does not actually ensure that the string is valid Unicode; ! any 31-bit ‘character’ is supported. !
integer(kind = nk) :: i
allocate (character(len = strbuf%len, kind = ck) :: s) do i = 1, strbuf%len s(i:i) = strbuf%chars(i) end do end function strbuf_t_to_unicode_full_string
function strbuf_t_to_unicode_substring (strbuf, i, j) result (s) ! ! ‘Extreme’ values of i and j are allowed, as shortcuts for ‘from ! the beginning’, ‘up to the end’, or ‘empty substring’. ! class(strbuf_t), intent(in) :: strbuf integer(kind = nk), intent(in) :: i, j character(:, kind = ck), allocatable :: s
! ! This does not actually ensure that the string is valid Unicode; ! any 31-bit ‘character’ is supported. !
integer(kind = nk) :: i1, j1 integer(kind = nk) :: n integer(kind = nk) :: k
i1 = max (1_nk, i) j1 = min (strbuf%len, j) n = max (0_nk, (j1 - i1) + 1_nk)
allocate (character(n, kind = ck) :: s) do k = 1, n s(k:k) = strbuf%chars(i1 + (k - 1_nk)) end do end function strbuf_t_to_unicode_substring
elemental function strbuf_t_length (strbuf) result (n) class(strbuf_t), intent(in) :: strbuf integer(kind = nk) :: n
n = strbuf%len end function strbuf_t_length
subroutine strbuf_t_ensure_storage (strbuf, length_needed) class(strbuf_t), intent(inout) :: strbuf integer(kind = nk), intent(in) :: length_needed
integer(kind = nk) :: len_needed integer(kind = nk) :: new_size type(strbuf_t) :: new_strbuf
len_needed = max (length_needed, 1_nk)
if (.not. allocated (strbuf%chars)) then ! Initialize a new strbuf%chars array. new_size = new_storage_size (len_needed) allocate (strbuf%chars(1:new_size)) else if (ubound (strbuf%chars, 1) < len_needed) then ! Allocate a new strbuf%chars array, larger than the current ! one, but containing the same characters. new_size = new_storage_size (len_needed) allocate (new_strbuf%chars(1:new_size)) new_strbuf%chars(1:strbuf%len) = strbuf%chars(1:strbuf%len) call move_alloc (new_strbuf%chars, strbuf%chars) end if end subroutine strbuf_t_ensure_storage
subroutine strbuf_t_set (dst, src) class(strbuf_t), intent(inout) :: dst class(*), intent(in) :: src
integer(kind = nk) :: n integer(kind = nk) :: i
select type (src) type is (character(*, kind = ck)) n = len (src, kind = nk) call dst%ensure_storage(n) do i = 1, n dst%chars(i) = src(i:i) end do dst%len = n type is (character(*)) n = len (src, kind = nk) call dst%ensure_storage(n) do i = 1, n dst%chars(i) = src(i:i) end do dst%len = n class is (strbuf_t) n = src%len call dst%ensure_storage(n) dst%chars(1:n) = src%chars(1:n) dst%len = n class default error stop end select end subroutine strbuf_t_set
subroutine strbuf_t_append (dst, src) class(strbuf_t), intent(inout) :: dst class(*), intent(in) :: src
integer(kind = nk) :: n_dst, n_src, n integer(kind = nk) :: i
select type (src) type is (character(*, kind = ck)) n_dst = dst%len n_src = len (src, kind = nk) n = n_dst + n_src call dst%ensure_storage(n) do i = 1, n_src dst%chars(n_dst + i) = src(i:i) end do dst%len = n type is (character(*)) n_dst = dst%len n_src = len (src, kind = nk) n = n_dst + n_src call dst%ensure_storage(n) do i = 1, n_src dst%chars(n_dst + i) = src(i:i) end do dst%len = n class is (strbuf_t) n_dst = dst%len n_src = src%len n = n_dst + n_src call dst%ensure_storage(n) dst%chars((n_dst + 1):n) = src%chars(1:n_src) dst%len = n class default error stop end select end subroutine strbuf_t_append
function skip_whitespace (strbuf, i) result (j) class(strbuf_t), intent(in) :: strbuf integer(kind = nk), intent(in) :: i integer(kind = nk) :: j
logical :: done
j = i done = .false. do while (.not. done) if (at_end_of_line (strbuf, j)) then done = .true. else if (.not. isspace (strbuf%chars(j))) then done = .true. else j = j + 1 end if end do end function skip_whitespace
function skip_non_whitespace (strbuf, i) result (j) class(strbuf_t), intent(in) :: strbuf integer(kind = nk), intent(in) :: i integer(kind = nk) :: j
logical :: done
j = i done = .false. do while (.not. done) if (at_end_of_line (strbuf, j)) then done = .true. else if (isspace (strbuf%chars(j))) then done = .true. else j = j + 1 end if end do end function skip_non_whitespace
function skip_whitespace_backwards (strbuf, i) result (j) class(strbuf_t), intent(in) :: strbuf integer(kind = nk), intent(in) :: i integer(kind = nk) :: j
logical :: done
j = i done = .false. do while (.not. done) if (j == -1) then done = .true. else if (.not. isspace (strbuf%chars(j))) then done = .true. else j = j - 1 end if end do end function skip_whitespace_backwards
function at_end_of_line (strbuf, i) result (bool) class(strbuf_t), intent(in) :: strbuf integer(kind = nk), intent(in) :: i logical :: bool
bool = (strbuf%length() < i) end function at_end_of_line
end module string_buffers
module reading_one_line_from_a_stream
use, intrinsic :: iso_fortran_env, only: input_unit use, intrinsic :: iso_fortran_env, only: error_unit use, non_intrinsic :: compiler_type_kinds, only: nk, ck, ick use, non_intrinsic :: string_buffers
implicit none private
! get_line_from_stream: read an entire input line from a stream into ! a strbuf_t. public :: get_line_from_stream
character(1, kind = ck), parameter :: linefeed_char = char (10, kind = ck)
! The following is correct for Unix and its relatives. character(1, kind = ck), parameter :: newline_char = linefeed_char
contains
subroutine get_line_from_stream (unit_no, eof, no_newline, strbuf) integer, intent(in) :: unit_no logical, intent(out) :: eof ! End of file? logical, intent(out) :: no_newline ! There is a line but it has no ! newline? (Thus eof also must ! be .true.) class(strbuf_t), intent(inout) :: strbuf
character(1, kind = ck) :: ch
strbuf = call get_ch (unit_no, eof, ch) do while (.not. eof .and. ch /= newline_char) call strbuf%append (ch) call get_ch (unit_no, eof, ch) end do no_newline = eof .and. (strbuf%length() /= 0) end subroutine get_line_from_stream
subroutine get_ch (unit_no, eof, ch) ! ! Read a single code point from the stream. ! ! Currently this procedure simply inputs ‘ASCII’ bytes rather than ! Unicode code points. ! integer, intent(in) :: unit_no logical, intent(out) :: eof character(1, kind = ck), intent(out) :: ch
integer :: stat character(1) :: c = '*'
eof = .false.
if (unit_no == input_unit) then call get_input_unit_char (c, stat) else read (unit = unit_no, iostat = stat) c end if
if (stat < 0) then ch = ck_'*' eof = .true. else if (0 < stat) then write (error_unit, '("Input error with status code ", I0)') stat stop 1 else ch = char (ichar (c, kind = ick), kind = ck) end if end subroutine get_ch
!!! !!! If you tell gfortran you want -std=f2008 or -std=f2018, you likely !!! will need to add also -fall-intrinsics or -U__GFORTRAN__ !!! !!! The first way, you get the FGETC intrinsic. The latter way, you !!! get the C interface code that uses getchar(3). !!!
- ifdef __GFORTRAN__
subroutine get_input_unit_char (c, stat) ! ! The following works if you are using gfortran. ! ! (FGETC is considered a feature for backwards compatibility with ! g77. However, I know of no way to reconfigure input_unit as a ! Fortran 2003 stream, for use with ordinary ‘read’.) ! character, intent(inout) :: c integer, intent(out) :: stat
call fgetc (input_unit, c, stat) end subroutine get_input_unit_char
- else
subroutine get_input_unit_char (c, stat) ! ! An alternative implementation of get_input_unit_char. This ! actually reads input from the C standard input, which might not ! be the same as input_unit. ! use, intrinsic :: iso_c_binding, only: c_int character, intent(inout) :: c integer, intent(out) :: stat
interface ! ! Use getchar(3) to read characters from standard input. This ! assumes there is actually such a function available, and that ! getchar(3) does not exist solely as a macro. (One could write ! one’s own getchar() if necessary, of course.) ! function getchar () result (c) bind (c, name = 'getchar') use, intrinsic :: iso_c_binding, only: c_int integer(kind = c_int) :: c end function getchar end interface
integer(kind = c_int) :: i_char
i_char = getchar () ! ! The C standard requires that EOF have a negative value. If the ! value returned by getchar(3) is not EOF, then it will be ! representable as an unsigned char. Therefore, to check for end ! of file, one need only test whether i_char is negative. ! if (i_char < 0) then stat = -1 else stat = 0 c = char (i_char) end if end subroutine get_input_unit_char
- endif
end module reading_one_line_from_a_stream
module vm_reader
use, intrinsic :: iso_fortran_env, only: error_unit use, non_intrinsic :: compiler_type_kinds use, non_intrinsic :: helpers use, non_intrinsic :: string_buffers use, non_intrinsic :: reading_one_line_from_a_stream
implicit none private
public :: vm_code_t public :: vm_t public :: read_vm
! ! Arbitrarily chosen opcodes. ! ! I think there should be a no-operation ‘nop’ opcode, to reserve ! space for later hand-patching. :) ! integer, parameter, public :: opcode_nop = 0 integer, parameter, public :: opcode_halt = 1 integer, parameter, public :: opcode_add = 2 integer, parameter, public :: opcode_sub = 3 integer, parameter, public :: opcode_mul = 4 integer, parameter, public :: opcode_div = 5 integer, parameter, public :: opcode_mod = 6 integer, parameter, public :: opcode_lt = 7 integer, parameter, public :: opcode_gt = 8 integer, parameter, public :: opcode_le = 9 integer, parameter, public :: opcode_ge = 10 integer, parameter, public :: opcode_eq = 11 integer, parameter, public :: opcode_ne = 12 integer, parameter, public :: opcode_and = 13 integer, parameter, public :: opcode_or = 14 integer, parameter, public :: opcode_neg = 15 integer, parameter, public :: opcode_not = 16 integer, parameter, public :: opcode_prtc = 17 integer, parameter, public :: opcode_prti = 18 integer, parameter, public :: opcode_prts = 19 integer, parameter, public :: opcode_fetch = 20 integer, parameter, public :: opcode_store = 21 integer, parameter, public :: opcode_push = 22 integer, parameter, public :: opcode_jmp = 23 integer, parameter, public :: opcode_jz = 24
character(8, kind = ck), parameter, public :: opcode_names(0:24) = & & (/ "nop ", & & "halt ", & & "add ", & & "sub ", & & "mul ", & & "div ", & & "mod ", & & "lt ", & & "gt ", & & "le ", & & "ge ", & & "eq ", & & "ne ", & & "and ", & & "or ", & & "neg ", & & "not ", & & "prtc ", & & "prti ", & & "prts ", & & "fetch ", & & "store ", & & "push ", & & "jmp ", & & "jz " /)
type :: vm_code_t integer(kind = rik), private :: len = 0_rik character(1), allocatable :: bytes(:) contains procedure, pass, private :: ensure_storage => vm_code_t_ensure_storage procedure, pass :: length => vm_code_t_length end type vm_code_t
type :: vm_t integer(kind = rik), allocatable :: string_boundaries(:) character(:, kind = ck), allocatable :: strings character(1), allocatable :: data(:) character(1), allocatable :: stack(:) type(vm_code_t) :: code integer(kind = rik) :: sp = 0_rik integer(kind = rik) :: pc = 0_rik end type vm_t
contains
subroutine vm_code_t_ensure_storage (code, length_needed) class(vm_code_t), intent(inout) :: code integer(kind = nk), intent(in) :: length_needed
integer(kind = nk) :: len_needed integer(kind = nk) :: new_size type(vm_code_t) :: new_code
len_needed = max (length_needed, 1_nk)
if (.not. allocated (code%bytes)) then ! Initialize a new code%bytes array. new_size = new_storage_size (len_needed) allocate (code%bytes(0:(new_size - 1))) else if (ubound (code%bytes, 1) < len_needed - 1) then ! Allocate a new code%bytes array, larger than the current one, ! but containing the same bytes. new_size = new_storage_size (len_needed) allocate (new_code%bytes(0:(new_size - 1))) new_code%bytes(0:(code%len - 1)) = code%bytes(0:(code%len - 1)) call move_alloc (new_code%bytes, code%bytes) end if end subroutine vm_code_t_ensure_storage
elemental function vm_code_t_length (code) result (len) class(vm_code_t), intent(in) :: code integer(kind = rik) :: len
len = code%len end function vm_code_t_length
subroutine read_vm (inp, strbuf, vm) integer, intent(in) :: inp type(strbuf_t), intent(inout) :: strbuf type(vm_t), intent(out) :: vm
integer(kind = rik) :: data_size integer(kind = rik) :: number_of_strings
! Read the header. call read_datasize_and_number_of_strings (inp, strbuf, data_size, number_of_strings)
! Allocate storage for data_size 32-bit numbers. Initialize them ! to zero, for no better reason than that C initializes global ! variables to zero. allocate (vm%data(0_rik:(4_rik * (data_size - 1))), source = achar (0))
! Allocate storage for indices/bounds of the strings to be loaded ! into the string storage space. allocate (vm%string_boundaries(0_rik:number_of_strings))
! Fill the strings storage and the string boundaries array. call read_strings (inp, strbuf, number_of_strings, vm)
! Read the program instructions. call read_code (inp, strbuf, vm)
! Allocate a stack. Let us say that the stack size must be a ! multiple of 4, and is fixed at 65536 = 4**8 bytes. Pushing a ! 32-bit integer increases the stack pointer by 4, popping ! decreases it by 4. allocate (vm%stack(0_rik:(4_rik ** 8))) end subroutine read_vm
subroutine read_datasize_and_number_of_strings (inp, strbuf, data_size, number_of_strings) integer, intent(in) :: inp type(strbuf_t), intent(inout) :: strbuf integer(kind = rik), intent(out) :: data_size integer(kind = rik), intent(out) :: number_of_strings
logical :: eof logical :: no_newline integer(kind = nk) :: i, j character(:, kind = ck), allocatable :: data_size_str character(:, kind = ck), allocatable :: number_of_strings_str integer :: stat
call get_line_from_stream (inp, eof, no_newline, strbuf) if (eof) call bad_vm_assembly
i = skip_whitespace (strbuf, 1_nk) i = skip_datasize_keyword (strbuf, i) i = skip_whitespace (strbuf, i) i = skip_specific_character (strbuf, i, ck_':') i = skip_whitespace (strbuf, i) j = skip_non_whitespace (strbuf, i) if (j == i) call bad_vm_assembly allocate (data_size_str, source = strbuf%to_unicode (i, j - 1))
i = skip_whitespace(strbuf, j) i = skip_strings_keyword (strbuf, i) i = skip_whitespace (strbuf, i) i = skip_specific_character (strbuf, i, ck_':') i = skip_whitespace (strbuf, i) j = skip_non_whitespace (strbuf, i) if (j == i) call bad_vm_assembly allocate (number_of_strings_str, source = strbuf%to_unicode (i, j - 1))
read (data_size_str, *, iostat = stat) data_size if (stat /= 0) call bad_vm_assembly read (number_of_strings_str, *, iostat = stat) number_of_strings if (stat /= 0) call bad_vm_assembly end subroutine read_datasize_and_number_of_strings
subroutine read_strings (inp, strbuf, number_of_strings, vm) integer, intent(in) :: inp type(strbuf_t), intent(inout) :: strbuf integer(kind = rik), intent(in) :: number_of_strings type(vm_t), intent(inout) :: vm
type(strbuf_t) :: strings_temporary integer(kind = rik) :: i
vm%string_boundaries(0) = 0_rik do i = 0_rik, number_of_strings - 1 call read_one_string (inp, strbuf, strings_temporary) vm%string_boundaries(i + 1) = strings_temporary%length() end do allocate (vm%strings, source = strings_temporary%to_unicode()) end subroutine read_strings
subroutine read_one_string (inp, strbuf, strings_temporary) integer, intent(in) :: inp type(strbuf_t), intent(inout) :: strbuf type(strbuf_t), intent(inout) :: strings_temporary
logical :: eof logical :: no_newline integer(kind = nk) :: i logical :: done
call get_line_from_stream (inp, eof, no_newline, strbuf) if (eof) call bad_vm_assembly i = skip_whitespace (strbuf, 1_nk) i = skip_specific_character (strbuf, i, ck_'"') done = .false. do while (.not. done) if (i == strbuf%length() + 1) call bad_vm_assembly if (strbuf%chars(i) == ck_'"') then done = .true. else if (strbuf%chars(i) == backslash_char) then if (i == strbuf%length()) call bad_vm_assembly select case (strbuf%chars(i + 1)) case (ck_'n') call strings_temporary%append(newline_char) case (backslash_char) call strings_temporary%append(backslash_char) case default call bad_vm_assembly end select i = i + 2 else call strings_temporary%append(strbuf%chars(i)) i = i + 1 end if end do end subroutine read_one_string
subroutine read_code (inp, strbuf, vm) integer, intent(in) :: inp type(strbuf_t), intent(inout) :: strbuf type(vm_t), intent(inout) :: vm
logical :: eof logical :: no_newline
call get_line_from_stream (inp, eof, no_newline, strbuf) do while (.not. eof) call parse_instruction (strbuf, vm%code) call get_line_from_stream (inp, eof, no_newline, strbuf) end do end subroutine read_code
subroutine parse_instruction (strbuf, code) type(strbuf_t), intent(in) :: strbuf type(vm_code_t), intent(inout) :: code
integer(kind = nk) :: i, j integer :: stat
integer :: opcode integer(kind = rik) :: i_vm integer(kind = rik) :: arg
character(8, kind = ck) :: opcode_name_str character(:, kind = ck), allocatable :: i_vm_str character(:, kind = ck), allocatable :: arg_str
i = skip_whitespace (strbuf, 1_nk) j = skip_non_whitespace (strbuf, i) if (j == i) call bad_vm_assembly allocate (i_vm_str, source = strbuf%to_unicode(i, j - 1)) read (i_vm_str, *, iostat = stat) i_vm if (stat /= 0) call bad_vm_assembly
i = skip_whitespace (strbuf, j) j = skip_non_whitespace (strbuf, i) opcode_name_str = ck_' ' opcode_name_str(1:(j - i)) = strbuf%to_unicode(i, j - 1) opcode = findloc (opcode_names, opcode_name_str, 1) - 1 if (opcode == -1) call bad_vm_assembly
select case (opcode)
case (opcode_push) call code%ensure_storage(i_vm + 5) code%bytes(i_vm) = achar (opcode) i = skip_whitespace (strbuf, j) j = skip_non_whitespace (strbuf, i) if (j == i) call bad_vm_assembly allocate (arg_str, source = strbuf%to_unicode(i, j - 1)) read (arg_str, *, iostat = stat) arg if (stat /= 0) call bad_vm_assembly call int32_to_vm_bytes (arg, code%bytes, i_vm + 1) code%len = max (code%len, i_vm + 5)
case (opcode_fetch, opcode_store) call code%ensure_storage(i_vm + 5) code%bytes(i_vm) = achar (opcode) i = skip_whitespace (strbuf, j) i = skip_specific_character (strbuf, i, ck_'[') i = skip_whitespace (strbuf, i) j = skip_non_whitespace (strbuf, i) if (j == i) call bad_vm_assembly if (strbuf%chars(j - 1) == ck_']') j = j - 1 allocate (arg_str, source = strbuf%to_unicode(i, j - 1)) read (arg_str, *, iostat = stat) arg if (stat /= 0) call bad_vm_assembly call uint32_to_vm_bytes (arg, code%bytes, i_vm + 1) code%len = max (code%len, i_vm + 5)
case (opcode_jmp, opcode_jz) call code%ensure_storage(i_vm + 5) code%bytes(i_vm) = achar (opcode) call code%ensure_storage(i_vm + 5) code%bytes(i_vm) = achar (opcode) i = skip_whitespace (strbuf, j) i = skip_specific_character (strbuf, i, ck_'(') i = skip_whitespace (strbuf, i) j = skip_non_whitespace (strbuf, i) if (j == i) call bad_vm_assembly if (strbuf%chars(j - 1) == ck_')') j = j - 1 allocate (arg_str, source = strbuf%to_unicode(i, j - 1)) read (arg_str, *, iostat = stat) arg if (stat /= 0) call bad_vm_assembly call int32_to_vm_bytes (arg, code%bytes, i_vm + 1) code%len = max (code%len, i_vm + 5)
case default call code%ensure_storage(i_vm + 1) code%bytes(i_vm) = achar (opcode) code%len = max (code%len, i_vm + 1) end select
end subroutine parse_instruction
function skip_datasize_keyword (strbuf, i) result (j) type(strbuf_t), intent(in) :: strbuf integer(kind = nk), intent(in) :: i integer(kind = nk) :: j
j = skip_specific_character (strbuf, i, ck_'D') j = skip_specific_character (strbuf, j, ck_'a') j = skip_specific_character (strbuf, j, ck_'t') j = skip_specific_character (strbuf, j, ck_'a') j = skip_specific_character (strbuf, j, ck_'s') j = skip_specific_character (strbuf, j, ck_'i') j = skip_specific_character (strbuf, j, ck_'z') j = skip_specific_character (strbuf, j, ck_'e') end function skip_datasize_keyword
function skip_strings_keyword (strbuf, i) result (j) type(strbuf_t), intent(in) :: strbuf integer(kind = nk), intent(in) :: i integer(kind = nk) :: j
j = skip_specific_character (strbuf, i, ck_'S') j = skip_specific_character (strbuf, j, ck_'t') j = skip_specific_character (strbuf, j, ck_'r') j = skip_specific_character (strbuf, j, ck_'i') j = skip_specific_character (strbuf, j, ck_'n') j = skip_specific_character (strbuf, j, ck_'g') j = skip_specific_character (strbuf, j, ck_'s') end function skip_strings_keyword
function skip_specific_character (strbuf, i, ch) result (j) type(strbuf_t), intent(in) :: strbuf integer(kind = nk), intent(in) :: i character(1, kind = ck), intent(in) :: ch integer(kind = nk) :: j
if (strbuf%length() < i) call bad_vm_assembly if (strbuf%chars(i) /= ch) call bad_vm_assembly j = i + 1 end function skip_specific_character
subroutine bad_vm_assembly write (error_unit, '("The input is not a correct virtual machine program.")') stop 1 end subroutine bad_vm_assembly
end module vm_reader
module vm_runner
use, intrinsic :: iso_fortran_env, only: error_unit use, non_intrinsic :: compiler_type_kinds use, non_intrinsic :: helpers use, non_intrinsic :: vm_reader
implicit none private
public :: run_vm
contains
subroutine run_vm (outp, vm) integer, intent(in) :: outp type(vm_t), intent(inout) :: vm
logical :: done integer :: opcode
vm%sp = 0 vm%pc = 0 done = .false. do while (.not. done) if (vm%pc < 0 .or. vm%code%length() <= vm%pc) call pc_error opcode = iachar (vm%code%bytes(vm%pc)) vm%pc = vm%pc + 1 select case (opcode) case (opcode_nop) continue case (opcode_halt) done = .true. case (opcode_add) call alu_add (vm) case (opcode_sub) call alu_sub (vm) case (opcode_mul) call alu_mul (vm) case (opcode_div) call alu_div (vm) case (opcode_mod) call alu_mod (vm) case (opcode_lt) call alu_lt (vm) case (opcode_gt) call alu_gt (vm) case (opcode_le) call alu_le (vm) case (opcode_ge) call alu_ge (vm) case (opcode_eq) call alu_eq (vm) case (opcode_ne) call alu_ne (vm) case (opcode_and) call alu_and (vm) case (opcode_or) call alu_or (vm) case (opcode_neg) call alu_neg (vm) case (opcode_not) call alu_not (vm) case (opcode_prtc) call prtc (outp, vm) case (opcode_prti) call prti (outp, vm) case (opcode_prts) call prts (outp, vm) case (opcode_fetch) call fetch_int32 (vm) case (opcode_store) call store_int32 (vm) case (opcode_push) call push_int32 (vm) case (opcode_jmp) call jmp (vm) case (opcode_jz) call jz (vm) case default write (error_unit, '("VM opcode unrecognized: ", I0)') opcode stop 1 end select end do end subroutine run_vm
subroutine push_int32 (vm) type(vm_t), intent(inout) :: vm
! ! Push the 32-bit integer data at pc to the stack, then increment ! pc by 4. !
if (ubound (vm%stack, 1) < vm%sp) then write (error_unit, '("VM stack overflow")') stop 1 end if if (vm%code%length() <= vm%pc + 4) call pc_error vm%stack(vm%sp:(vm%sp + 3)) = vm%code%bytes(vm%pc:(vm%pc + 3)) vm%sp = vm%sp + 4 vm%pc = vm%pc + 4 end subroutine push_int32
subroutine fetch_int32 (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: i integer(kind = rik) :: x
if (vm%code%length() <= vm%pc + 4) call pc_error call uint32_from_vm_bytes (i, vm%code%bytes, vm%pc) vm%pc = vm%pc + 4
if (ubound (vm%data, 1) < i * 4) then write (error_unit, '("VM data access error")') stop 1 end if call int32_from_vm_bytes (x, vm%data, i * 4)
if (ubound (vm%stack, 1) < vm%sp) then write (error_unit, '("VM stack overflow")') stop 1 end if call int32_to_vm_bytes (x, vm%stack, vm%sp) vm%sp = vm%sp + 4 end subroutine fetch_int32
subroutine store_int32 (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: i integer(kind = rik) :: x
if (vm%code%length() <= vm%pc + 4) call pc_error call uint32_from_vm_bytes (i, vm%code%bytes, vm%pc) vm%pc = vm%pc + 4
call ensure_there_is_enough_stack_data (vm, 4_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 4) vm%sp = vm%sp - 4
if (ubound (vm%data, 1) < i * 4) then write (error_unit, '("VM data access error")') stop 1 end if call int32_to_vm_bytes (x, vm%data, i * 4) end subroutine store_int32
subroutine jmp (vm) type(vm_t), intent(inout) :: vm
! ! Add the 32-bit data at pc to pc itself. !
integer(kind = rik) :: x
if (vm%code%length() <= vm%pc + 4) call pc_error call int32_from_vm_bytes (x, vm%code%bytes, vm%pc) vm%pc = vm%pc + x end subroutine jmp
subroutine jz (vm) type(vm_t), intent(inout) :: vm
! ! Conditionally add the 32-bit data at pc to pc itself. !
integer(kind = rik) :: x
call ensure_there_is_enough_stack_data (vm, 4_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 4) vm%sp = vm%sp - 4 if (x == 0) then if (vm%code%length() <= vm%pc + 4) call pc_error call int32_from_vm_bytes (x, vm%code%bytes, vm%pc) vm%pc = vm%pc + x else vm%pc = vm%pc + 4 end if end subroutine jz
subroutine alu_neg (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x
call ensure_there_is_enough_stack_data (vm, 4_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 4) x = -x call int32_to_vm_bytes (x, vm%stack, vm%sp - 4) end subroutine alu_neg
subroutine alu_not (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x
call ensure_there_is_enough_stack_data (vm, 4_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 4) x = bool2int (x == 0_rik) call int32_to_vm_bytes (x, vm%stack, vm%sp - 4) end subroutine alu_not
subroutine alu_add (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x, y, z
call ensure_there_is_enough_stack_data (vm, 8_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 8) call int32_from_vm_bytes (y, vm%stack, vm%sp - 4) z = x + y call int32_to_vm_bytes (z, vm%stack, vm%sp - 8) vm%sp = vm%sp - 4 end subroutine alu_add
subroutine alu_sub (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x, y, z
call ensure_there_is_enough_stack_data (vm, 8_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 8) call int32_from_vm_bytes (y, vm%stack, vm%sp - 4) z = x - y call int32_to_vm_bytes (z, vm%stack, vm%sp - 8) vm%sp = vm%sp - 4 end subroutine alu_sub
subroutine alu_mul (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x, y, z
call ensure_there_is_enough_stack_data (vm, 8_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 8) call int32_from_vm_bytes (y, vm%stack, vm%sp - 4) z = x * y call int32_to_vm_bytes (z, vm%stack, vm%sp - 8) vm%sp = vm%sp - 4 end subroutine alu_mul
subroutine alu_div (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x, y, z
call ensure_there_is_enough_stack_data (vm, 8_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 8) call int32_from_vm_bytes (y, vm%stack, vm%sp - 4) z = x / y ! This works like ‘/’ in C. call int32_to_vm_bytes (z, vm%stack, vm%sp - 8) vm%sp = vm%sp - 4 end subroutine alu_div
subroutine alu_mod (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x, y, z
call ensure_there_is_enough_stack_data (vm, 8_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 8) call int32_from_vm_bytes (y, vm%stack, vm%sp - 4) z = mod (x, y) ! This works like ‘%’ in C. call int32_to_vm_bytes (z, vm%stack, vm%sp - 8) vm%sp = vm%sp - 4 end subroutine alu_mod
subroutine alu_lt (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x, y, z
call ensure_there_is_enough_stack_data (vm, 8_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 8) call int32_from_vm_bytes (y, vm%stack, vm%sp - 4) z = bool2int (x < y) call int32_to_vm_bytes (z, vm%stack, vm%sp - 8) vm%sp = vm%sp - 4 end subroutine alu_lt
subroutine alu_gt (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x, y, z
call ensure_there_is_enough_stack_data (vm, 8_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 8) call int32_from_vm_bytes (y, vm%stack, vm%sp - 4) z = bool2int (x > y) call int32_to_vm_bytes (z, vm%stack, vm%sp - 8) vm%sp = vm%sp - 4 end subroutine alu_gt
subroutine alu_le (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x, y, z
call ensure_there_is_enough_stack_data (vm, 8_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 8) call int32_from_vm_bytes (y, vm%stack, vm%sp - 4) z = bool2int (x <= y) call int32_to_vm_bytes (z, vm%stack, vm%sp - 8) vm%sp = vm%sp - 4 end subroutine alu_le
subroutine alu_ge (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x, y, z
call ensure_there_is_enough_stack_data (vm, 8_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 8) call int32_from_vm_bytes (y, vm%stack, vm%sp - 4) z = bool2int (x >= y) call int32_to_vm_bytes (z, vm%stack, vm%sp - 8) vm%sp = vm%sp - 4 end subroutine alu_ge
subroutine alu_eq (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x, y, z
call ensure_there_is_enough_stack_data (vm, 8_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 8) call int32_from_vm_bytes (y, vm%stack, vm%sp - 4) z = bool2int (x == y) call int32_to_vm_bytes (z, vm%stack, vm%sp - 8) vm%sp = vm%sp - 4 end subroutine alu_eq
subroutine alu_ne (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x, y, z
call ensure_there_is_enough_stack_data (vm, 8_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 8) call int32_from_vm_bytes (y, vm%stack, vm%sp - 4) z = bool2int (x /= y) call int32_to_vm_bytes (z, vm%stack, vm%sp - 8) vm%sp = vm%sp - 4 end subroutine alu_ne
subroutine alu_and (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x, y, z
call ensure_there_is_enough_stack_data (vm, 8_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 8) call int32_from_vm_bytes (y, vm%stack, vm%sp - 4) z = bool2int (x /= 0 .and. y /= 0) call int32_to_vm_bytes (z, vm%stack, vm%sp - 8) vm%sp = vm%sp - 4 end subroutine alu_and
subroutine alu_or (vm) type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x, y, z
call ensure_there_is_enough_stack_data (vm, 8_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 8) call int32_from_vm_bytes (y, vm%stack, vm%sp - 4) z = bool2int (x /= 0 .or. y /= 0) call int32_to_vm_bytes (z, vm%stack, vm%sp - 8) vm%sp = vm%sp - 4 end subroutine alu_or
subroutine ensure_there_is_enough_stack_data (vm, n) type(vm_t), intent(in) :: vm integer(kind = rik), intent(in) :: n
if (vm%sp < n) then write (error_unit, '("VM stack underflow")') stop 1 end if end subroutine ensure_there_is_enough_stack_data
subroutine prtc (outp, vm) integer, intent(in) :: outp type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x
call ensure_there_is_enough_stack_data (vm, 4_rik) call uint32_from_vm_bytes (x, vm%stack, vm%sp - 4) write (outp, '(A1)', advance = 'no') char (x, kind = ck) vm%sp = vm%sp - 4 end subroutine prtc
subroutine prti (outp, vm) integer, intent(in) :: outp type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x
call ensure_there_is_enough_stack_data (vm, 4_rik) call int32_from_vm_bytes (x, vm%stack, vm%sp - 4) write (outp, '(I0)', advance = 'no') x vm%sp = vm%sp - 4 end subroutine prti
subroutine prts (outp, vm) integer, intent(in) :: outp type(vm_t), intent(inout) :: vm
integer(kind = rik) :: x integer(kind = rik) :: i, j
call ensure_there_is_enough_stack_data (vm, 4_rik) call uint32_from_vm_bytes (x, vm%stack, vm%sp - 4) if (ubound (vm%string_boundaries, 1) - 1 < x) then write (error_unit, '("VM string boundary error")') stop 1 end if i = vm%string_boundaries(x) j = vm%string_boundaries(x + 1) write (outp, '(A)', advance = 'no') vm%strings((i + 1):j) vm%sp = vm%sp - 4 end subroutine prts
subroutine pc_error write (error_unit, '("VM program counter error")') stop 1 end subroutine pc_error
end module vm_runner
program vm
use, intrinsic :: iso_fortran_env, only: input_unit use, intrinsic :: iso_fortran_env, only: output_unit use, intrinsic :: iso_fortran_env, only: error_unit use, non_intrinsic :: compiler_type_kinds use, non_intrinsic :: string_buffers use, non_intrinsic :: vm_reader use, non_intrinsic :: vm_runner
implicit none
integer, parameter :: inp_unit_no = 100 integer, parameter :: outp_unit_no = 101
integer :: arg_count character(200) :: arg integer :: inp integer :: outp
arg_count = command_argument_count () if (3 <= arg_count) then call print_usage else if (arg_count == 0) then inp = input_unit outp = output_unit else if (arg_count == 1) then call get_command_argument (1, arg) inp = open_for_input (trim (arg)) outp = output_unit else if (arg_count == 2) then call get_command_argument (1, arg) inp = open_for_input (trim (arg)) call get_command_argument (2, arg) outp = open_for_output (trim (arg)) end if
block type(strbuf_t) :: strbuf type(vm_t) :: vm
call read_vm (inp, strbuf, vm) call run_vm (outp, vm) end block end if
contains
function open_for_input (filename) result (unit_no) character(*), intent(in) :: filename integer :: unit_no
integer :: stat
open (unit = inp_unit_no, file = filename, status = 'old', & & action = 'read', access = 'stream', form = 'unformatted', & & iostat = stat) if (stat /= 0) then write (error_unit, '("Error: failed to open ", 1A, " for input")') filename stop 1 end if unit_no = inp_unit_no end function open_for_input
function open_for_output (filename) result (unit_no) character(*), intent(in) :: filename integer :: unit_no
integer :: stat
open (unit = outp_unit_no, file = filename, action = 'write', iostat = stat) if (stat /= 0) then write (error_unit, '("Error: failed to open ", 1A, " for output")') filename stop 1 end if unit_no = outp_unit_no end function open_for_output
subroutine print_usage character(200) :: progname
call get_command_argument (0, progname) write (output_unit, '("Usage: ", 1A, " [INPUT_FILE [OUTPUT_FILE]]")') & & trim (progname) end subroutine print_usage
end program vm</lang>
- Output:
$ gfortran -O3 -Wall -Wextra -fcheck=all -std=f2018 -U__GFORTRAN__ -g -o vm vm.F90 && ./lex count.t | ./parse | ./gen | ./vm
count is: 1 count is: 2 count is: 3 count is: 4 count is: 5 count is: 6 count is: 7 count is: 8 count is: 9
Go
<lang go>package main
import (
"bufio" "encoding/binary" "fmt" "log" "math" "os" "strconv" "strings"
)
type code = byte
const (
fetch code = iota store push add sub mul div mod lt gt le ge eq ne and or neg not jmp jz prtc prts prti halt
)
var codeMap = map[string]code{
"fetch": fetch, "store": store, "push": push, "add": add, "sub": sub, "mul": mul, "div": div, "mod": mod, "lt": lt, "gt": gt, "le": le, "ge": ge, "eq": eq, "ne": ne, "and": and, "or": or, "neg": neg, "not": not, "jmp": jmp, "jz": jz, "prtc": prtc, "prts": prts, "prti": prti, "halt": halt,
}
var (
err error scanner *bufio.Scanner object []code stringPool []string
)
func reportError(msg string) {
log.Fatalf("error : %s\n", msg)
}
func check(err error) {
if err != nil { log.Fatal(err) }
}
func btoi(b bool) int32 {
if b { return 1 } return 0
}
func itob(i int32) bool {
if i != 0 { return true } return false
}
func emitByte(c code) {
object = append(object, c)
}
func emitWord(n int) {
bs := make([]byte, 4) binary.LittleEndian.PutUint32(bs, uint32(n)) for _, b := range bs { emitByte(code(b)) }
}
/*** Virtual Machine interpreter ***/ func runVM(dataSize int) {
stack := make([]int32, dataSize+1) pc := int32(0) for { op := object[pc] pc++ switch op { case fetch: x := int32(binary.LittleEndian.Uint32(object[pc : pc+4])) stack = append(stack, stack[x]) pc += 4 case store: x := int32(binary.LittleEndian.Uint32(object[pc : pc+4])) ln := len(stack) stack[x] = stack[ln-1] stack = stack[:ln-1] pc += 4 case push: x := int32(binary.LittleEndian.Uint32(object[pc : pc+4])) stack = append(stack, x) pc += 4 case add: ln := len(stack) stack[ln-2] += stack[ln-1] stack = stack[:ln-1] case sub: ln := len(stack) stack[ln-2] -= stack[ln-1] stack = stack[:ln-1] case mul: ln := len(stack) stack[ln-2] *= stack[ln-1] stack = stack[:ln-1] case div: ln := len(stack) stack[ln-2] = int32(float64(stack[ln-2]) / float64(stack[ln-1])) stack = stack[:ln-1] case mod: ln := len(stack) stack[ln-2] = int32(math.Mod(float64(stack[ln-2]), float64(stack[ln-1]))) stack = stack[:ln-1] case lt: ln := len(stack) stack[ln-2] = btoi(stack[ln-2] < stack[ln-1]) stack = stack[:ln-1] case gt: ln := len(stack) stack[ln-2] = btoi(stack[ln-2] > stack[ln-1]) stack = stack[:ln-1] case le: ln := len(stack) stack[ln-2] = btoi(stack[ln-2] <= stack[ln-1]) stack = stack[:ln-1] case ge: ln := len(stack) stack[ln-2] = btoi(stack[ln-2] >= stack[ln-1]) stack = stack[:ln-1] case eq: ln := len(stack) stack[ln-2] = btoi(stack[ln-2] == stack[ln-1]) stack = stack[:ln-1] case ne: ln := len(stack) stack[ln-2] = btoi(stack[ln-2] != stack[ln-1]) stack = stack[:ln-1] case and: ln := len(stack) stack[ln-2] = btoi(itob(stack[ln-2]) && itob(stack[ln-1])) stack = stack[:ln-1] case or: ln := len(stack) stack[ln-2] = btoi(itob(stack[ln-2]) || itob(stack[ln-1])) stack = stack[:ln-1] case neg: ln := len(stack) stack[ln-1] = -stack[ln-1] case not: ln := len(stack) stack[ln-1] = btoi(!itob(stack[ln-1])) case jmp: x := int32(binary.LittleEndian.Uint32(object[pc : pc+4])) pc += x case jz: ln := len(stack) v := stack[ln-1] stack = stack[:ln-1] if v != 0 { pc += 4 } else { x := int32(binary.LittleEndian.Uint32(object[pc : pc+4])) pc += x } case prtc: ln := len(stack) fmt.Printf("%c", stack[ln-1]) stack = stack[:ln-1] case prts: ln := len(stack) fmt.Printf("%s", stringPool[stack[ln-1]]) stack = stack[:ln-1] case prti: ln := len(stack) fmt.Printf("%d", stack[ln-1]) stack = stack[:ln-1] case halt: return default: reportError(fmt.Sprintf("Unknown opcode %d\n", op)) } }
}
func translate(s string) string {
var d strings.Builder for i := 0; i < len(s); i++ { if s[i] == '\\' && (i+1) < len(s) { if s[i+1] == 'n' { d.WriteByte('\n') i++ } else if s[i+1] == '\\' { d.WriteByte('\\') i++ } } else { d.WriteByte(s[i]) } } return d.String()
}
func loadCode() int {
var dataSize int firstLine := true for scanner.Scan() { line := strings.TrimRight(scanner.Text(), " \t") if len(line) == 0 { if firstLine { reportError("empty line") } else { break } } lineList := strings.Fields(line) if firstLine { dataSize, err = strconv.Atoi(lineList[1]) check(err) nStrings, err := strconv.Atoi(lineList[3]) check(err) for i := 0; i < nStrings; i++ { scanner.Scan() s := strings.Trim(scanner.Text(), "\"\n") stringPool = append(stringPool, translate(s)) } firstLine = false continue } offset, err := strconv.Atoi(lineList[0]) check(err) instr := lineList[1] opCode, ok := codeMap[instr] if !ok { reportError(fmt.Sprintf("Unknown instruction %s at %d", instr, opCode)) } emitByte(opCode) switch opCode { case jmp, jz: p, err := strconv.Atoi(lineList[3]) check(err) emitWord(p - offset - 1) case push: value, err := strconv.Atoi(lineList[2]) check(err) emitWord(value) case fetch, store: value, err := strconv.Atoi(strings.Trim(lineList[2], "[]")) check(err) emitWord(value) } } check(scanner.Err()) return dataSize
}
func main() {
codeGen, err := os.Open("codegen.txt") check(err) defer codeGen.Close() scanner = bufio.NewScanner(codeGen) runVM(loadCode())
}</lang>
- Output:
Using the 'while count' example:
count is: 1 count is: 2 count is: 3 count is: 4 count is: 5 count is: 6 count is: 7 count is: 8 count is: 9
Icon
<lang icon># -*- Icon -*-
- The Rosetta Code virtual machine in Icon. Migrated from the
- ObjectIcon.
- See https://rosettacode.org/wiki/Compiler/virtual_machine_interpreter
record VirtualMachine(code, global_data, strings, stack, pc)
global opcode_names global opcode_values global op_halt global op_add global op_sub global op_mul global op_div global op_mod global op_lt global op_gt global op_le global op_ge global op_eq global op_ne global op_and global op_or global op_neg global op_not global op_prtc global op_prti global op_prts global op_fetch global op_store global op_push global op_jmp global op_jz
global whitespace_chars
procedure main(args)
local f_inp, f_out local vm
whitespace_chars := ' \t\n\r\f\v' initialize_opcodes()
if 3 <= *args then { write("Usage: ", &progname, " [INPUT_FILE [OUTPUT_FILE]]") exit(1) }
if 1 <= *args then { f_inp := open(args[1], "r") | { write(&errout, "Failed to open ", args[1], " for reading.") exit(1) } } else { f_inp := &input }
if 2 <= *args then { f_out := open(args[2], "w") | { write(&errout, "Failed to open ", args[2], " for writing.") exit(1) } } else { f_out := &output }
vm := VirtualMachine() read_assembly_code(f_inp, vm) run_vm(f_out, vm)
end
procedure initialize_opcodes()
local i
opcode_names := ["halt", "add", "sub", "mul", "div", "mod", "lt", "gt", "le", "ge", "eq", "ne", "and", "or", "neg", "not", "prtc", "prti", "prts", "fetch", "store", "push", "jmp", "jz"]
opcode_values := table() every i := 1 to *opcode_names do opcode_values[opcode_names[i]] := char(i)
op_halt := opcode_values["halt"] op_add := opcode_values["add"] op_sub := opcode_values["sub"] op_mul := opcode_values["mul"] op_div := opcode_values["div"] op_mod := opcode_values["mod"] op_lt := opcode_values["lt"] op_gt := opcode_values["gt"] op_le := opcode_values["le"] op_ge := opcode_values["ge"] op_eq := opcode_values["eq"] op_ne := opcode_values["ne"] op_and := opcode_values["and"] op_or := opcode_values["or"] op_neg := opcode_values["neg"] op_not := opcode_values["not"] op_prtc := opcode_values["prtc"] op_prti := opcode_values["prti"] op_prts := opcode_values["prts"] op_fetch := opcode_values["fetch"] op_store := opcode_values["store"] op_push := opcode_values["push"] op_jmp := opcode_values["jmp"] op_jz := opcode_values["jz"]
end
procedure int2bytes (n)
local bytes
# The VM is little-endian.
bytes := "****" bytes[1] := char (iand(n, 16rFF)) bytes[2] := char(iand(ishift(n, -8), 16rFF)) bytes[3] := char(iand(ishift(n, -16), 16rFF)) bytes[4] := char(iand(ishift(n, -24), 16rFF)) return bytes
end
procedure bytes2int(bytes, i)
local n0, n1, n2, n3, n
# The VM is little-endian.
n0 := ord(bytes[i]) n1 := ishift(ord(bytes[i + 1]), 8) n2 := ishift(ord(bytes[i + 2]), 16) n3 := ishift(ord(bytes[i + 3]), 24) n := ior (n0, ior (n1, ior (n2, n3)))
# Do not forget to extend the sign bit. return (if n3 <= 16r7F then n else ior(n, icom(16rFFFFFFFF)))
end
procedure read_assembly_code(f, vm)
local data_size, number_of_strings local line, ch local i local address local opcode
# Read the header line. line := read(f) | bad_vm() line ? { tab(many(whitespace_chars)) tab(match("Datasize")) | bad_vm() tab(many(whitespace_chars)) tab(any(':')) | bad_vm() tab(many(whitespace_chars)) data_size := integer(tab(many(&digits))) | bad_vm() tab(many(whitespace_chars)) tab(match("Strings")) | bad_vm() tab(many(whitespace_chars)) tab(any(':')) | bad_vm() tab(many(whitespace_chars)) number_of_strings := integer(tab(many(&digits))) | bad_vm() }
# Read the strings. vm.strings := list(number_of_strings) every i := 1 to number_of_strings do { vm.strings[i] := "" line := read(f) | bad_vm() line ? { tab(many(whitespace_chars)) tab(any('"')) | bad_vm() while ch := tab(any(~'"')) do { if ch == '\\' then { ch := tab(any('n\\')) | bad_vm() vm.strings[i] ||:= (if (ch == "n") then "\n" else "\\") } else { vm.strings[i] ||:= ch } } } }
# Read the code. vm.code := "" while line := read(f) do { line ? { tab(many(whitespace_chars)) address := integer(tab(many(&digits))) | bad_vm() tab(many(whitespace_chars)) opcode := tab(many(~whitespace_chars)) | bad_vm() vm.code ||:= opcode_values[opcode] case opcode of { "push": { tab(many(whitespace_chars)) vm.code ||:= int2bytes(integer(tab(many(&digits)))) | int2bytes(integer(tab(any('-')) || tab(many(&digits)))) | bad_vm() } "fetch" | "store": { tab(many(whitespace_chars)) tab(any('[')) | bad_vm() tab(many(whitespace_chars)) vm.code ||:= int2bytes(integer(tab(many(&digits)))) | bad_vm() tab(many(whitespace_chars)) tab(any(']')) | bad_vm() } "jmp" | "jz": { tab(many(whitespace_chars)) tab(any('(')) | bad_vm() tab(many(whitespace_chars)) vm.code ||:= int2bytes(integer(tab(many(&digits)))) | int2bytes(integer(tab(any('-')) || tab(many(&digits)))) | bad_vm() tab(many(whitespace_chars)) tab(any(')')) | bad_vm() tab(many(whitespace_chars)) tab(many(&digits)) | bad_vm() } default: { # Do nothing } } } }
# Create a global data area. vm.global_data := list(data_size, &null)
initialize_vm(vm)
end
procedure run_vm(f_out, vm)
initialize_vm(vm) continue_vm(f_out, vm)
end
procedure continue_vm(f_out, vm)
while vm.code[vm.pc] ~== op_halt do step_vm(f_out, vm)
end
procedure step_vm(f_out, vm)
local opcode
opcode := vm.code[vm.pc] vm.pc +:= 1 case opcode of { op_add: binop(vm, "+") op_sub: binop(vm, "-") op_mul: binop(vm, "*") op_div: binop(vm, "/") op_mod: binop(vm, "%") op_lt: comparison(vm, "<") op_gt: comparison(vm, ">") op_le: comparison(vm, "<=") op_ge: comparison(vm, ">=") op_eq: comparison(vm, "=") op_ne: comparison(vm, "~=") op_and: logical_and(vm) op_or: logical_or(vm) op_neg: negate(vm) op_not: logical_not(vm) op_prtc: printc(f_out, vm) op_prti: printi(f_out, vm) op_prts: prints(f_out, vm) op_fetch: fetch_global(vm) op_store: store_global(vm) op_push: push_argument(vm) op_jmp: jump(vm) op_jz: jump_if_zero(vm) default: bad_opcode() }
end
procedure negate(vm)
vm.stack[1] := -vm.stack[1]
end
procedure binop(vm, func)
vm.stack[2] := func(vm.stack[2], vm.stack[1]) pop(vm.stack)
end
procedure comparison(vm, func)
vm.stack[2] := (if func(vm.stack[2], vm.stack[1]) then 1 else 0) pop(vm.stack)
end
procedure logical_and(vm)
vm.stack[2] := (if vm.stack[2] ~= 0 & vm.stack[1] ~= 0 then 1 else 0) pop(vm.stack)
end
procedure logical_or(vm)
vm.stack[2] := (if vm.stack[2] ~= 0 | vm.stack[1] ~= 0 then 1 else 0) pop(vm.stack)
end
procedure logical_not(vm)
vm.stack[1] := (if vm.stack[1] ~= 0 then 0 else 1)
end
procedure printc(f_out, vm)
/f_out := &output writes(f_out, char(pop(vm.stack)))
end
procedure printi(f_out, vm)
/f_out := &output writes(f_out, pop(vm.stack))
end
procedure prints(f_out, vm)
/f_out := &output writes(f_out, vm.strings[pop(vm.stack) + 1])
end
procedure fetch_global(vm)
push(vm.stack, vm.global_data[get_argument(vm) + 1]) vm.pc +:= 4
end
procedure store_global(vm)
vm.global_data[get_argument(vm) + 1] := pop(vm.stack) vm.pc +:= 4
end
procedure push_argument(vm)
push(vm.stack, get_argument(vm)) vm.pc +:= 4
end
procedure jump(vm)
vm.pc +:= get_argument(vm)
end
procedure jump_if_zero(vm)
if pop(vm.stack) = 0 then vm.pc +:= get_argument(vm) else vm.pc +:= 4
end
procedure get_argument(vm)
return bytes2int(vm.code, vm.pc)
end
procedure initialize_vm(vm)
# The program counter starts at 1, for convenient indexing into # the code[] array. Icon indexing starts at 1 (for a *very* good # reason, but that’s a topic for another day). vm.pc := 1 vm.stack := []
end
procedure bad_vm()
write(&errout, "Bad VM.") exit(1)
end
procedure bad_opcode()
write(&errout, "Bad opcode.") exit(1)
end</lang>
J
Implementation: <lang J>(opcodes)=: opcodes=: ;:{{)n
fetch store push add sub mul div mod lt gt le ge eq ne and or neg not jmp jz prtc prts prti halt
}}-.LF
unpack=: {{
lines=. <;._2 y 'ds0 ds s0 s'=.;:0{::lines assert.'Datasize:Strings:'-:ds0,s0 vars=: (".ds)#0 strings=: rplc&('\\';'\';'\n';LF)L:0 '"'-.L:0~(1+i.".s){lines object=: ;xlate L:1 (;:'()[]') -.~L:1 ;:L:0 '-_' rplc~L:0 (1+".s)}.lines outbuf=: stack=: i.0
}}
xlate=: {{
if.2<#y do. (opcodes i. 1{y),(4#256)#:".2{::y else. opcodes i. 1{y end.
}}
NB. ensure we maintain 32 bit signed int representation signadj=: _2147483648+4294967296|2147483648+] getint=: signadj@(256 #. ])
PUSH=: Template:Stack=:stack,signadj y POP=: {{ (stack=: _1 }. stack) ] _1 { stack }} POP2=: {{ (stack=: _2 }. stack) ] _2 {. stack }} emit=:{{
outbuf=: outbuf,y if.LF e. outbuf do. ndx=. outbuf i:LF echo ndx{.outbuf outbuf=: }.ndx}.outbuf end.
}}
run_vm=: {{
unpack y stack=: i.pc=:0 lim=. <:#object while.do. pc=: pc+1 [ op=: (pc { object){opcodes i=. getint (lim<.pc+i.4) { object k=. 0 select.op case.fetch do. k=.4 [PUSH i{vars case.store do. k=.4 [vars=: (POP) i} vars case.push do. k=.4 [PUSH i case.add do. PUSH +/POP2 case.sub do. PUSH -/POP2 case.mul do. PUSH */POP2 case.div do. PUSH<.%/POP2 case.mod do. PUSH |~/POP2 case.lt do. PUSH </POP2 case.le do. PUSH <:/POP2 case.eq do. PUSH =/POP2 case.ne do. PUSH ~:/POP2 case.ge do. PUSH >:/POP2 case.gt do. PUSH >/POP2 case.and do. PUSH */0~:POP2 case.or do. PUSH +./0~:POP2 case.neg do. PUSH -POP case.not do. PUSH 0=POP case.jmp do. k=. i case.jz do. k=. (0=POP){4,i case.prtc do. emit u:POP case.prts do. emit (POP){::strings case.prti do. emit rplc&'_-'":POP case.halt do. if.#outbuf do.echo outbuf end.EMPTY return. end. pc=: pc+k end.
}}</lang>
Task example: <lang J>count=:{{)n count = 1; while (count < 10) {
print("count is: ", count, "\n"); count = count + 1;
} }}
run_vm gen syntax lex count
count is: 1 count is: 2 count is: 3 count is: 4 count is: 5 count is: 6 count is: 7 count is: 8 count is: 9 </lang>
Julia
<lang julia>mutable struct VM32
code::Vector{UInt8} stack::Vector{Int32} data::Vector{Int32} strings::Vector{String} offsets::Vector{Int32} lastargs::Vector{Int32} ip::Int32 VM32() = new(Vector{UInt8}(), Vector{Int32}(), Vector{Int32}(), Vector{String}(), Vector{Int32}(), Vector{Int32}(), 1)
end
halt, add, sub, mul, Div, mod, not, neg, and, or, lt, gt, le, ge, ne, eq, prts,
prti, prtc, store, Fetch, push, jmp, jz = UInt8.(collect(1:24))
function assemble(io)
vm = VM32() header = readline(io) datasize, nstrings = match(r"\w+:\s*(\d+)\s+\w+:\s*(\d+)", header).captures vm.data = zeros(Int32, parse(Int, datasize) + 4) for i in 1:parse(Int, nstrings) line = replace(strip(readline(io), ['"', '\n']), r"\\." => x -> x[end] == 'n' ? "\n" : string(x[end])) push!(vm.strings, line) end while !eof(io) line = readline(io) offset, op, arg1, arg2 = match(r"(\d+)\s+(\w+)\s*(\S+)?\s*(\S+)?", line).captures op = op in ["fetch", "div"] ? uppercasefirst(op) : op push!(vm.code, eval(Symbol(op))) if arg1 != nothing v = parse(Int32, strip(arg1, ['[', ']', '(', ')'])) foreach(x -> push!(vm.code, x), reinterpret(UInt8, [v])) end if arg2 != nothing push!(vm.lastargs, (x = tryparse(Int32, arg2)) == nothing ? 0 : x) end push!(vm.offsets, parse(Int32, offset)) end vm
end
function runvm(vm)
value() = (x = vm.ip; vm.ip += 4; reinterpret(Int32, vm.code[x:x+3])[1]) tobool(x) = (x != 0) ops = Dict( halt => () -> exit(), add => () -> begin vm.stack[end-1] += vm.stack[end]; pop!(vm.stack); vm.stack[end] end, sub => () -> begin vm.stack[end-1] -= vm.stack[end]; pop!(vm.stack); vm.stack[end] end, mul => () -> begin vm.stack[end-1] *= vm.stack[end]; pop!(vm.stack); vm.stack[end] end, Div => () -> begin vm.stack[end-1] /= vm.stack[end]; pop!(vm.stack); vm.stack[end] end, mod => () -> begin vm.stack[end-1] %= vm.stack[1]; pop!(vm.stack); vm.stack[end] end, not => () -> vm.stack[end] = vm.stack[end] ? 0 : 1, neg => () -> vm.stack[end] = -vm.stack[end], and => () -> begin vm.stack[end-1] = tobool(vm.stack[end-1]) && tobool(vm.stack[end]) ? 1 : 0; pop!(vm.stack); vm.stack[end] end, or => () -> begin vm.stack[end-1] = tobool(vm.stack[end-1]) || tobool(vm.stack[end]) ? 1 : 0; pop!(vm.stack); vm.stack[end] end, lt => () -> begin x = (vm.stack[end-1] < vm.stack[end] ? 1 : 0); pop!(vm.stack); vm.stack[end] = x end, gt => () -> begin x = (vm.stack[end-1] > vm.stack[end] ? 1 : 0); pop!(vm.stack); vm.stack[end] = x end, le => () -> begin x = (vm.stack[end-1] <= vm.stack[end] ? 1 : 0); pop!(vm.stack); vm.stack[end] = x end, ge => () -> begin x = (vm.stack[end-1] >= vm.stack[end] ? 1 : 0); pop!(vm.stack); vm.stack[end] = x end, ne => () -> begin x = (vm.stack[end-1] != vm.stack[end] ? 1 : 0); pop!(vm.stack); vm.stack[end] = x end, eq => () -> begin x = (vm.stack[end-1] == vm.stack[end] ? 1 : 0); pop!(vm.stack); vm.stack[end] = x end, prts => () -> print(vm.strings[pop!(vm.stack) + 1]), prti => () -> print(pop!(vm.stack)), prtc => () -> print(Char(pop!(vm.stack))), store => () -> vm.data[value() + 1] = pop!(vm.stack), Fetch => () -> push!(vm.stack, vm.data[value() + 1]), push => () -> push!(vm.stack, value()), jmp => () -> vm.ip += value(), jz => () -> if pop!(vm.stack) == 0 vm.ip += value() else vm.ip += 4 end) vm.ip = 1 while true op = vm.code[vm.ip] vm.ip += 1 ops[op]() end
end
const testasm = """ Datasize: 1 Strings: 2 "count is: " "\\n"
0 push 1 5 store [0] 10 fetch [0] 15 push 10 20 lt 21 jz (43) 65 26 push 0 31 prts 32 fetch [0] 37 prti 38 push 1 43 prts 44 fetch [0] 49 push 1 54 add 55 store [0] 60 jmp (-51) 10 65 halt """
const iob = IOBuffer(testasm) const vm = assemble(iob) runvm(vm)
</lang>
- Output:
count is: 1 count is: 2 count is: 3 count is: 4 count is: 5 count is: 6 count is: 7 count is: 8 count is: 9
M2000 Interpreter
Using Select Case
<lang M2000 Interpreter> Module Virtual_Machine_Interpreter (a$){ \\ function to extract string, replacing escape codes. Function GetString$(a$) { s=instr(a$, chr$(34)) m=rinstr(a$,chr$(34))-s if m>1 then \\ process escape codes =format$(mid$(a$, s+1, m-1)) else ="" end if } \\ module to print a string to console using codes, 13, 10, 9 Module printsrv (a$) { for i=1 to len(a$) select case chrcode(Mid$(a$,i,1)) case 13 cursor 0 case 10 cursor 0 : Print case 9 cursor ((pos+tab) div tab)*tab else case { m=pos :if pos>=width then Print : m=pos Print Mid$(a$,i,1); if m<=width then cursor m+1 } end select next i } const nl$=chr$(13)+chr$(10) \\ we can set starting value to any number n where 0<=n<=232 enum op { halt_=232, add_, sub_, mul_, div_, mod_, not_, neg_, and_, or_, lt_, gt_, le_, ge_, ne_, eq_, prts_, prti_, prtc_, store_, fetch_, push_, jmp_, jz_
}
Rem : Form 120, 60 ' change console width X height to run Ascii Mandlebrot examlpe Report "Virtual Assembly Code:"+{ }+a$ Print "Prepare Byte Code"
\\ get datasize a$=rightpart$(a$, "Datasize:") m=0 data_size=val(a$, "int", m) a$=mid$(a$, m) \\ make stack if data_size>0 then Buffer Clear stack_ as long*data_size \\ dim or redim buffer append 1000 long as is. Buffer stack_ as long*(1000+data_size) \\ get strings a$=rightpart$(a$, "Strings:") m=0 strings=val(a$, "int", m) a$=rightpart$(a$, nl$)
if strings>0 then Dim strings$(strings) for i=0 to strings-1 strings$(i)=GetString$(leftpart$(a$, nl$)) a$=rightpart$(a$, nl$) Next i End if buffer clear code_ as byte*1000 do m=0 offset=val(a$,"int", m) if m<0 then exit a$=mid$(a$,m) line$=trim$(leftpart$(a$,nl$)) if line$="" then line$=trim$(a$) else a$=trim$(rightpart$(a$, nl$)) op$=if$(instr(line$," ")>0->leftpart$(line$," "), line$) if not valid(eval(op$+"_")) then exit opc=eval(op$+"_") Return code_, offset:=opc if opc>=store_ then line$=rightpart$(line$," ") select case opc case store_, fetch_ Return code_, offset+1:=val(rightpart$(leftpart$(line$,"]"),"[")) as long : offset+=4 case push_ Return code_, offset+1:=uint(val(line$)) as long : offset+=4 case jz_, jmp_ Return code_, offset+1:=val(rightpart$(line$,")")) as long : offset+=4 end select end if Always Print "Press any key" : Push key$ : Drop \\ Prepare VM let pc=0, sp=len(stack_) div 4 do { func=eval(code_, pc) pc++ select case func case halt_ exit case push_ sp--:return stack_, sp:=eval(code_, pc as long):pc+=4 case jz_ sp++: if eval(stack_, sp-1)=0 then pc=eval(code_, pc as long) else pc+=4 case jmp_ pc=eval(code_, pc as long) case fetch_ sp--:Return stack_, sp:=eval(stack_, eval(code_, pc as long)):pc+=4 case store_ Return stack_, eval(code_, pc as long):=eval(stack_, sp):sp++:pc+=4 case add_ Return stack_, sp+1:=uint(sint(eval(stack_, sp+1))+sint(eval(stack_, sp))):sp++ case sub_ Return stack_, sp+1:=uint(sint(eval(stack_, sp+1))-sint(eval(stack_, sp))):sp++ case mul_ Return stack_, sp+1:=uint(sint(eval(stack_, sp+1))*sint(eval(stack_, sp))):sp++ case div_ Return stack_, sp+1:=uint(sint(eval(stack_, sp+1)) div sint(eval(stack_, sp))):sp++ case mod_ Return stack_, sp+1:=uint(sint(eval(stack_, sp+1)) mod sint(eval(stack_, sp))) :sp++ case not_ Return stack_, sp:=if(eval(stack_, sp)=0->uint(-1),0) case neg_ \\ we can use neg(sint(value))+1 or uint(-sint(value)) Return stack_, sp:=uint(-sint(eval(stack_, sp))) case and_ Return stack_, sp+1:=binary.and(eval(stack_, sp+1),eval(stack_, sp)):sp++ case or_ Return stack_, sp+1:=binary.or(eval(stack_, sp+1),eval(stack_, sp)):sp++ case lt_ Return stack_, sp+1:=uint(if(sint(eval(stack_, sp+1))<sint(eval(stack_, sp))->-1, 0)):sp++ case gt_ Return stack_, sp+1:=uint(if(sint(eval(stack_, sp+1))>sint(eval(stack_, sp))->-1, 0)):sp++ case le_ Return stack_, sp+1:=uint(if(sint(eval(stack_, sp+1))<=sint(eval(stack_, sp))->-1, 0)):sp++ case ge_ Return stack_, sp+1:=uint(if(sint(eval(stack_, sp+1))>=sint(eval(stack_, sp))->-1, 0)):sp++ case ne_ Return stack_, sp+1:=uint(if(eval(stack_, sp+1)<>eval(stack_, sp)->-1, 0)):sp++ case eq_ Return stack_, sp+1:=uint(if(eval(stack_, sp+1)=eval(stack_, sp)->-1, 0)):sp++ case prts_ printsrv strings$(eval(stack_,sp)):sp++ case prti_ printsrv str$(sint(eval(stack_,sp)),0):sp++ case prtc_ printsrv chrcode$(eval(stack_,sp)):sp++ else case Error "Unkown op "+str$(func) end select } always Print "done" } Virtual_Machine_Interpreter { Datasize: 1 Strings: 2 "count is: " "\n"
0 push 1 5 store [0] 10 fetch [0] 15 push 10 20 lt 21 jz (43) 65 26 push 0 31 prts 32 fetch [0] 37 prti 38 push 1 43 prts 44 fetch [0] 49 push 1 54 add 55 store [0] 60 jmp (-51) 10 65 halt
} </lang>
Using Lambda functions
A call local to function pass the current scope to function, so it's like a call to subroutine, but faster.
<lang M2000 Interpreter> Module Virtual_Machine_Interpreter (a$){ \\ function to extract string, replacing escape codes. Function GetString$(a$) { s=instr(a$, chr$(34)) m=rinstr(a$,chr$(34))-s if m>1 then \\ process escape codes =format$(mid$(a$, s+1, m-1)) else ="" end if } \\ module to print a string to console using codes, 13, 10, 9 Module printsrv (a$) { for i=1 to len(a$) select case chrcode(Mid$(a$,i,1)) case 13 cursor 0 case 10 cursor 0 : Print case 9 cursor ((pos+tab) div tab)*tab else case { m=pos :if pos>=width then Print : m=pos Print Mid$(a$,i,1); if m<=width then cursor m+1 } end select next i } const nl$=chr$(13)+chr$(10) \\ we can set starting value to any number n where 0<=n<=232 enum op { halt_=232, add_, sub_, mul_, div_, mod_, not_, neg_, and_, or_, lt_, gt_, le_, ge_, ne_, eq_, prts_, prti_, prtc_, store_, fetch_, push_, jmp_, jz_
} exit_now=false
Inventory func=halt_:=lambda->{exit_now=true} Append func, push_:=lambda->{sp--:return stack_, sp:=eval(code_, pc as long):pc+=4} Append func, jz_:=lambda->{ sp++: if eval(stack_, sp-1)=0 then pc=eval(code_, pc as long) else pc+=4 } Append func, jmp_:=lambda->{pc=eval(code_, pc as long)} Append func, fetch_:=lambda->{sp--:Return stack_, sp:=eval(stack_, eval(code_, pc as long)):pc+=4} Append func, store_:=lambda->{Return stack_, eval(code_, pc as long):=eval(stack_, sp):sp++:pc+=4} Append func, add_:=lambda->{Return stack_, sp+1:=uint(sint(eval(stack_, sp+1))+sint(eval(stack_, sp))):sp++} Append func, sub_:=lambda->{Return stack_, sp+1:=uint(sint(eval(stack_, sp+1))-sint(eval(stack_, sp))):sp++} Append func, mul_:=lambda->{Return stack_, sp+1:=uint(sint(eval(stack_, sp+1))*sint(eval(stack_, sp))):sp++} Append func, div_:=lambda->{Return stack_, sp+1:=uint(sint(eval(stack_, sp+1)) div sint(eval(stack_, sp))):sp++} Append func, mod_:=lambda->{Return stack_, sp+1:=uint(sint(eval(stack_, sp+1)) mod sint(eval(stack_, sp))) :sp++} Append func, not_:=lambda->{Return stack_, sp:=if(eval(stack_, sp)=0->uint(-1),0)} Append func, neg_:=lambda->{Return stack_, sp:=uint(-sint(eval(stack_, sp)))} Append func, and_:=lambda->{Return stack_, sp+1:=binary.and(eval(stack_, sp+1),eval(stack_, sp)):sp++ } Append func, or_:=lambda->{Return stack_, sp+1:=binary.or(eval(stack_, sp+1),eval(stack_, sp)):sp++ } Append func, lt_:=lambda->{Return stack_, sp+1:=uint(if(sint(eval(stack_, sp+1))<sint(eval(stack_, sp))->-1, 0)):sp++} Append func, gt_:=lambda->{Return stack_, sp+1:=uint(if(sint(eval(stack_, sp+1))>sint(eval(stack_, sp))->-1, 0)):sp++} Append func, le_:=lambda->{Return stack_, sp+1:=uint(if(sint(eval(stack_, sp+1))<=sint(eval(stack_, sp))->-1, 0)):sp++} Append func, ge_:=lambda->{Return stack_, sp+1:=uint(if(sint(eval(stack_, sp+1))>=sint(eval(stack_, sp))->-1, 0)):sp++} Append func, ne_:=lambda->{Return stack_, sp+1:=uint(if(eval(stack_, sp+1)<>eval(stack_, sp)->-1, 0)):sp++} Append func, eq_:=lambda->{Return stack_, sp+1:=uint(if(eval(stack_, sp+1)=eval(stack_, sp)->-1, 0)):sp++} Append func, prts_:=lambda->{printsrv strings$(eval(stack_,sp)):sp++} Append func, prti_:=lambda->{printsrv str$(sint(eval(stack_,sp)),0):sp++} Append func, prtc_:=lambda->{printsrv chrcode$(eval(stack_,sp)):sp++} Rem : Form 120, 60 ' change console width X height to run Ascii Mandlebrot examlpe Report "Virtual Assembly Code:"+{ }+a$ Print "Prepare Byte Code"
\\ get datasize a$=rightpart$(a$, "Datasize:") m=0 data_size=val(a$, "int", m) a$=mid$(a$, m) \\ make stack if data_size>0 then Buffer Clear stack_ as long*data_size \\ dim or redim buffer append 1000 long as is. Buffer stack_ as long*(1000+data_size) \\ get strings a$=rightpart$(a$, "Strings:") m=0 strings=val(a$, "int", m) a$=rightpart$(a$, nl$)
if strings>0 then Dim strings$(strings) for i=0 to strings-1 strings$(i)=GetString$(leftpart$(a$, nl$)) a$=rightpart$(a$, nl$) Next i End if buffer clear code_ as byte*1000 do m=0 offset=val(a$,"int", m) if m<0 then exit a$=mid$(a$,m) line$=trim$(leftpart$(a$,nl$)) if line$="" then line$=trim$(a$) else a$=trim$(rightpart$(a$, nl$)) op$=if$(instr(line$," ")>0->leftpart$(line$," "), line$) if not valid(eval(op$+"_")) then exit opc=eval(op$+"_") Return code_, offset:=opc if opc>=store_ then line$=rightpart$(line$," ") select case opc case store_, fetch_ Return code_, offset+1:=val(rightpart$(leftpart$(line$,"]"),"[")) as long : offset+=4 case push_ Return code_, offset+1:=uint(val(line$)) as long : offset+=4 case jz_, jmp_ Return code_, offset+1:=val(rightpart$(line$,")")) as long : offset+=4 end select end if Always Print "Press any key" : Push key$ : Drop \\ Prepare VM let pc=0, sp=len(stack_) div 4 do b=func(eval(code_, pc)) pc++ call local b() until exit_now Print "done" } Virtual_Machine_Interpreter { Datasize: 1 Strings: 2 "count is: " "\n"
0 push 1 5 store [0] 10 fetch [0] 15 push 10 20 lt 21 jz (43) 65 26 push 0 31 prts 32 fetch [0] 37 prti 38 push 1 43 prts 44 fetch [0] 49 push 1 54 add 55 store [0] 60 jmp (-51) 10 65 halt
} </lang>
Nim
<lang Nim>import os, parseutils, strutils, strscans, strformat
type
Value = int32 BytesValue = array[4, byte] Address = int32
OpCode = enum opFetch = "fetch" opStore = "store" opPush = "push" opJmp = "jmp" opJz = "jz" opAdd = "add" opSub = "sub" opMul = "mul" opDiv = "div" opMod = "mod" opLt = "lt" opgt = "gt" opLe = "le" opGe = "ge" opEq = "eq" opNe = "ne" opAnd = "and" opOr = "or" opNeg = "neg" opNot = "not" opPrtc = "prtc" opPrti = "prti" opPrts = "prts" opHalt = "halt" opInvalid = "invalid"
# Virtual machine description. VM = object stack: seq[Value] # Evaluation stack. memory: seq[byte] # Memory to store program. data: seq[Value] # Data storage. strings: seq[string] # String storage. pc: Address # Program counter.
# Exceptions. LoadingError = object of CatchableError RuntimeError = object of CatchableError
- Running program.
proc checkStackLength(vm: VM; minLength: int) {.inline.} =
## Check that evaluation stack contains at least "minLength" elements. if vm.stack.len < minLength: raise newException(RuntimeError, &"not enough operands on the stack (pc = {vm.pc}).")
- ---------------------------------------------------------------------------------------------------
proc getOperand(vm: var VM): Value =
## Get a 32 bits operand.
type Union {.union.} = object value: Value bytes: BytesValue
if vm.pc + 4 >= vm.memory.len: raise newException(RuntimeError, &"out of memory (pc = {vm.pc}).")
var aux: Union let address = vm.pc + 1 for idx in 0..3: aux.bytes[idx] = vm.memory[address + idx] result = aux.value
- ---------------------------------------------------------------------------------------------------
proc run(vm: var VM) =
## Run a program loaded in VM memory.
vm.pc = 0
while true:
if vm.pc notin 0..vm.memory.high: raise newException(RuntimeError, &"out of memory (pc = {vm.pc}).")
let opcode = OpCode(vm.memory[vm.pc]) case opcode
of opFetch, opStore: let index = vm.getOperand() if index notin 0..vm.data.high: raise newException(RuntimeError, &"wrong memory index (pc = {vm.pc}).") if opcode == opFetch: vm.stack.add(vm.data[index]) else: vm.checkStackLength(1) vm.data[index] = vm.stack.pop() inc vm.pc, 4
of opPush: let value = vm.getOperand() vm.stack.add(value) inc vm.pc, 4
of opJmp: let offset = vm.getOperand() inc vm.pc, offset
of opJz: let offset = vm.getOperand() vm.checkStackLength(1) let value = vm.stack.pop() inc vm.pc, if value == 0: offset else: 4
of opAdd..opOr: # Two operands instructions. vm.checkStackLength(2) let op2 = vm.stack.pop() let op1 = vm.stack.pop() case range[opAdd..opOr](opcode) of opAdd: vm.stack.add(op1 + op2) of opSub: vm.stack.add(op1 - op2) of opMul: vm.stack.add(op1 * op2) of opDiv: vm.stack.add(op1 div op2) of opMod: vm.stack.add(op1 mod op2) of opLt: vm.stack.add(Value(op1 < op2)) of opgt: vm.stack.add(Value(op1 > op2)) of opLe: vm.stack.add(Value(op1 <= op2)) of opGe: vm.stack.add(Value(op1 >= op2)) of opEq: vm.stack.add(Value(op1 == op2)) of opNe: vm.stack.add(Value(op1 != op2)) of opAnd: vm.stack.add(op1 and op2) of opOr: vm.stack.add(op1 or op2)
of opNeg..opPrts: # One operand instructions. vm.checkStackLength(1) let op = vm.stack.pop() case range[opNeg..opPrts](opcode) of opNeg: vm.stack.add(-op) of opNot: vm.stack.add(not op) of opPrtc: stdout.write(chr(op)) of opPrti: stdout.write(op) of opPrts: if op notin 0..vm.strings.high: raise newException(RuntimeError, &"wrong string index (pc = {vm.pc}).") stdout.write(vm.strings[op])
of opHalt: break
of opInvalid: discard # Not possible.
inc vm.pc
- Loading assembly file.
proc parseHeader(line: string): tuple[dataSize, stringCount: int] =
## Parse the header.
if not line.scanf("Datasize: $s$i $sStrings: $i", result.dataSize, result.stringCount): raise newException(LoadingError, "Wrong header in code.")
- ---------------------------------------------------------------------------------------------------
import re
proc parseString(line: string; linenum: int): string =
## Parse a string.
if not line.startsWith('"'): raise newException(LoadingError, "Line $1: incorrect string.".format(linenum)) # Can't use "unescape" as it is confused by "\\n" and "\n". result = line.replacef(re"([^\\])(\\n)", "$1\n").replace(r"\\", r"\").replace("\"", "")
- ---------------------------------------------------------------------------------------------------
proc parseValue(line: string; linenum: int; pos: var int; msg: string): int32 =
## Parse an int32 value.
var value: int
pos += line.skipWhitespace(pos) let parsed = line.parseInt(value, pos) if parsed == 0: raise newException(LoadingError, "Line $1: ".format(linenum) & msg) pos += parsed result = int32(value)
- ---------------------------------------------------------------------------------------------------
proc parseOpcode(line: string; linenum: int; pos: var int): OpCode =
## Parse an opcode.
var opstring: string
pos += line.skipWhitespace(pos) let parsed = line.parseIdent(opstring, pos) if parsed == 0: raise newException(LoadingError, "Line $1: opcode expected".format(linenum)) pos += parsed
result = parseEnum[OpCode](opstring, opInvalid) if result == opInvalid: raise newException(LoadingError, "Line $1: invalid opcode encountered".format(linenum))
- ---------------------------------------------------------------------------------------------------
proc parseMemoryIndex(line: string; linenum: int; pos: var int): int32 =
## Parse a memory index (int32 value between brackets).
var memIndex: int
pos += line.skipWhitespace(pos) let str = line.captureBetween('[', ']', pos) if str.parseInt(memIndex) == 0 or memIndex < 0: raise newException(LoadingError, "Line $1: invalid memory index".format(lineNum)) pos += str.len + 2 result = int32(memIndex)
- ---------------------------------------------------------------------------------------------------
proc parseOffset(line: string; linenum: int; pos: var int): int32 =
## Parse an offset (int32 value between parentheses).
var offset: int
pos += line.skipWhitespace(pos) let str = line.captureBetween('(', ')', pos) if str.parseInt(offset) == 0: raise newException(LoadingError, "Line $1: invalid offset".format(linenum)) pos += str.len + 2 result = int32(offset)
- ---------------------------------------------------------------------------------------------------
proc load(vm: var VM; code: string) =
## Load an assembly code into VM memory.
# Analyze header. let lines = code.splitlines() let (dataSize, stringCount) = parseHeader(lines[0]) vm.data.setLen(dataSize) vm.strings.setLen(stringCount)
# Load strings. for idx in 1..stringCount: vm.strings[idx - 1] = lines[idx].parseString(idx + 1)
# Load code. var pc: Address = 0 for idx in (stringCount + 1)..lines.high: var pos = 0 let line = lines[idx] if line.len == 0: continue
# Process address. let address = line.parseValue(idx + 1, pos, "address expected") if address != pc: raise newException(LoadingError, "Line $1: wrong address".format(idx + 1))
# Process opcode. let opcode = line.parseOpcode(idx + 1, pos) vm.memory.add(byte(opcode))
# Process operand. case opcode
of opFetch, opStore: # Find memory index. let memIndex = line.parseMemoryIndex(idx + 1, pos) vm.memory.add(cast[BytesValue](Value(memIndex))) inc pc, 5
of opJmp, opJz: # Find offset. let offset = line.parseOffset(idx + 1, pos) vm.memory.add(cast[BytesValue](Value(offset))) # Find and check branch address. let branchAddress = line.parseValue(idx + 1, pos, "branch address expected") if branchAddress != pc + offset + 1: raise newException(LoadingError, "Line $1: wrong branch address".format(idx + 1)) inc pc, 5
of opPush: # Find value. let value = line.parseValue(idx + 1, pos, "value expected") vm.memory.add(cast[BytesValue](Value(value))) inc pc, 5
else: inc pc
- ———————————————————————————————————————————————————————————————————————————————————————————————————
let code = if paramCount() == 0: stdin.readAll() else: paramStr(1).readFile() var vm: VM
vm.load(code) vm.run()</lang>
All tests passed.
ObjectIcon
<lang ObjectIcon># -*- ObjectIcon -*-
- The Rosetta Code virtual machine in Object Icon.
- See https://rosettacode.org/wiki/Compiler/virtual_machine_interpreter
import io
procedure main(args)
local f_inp, f_out local vm
if 3 <= *args then { write("Usage: ", &progname, " [INPUT_FILE [OUTPUT_FILE]]") exit(1) }
if 1 <= *args then { f_inp := FileStream(args[1], FileOpt.RDONLY) | stop (&why) } else { f_inp := FileStream.stdin } f_inp := BufferStream(f_inp)
if 2 <= *args then { f_out := FileStream(args[2], ior (FileOpt.WRONLY, FileOpt.TRUNC, FileOpt.CREAT)) | stop (&why) } else { f_out := FileStream.stdout }
vm := VirtualMachine() vm.read_assembly_code(f_inp) vm.run(f_out)
end
procedure int2bytes (n)
local bytes
# The VM is little-endian.
bytes := "****" bytes[1] := char (iand(n, 16rFF)) bytes[2] := char(iand(ishift(n, -8), 16rFF)) bytes[3] := char(iand(ishift(n, -16), 16rFF)) bytes[4] := char(iand(ishift(n, -24), 16rFF)) return bytes
end
procedure bytes2int(bytes, i)
local n0, n1, n2, n3, n
# The VM is little-endian.
n0 := ord(bytes[i]) n1 := ishift(ord(bytes[i + 1]), 8) n2 := ishift(ord(bytes[i + 2]), 16) n3 := ishift(ord(bytes[i + 3]), 24) n := ior (n0, ior (n1, ior (n2, n3)))
# Do not forget to extend the sign bit. return (if n3 <= 16r7F then n else ior(n, icom(16rFFFFFFFF)))
end
class OpcodeCollection()
public static const opcode_names public static const opcode_values
public static const op_halt public static const op_add public static const op_sub public static const op_mul public static const op_div public static const op_mod public static const op_lt public static const op_gt public static const op_le public static const op_ge public static const op_eq public static const op_ne public static const op_and public static const op_or public static const op_neg public static const op_not public static const op_prtc public static const op_prti public static const op_prts public static const op_fetch public static const op_store public static const op_push public static const op_jmp public static const op_jz
private static init() local i
opcode_names := ["halt", "add", "sub", "mul", "div", "mod", "lt", "gt", "le", "ge", "eq", "ne", "and", "or", "neg", "not", "prtc", "prti", "prts", "fetch", "store", "push", "jmp", "jz"]
opcode_values := table() every i := 1 to *opcode_names do opcode_values[opcode_names[i]] := char(i)
op_halt := opcode_values["halt"] op_add := opcode_values["add"] op_sub := opcode_values["sub"] op_mul := opcode_values["mul"] op_div := opcode_values["div"] op_mod := opcode_values["mod"] op_lt := opcode_values["lt"] op_gt := opcode_values["gt"] op_le := opcode_values["le"] op_ge := opcode_values["ge"] op_eq := opcode_values["eq"] op_ne := opcode_values["ne"] op_and := opcode_values["and"] op_or := opcode_values["or"] op_neg := opcode_values["neg"] op_not := opcode_values["not"] op_prtc := opcode_values["prtc"] op_prti := opcode_values["prti"] op_prts := opcode_values["prts"] op_fetch := opcode_values["fetch"] op_store := opcode_values["store"] op_push := opcode_values["push"] op_jmp := opcode_values["jmp"] op_jz := opcode_values["jz"]
return end
end
class VirtualMachine(OpcodeCollection)
public code public global_data public strings public stack public pc
private static const whitespace_chars
private static init() whitespace_chars := ' \t\n\r\f\v' return end
public read_assembly_code(f) local data_size, number_of_strings local line, ch local i local address local opcode
# Read the header line. line := f.read() | bad_vm() line ? { tab(many(whitespace_chars)) tab(match("Datasize")) | bad_vm() tab(many(whitespace_chars)) tab(any(':')) | bad_vm() tab(many(whitespace_chars)) data_size := integer(tab(many(&digits))) | bad_vm() tab(many(whitespace_chars)) tab(match("Strings")) | bad_vm() tab(many(whitespace_chars)) tab(any(':')) | bad_vm() tab(many(whitespace_chars)) number_of_strings := integer(tab(many(&digits))) | bad_vm() }
# Read the strings. strings := list(number_of_strings) every i := 1 to number_of_strings do { strings[i] := "" line := f.read() | bad_vm() line ? { tab(many(whitespace_chars)) tab(any('"')) | bad_vm() while ch := tab(any(~'"')) do { if ch == '\\' then { ch := tab(any('n\\')) | bad_vm() strings[i] ||:= (if (ch == "n") then "\n" else "\\") } else { strings[i] ||:= ch } } } }
# Read the code. code := "" while line := f.read() do { line ? { tab(many(whitespace_chars)) address := integer(tab(many(&digits))) | bad_vm() tab(many(whitespace_chars)) opcode := tab(many(~whitespace_chars)) | bad_vm() code ||:= opcode_values[opcode] case opcode of { "push": { tab(many(whitespace_chars)) code ||:= int2bytes(integer(tab(many(&digits)))) | int2bytes(integer(tab(any('-')) || tab(many(&digits)))) | bad_vm() } "fetch" | "store": { tab(many(whitespace_chars)) tab(any('[')) | bad_vm() tab(many(whitespace_chars)) code ||:= int2bytes(integer(tab(many(&digits)))) | bad_vm() tab(many(whitespace_chars)) tab(any(']')) | bad_vm() } "jmp" | "jz": { tab(many(whitespace_chars)) tab(any('(')) | bad_vm() tab(many(whitespace_chars)) code ||:= int2bytes(integer(tab(many(&digits)))) | int2bytes(integer(tab(any('-')) || tab(many(&digits)))) | bad_vm() tab(many(whitespace_chars)) tab(any(')')) | bad_vm() tab(many(whitespace_chars)) tab(many(&digits)) | bad_vm() } default: { # Do nothing } } } }
# Create a global data area. global_data := list(data_size, &null)
initialize()
return end
public run(f_out) initialize() continue(f_out) return end
public continue(f_out) while code[pc] ~== op_halt do step(f_out) end
public step(f_out) local opcode
opcode := code[pc] pc +:= 1 case opcode of { op_add: binop("+") op_sub: binop("-") op_mul: binop("*") op_div: binop("/") op_mod: binop("%") op_lt: comparison("<") op_gt: comparison(">") op_le: comparison("<=") op_ge: comparison(">=") op_eq: comparison("=") op_ne: comparison("~=") op_and: logical_and() op_or: logical_or() op_neg: negate() op_not: logical_not() op_prtc: printc(f_out) op_prti: printi(f_out) op_prts: prints(f_out) op_fetch: fetch_global() op_store: store_global() op_push: push_argument() op_jmp: jump() op_jz: jump_if_zero() default: bad_opcode() } end
private negate() stack[1] := -stack[1] return end
private binop(func) stack[2] := func(stack[2], stack[1]) pop(stack) return end
private comparison(func) stack[2] := (if func(stack[2], stack[1]) then 1 else 0) pop(stack) return end
private logical_and() stack[2] := (if stack[2] ~= 0 & stack[1] ~= 0 then 1 else 0) pop(stack) return end
private logical_or() stack[2] := (if stack[2] ~= 0 | stack[1] ~= 0 then 1 else 0) pop(stack) return end
private logical_not() stack[1] := (if stack[1] ~= 0 then 0 else 1) return end
private printc(f_out) /f_out := FileStream.stdout f_out.writes(char(pop(stack))) return end
private printi(f_out) /f_out := FileStream.stdout f_out.writes(pop(stack)) return end
private prints(f_out) /f_out := FileStream.stdout f_out.writes(strings[pop(stack) + 1]) return end
private fetch_global() push(stack, global_data[get_argument() + 1]) pc +:= 4 return end
private store_global() global_data[get_argument() + 1] := pop(stack) pc +:= 4 return end
private push_argument() push(stack, get_argument()) pc +:= 4 return end
private jump() pc +:= get_argument() return end
private jump_if_zero() if pop(stack) = 0 then pc +:= get_argument() else pc +:= 4 return end
private get_argument() return bytes2int(code, pc) end
public initialize() # The program counter starts at 1, for convenient indexing into # the code[] array. Icon indexing starts at 1 (for a *very* good # reason, but that’s a topic for another day). pc := 1 stack := [] return end
private bad_vm() write(FileStream.stderr, "Bad VM.") exit(1) end
private bad_opcode() write(FileStream.stderr, "Bad opcode.") exit(1) end
end</lang>
- Output:
$ oit vm-oi.icn && ./vm-oi compiler-tests/count.vm
count is: 1 count is: 2 count is: 3 count is: 4 count is: 5 count is: 6 count is: 7 count is: 8 count is: 9
Perl
Tested with perl v5.26.1 <lang Perl>#!/usr/bin/perl
use strict; # vm.pl - run rosetta code use warnings; use integer;
my ($binary, $pc, @stack, @data) = (, 0);
<> =~ /Strings: (\d+)/ or die "bad header"; my @strings = map <> =~ tr/\n""//dr =~ s/\\(.)/$1 eq 'n' ? "\n" : $1/ger, 1..$1;
sub value { unpack 'l', substr $binary, ($pc += 4) - 4, 4 }
my @ops = (
[ halt => sub { exit } ], [ add => sub { $stack[-2] += pop @stack } ], [ sub => sub { $stack[-2] -= pop @stack } ], [ mul => sub { $stack[-2] *= pop @stack } ], [ div => sub { $stack[-2] /= pop @stack } ], [ mod => sub { $stack[-2] %= pop @stack } ], [ not => sub { $stack[-1] = $stack[-1] ? 0 : 1 } ], [ neg => sub { $stack[-1] = - $stack[-1] } ], [ and => sub { $stack[-2] &&= $stack[-1]; pop @stack } ], [ or => sub { $stack[-2] ||= $stack[-1]; pop @stack } ], [ lt => sub { $stack[-1] = $stack[-2] < pop @stack ? 1 : 0 } ], [ gt => sub { $stack[-1] = $stack[-2] > pop @stack ? 1 : 0 } ], [ le => sub { $stack[-1] = $stack[-2] <= pop @stack ? 1 : 0 } ], [ ge => sub { $stack[-1] = $stack[-2] >= pop @stack ? 1 : 0 } ], [ ne => sub { $stack[-1] = $stack[-2] != pop @stack ? 1 : 0 } ], [ eq => sub { $stack[-1] = $stack[-2] == pop @stack ? 1 : 0 } ], [ prts => sub { print $strings[pop @stack] } ], [ prti => sub { print pop @stack } ], [ prtc => sub { print chr pop @stack } ], [ store => sub { $data[value()] = pop @stack } ], [ fetch => sub { push @stack, $data[value()] // 0 } ], [ push => sub { push @stack, value() } ], [ jmp => sub { $pc += value() - 4 } ], [ jz => sub { $pc += pop @stack ? 4 : value() - 4 } ], );
my %op2n = map { $ops[$_][0], $_ } 0..$#ops; # map name to op number
while(<>)
{ /^ *\d+ +(\w+)/ or die "bad line $_"; # format error $binary .= chr( $op2n{$1} // die "$1 not defined" ) . # op code (/\((-?\d+)\)|(\d+)]?$/ and pack 'l', $+); # 4 byte value }
$ops[vec($binary, $pc++, 8)][1]->() while 1; # run it</lang> Passes all tests.
Phix
Reusing cgen.e from the Code Generator task
-- -- demo\rosetta\Compiler\vm.exw -- ============================ -- -- Since we have generated executable machine code, the virtual machine, such as it is, is just -- the higher level implementations of printc/i/s, see setbuiltins() in cgen.e -- Otherwise the only difference between this and cgen.exw is call(code_mem) instead of decode(). -- -- A quick test (calculating fib(44) 10^6 times) suggests ~500 times faster than interp.exw - -- which is to be expected given that a single add instruction (1 clock) here is implemented as -- at least three (and quite possibly five!) resursive calls to interp() in the other. format PE32 --format ELF32 -- Note: cgen generates 32-bit machine code, which cannot be executed directly from a 64-bit interpreter. -- You can however, via the magic of either the above format directives, use a 64-bit version of -- Phix to compile this (just add a -c command line option) to a 32-bit executable, which can. -- It would not be particularly difficult to emit 32 or 64 bit code, but some source code files -- would, fairly obviously, then be very nearly twice as long, and a fair bit harder to read. without js -- (machine code!) include cgen.e procedure main(sequence cl) open_files(cl) toks = lex() object t = parse() code_gen(t) fixup() if machine_bits()=32 then -- ^ as per note above call(code_mem) end if free({var_mem,code_mem}) close_files() end procedure --main(command_line()) main({0,0,"count.c"})
- Output:
count is: 1 count is: 2 count is: 3 count is: 4 count is: 5 count is: 6 count is: 7 count is: 8 count is: 9
Python
Tested with Python 2.7 and 3.x <lang Python>from __future__ import print_function import sys, struct
FETCH, STORE, PUSH, ADD, SUB, MUL, DIV, MOD, LT, GT, LE, GE, EQ, NE, AND, OR, NEG, NOT, \ JMP, JZ, PRTC, PRTS, PRTI, HALT = range(24)
code_map = {
"fetch": FETCH, "store": STORE, "push": PUSH, "add": ADD, "sub": SUB, "mul": MUL, "div": DIV, "mod": MOD, "lt": LT, "gt": GT, "le": LE, "ge": GE, "eq": EQ, "ne": NE, "and": AND, "or": OR, "not": NOT, "neg": NEG, "jmp": JMP, "jz": JZ, "prtc": PRTC, "prts": PRTS, "prti": PRTI, "halt": HALT
}
input_file = None code = bytearray() string_pool = [] word_size = 4
- show error and exit
def error(msg):
print("%s" % (msg)) exit(1)
def int_to_bytes(val):
return struct.pack("<i", val)
def bytes_to_int(bstr):
return struct.unpack("<i", bstr)
def emit_byte(x):
code.append(x)
def emit_word(x):
s = int_to_bytes(x) for x in s: code.append(x)
def run_vm(data_size):
stack = [0 for i in range(data_size + 1)] pc = 0 while True: op = code[pc] pc += 1
if op == FETCH: stack.append(stack[bytes_to_int(code[pc:pc+word_size])[0]]); pc += word_size elif op == STORE: stack[bytes_to_int(code[pc:pc+word_size])[0]] = stack.pop(); pc += word_size elif op == PUSH: stack.append(bytes_to_int(code[pc:pc+word_size])[0]); pc += word_size elif op == ADD: stack[-2] += stack[-1]; stack.pop() elif op == SUB: stack[-2] -= stack[-1]; stack.pop() elif op == MUL: stack[-2] *= stack[-1]; stack.pop() # use C like division semantics elif op == DIV: stack[-2] = int(float(stack[-2]) / stack[-1]); stack.pop() elif op == MOD: stack[-2] = int(float(stack[-2]) % stack[-1]); stack.pop() elif op == LT: stack[-2] = stack[-2] < stack[-1]; stack.pop() elif op == GT: stack[-2] = stack[-2] > stack[-1]; stack.pop() elif op == LE: stack[-2] = stack[-2] <= stack[-1]; stack.pop() elif op == GE: stack[-2] = stack[-2] >= stack[-1]; stack.pop() elif op == EQ: stack[-2] = stack[-2] == stack[-1]; stack.pop() elif op == NE: stack[-2] = stack[-2] != stack[-1]; stack.pop() elif op == AND: stack[-2] = stack[-2] and stack[-1]; stack.pop() elif op == OR: stack[-2] = stack[-2] or stack[-1]; stack.pop() elif op == NEG: stack[-1] = -stack[-1] elif op == NOT: stack[-1] = not stack[-1] elif op == JMP: pc += bytes_to_int(code[pc:pc+word_size])[0] elif op == JZ: if stack.pop(): pc += word_size else: pc += bytes_to_int(code[pc:pc+word_size])[0] elif op == PRTC: print("%c" % (stack[-1]), end=); stack.pop() elif op == PRTS: print("%s" % (string_pool[stack[-1]]), end=); stack.pop() elif op == PRTI: print("%d" % (stack[-1]), end=); stack.pop() elif op == HALT: break
def str_trans(srce):
dest = "" i = 0 while i < len(srce): if srce[i] == '\\' and i + 1 < len(srce): if srce[i + 1] == 'n': dest += '\n' i += 2 elif srce[i + 1] == '\\': dest += '\\' i += 2 else: dest += srce[i] i += 1
return dest
def load_code():
global string_pool
line = input_file.readline() if len(line) == 0: error("empty line")
line_list = line.split() data_size = int(line_list[1]) n_strings = int(line_list[3])
for i in range(n_strings): string_pool.append(str_trans(input_file.readline().strip('"\n')))
while True: line = input_file.readline() if len(line) == 0: break line_list = line.split() offset = int(line_list[0]) instr = line_list[1] opcode = code_map.get(instr) if opcode == None: error("Unknown instruction %s at %d" % (instr, offset)) emit_byte(opcode) if opcode in [JMP, JZ]: p = int(line_list[3]) emit_word(p - (offset + 1)) elif opcode == PUSH: value = int(line_list[2]) emit_word(value) elif opcode in [FETCH, STORE]: value = int(line_list[2].strip('[]')) emit_word(value)
return data_size
- main driver
input_file = sys.stdin if len(sys.argv) > 1:
try: input_file = open(sys.argv[1], "r", 4096) except IOError as e: error(0, 0, "Can't open %s" % sys.argv[1])
data_size = load_code() run_vm(data_size)</lang>
Raku
(formerly Perl 6) Non-standard size of instructions (not byte-coded, 'Datasize' is 3 not 1) required adjusting the jump offsets.
NOTE: I don't think you are allowed to change the jump offsets. They come from another program from another company from another planet from another galaxy.
WIP: discovered 'P5pack' module, this may allow for completing the task properly, using correct offsets
<lang perl6>my @CODE = q:to/END/.lines; Datasize: 3 Strings: 2 "count is: " "\n"
0 push 1 5 store [0] 10 fetch [0] 15 push 10 20 lt 21 jz (68) 65 # jump value adjusted 26 push 0 31 prts 32 fetch [0] 37 prti 38 push 1 43 prts 44 fetch [0] 49 push 1 54 add 55 store [0] 60 jmp (-87) 10 # jump value adjusted 65 halt
END
my (@stack, @strings, @data, $memory); my $pc = 0;
(@CODE.shift) ~~ /'Datasize:' \s+ (\d+) \s+ 'Strings:' \s+ (\d+)/ or die "bad header"; my $w = $0; # 'wordsize' of op-codes and 'width' of data values @strings.push: (my $s = @CODE.shift) eq '"\n"' ?? "\n" !! $s.subst(/'"'/, , :g) for 1..$1;
sub value { substr($memory, ($pc += $w) - $w, $w).trim }
my %ops = (
'no-op' => sub { }, 'add' => sub { @stack[*-2] += @stack.pop }, 'sub' => sub { @stack[*-2] -= @stack.pop }, 'mul' => sub { @stack[*-2] *= @stack.pop }, 'div' => sub { @stack[*-2] /= @stack.pop }, 'mod' => sub { @stack[*-2] %= @stack.pop }, 'neg' => sub { @stack[*-1] = - @stack[*-1] }, 'and' => sub { @stack[*-2] &&= @stack[*-1]; @stack.pop }, 'or' => sub { @stack[*-2] ||= @stack[*-1]; @stack.pop }, 'not' => sub { @stack[*-1] = @stack[*-1] ?? 0 !! 1 }, 'lt' => sub { @stack[*-1] = @stack[*-2] < @stack.pop ?? 1 !! 0 }, 'gt' => sub { @stack[*-1] = @stack[*-2] > @stack.pop ?? 1 !! 0 }, 'le' => sub { @stack[*-1] = @stack[*-2] <= @stack.pop ?? 1 !! 0 }, 'ge' => sub { @stack[*-1] = @stack[*-2] >= @stack.pop ?? 1 !! 0 }, 'ne' => sub { @stack[*-1] = @stack[*-2] != @stack.pop ?? 1 !! 0 }, 'eq' => sub { @stack[*-1] = @stack[*-2] == @stack.pop ?? 1 !! 0 }, 'store' => sub { @data[&value] = @stack.pop }, 'fetch' => sub { @stack.push: @data[&value] // 0 }, 'push' => sub { @stack.push: value() }, 'jmp' => sub { $pc += value() - $w }, 'jz' => sub { $pc += @stack.pop ?? $w !! value() - $w }, 'prts' => sub { print @strings[@stack.pop] }, 'prti' => sub { print @stack.pop }, 'prtc' => sub { print chr @stack.pop }, 'halt' => sub { exit }
);
my %op2n = %ops.keys.sort Z=> 0..*; my %n2op = %op2n.invert; %n2op{} = 'no-op';
for @CODE -> $_ {
next unless /\w/; /^ \s* \d+ \s+ (\w+)/ or die "bad line $_"; $memory ~= %op2n{$0}.fmt("%{$w}d"); /'(' ('-'?\d+) ')' | (\d+) ']'? $/; $memory ~= $0 ?? $0.fmt("%{$w}d") !! ' ' x $w;
}
loop {
my $opcode = substr($memory, $pc, $w).trim; $pc += $w; %ops{%n2op{ $opcode }}();
}</lang>
- Output:
count is: 1 count is: 2 count is: 3 count is: 4 count is: 5 count is: 6 count is: 7 count is: 8 count is: 9
Scala
The complete implementation for the compiler tasks can be found in a GitHub repository at github.com/edadma/rosettacodeCompiler which includes full unit testing for the samples given in Compiler/Sample programs.
The following code implements a virtual machine for the output of the code generator.
<lang scala> package xyz.hyperreal.rosettacodeCompiler
import java.io.{BufferedReader, FileReader, Reader, StringReader}
import scala.collection.mutable import scala.collection.mutable.ArrayBuffer
object VirtualMachine {
private object Opcodes { val FETCH: Byte = 0 val STORE: Byte = 1 val PUSH: Byte = 2 val JMP: Byte = 3 val JZ: Byte = 4 val ADD: Byte = 5 val SUB: Byte = 6 val MUL: Byte = 7 val DIV: Byte = 8 val MOD: Byte = 9 val LT: Byte = 10 val GT: Byte = 11 val LE: Byte = 12 val GE: Byte = 13 val EQ: Byte = 14 val NE: Byte = 15 val AND: Byte = 16 val OR: Byte = 17 val NEG: Byte = 18 val NOT: Byte = 19 val PRTC: Byte = 20 val PRTI: Byte = 21 val PRTS: Byte = 22 val HALT: Byte = 23 }
import Opcodes._
private val HEADER_REGEX = "Datasize: ([0-9]+) Strings: ([0-9]+)" r private val STRING_REGEX = "\"([^\"]*)\"" r private val PUSH_REGEX = " *[0-9]+ push +([0-9]+|'(?:[^'\\n]|\\\\n|\\\\\\\\)')" r private val PRTS_REGEX = " *[0-9]+ prts" r private val PRTI_REGEX = " *[0-9]+ prti" r private val PRTC_REGEX = " *[0-9]+ prtc" r private val HALT_REGEX = " *[0-9]+ halt" r private val STORE_REGEX = " *[0-9]+ store +\\[([0-9]+)\\]" r private val FETCH_REGEX = " *[0-9]+ fetch +\\[([0-9]+)\\]" r private val LT_REGEX = " *[0-9]+ lt" r private val GT_REGEX = " *[0-9]+ gt" r private val LE_REGEX = " *[0-9]+ le" r private val GE_REGEX = " *[0-9]+ ge" r private val NE_REGEX = " *[0-9]+ ne" r private val EQ_REGEX = " *[0-9]+ eq" r private val JZ_REGEX = " *[0-9]+ jz +\\((-?[0-9]+)\\) [0-9]+" r private val ADD_REGEX = " *[0-9]+ add" r private val SUB_REGEX = " *[0-9]+ sub" r private val MUL_REGEX = " *[0-9]+ mul" r private val DIV_REGEX = " *[0-9]+ div" r private val MOD_REGEX = " *[0-9]+ mod" r private val AND_REGEX = " *[0-9]+ and" r private val OR_REGEX = " *[0-9]+ or" r private val NOT_REGEX = " *[0-9]+ not" r private val NEG_REGEX = " *[0-9]+ neg" r private val JMP_REGEX = " *[0-9]+ jmp +\\((-?[0-9]+)\\) [0-9]+" r
def fromStdin = fromReader(Console.in)
def fromFile(file: String) = fromReader(new FileReader(file))
def fromString(src: String) = fromReader(new StringReader(src))
def fromReader(r: Reader) = { val in = new BufferedReader(r) val vm = in.readLine match { case HEADER_REGEX(datasize, stringsize) => val strings = for (_ <- 1 to stringsize.toInt) yield in.readLine match { case STRING_REGEX(s) => unescape(s) case null => sys.error("expected string constant but encountered end of input") case s => sys.error(s"expected string constant: $s") } var line: String = null val code = new ArrayBuffer[Byte]
def addShort(a: Int) = { code += (a >> 8).toByte code += a.toByte }
def addInstIntOperand(opcode: Byte, operand: Int) = { code += opcode addShort(operand >> 16) addShort(operand) }
def addInst(opcode: Byte, operand: String) = addInstIntOperand(opcode, operand.toInt)
while ({ line = in.readLine; line ne null }) line match { case PUSH_REGEX(n) if n startsWith "'" => addInstIntOperand(PUSH, unescape(n.substring(1, n.length - 1)).head) case PUSH_REGEX(n) => addInst(PUSH, n) case PRTS_REGEX() => code += PRTS case PRTI_REGEX() => code += PRTI case PRTC_REGEX() => code += PRTC case HALT_REGEX() => code += HALT case STORE_REGEX(idx) => addInst(STORE, idx) case FETCH_REGEX(idx) => addInst(FETCH, idx) case LT_REGEX() => code += LT case GT_REGEX() => code += GT case LE_REGEX() => code += LE case GE_REGEX() => code += GE case NE_REGEX() => code += NE case EQ_REGEX() => code += EQ case JZ_REGEX(disp) => addInst(JZ, disp) case ADD_REGEX() => code += ADD case SUB_REGEX() => code += SUB case MUL_REGEX() => code += MUL case DIV_REGEX() => code += DIV case MOD_REGEX() => code += MOD case AND_REGEX() => code += AND case OR_REGEX() => code += OR case NOT_REGEX() => code += NOT case NEG_REGEX() => code += NEG case JMP_REGEX(disp) => addInst(JMP, disp) }
new VirtualMachine(code, datasize.toInt, strings) case _ => sys.error("expected header") }
in.close vm }
}
class VirtualMachine(code: IndexedSeq[Byte], datasize: Int, strings: IndexedSeq[String]) {
import VirtualMachine.Opcodes._
var pc = 0 val stack = new mutable.ArrayStack[Int] val data = new Array[Int](datasize) var running = false
def getByte = { val byte = code(pc) & 0xFF
pc += 1 byte }
def getShort = getByte << 8 | getByte
def getInt = getShort << 16 | getShort
def pushBoolean(b: Boolean) = stack push (if (b) 1 else 0)
def popBoolean = if (stack.pop != 0) true else false
def operator(f: (Int, Int) => Int) = { val y = stack.pop
stack.push(f(stack.pop, y)) }
def relation(r: (Int, Int) => Boolean) = { val y = stack.pop
pushBoolean(r(stack.pop, y)) }
def connective(c: (Boolean, Boolean) => Boolean) = pushBoolean(c(popBoolean, popBoolean))
def execute: Unit = getByte match { case FETCH => stack push data(getInt) case STORE => data(getInt) = stack.pop case PUSH => stack push getInt case JMP => pc = pc + getInt case JZ => if (stack.pop == 0) pc = pc + getInt else pc += 4 case ADD => operator(_ + _) case SUB => operator(_ - _) case MUL => operator(_ * _) case DIV => operator(_ / _) case MOD => operator(_ % _) case LT => relation(_ < _) case GT => relation(_ > _) case LE => relation(_ <= _) case GE => relation(_ >= _) case EQ => relation(_ == _) case NE => relation(_ != _) case AND => connective(_ && _) case OR => connective(_ || _) case NEG => stack push -stack.pop case NOT => pushBoolean(!popBoolean) case PRTC => print(stack.pop.toChar) case PRTI => print(stack.pop) case PRTS => print(strings(stack.pop)) case HALT => running = false }
def run = { pc = 0 stack.clear running = true
for (i <- data.indices) data(i) = 0
while (running) execute }
} </lang>
The above code depends on the function unescape() to perform string escape sequence translation. That function is defined in the following separate source file.
<lang scala> package xyz.hyperreal
import java.io.ByteArrayOutputStream
package object rosettacodeCompiler {
val escapes = "\\\\b|\\\\f|\\\\t|\\\\r|\\\\n|\\\\\\\\|\\\\\"" r
def unescape(s: String) = escapes.replaceAllIn(s, _.matched match { case "\\b" => "\b" case "\\f" => "\f" case "\\t" => "\t" case "\\r" => "\r" case "\\n" => "\n" case "\\\\" => "\\" case "\\\"" => "\"" })
def capture(thunk: => Unit) = { val buf = new ByteArrayOutputStream
Console.withOut(buf)(thunk) buf.toString }
} </lang>
Scheme
The interpreter uses recursion, representing the stack as a list; the stack pointer is the reference to the top of the list. This is a more natural solution in Scheme than a fixed stack array, and removes the danger of stack overflow. Operations on or returning booleans have been adapted to use integers, 0 for false and anything else for true.
All of the "Compiler/Sample programs" are correctly interpreted.
<lang scheme> (import (scheme base)
(scheme char) (scheme file) (scheme process-context) (scheme write) (only (srfi 13) string-contains string-delete string-filter string-replace string-tokenize))
(define *word-size* 4)
- Mappings from operation symbols to internal procedures.
- We define operations appropriate to virtual machine
- e.g. division must return an int, not a rational
- boolean values are treated as numbers
- 0 is false, other is true
(define *unary-ops*
(list (cons 'neg (lambda (a) (- a))) (cons 'not (lambda (a) (if (zero? a) 1 0)))))
(define *binary-ops*
(let ((number-comp (lambda (op) (lambda (a b) (if (op a b) 1 0))))) (list (cons 'add +) (cons 'sub -) (cons 'mul *) (cons 'div (lambda (a b) (truncate (/ a b)))) ; int division (cons 'mod modulo) (cons 'lt (number-comp <)) (cons 'gt (number-comp >)) (cons 'le (number-comp <=)) (cons 'ge (number-comp >=)) (cons 'eq (lambda (a b) (if (= a b) 1 0))) (cons 'ne (lambda (a b) (if (= a b) 0 1))) (cons 'and (lambda (a b) ; make "and" work on numbers (if (and (not (zero? a)) (not (zero? b))) 1 0))) (cons 'or (lambda (a b) ; make "or" work on numbers (if (or (not (zero? a)) (not (zero? b))) 1 0))))))
- read information from file, returning vectors for data and strings
- and a list of the code instructions
(define (read-code filename)
(define (setup-definitions str) (values ; return vectors for (data strings) of required size (make-vector (string->number (list-ref str 1)) #f) (make-vector (string->number (list-ref str 3)) #f))) (define (read-strings strings) ; read constant strings into data structure (define (replace-newlines chars) ; replace newlines, obeying \\n (cond ((< (length chars) 2) ; finished list chars) ((and (>= (length chars) 3) ; preserve \\n (char=? #\\ (car chars)) (char=? #\\ (cadr chars)) (char=? #\n (cadr (cdr chars)))) (cons (car chars) (cons (cadr chars) (cons (cadr (cdr chars)) (replace-newlines (cdr (cdr (cdr chars)))))))) ((and (char=? #\\ (car chars)) ; replace \n with newline (char=? #\n (cadr chars))) (cons #\newline (replace-newlines (cdr (cdr chars))))) (else ; keep char and look further (cons (car chars) (replace-newlines (cdr chars)))))) (define (tidy-string str) ; remove quotes, map newlines to actual newlines (list->string (replace-newlines (string->list (string-delete #\" str))))) ; " (needed to satisfy rosettacode's scheme syntax highlighter) ; (do ((i 0 (+ i 1))) ((= i (vector-length strings)) ) (vector-set! strings i (tidy-string (read-line))))) (define (read-code) (define (cleanup-code opn) ; tidy instructions, parsing numbers (let ((addr (string->number (car opn))) (instr (string->symbol (cadr opn)))) (cond ((= 2 (length opn)) (list addr instr)) ((= 3 (length opn)) (list addr instr (string->number (string-filter char-numeric? (list-ref opn 2))))) (else ; assume length 4, jump instructions (list addr instr (string->number (list-ref opn 3))))))) ; (let loop ((result '())) (let ((line (read-line))) (if (eof-object? line) (reverse (map cleanup-code result)) (loop (cons (string-tokenize line) result)))))) ; (with-input-from-file filename (lambda () (let-values (((data strings) (setup-definitions (string-tokenize (read-line))))) (read-strings strings) (values data strings (read-code))))))
- run the virtual machine
(define (run-program data strings code)
(define (get-instruction n) (if (assq n code) (cdr (assq n code)) (error "Could not find instruction"))) ; (let loop ((stack '()) (pc 0)) (let ((op (get-instruction pc))) (case (car op) ((fetch) (loop (cons (vector-ref data (cadr op)) stack) (+ pc 1 *word-size*))) ((store) (vector-set! data (cadr op) (car stack)) (loop (cdr stack) (+ pc 1 *word-size*))) ((push) (loop (cons (cadr op) stack) (+ pc 1 *word-size*))) ((add sub mul div mod lt gt le eq ne and or) (let ((instr (assq (car op) *binary-ops*))) (if instr (loop (cons ((cdr instr) (cadr stack) ; replace top two with result (car stack)) (cdr (cdr stack))) (+ pc 1)) (error "Unknown binary operation")))) ((neg not) (let ((instr (assq (car op) *unary-ops*))) (if instr (loop (cons ((cdr instr) (car stack)) ; replace top with result (cdr stack)) (+ pc 1)) (error "Unknown unary operation")))) ((jmp) (loop stack (cadr op))) ((jz) (loop (cdr stack) (if (zero? (car stack)) (cadr op) (+ pc 1 *word-size*)))) ((prtc) (display (integer->char (car stack))) (loop (cdr stack) (+ pc 1))) ((prti) (display (car stack)) (loop (cdr stack) (+ pc 1))) ((prts) (display (vector-ref strings (car stack))) (loop (cdr stack) (+ pc 1))) ((halt) #t)))))
- create and run virtual machine from filename passed on command line
(if (= 2 (length (command-line)))
(let-values (((data strings code) (read-code (cadr (command-line))))) (run-program data strings code)) (display "Error: pass a .asm filename\n"))
</lang>
Wren
<lang ecmascript>import "/dynamic" for Enum import "/crypto" for Bytes import "/fmt" for Conv import "/ioutil" for FileUtil
var codes = [
"fetch", "store", "push", "add", "sub", "mul", "div", "mod", "lt", "gt", "le", "ge", "eq", "ne", "and", "or", "neg", "not", "jmp", "jz", "prtc", "prts", "prti", "halt"
]
var Code = Enum.create("Code", codes)
var codeMap = {
"fetch": Code.fetch, "store": Code.store, "push": Code.push, "add": Code.add, "sub": Code.sub, "mul": Code.mul, "div": Code.div, "mod": Code.mod, "lt": Code.lt, "gt": Code.gt, "le": Code.le, "ge": Code.ge, "eq": Code.eq, "ne": Code.ne, "and": Code.and, "or": Code.or, "neg": Code.neg, "not": Code.not, "jmp": Code.jmp, "jz": Code.jz, "prtc": Code.prtc, "prts": Code.prts, "prti": Code.prti, "halt": Code.halt
}
var object = [] var stringPool = []
var reportError = Fn.new { |msg| Fiber.abort("error : %(msg)") }
var emitByte = Fn.new { |c| object.add(c) }
var emitWord = Fn.new { |n|
var bs = Bytes.fromIntLE(n) for (b in bs) emitByte.call(b)
}
// Converts the 4 bytes starting at object[pc] to an unsigned 32 bit integer // and thence to a signed 32 bit integer var toInt32LE = Fn.new { |pc|
var x = Bytes.toIntLE(object[pc...pc+4]) if (x >= 2.pow(31)) x = x - 2.pow(32) return x
}
/* Virtual Machine interpreter */
var runVM = Fn.new { |dataSize|
var stack = List.filled(dataSize + 1, 0) var pc = 0 while (true) { var op = object[pc] pc = pc + 1 if (op == Code.fetch) { var x = toInt32LE.call(pc) stack.add(stack[x]) pc = pc + 4 } else if (op == Code.store) { var x = toInt32LE.call(pc) var ln = stack.count stack[x] = stack[ln-1] stack = stack[0...ln-1] pc = pc + 4 } else if (op == Code.push) { var x = toInt32LE.call(pc) stack.add(x) pc = pc + 4 } else if (op == Code.add) { var ln = stack.count stack[ln-2] = stack[ln-2] + stack[ln-1] stack = stack[0...ln-1] } else if (op == Code.sub) { var ln = stack.count stack[ln-2] = stack[ln-2] - stack[ln-1] stack = stack[0...ln-1] } else if (op == Code.mul) { var ln = stack.count stack[ln-2] = stack[ln-2] * stack[ln-1] stack = stack[0...ln-1] } else if (op == Code.div) { var ln = stack.count stack[ln-2] = (stack[ln-2] / stack[ln-1]).truncate stack = stack[0...ln-1] } else if (op == Code.mod) { var ln = stack.count stack[ln-2] = stack[ln-2] % stack[ln-1] stack = stack[0...ln-1] } else if (op == Code.lt) { var ln = stack.count stack[ln-2] = Conv.btoi(stack[ln-2] < stack[ln-1]) stack = stack[0...ln-1] } else if (op == Code.gt) { var ln = stack.count stack[ln-2] = Conv.btoi(stack[ln-2] > stack[ln-1]) stack = stack[0...ln-1] } else if (op == Code.le) { var ln = stack.count stack[ln-2] = Conv.btoi(stack[ln-2] <= stack[ln-1]) stack = stack[0...ln-1] } else if (op == Code.ge) { var ln = stack.count stack[ln-2] = Conv.btoi(stack[ln-2] >= stack[ln-1]) stack = stack[0...ln-1] } else if (op == Code.eq) { var ln = stack.count stack[ln-2] = Conv.btoi(stack[ln-2] == stack[ln-1]) stack = stack[0...ln-1] } else if (op == Code.ne) { var ln = stack.count stack[ln-2] = Conv.btoi(stack[ln-2] != stack[ln-1]) stack = stack[0...ln-1] } else if (op == Code.and) { var ln = stack.count stack[ln-2] = Conv.btoi(Conv.itob(stack[ln-2]) && Conv.itob(stack[ln-1])) stack = stack[0...ln-1] } else if (op == Code.or) { var ln = stack.count stack[ln-2] = Conv.btoi(Conv.itob(stack[ln-2]) || Conv.itob(stack[ln-1])) stack = stack[0...ln-1] } else if (op == Code.neg) { var ln = stack.count stack[ln-1] = -stack[ln-1] } else if (op == Code.not) { var ln = stack.count stack[ln-1] = Conv.btoi(!Conv.itob(stack[ln-1])) } else if (op == Code.jmp) { var x = toInt32LE.call(pc) pc = pc + x } else if (op == Code.jz) { var ln = stack.count var v = stack[ln-1] stack = stack[0...ln-1] if (v != 0) { pc = pc + 4 } else { var x = toInt32LE.call(pc) pc = pc + x } } else if (op == Code.prtc) { var ln = stack.count System.write(String.fromByte(stack[ln-1])) stack = stack[0...ln-1] } else if (op == Code.prts) { var ln = stack.count System.write(stringPool[stack[ln-1]]) stack = stack[0...ln-1] } else if (op == Code.prti) { var ln = stack.count System.write(stack[ln-1]) stack = stack[0...ln-1] } else if (op == Code.halt) { return } else { reportError.call("Unknown opcode %(op)") } }
}
var translate = Fn.new { |s|
var d = "" var i = 0 while (i < s.count) { if (s[i] == "\\" && (i+1) < s.count) { if (s[i+1] == "n") { d = d + "\n" i = i + 1 } else if (s[i+1] == "\\") { d = d + "\\" i = i + 1 } } else { d = d + s[i] } i = i + 1 } return d
}
var lines = [] var lineCount = 0 var lineNum = 0
var loadCode = Fn.new {
var dataSize var firstLine = true while (lineNum < lineCount) { var line = lines[lineNum].trimEnd(" \t") lineNum = lineNum + 1 if (line.count == 0) { if (firstLine) { reportError.call("empty line") } else { break } } var lineList = line.split(" ").where { |s| s != "" }.toList if (firstLine) { dataSize = Num.fromString(lineList[1]) var nStrings = Num.fromString(lineList[3]) for (i in 0...nStrings) { var s = lines[lineNum].trim("\"\n") lineNum = lineNum + 1 stringPool.add(translate.call(s)) } firstLine = false continue } var offset = Num.fromString(lineList[0]) var instr = lineList[1] var opCode = codeMap[instr] if (!opCode) { reportError.call("Unknown instruction %(instr) at %(opCode)") } emitByte.call(opCode) if (opCode == Code.jmp || opCode == Code.jz) { var p = Num.fromString(lineList[3]) emitWord.call(p - offset - 1) } else if (opCode == Code.push) { var value = Num.fromString(lineList[2]) emitWord.call(value) } else if (opCode == Code.fetch || opCode == Code.store) { var value = Num.fromString(lineList[2].trim("[]")) emitWord.call(value) } } return dataSize
}
lines = FileUtil.readLines("codegen.txt") lineCount = lines.count runVM.call(loadCode.call())</lang>
- Output:
count is: 1 count is: 2 count is: 3 count is: 4 count is: 5 count is: 6 count is: 7 count is: 8 count is: 9
Zig
<lang zig> const std = @import("std");
pub const VirtualMachineError = error{OutOfMemory};
pub const VirtualMachine = struct {
allocator: std.mem.Allocator, stack: [stack_size]i32, program: std.ArrayList(u8), sp: usize, // stack pointer pc: usize, // program counter string_pool: std.ArrayList([]const u8), // all the strings in the program globals: std.ArrayList(i32), // all the variables in the program, they are global output: std.ArrayList(u8), // Instead of outputting to stdout, we do it here for better testing.
const Self = @This(); const stack_size = 32; // Can be arbitrarily increased/decreased as long as we have enough. const word_size = @sizeOf(i32);
pub fn init( allocator: std.mem.Allocator, program: std.ArrayList(u8), string_pool: std.ArrayList([]const u8), globals: std.ArrayList(i32), ) Self { return VirtualMachine{ .allocator = allocator, .stack = [_]i32{std.math.maxInt(i32)} ** stack_size, .program = program, .sp = 0, .pc = 0, .string_pool = string_pool, .globals = globals, .output = std.ArrayList(u8).init(allocator), }; }
pub fn interp(self: *Self) VirtualMachineError!void { while (true) : (self.pc += 1) { switch (@intToEnum(Op, self.program.items[self.pc])) { .push => self.push(self.unpackInt()), .store => self.globals.items[@intCast(usize, self.unpackInt())] = self.pop(), .fetch => self.push(self.globals.items[@intCast(usize, self.unpackInt())]), .jmp => self.pc = @intCast(usize, self.unpackInt() - 1), .jz => { if (self.pop() == 0) { // -1 because `while` increases it with every iteration. // This doesn't allow to jump to location 0 because we use `usize` for `pc`, // just arbitrary implementation limitation. self.pc = @intCast(usize, self.unpackInt() - 1); } else { self.pc += word_size; } }, .prts => try self.out("{s}", .{self.string_pool.items[@intCast(usize, self.pop())]}), .prti => try self.out("{d}", .{self.pop()}), .prtc => try self.out("{c}", .{@intCast(u8, self.pop())}), .lt => self.binOp(lt), .le => self.binOp(le), .gt => self.binOp(gt), .ge => self.binOp(ge), .eq => self.binOp(eq), .ne => self.binOp(ne), .add => self.binOp(add), .mul => self.binOp(mul), .sub => self.binOp(sub), .div => self.binOp(div), .mod => self.binOp(mod), .@"and" => self.binOp(@"and"), .@"or" => self.binOp(@"or"), .not => self.push(@boolToInt(self.pop() == 0)), .neg => self.push(-self.pop()), .halt => break, } } }
fn push(self: *Self, n: i32) void { self.sp += 1; self.stack[self.sp] = n; }
fn pop(self: *Self) i32 { std.debug.assert(self.sp != 0); self.sp -= 1; return self.stack[self.sp + 1]; }
fn unpackInt(self: *Self) i32 { const arg_ptr = @ptrCast(*[4]u8, self.program.items[self.pc + 1 .. self.pc + 1 + word_size]); self.pc += word_size; var arg_array = arg_ptr.*; const arg = @ptrCast(*i32, @alignCast(@alignOf(i32), &arg_array)); return arg.*; }
pub fn out(self: *Self, comptime format: []const u8, args: anytype) VirtualMachineError!void { try self.output.writer().print(format, args); }
fn binOp(self: *Self, func: fn (a: i32, b: i32) i32) void { const a = self.pop(); const b = self.pop(); // Note that arguments are in reversed order because this is how we interact with // push/pop operations of the stack. const result = func(b, a); self.push(result); }
fn lt(a: i32, b: i32) i32 { return @boolToInt(a < b); } fn le(a: i32, b: i32) i32 { return @boolToInt(a <= b); } fn gt(a: i32, b: i32) i32 { return @boolToInt(a > b); } fn ge(a: i32, b: i32) i32 { return @boolToInt(a >= b); } fn eq(a: i32, b: i32) i32 { return @boolToInt(a == b); } fn ne(a: i32, b: i32) i32 { return @boolToInt(a != b); } fn add(a: i32, b: i32) i32 { return a + b; } fn sub(a: i32, b: i32) i32 { return a - b; } fn mul(a: i32, b: i32) i32 { return a * b; } fn div(a: i32, b: i32) i32 { return @divTrunc(a, b); } fn mod(a: i32, b: i32) i32 { return @mod(a, b); } fn @"or"(a: i32, b: i32) i32 { return @boolToInt((a != 0) or (b != 0)); } fn @"and"(a: i32, b: i32) i32 { return @boolToInt((a != 0) and (b != 0)); }
};
pub fn main() !void {
var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator); defer arena.deinit(); const allocator = arena.allocator();
var arg_it = std.process.args(); _ = try arg_it.next(allocator) orelse unreachable; // program name const file_name = arg_it.next(allocator); // We accept both files and standard input. var file_handle = blk: { if (file_name) |file_name_delimited| { const fname: []const u8 = try file_name_delimited; break :blk try std.fs.cwd().openFile(fname, .{}); } else { break :blk std.io.getStdIn(); } }; defer file_handle.close(); const input_content = try file_handle.readToEndAlloc(allocator, std.math.maxInt(usize));
var string_pool = std.ArrayList([]const u8).init(allocator); var globals = std.ArrayList(i32).init(allocator); const bytecode = try loadBytecode(allocator, input_content, &string_pool, &globals); var vm = VirtualMachine.init(allocator, bytecode, string_pool, globals); try vm.interp(); const result: []const u8 = vm.output.items; _ = try std.io.getStdOut().write(result);
}
pub const Op = enum(u8) {
fetch, store, push, add, sub, mul, div, mod, lt, gt, le, ge, eq, ne, @"and", @"or", neg, not, jmp, jz, prtc, prts, prti, halt,
const from_string = std.ComptimeStringMap(Op, .{ .{ "fetch", .fetch }, .{ "store", .store }, .{ "push", .push }, .{ "add", .add }, .{ "sub", .sub }, .{ "mul", .mul }, .{ "div", .div }, .{ "mod", .mod }, .{ "lt", .lt }, .{ "gt", .gt }, .{ "le", .le }, .{ "ge", .ge }, .{ "eq", .eq }, .{ "ne", .ne }, .{ "and", .@"and" }, .{ "or", .@"or" }, .{ "neg", .neg }, .{ "not", .not }, .{ "jmp", .jmp }, .{ "jz", .jz }, .{ "prtc", .prtc }, .{ "prts", .prts }, .{ "prti", .prti }, .{ "halt", .halt }, });
pub fn fromString(str: []const u8) Op { return from_string.get(str).?; }
};
// 100 lines of code to load serialized bytecode, eh fn loadBytecode(
allocator: std.mem.Allocator, str: []const u8, string_pool: *std.ArrayList([]const u8), globals: *std.ArrayList(i32),
) !std.ArrayList(u8) {
var result = std.ArrayList(u8).init(allocator); var line_it = std.mem.split(u8, str, "\n"); while (line_it.next()) |line| { if (std.mem.indexOf(u8, line, "halt")) |_| { var tok_it = std.mem.tokenize(u8, line, " "); const size = try std.fmt.parseInt(usize, tok_it.next().?, 10); try result.resize(size + 1); break; } }
line_it.index = 0; const first_line = line_it.next().?; const strings_index = std.mem.indexOf(u8, first_line, " Strings: ").?; const globals_size = try std.fmt.parseInt(usize, first_line["Datasize: ".len..strings_index], 10); const string_pool_size = try std.fmt.parseInt(usize, first_line[strings_index + " Strings: ".len ..], 10); try globals.resize(globals_size); try string_pool.ensureTotalCapacity(string_pool_size); var string_cnt: usize = 0; while (string_cnt < string_pool_size) : (string_cnt += 1) { const line = line_it.next().?; var program_string = try std.ArrayList(u8).initCapacity(allocator, line.len); var escaped = false; // Skip double quotes for (line[1 .. line.len - 1]) |ch| { if (escaped) { escaped = false; switch (ch) { '\\' => try program_string.append('\\'), 'n' => try program_string.append('\n'), else => { std.debug.print("unknown escape sequence: {c}\n", .{ch}); std.os.exit(1); }, } } else { switch (ch) { '\\' => escaped = true, else => try program_string.append(ch), } } } try string_pool.append(program_string.items); } while (line_it.next()) |line| { if (line.len == 0) break;
var tok_it = std.mem.tokenize(u8, line, " "); const address = try std.fmt.parseInt(usize, tok_it.next().?, 10); const op = Op.fromString(tok_it.next().?); result.items[address] = @enumToInt(op); switch (op) { .fetch, .store => { const index_bracketed = tok_it.rest(); const index = try std.fmt.parseInt(i32, index_bracketed[1 .. index_bracketed.len - 1], 10); insertInt(&result, address + 1, index); }, .push => { insertInt(&result, address + 1, try std.fmt.parseInt(i32, tok_it.rest(), 10)); }, .jmp, .jz => { _ = tok_it.next(); insertInt(&result, address + 1, try std.fmt.parseInt(i32, tok_it.rest(), 10)); }, else => {}, } } return result;
}
fn insertInt(array: *std.ArrayList(u8), address: usize, n: i32) void {
const word_size = @sizeOf(i32); var i: usize = 0; var n_var = n; var n_bytes = @ptrCast(*[4]u8, &n_var); while (i < word_size) : (i += 1) { array.items[@intCast(usize, address + i)] = n_bytes[@intCast(usize, i)]; }
} </lang>
zkl
File rvm.zkl: <lang zkl>// This is a little endian machine const WORD_SIZE=4; const{ var _n=-1; var[proxy]N=fcn{ _n+=1 } } // enumerator const FETCH=N, STORE=N, PUSH=N, ADD=N, SUB=N, MUL=N, DIV=N, MOD=N,
LT=N, GT=N, LE=N, GE=N, EQ=N, NE=N, AND=N, OR=N, NEG=N, NOT=N, JMP=N, JZ=N, PRTC=N, PRTS=N, PRTI=N, HALT=N;
var [const]
bops=Dictionary(ADD,'+, SUB,'-, MUL,'*, DIV,'/, MOD,'%,
LT,'<, GT,'>, LE,'<=, GE,'>=, NE,'!=, EQ,'==, NE,'!=),
strings=List(); // filled in by the loader
// do a binary op
fcn bop(stack,op){ a,b:=stack.pop(),stack.pop(); stack.append(bops[op](b,a)) }
fcn run_vm(code,stackSz){
stack,pc := List.createLong(stackSz,0), 0; while(True){ op:=code[pc]; pc+=1; switch(op){ case(FETCH){
stack.append(stack[code.toLittleEndian(pc,WORD_SIZE,False)]);
pc+=WORD_SIZE;
} case(STORE){ stack[code.toLittleEndian(pc,WORD_SIZE)]=stack.pop(); pc+=WORD_SIZE; }
case(PUSH){
stack.append(code.toLittleEndian(pc,WORD_SIZE,False)); // signed pc+=WORD_SIZE; } case(ADD,SUB,MUL,DIV,MOD,LT,GT,LE,GE,EQ,NE) { bop(stack,op) } case(AND){ stack[-2] = stack[-2] and stack[-1]; stack.pop() } case(OR) { stack[-2] = stack[-2] or stack[-1]; stack.pop() } case(NEG){ stack[-1] = -stack[-1] } case(NOT){ stack[-1] = not stack[-1] } case(JMP){ pc+=code.toLittleEndian(pc,WORD_SIZE,False); } // signed case(JZ) { if(stack.pop()) pc+=WORD_SIZE; else pc+=code.toLittleEndian(pc,WORD_SIZE,False); } case(PRTC){ } // not implemented case(PRTS){ print(strings[stack.pop()]) } case(PRTI){ print(stack.pop()) } case(HALT){ break } else{ throw(Exception.AssertionError( "Bad op code (%d) @%d".fmt(op,pc-1))) }
} }
}
code:=File(vm.nthArg(0)).read(); // binary code file
// the string table is prepended to the code: // 66,1 byte len,text, no trailing '\0' needed
while(code[0]==66){ // read the string table
sz:=code[1]; strings.append(code[2,sz].text); code.del(0,sz+2);
} run_vm(code,1000);</lang> The binary code file code.bin:
- Output:
$ zkl hexDump code.bin 0: 42 0a 63 6f 75 6e 74 20 | 69 73 3a 20 42 01 0a 02 B.count is: B... 16: 01 00 00 00 01 00 00 00 | 00 00 00 00 00 00 02 0a ................ 32: 00 00 00 08 13 2b 00 00 | 00 02 00 00 00 00 15 00 .....+.......... 48: 00 00 00 00 16 02 01 00 | 00 00 15 00 00 00 00 00 ................ 64: 02 01 00 00 00 03 01 00 | 00 00 00 12 cd ff ff ff ................ 80: 17
- Output:
$ zkl rvm code.bin count is: 1 count is: 2 count is: 3 count is: 4 count is: 5 count is: 6 count is: 7 count is: 8 count is: 9