User:Eriksiers/BBC BASIC detokenizer: Difference between revisions
m (→Exception: changed phrasing; same intent as before) |
m (updated lang tag) |
||
(One intermediate revision by the same user not shown) | |||
Line 1:
This is to convert tokenized [[BBC BASIC for Windows]] files to a human-readable (plain text) format. Note that this will ''probably'' not handle tokenized files from other versions of [[:Category:BBC BASIC|BBC BASIC]] -- specifically, I know that the RISC OS version uses a different format (see [http://bb4w.wikispaces.com/Notes+for+RISC+OS+users this page] and [http://bb4w.wikispaces.com/Format this page] at the BB4W wiki).
Two pages that may be of interest at the BB4W wiki are [http://bb4w.wikispaces.com/Tokeniser Tokeniser] and [http://bb4w.wikispaces.com/Detokeniser Detokeniser], both of which discuss this topic using BBC BASIC code.
I believe that the previous issues with line numbers as the targets of GOTO and the like has been fixed, using a QB translation of code on the above-mentioned Detokeniser page at the BB4W wiki. (View the page history for details, if you're interested.)
The [[:Category:VBA|VBA]] code that generates the below program has been moved to [[User:Eriksiers/BBC BASIC detokenizer/Generator|its own page]].
==Licensing==
All source code on this page is Copyright ©2012 Erik Siers. The sources are dual-licensed under the terms of the following licenses:
* The [http://www.gnu.org/licenses/old-licenses/gpl-2.0.html GNU General Public License (GNU GPL)
* The [http://www.gnu.org/licenses/fdl-1.2.html GNU Free Documentation License (GNU FDL)
The two licenses are not 100% compatible; which license you use, I leave up to you.
===Exception===
Any source code that is generated by the
=Detoken.bas=
This is the code that actually does the work. I chose QBasic for a number of reasons, the most important of which are:
Line 100 ⟶ 23:
* ease of including in [[:Category:Batch File|batch files]]
Note that since I expect the code to run under QBasic, you don't specify the file(s) to be detokenized on the command line, but instead do it one file at a time, via the <
<
DATA 11,
DATA 133,
DATA 144,
DATA 154,
DATA 164,
DATA 174,
DATA 184,
DATA 194,
DATA 204,
DATA 214,
DATA 224,
DATA 234,
DATA 244,
DATA 254,
DIM
DIM tmp AS LONG, lineNumDec AS LONG
DIM keywords(255) AS STRING
DO
Line 130 ⟶ 54:
'skip first char
GET #2, , encoded
GET #2, ,
SEEK #2, SEEK(2) - 2
lineNum = (lineNum AND &HFFFF&)
IF (CHR$(0) = encoded) AND (&HFFFF& = lineNum) THEN
CLOSE : SYSTEM
ELSEIF lineNum <> 0 THEN
PRINT #3, LTRIM$(STR$(lineNum)); " ";
END IF
DO UNTIL EOF(2)
Line 147 ⟶ 73:
IF NOT EOF(2) THEN
GET #2, , encoded
GET #2, ,
lineNum = (lineNum AND
ELSEIF lineNum <> 0 THEN
PRINT #3, LTRIM$(STR$(lineNum)); " ";
END IF
END IF
CASE &H8D
'deal with encoded line numbers (GOTO targets)
IF NOT EOF(2) THEN
GET #2, , encoded
IF NOT EOF(2) THEN
GET #2, , lineNum
SEEK #2, SEEK(2) - 2
lineNum = (lineNum AND &HFFFF&)
ELSE
'should NEVER get here
lineNum = 0
END IF
lo& = ((ASC(encoded) * 4&) AND &HC0) XOR (lineNum AND 255&) 'LEFT SHIFT ASC(encoded), 2
hi& = ((ASC(encoded) * 16&) AND &HC0) XOR ((lineNum \ 256&) AND 255&) 'LEFT SHIFT ASC(encoded), 4
tmp = lo& + (hi& * 256&)
PRINT #3, LTRIM$(RTRIM$(STR$(tmp)));
END IF
CASE ELSE
Line 163 ⟶ 108:
LOOP
CLOSE
SYSTEM</
|
Latest revision as of 04:55, 1 September 2022
This is to convert tokenized BBC BASIC for Windows files to a human-readable (plain text) format. Note that this will probably not handle tokenized files from other versions of BBC BASIC -- specifically, I know that the RISC OS version uses a different format (see this page and this page at the BB4W wiki).
Two pages that may be of interest at the BB4W wiki are Tokeniser and Detokeniser, both of which discuss this topic using BBC BASIC code.
I believe that the previous issues with line numbers as the targets of GOTO and the like has been fixed, using a QB translation of code on the above-mentioned Detokeniser page at the BB4W wiki. (View the page history for details, if you're interested.)
The VBA code that generates the below program has been moved to its own page.
Licensing
All source code on this page is Copyright ©2012 Erik Siers. The sources are dual-licensed under the terms of the following licenses:
- The GNU General Public License (GNU GPL) v2 or, at your option, any later version.
- The GNU Free Documentation License (GNU FDL) v1.2 or, at your option, any later version.
The two licenses are not 100% compatible; which license you use, I leave up to you.
Exception
Any source code that is generated by the generated QBasic program (only) is specifically excluded from the above licensing terms, and is instead covered by whatever license(s) cover the original, tokenized sources.
Detoken.bas
This is the code that actually does the work. I chose QBasic for a number of reasons, the most important of which are:
- interpreter (no compile time)
- simple
- ease of including in batch files
Note that since I expect the code to run under QBasic, you don't specify the file(s) to be detokenized on the command line, but instead do it one file at a time, via the DETOKEN
environment variable. (This can easily be changed, but I don't care to.)
DATA 1,CIRCLE,2,ELLIPSE,3,FILLED,4,MOUSE,5,ORIGIN,6,QUIT,7,RECTANGLE,8,SWAP,9,SYS,10,TINT
DATA 11,WAIT,12,INSTALL,14,PRIVATE,15,BY,16,EXIT,128,AND,129,DIV,130,EOR,131,MOD,132,OR
DATA 133,ERROR,134,LINE,135,OFF,136,STEP,137,SPC,138,TAB(,139,ELSE,140,THEN,142,OPENIN,143,PTR
DATA 144,PAGE,145,TIME,146,LOMEM,147,HIMEM,148,ABS,149,ACS,150,ADVAL,151,ASC,152,ASN,153,ATN
DATA 154,BGET,155,COS,156,COUNT,157,DEG,158,ERL,159,ERR,160,EVAL,161,EXP,162,EXT,163,FALSE
DATA 164,FN,165,GET,166,INKEY,167,INSTR(,168,INT,169,LEN,170,LN,171,LOG,172,NOT,173,OPENUP
DATA 174,OPENOUT,175,PI,176,POINT,177,POS,178,RAD,179,RND,180,SGN,181,SIN,182,SQR,183,TAN
DATA 184,TO,185,TRUE,186,USR,187,VAL,188,RND,189,CHR$,190,GET$,191,INKEY$,192,LEFT$,193,MID$(
DATA 194,RIGHT$,195,STR$,196,STRING,197,EOF,198,SUM,199,WHILE,200,CASE,201,WHEN,202,OF,203,ENDCASE
DATA 204,OTHERWISE,205,ENDIF,206,ENDWHILE,207,PTR,208,PAGE,209,TIME,210,LOMEM,211,HIMEM,212,SOUND,213,BPUT
DATA 214,CALL,215,CHAIN,216,CLEAR,217,CLOSE,218,CLG,219,CLS,220,DATA,221,DEF,222,DIM,223,DRAW
DATA 224,END,225,ENDPROC,226,ENVELOPE,227,FOR,228,GOSUB,229,GOTO,230,GCOL,231,IF,232,INPUT,233,LET
DATA 234,LOCAL,235,MODE,236,MOVE,237,NEXT,238,ON,239,VDU,240,PLOT,241,PRINT,242,PROC,243,READ
DATA 244,REM,245,REPEAT,246,REPORT,247,RESTORE,248,RETURN,249,RUN,250,STOP,251,COLOUR,252,TRACE,253,UNTIL
DATA 254,WIDTH,255,OSCLI,0
DIM lineNum AS LONG, encoded AS STRING * 1
DIM tmp AS LONG, lineNumDec AS LONG
DIM keywords(255) AS STRING
DO
READ t%
IF 0 = t% THEN EXIT DO
READ keywords(t%)
LOOP
nm$ = ENVIRON$("DETOKEN")
OPEN nm$ + ".bbc" FOR BINARY AS 2
OPEN nm$ + ".bas" FOR OUTPUT AS 3
'skip first char
GET #2, , encoded
GET #2, , lineNum
SEEK #2, SEEK(2) - 2
lineNum = (lineNum AND &HFFFF&)
IF (CHR$(0) = encoded) AND (&HFFFF& = lineNum) THEN
CLOSE : SYSTEM
ELSEIF lineNum <> 0 THEN
PRINT #3, LTRIM$(STR$(lineNum)); " ";
END IF
DO UNTIL EOF(2)
GET #2, , encoded
i% = ASC(encoded)
SELECT CASE i%
CASE 0
'do nothing
CASE 13
PRINT #3,
'skip first char of next line
IF NOT EOF(2) THEN
GET #2, , encoded
GET #2, , lineNum
SEEK #2, SEEK(2) - 2
lineNum = (lineNum AND &HFFFF&)
IF (CHR$(0) = encoded) AND (&HFFFF& = lineNum) THEN
CLOSE : SYSTEM
ELSEIF lineNum <> 0 THEN
PRINT #3, LTRIM$(STR$(lineNum)); " ";
END IF
END IF
CASE &H8D
'deal with encoded line numbers (GOTO targets)
IF NOT EOF(2) THEN
GET #2, , encoded
IF NOT EOF(2) THEN
GET #2, , lineNum
SEEK #2, SEEK(2) - 2
lineNum = (lineNum AND &HFFFF&)
ELSE
'should NEVER get here
lineNum = 0
END IF
lo& = ((ASC(encoded) * 4&) AND &HC0) XOR (lineNum AND 255&) 'LEFT SHIFT ASC(encoded), 2
hi& = ((ASC(encoded) * 16&) AND &HC0) XOR ((lineNum \ 256&) AND 255&) 'LEFT SHIFT ASC(encoded), 4
tmp = lo& + (hi& * 256&)
PRINT #3, LTRIM$(RTRIM$(STR$(tmp)));
END IF
CASE ELSE
IF keywords(i%) <> "" THEN
PRINT #3, keywords(i%);
ELSE
PRINT #3, encoded;
END IF
END SELECT
LOOP
CLOSE
SYSTEM