User:Eriksiers/to OS

From Rosetta Code
Revision as of 19:28, 12 January 2010 by Eriksiers (talk | contribs) (→‎Shell Script: untested changes)

This is just a handful of programs that translate textfile line endings between the conventions used by the systems I use: Unix-like systems (LF), Mac OS (CR), and DOS & Windows (CRLF).

FreeBASIC

This is done without using any libraries. (FB can use various regex libs, including PCRE.)

<lang qbasic>'$LANG: "fblite"

OPTION EXPLICIT OPTION ESCAPE

  1. INCLUDE "file.bi"

CONST CR$ = "\r" CONST LF$ = "\n" CONST CRLF$ = "\r\n"

SUB replacer(fileIN AS STRING, tgtName AS STRING, target AS STRING)

   DIM incoming AS STRING, fileOUT AS STRING, inNum AS LONG, outNum AS LONG
   DIM tmpLng1 AS LONG, tmpLng2 AS LONG
   PRINT fileIN; " -> ";
   inNum = FREEFILE
   OPEN fileIN FOR INPUT AS inNum
   fileOUT = fileIN & "." & tgtName
   outNum = FREEFILE
   OPEN fileOUT FOR OUTPUT AS outNum
       DO UNTIL EOF(inNum)
           LINE INPUT #inNum, incoming
           ' lines are split by FreeBASIC at CRLF or just LF
           ' CR by itself is not considered EOL
           tmpLng2 = 1
           DO
               tmpLng1 = INSTR(tmpLng2, incoming, CR$)
               IF tmpLng1 < 1 THEN
                   EXIT DO
               ELSE
                   tmpLng2 = tmpLng1 + LEN(target)
                   incoming = LEFT$(incoming, tmpLng1 - 1) & target & MID$(incoming, tmpLng1 + 1)
               END IF
           LOOP
           PRINT #outNum, incoming & target;
       LOOP
   CLOSE
   PRINT fileOUT

END SUB


DIM f_IN AS STRING, tmpLng AS LONG, appName AS STRING, tgt AS STRING, nextArg AS LONG

tmpLng = INSTR(COMMAND$(0), ".") IF tmpLng THEN

   appName = LEFT$(COMMAND$(0), tmpLng - 1)

ELSE

   appName = COMMAND$(0)

END IF SELECT CASE LCASE$(appName)

   CASE "tonix"
       tgt = LF$
       nextArg = 1
   CASE "tomac"
       tgt = CR$
       nextArg = 1
   CASE "todos", "towin"
       tgt = CRLF$
       nextArg = 1
   CASE ELSE
       appName = LCASE$(COMMAND$(1))
       SELECT CASE appName
           CASE "unix", "posix", "linux", "bsd"
               tgt = LF$
           CASE "win", "dos", "windows"
               tgt = CRLF$
           CASE "mac", "macos", "osx"
               tgt = CR$
           CASE ELSE
               PRINT "This program must be called as one of these:"
               PRINT "- todos"
               PRINT "- tomac"
               PRINT "- tonix"
               PRINT "- towin"
               PRINT "- (anything else), with the first arg specifying the target system:"
               PRINT "  - unix, posix, linux, bsd"
               PRINT "  - win, dos, windows"
               PRINT "  - mac, macos, osx"
               PRINT "  For example, """; COMMAND$(0); " win file.txt"""
               END 1   ' error code
       END SELECT
       nextArg = 2

END SELECT

FOR tmpLng = nextArg TO (__FB_ARGC__ - 1)

   ' if no args, loop doesn't happen (i.e. exits immediately)
   f_IN = DIR$(TRIM$(COMMAND$(tmpLng)))  ' allows for wildcards
   WHILE LEN(f_IN)
       replacer f_IN, appName, tgt
       f_IN = DIR$
   WEND

NEXT</lang>

PowerBASIC

(This is partly a translation of the FreeBASIC version.)

Unlike FreeBASIC (or many other languages), PowerBASIC can't easily discover the name that the program was called as without making an API call. I just found out that PowerBASIC 9 has a new object called EXE which serves pretty much the same purpose as VB's App object. [shrug] Not gonna change this now.

Also unlike FreeBASIC, when reading in a line from a text file via LINE INPUT #, only CRLF is recognized as EOL.

<lang powerbasic>$DIM ALL

SUB replacer(fileIN AS STRING, tgtName AS STRING, target AS STRING)

   DIM incoming AS STRING, fileOUT AS STRING, inNum AS LONG, outNum AS LONG
   DIM tmpLng1 AS LONG, tmpLng2 AS LONG

$IF %DEF($PB_CC32)

   PRINT fileIN; " -> ";

$ENDIF

   inNum = FREEFILE
   OPEN fileIN FOR INPUT AS inNum
   fileOUT = fileIN & "." & tgtName
   outNum = FREEFILE
   OPEN fileOUT FOR OUTPUT AS outNum
       DO UNTIL EOF(inNum)
           LINE INPUT #inNum, incoming
           ' lines are split by PowerBASIC at CRLF
           ' CR or LF by itself is not considered EOL
           tmpLng2 = 1
           REPLACE $LF WITH $CR IN incoming
           REPLACE $CR WITH target IN incoming
           PRINT #outNum, incoming & target;
       LOOP
   CLOSE

$IF %DEF($PB_CC32)

   PRINT fileOUT

$ENDIF END SUB

FUNCTION PBMAIN

   DIM f_IN AS STRING, tmpLng AS LONG, tgt AS STRING, nextArg AS LONG
   REDIM args(PARSECOUNT(COMMAND$, $SPC)) AS STRING
   PARSE COMMAND$, args(), $SPC
   SELECT CASE args(0)
       CASE "unix", "posix", "linux", "bsd"
           tgt = $LF
       CASE "win", "dos", "windows"
           tgt = $CRLF
       CASE "mac", "macos", "osx"
           tgt = $CR
       CASE ELSE
           tgt = "This program's first arg must specify the target system:" & $CRLF & _
                 "- unix, posix, linux, bsd" & $CRLF & "- win, dos, windows" & $CRLF & _
                 "- mac, macos, osx" & $CRLF & "For example, ""toOS win file.txt"""
           ? tgt   ' PRINT under PB/CC, MSGBOX under PB/WIN
           FUNCTION = 1   ' error code
           EXIT FUNCTION
   END SELECT
   FOR tmpLng = 1 TO UBOUND(args)
       f_IN = DIR$(TRIM$(args(tmpLng)))  ' allows for wildcards
       WHILE LEN(f_IN)
           replacer f_IN, args(0), tgt
           f_IN = DIR$
       WEND
   NEXT

$IF %DEF($PB_WIN32)

   MSGBOX "Done."

$ENDIF END FUNCTION</lang>

Shell Script

Neither of these actually work... I'm terrible with regexes in general and sed in particular. These are here to give someone a starting point. (If someone who's better than me on this sorta thing wants to fix it, I wouldn't complain...) I didn't try very hard because pretty much all modern Unix-like distros (including the ones I use) include dos2unix/unix2dos, fromdos/todos, or something similar.

This script decides what to do based on the script's name: source2target. (This is handled by case.) The translation is performed by sed.

<lang bash>case $0 in dos2unix)

 sed s/\r//g $1>$1.unix
 ;;

dos2mac)

 sed s/\n//g $1>$1.mac
 ;;

mac2dos)

 sed s/\r/\r\n/g $1>$1.dos
 ;;

mac2unix)

 sed s/\r/\n/g $1>$1.unix
 ;;

unix2dos)

 sed s/\n/\r\n/g $1>$1.dos
 ;;

unix2mac)

 sed s/\n/\r/g $1>$1.mac
 ;;

esac</lang>

An alternative would be to make this rather generic. It could be called from, and could be called similar to this:

from dos to unix file.txt

...which would write the output to "file.txt.unix".

<lang bash>case $1 in dos)

 first=\r\n
 ;;

mac)

 first=\r
 ;;

unix)

 first=\n
 ;;

esac case $2 in dos)

 second=\r\n
 f=$3
 ;;

mac)

 second=\r
 f=$3
 ;;

unix)

 second=\n
 f=$3
 ;;

to)

 case $3 in
 dos)
   second=\r\n
   f=$4
   ;;
 mac)
   second=\r
   f=$4
   ;;
 unix)
   second=\n
   f=$4
   ;;
 esac

esac sed -b s/$first/$second/g $f>$f.$2</lang>