Globally replace text in several files

Revision as of 12:53, 14 October 2012 by Edwin (talk | contribs) (Added Perl 6 solution.)

The task is to replace every occuring instance of a piece of text in a group of text files with another one. For this task we want to replace the text "Goodbye London!" with "Hello New York!" for a list of files.

Task
Globally replace text in several files
You are encouraged to solve this task according to the task description, using any language you may know.

Ada

<lang Ada>with Ada.Strings.Unbounded, Ada.Text_IO, Ada.Command_Line, Ada.Directories;

procedure Global_Replace is

  subtype U_String is Ada.Strings.Unbounded.Unbounded_String;
  function "+"(S: String) return U_String renames
    Ada.Strings.Unbounded.To_Unbounded_String;
  function "-"(U: U_String) return String renames
    Ada.Strings.Unbounded.To_String;
  procedure String_Replace(S: in out U_String; Pattern, Replacement: String) is
     -- example: if S is "Mary had a XX lamb", then String_Replace(S, "X", "little");
     --          will turn S into "Mary had a littlelittle lamb"
     --          and String_Replace(S, "Y", "small"); will not change S
     Index : Natural;
  begin
     loop
        Index := Ada.Strings.Unbounded.Index(Source => S, Pattern => Pattern);
        exit when Index = 0;
        Ada.Strings.Unbounded.Replace_Slice
          (Source => S, Low => Index, High => Index+Pattern'Length-1,
           By => Replacement);
     end loop;
  end String_Replace;
  procedure File_Replace(Filename: String; Pattern, Replacement: String) is
     -- applies String_Rplace to each line in the file with the given Filename
     -- propagates any exceptions, when, e.g., the file does not exist 
     I_File, O_File: Ada.Text_IO.File_Type;
     Line: U_String;
     Tmp_Name: String := Filename & ".tmp"; 
        -- name of temporary file; if that file already exists, it will be overwritten
  begin
     Ada.Text_IO.Open(I_File, Ada.Text_IO.In_File, Filename);
     Ada.Text_IO.Create(O_File, Ada.Text_IO.Out_File, Tmp_Name);
     while not Ada.Text_IO.End_Of_File(I_File) loop
        Line := +Ada.Text_IO.Get_Line(I_File);
        String_Replace(Line, Pattern, Replacement);
        Ada.Text_IO.Put_Line(O_File, -Line);
     end loop;
     Ada.Text_IO.Close(I_File);
     Ada.Text_IO.Close(O_File);
     Ada.Directories.Delete_File(Filename);
     Ada.Directories.Rename(Old_Name => Tmp_Name, New_Name => Filename);
  end File_Replace;
  Pattern:     String := Ada.Command_Line.Argument(1);
  Replacement: String :=  Ada.Command_Line.Argument(2);

begin

  Ada.Text_IO.Put_Line("Replacing """ & Pattern
                         & """ by """ & Replacement & """ in"
                         & Integer'Image(Ada.Command_Line.Argument_Count - 2)
                         & " files.");
  for I in 3 .. Ada.Command_Line.Argument_Count loop
     File_Replace(Ada.Command_Line.Argument(I), Pattern, Replacement);
  end loop;

end Global_Replace;</lang>

Ouput:

> ls ?.txt
1.txt  2.txt  x.txt  y.txt

> more 2.txt
This is a text.
"Goodbye London!" 
"Goodbye London!" 
"Byebye London!" "Byebye London!" "Byebye London!" 

> ./global_replace "Goodbye London" "Hello New York" ?.txt
Replacing "Goodbye London" by "Hello New York" in 4 files.

> more 2.txt
This is a text.
"Hello New York!" 
"Hello New York!" 
"Byebye London!" "Byebye London!" "Byebye London!" 

AutoHotkey

<lang AutoHotkey>SetWorkingDir %A_ScriptDir%  ; Change the working directory to the script's location listFiles := "a.txt|b.txt|c.txt" ; Define a list of files in the current working directory loop, Parse, listFiles, | { ; The above parses the list based on the | character fileread, contents, %A_LoopField% ; Read the file fileDelete, %A_LoopField%  ; Delete the file stringReplace, contents, contents, Goodbye London!, Hello New York!, All ; replace all occurrences fileAppend, %contents%, %A_LoopField% ; Re-create the file with new contents } </lang>

BASIC

Works with: FreeBASIC

Pass the files on the command line (i.e. global-replace *.txt).

<lang qbasic>CONST matchtext = "Goodbye London!" CONST repltext = "Hello New York!" CONST matchlen = LEN(matchtext)

DIM L0 AS INTEGER, x AS INTEGER, filespec AS STRING, linein AS STRING

L0 = 1 WHILE LEN(COMMAND$(L0))

   filespec = DIR$(COMMAND$(L0))
   WHILE LEN(filespec)
       OPEN filespec FOR BINARY AS 1
           linein = SPACE$(LOF(1))
           GET #1, 1, linein
           DO
               x = INSTR(linein, matchtext)
               IF x THEN
                   linein = LEFT$(linein, x - 1) & repltext & MID$(linein, x + matchlen)
                   ' If matchtext and repltext are of equal length (as in this example)
                   ' then you can replace the above line with this:
                   ' MID$(linein, x) = repltext
                   ' This is somewhat more efficient than having to rebuild the string.
               ELSE
                   EXIT DO
               END IF
           LOOP
       ' If matchtext and repltext are of equal length (as in this example), or repltext
       ' is longer than matchtext, you could just write back to the file while it's open
       ' in BINARY mode, like so:
       ' PUT #1, 1, linein
       ' But since there's no way to reduce the file size via BINARY and PUT, we do this:
       CLOSE
       OPEN filespec FOR OUTPUT AS 1
           PRINT #1, linein;
       CLOSE
       filespec = DIR$
   WEND
   L0 += 1

WEND</lang>

C

<lang C>#include <stdio.h>

  1. include <stdlib.h>
  2. include <stddef.h>
  3. include <string.h>
  4. include <sys/types.h>
  5. include <fcntl.h>
  6. include <sys/stat.h>
  7. include <unistd.h>
  8. include <err.h>
  9. include <string.h>

char * find_match(char *buf, char * buf_end, char *pat, size_t len) { ptrdiff_t i; char *start = buf; while (start + len < buf_end) { for (i = 0; i < len; i++) if (start[i] != pat[i]) break;

if (i == len) return start; start++; } return 0; }

int replace(char *from, char *to, char *fname) {

  1. define bail(msg) { warn(msg" '%s'", fname); goto done; }

struct stat st; int ret = 0; char *buf = 0, *start, *end; size_t len = strlen(from), nlen = strlen(to); int fd = open(fname, O_RDWR);

if (fd == -1) bail("Can't open"); if (fstat(fd, &st) == -1) bail("Can't stat"); if (!(buf = malloc(st.st_size))) bail("Can't alloc"); if (read(fd, buf, st.st_size) != st.st_size) bail("Bad read");

start = buf; end = find_match(start, buf + st.st_size, from, len); if (!end) goto done; /* no match found, don't change file */

ftruncate(fd, 0); lseek(fd, 0, 0); do { write(fd, start, end - start); /* write content before match */ write(fd, to, nlen); /* write replacement of match */ start = end + len; /* skip to end of match */ /* find match again */ end = find_match(start, buf + st.st_size, from, len); } while (end);

/* write leftover after last match */ if (start < buf + st.st_size) write(fd, start, buf + st.st_size - start);

done: if (fd != -1) close(fd); if (buf) free(buf); return ret; }

int main() { char *from = "Goodbye, London!"; char *to = "Hello, New York!"; char * files[] = { "test1.txt", "test2.txt", "test3.txt" }; int i;

for (i = 0; i < sizeof(files)/sizeof(char*); i++) replace(from, to, files[i]);

return 0; }</lang>

C++

<lang cpp>#include <fstream>

  1. include <iterator>
  2. include <boost/regex.hpp>
  3. include <string>
  4. include <iostream>

int main( int argc , char *argv[ ] ) {

  boost::regex to_be_replaced( "Goodbye London\\s*!" ) ;
  std::string replacement( "Hello New York!" ) ;
  for ( int i = 1 ; i < argc ; i++ ) {
     std::ifstream infile ( argv[ i ] ) ;
     if ( infile ) {

std::string filetext( (std::istreambuf_iterator<char>( infile )) , std::istreambuf_iterator<char>( ) ) ; std::string changed ( boost::regex_replace( filetext , to_be_replaced , replacement )) ; infile.close( ) ; std::ofstream outfile( argv[ i ] , std::ios_base::out | std::ios_base::trunc ) ; if ( outfile.is_open( ) ) { outfile << changed ; outfile.close( ) ; }

     }
     else 

std::cout << "Can't find file " << argv[ i ] << " !\n" ;

  }
  return 0 ;

}</lang>

D

Works with: D version 2

<lang d>import std.file, std.array;

void main() {

   auto from = "Goodbye London!", to = "Hello, New York!";
   foreach (fn; "a.txt b.txt c.txt".split()) {
       write(fn, replace(cast(string)read(fn), from, to));
   }

}</lang>

Go

<lang go>package main

import (

   "bytes"
   "io/ioutil"
   "log"
   "os"

)

func main() {

   gRepNFiles("Goodbye London!", "Hello New York!", []string{
       "a.txt",
       "b.txt",
       "c.txt",
   })

}

func gRepNFiles(olds, news string, files []string) {

   oldb := []byte(olds)
   newb := []byte(news)
   for _, fn := range files {
       if err := gRepFile(oldb, newb, fn); err != nil {
           log.Println(err)
       }
   }

}

func gRepFile(oldb, newb []byte, fn string) (err error) {

   var f *os.File
   if f, err = os.OpenFile(fn, os.O_RDWR, 0); err != nil {
       return
   }
   defer func() {
       if cErr := f.Close(); err == nil {
           err = cErr
       }
   }()
   var b []byte
   if b, err = ioutil.ReadAll(f); err != nil {
       return
   }
   if bytes.Index(b, oldb) < 0 {
       return
   }
   r := bytes.Replace(b, oldb, newb, -1)
   if err = f.Truncate(0); err != nil {
       return
   }
   _, err = f.WriteAt(r, 0)
   return

}</lang>

Icon and Unicon

This example uses the Unicon stat function. It can be rewritten for Icon to aggregate the file in a reads loop. <lang Icon>procedure main() globalrepl("Goodbye London","Hello New York","a.txt","b.txt") # variable args for files end

procedure globalrepl(old,new,files[])

every fn := !files do

  if s := reads(f := open(fn,"bu"),stat(f).size) then {
     writes(seek(f,1),replace(s,old,new))
     close(f)
     }
  else write(&errout,"Unable to open ",fn)

end

link strings # for replace</lang>

strings.icn provides replace.

J

If files is a variable with the desired list of file names:

<lang j>require'strings' (1!:2~rplc&('Goodbye London!';'Hello New York!')@(1!:1))"0 files</lang>


Liberty BASIC

<lang lb> nomainwin

file$( 1) ="data1.txt" file$( 2) ="data2.txt" file$( 3) ="data3.txt"


for i =1 to 3

   open file$( i) for input as #i
       orig$ =input$( #i, lof( #i))
   close #i
   dummy$ =FindReplace$( orig$, "Goodbye London!", "Hello New York!", 1)
   open "RC" +file$( i) for output as #o
       #o dummy$;
   close #o

next i

end

function FindReplace$( FindReplace$, find$, replace$, replaceAll) ' Target string, string to find, string to replace it with, flag 0/1 for 'replace all occurrences'.

   if ( ( FindReplace$ <>"") and ( find$ <>"") ) then
       fLen =len( find$)
       rLen =len( replace$)
       do
           fPos =instr( FindReplace$, find$, fPos)
           if not( fPos) then exit function
           pre$            =left$( FindReplace$, fPos -1)
           post$           =mid$( FindReplace$, fPos +fLen)
           FindReplace$    =pre$ +replace$ +post$
           fPos            =fPos +( rLen -fLen) +1
       loop while (replaceAll)
   end if

end function </lang>

Lua

<lang lua>filenames = { "f1.txt", "f2.txt" }

for _, fn in pairs( filenames ) do

   fp = io.open( fn, "r" )
   str = fp:read( "*all" )
   str = string.gsub( str, "Goodbye London!", "Hello New York!" )
   fp:close()
   fp = io.open( fn, "w+" )
   fp:write( str )
   fp:close()

end</lang>


OpenEdge/Progress

<lang progress>FUNCTION replaceText RETURNS LOGICAL (

  i_cfile_list   AS CHAR,
  i_cfrom        AS CHAR,
  i_cto          AS CHAR

):

  DEF VAR ii     AS INT.
  DEF VAR lcfile AS LONGCHAR.
  DO ii = 1 TO NUM-ENTRIES( i_cfile_list ):
     COPY-LOB FROM FILE ENTRY( ii, i_cfile_list ) TO lcfile.
     lcfile = REPLACE( lcfile, i_cfrom, i_cto ).
     COPY-LOB FROM lcfile TO FILE ENTRY( ii, i_cfile_list ).
  END.
  

END FUNCTION. /* replaceText */

replaceText(

  "a.txt,b.txt,c.txt",
  "Goodbye London!",
  "Hello New York!"

).</lang>

Pascal

Works with: Free_Pascal

<lang pascal>Program StringReplace;

uses

 Classes, StrUtils;

const

 fileName: array[1..3] of string = ('a.txt', 'b.txt', 'c.txt');
 matchText = 'Goodbye London!';
 replaceText = 'Hello New York!';

var

 AllText: TStringlist;
 i, j: integer;

begin

 for j := low(fileName) to high(fileName) do
 begin
  AllText := TStringlist.Create;
  AllText.LoadFromFile(fileName[j]);
  for i := 0 to AllText.Count-1 do
    AllText.Strings[i] := AnsiReplaceStr(AllText.Strings[i], matchText, replaceText);
  AllText.SaveToFile(fileName[j]);
  AllText.Destroy;
 end;

end.</lang>

Perl

<lang bash>perl -pi -e "s/Goodbye London\!/Hello New York\!/g;" a.txt b.txt c.txt</lang>

Perl 6

Current Perl 6 implementations do not yet support the -i flag for editing files in place, so we roll our own (rather unsafe) version:

<lang perl6>spurt $_, slurp($_).subst('Goodbye London!', 'Hello New York!', :g)

   for <a.txt b.txt c.txt>;</lang>

PicoLisp

<lang PicoLisp>(for File '(a.txt b.txt c.txt)

  (call 'mv File (tmp File))
  (out File
     (in (tmp File)
        (while (echo "Goodbye London!")
           (prin "Hello New York!") ) ) ) )</lang>

PowerBASIC

Translation of: BASIC

<lang powerbasic>$matchtext = "Goodbye London!" $repltext = "Hello New York!"

FUNCTION PBMAIN () AS LONG

   DIM L0 AS INTEGER, filespec AS STRING, linein AS STRING
   L0 = 1
   WHILE LEN(COMMAND$(L0))
       filespec = DIR$(COMMAND$(L0))
       WHILE LEN(filespec)
           OPEN filespec FOR BINARY AS 1
               linein = SPACE$(LOF(1))
               GET #1, 1, linein
               ' No need to jump through FB's hoops here...
               REPLACE $matchtext WITH $repltext IN linein
               PUT #1, 1, linein
               SETEOF #1
           CLOSE
           filespec = DIR$
       WEND
       INCR L0
   WEND

END FUNCTION</lang>

PureBasic

<lang PureBasic>Procedure GRTISF(List File$(), Find$, Replace$)

 Protected Line$, Out$, OutFile$, i
 ForEach File$()
   fsize=FileSize(File$())
   If fsize<=0: Continue: EndIf
   If ReadFile(0, File$())
     i=0
     ;
     ; generate a temporary file in a safe way
     Repeat
       file$=GetTemporaryDirectory()+base$+"_"+Str(i)+".tmp"
       i+1
     Until FileSize(file$)=-1
     i=CreateFile(FileID, file$)
     If i
       ; Copy the infile to the outfile while replacing any needed text
       While Not Eof(0)
         Line$=ReadString(0)
         Out$=ReplaceString(Line$,Find$,Replace$)
         WriteString(1,Out$)
       Wend
       CloseFile(1)
     EndIf
     CloseFile(0)
     If i
       ; If we made a new file, copy it back.
       CopyFile(file$, File$())
       DeleteFile(file$)
     EndIf
   EndIf
 Next

EndProcedure</lang> Implementation

NewList Xyz$()
AddElement(Xyz$()): Xyz$()="C:\\a.txt"
AddElement(Xyz$()): Xyz$()="C:\\b.txt"
AddElement(Xyz$()): Xyz$()="D:\\c.txt"

GRTISF(Xyz$(), "Goodbye London", "Hello New York")

Python

From Python docs. (Note: in-place editing does not work for MS-DOS 8+3 filesystems.).

<lang python>import fileinput

for line in fileinput.input(inplace=True):

   print(line.replace('Goodbye London!', 'Hello New York!'), end=)

</lang>

Ruby

Like Perl:

ruby -pi -e "gsub('Goodbye London!', 'Hello New York!')" a.txt b.txt c.txt

Run BASIC

<lang runbasic>file$(1) ="data1.txt" file$(2) ="data2.txt" file$(3) ="data3.txt"

for i = 1 to 3

   open file$(i) for input as #in
       fileBefore$ = input$( #in, lof( #in))
   close #in

   fileAfter$ = strRep$(fileBefore$, "Goodbye London!", "Hello New York!")
   open "new_" +  file$(i) for output as #out
       print #out,fileAfter$;
   close #out

next i end

' -------------------------------- ' string replace - rep str with ' -------------------------------- FUNCTION strRep$(str$,rep$,with$) ln = len(rep$) ln1 = ln - 1 i = 1 while i <= len(str$)

   if mid$(str$,i,ln) = rep$ then
       strRep$ = strRep$ + with$
       i = i + ln1
   else
       strRep$ = strRep$ + mid$(str$,i,1)
   end if

i = i + 1 WEND END FUNCTION</lang>

Seed7

<lang seed7>$ include "seed7_05.s7i";

 include "getf.s7i";

const proc: main is func

 local
   var string: fileName is "";
   var string: content is "";
 begin
   for fileName range [] ("a.txt", "b.txt", "c.txt") do
     content := getf(fileName);
     content := replace(content, "Goodbye London!", "Hello New York!");
     putf(fileName, content);
   end for;
 end func;</lang>

Tcl

Library: Tcllib (Package: fileutil)

<lang tcl>package require Tcl 8.5 package require fileutil

  1. Parameters to the replacement

set from "Goodbye London!" set to "Hello New York!"

  1. Which files to replace

set fileList [list a.txt b.txt c.txt]

  1. Make a command fragment that performs the replacement on a supplied string

set replacementCmd [list string map [list $from $to]]

  1. Apply the replacement to the contents of each file

foreach filename $fileList {

   fileutil::updateInPlace $filename $replacementCmd

}</lang>

TUSCRIPT

<lang tuscript> $$ MODE TUSCRIPT files="a.txt'b.txt'c.txt"

BUILD S_TABLE search = ":Goodbye London!:"

LOOP file=files

ERROR/STOP OPEN (file,WRITE,-std-)
ERROR/STOP CREATE ("scratch",FDF-o,-std-)
 ACCESS q: READ/STREAM/RECORDS/UTF8 $file s,aken+text/search+eken
 ACCESS s: WRITE/ERASE/STREAM/UTF8 "scratch" s,aken+text+eken
  LOOP
   READ/EXIT q
   IF (text.ct.search) SET text="Hello New York!"
   WRITE/ADJUST s
  ENDLOOP
 ENDACCESS/PRINT q
 ENDACCESS/PRINT s
ERROR/STOP COPY ("scratch",file)
ERROR/STOP CLOSE (file)

ENDLOOP ERROR/STOP DELETE ("scratch") </lang>

TXR

Another use of a screwdriver as a hammer.

The dummy empty output at the end serves a dual purpose. Firstly, without argument clauses following it, the @(next `!mv ...`) will not actually happen (lazy evaluation!). Secondly, if a txr script performs no output on standard output, the default action of dumping variable bindings kicks in.

<lang txr>@(next :args) @(collect) @file @(next `@file`) @(freeform) @(coll :gap 0)@notmatch@{match /Goodbye, London!/}@(end)@*tail@/\n/ @(output `@file.tmp`) @(rep)@{notmatch}Hello, New York!@(end)@tail @(end) @(next `!mv @file.tmp @file`) @(output) @(end) @(end)</lang> Run:

$ cat foo.txt
aaaGoodbye, London!aaa
Goodbye, London!
$ cat bar.txt
aaaGoodbye, London!aaa
Goodbye, London!
$ txr replace-files.txr foo.txt bar.txt
$ cat foo.txt
aaaHello, New York!aaa
Hello, New York!
$ cat bar.txt
aaaHello, New York!aaa
Hello, New York!

Run, with no directory permissions:

$ chmod a-w .
$ txr replace-files.txr foo.txt bar.txt
txr: unhandled exception of type file_error:
txr: could not open foo.txt.tmp (error 13/Permission denied)
false