Stream merge: Difference between revisions

Added FreeBASIC
m (→‎{{header|Phix}}: added syntax colouring, marked p2js incompatible)
(Added FreeBASIC)
 
(2 intermediate revisions by 2 users not shown)
Line 14:
=={{header|360 Assembly}}==
No usage of tricks such as forbiden records in the streams.
<langsyntaxhighlight lang="360asm">* Stream Merge 07/02/2017
STRMERGE CSECT
USING STRMERGE,R13 base register
Line 130:
PG DS CL64
YREGS
END STRMERGE</langsyntaxhighlight>
{{in}}
<pre style="height:20ex">
Line 167:
 
=={{header|Ada}}==
<langsyntaxhighlight Adalang="ada">with Ada.Text_Io;
with Ada.Command_Line;
with Ada.Containers.Indefinite_Holders;
Line 238:
end loop;
 
end Stream_Merge;</langsyntaxhighlight>
 
=={{header|ALGOL 68}}==
NB, all the files (including the output files) must exist before running this. The output files are overwritten with the merged records.
<langsyntaxhighlight lang="algol68"># merge a number of input files to an output file #
PROC mergenf = ( []REF FILE inf, REF FILE out )VOID:
BEGIN
Line 344:
# test the file merge #
merge2( "in1.txt", "in2.txt", "out2.txt" );
mergen( ( "in1.txt", "in2.txt", "in3.txt", "in4.txt" ), "outn.txt" )</langsyntaxhighlight>
{{out}}
<pre>
Line 350:
 
=={{header|ATS}}==
<syntaxhighlight lang="ats">
<lang ATS>
(* ****** ****** *)
//
Line 539:
//
} (* end of [main0] *)
</syntaxhighlight>
</lang>
 
=={{header|AWK}}==
<syntaxhighlight lang="awk">
<lang AWK>
# syntax: GAWK -f STREAM_MERGE.AWK filename(s) >output
# handles 1 .. N files
Line 608:
errors++
}
</syntaxhighlight>
</lang>
 
=={{header|C}}==
<syntaxhighlight lang="c">/*
<lang C>/*
* Rosetta Code - stream merge in C.
*
Line 654:
return EXIT_SUCCESS;
}
</syntaxhighlight>
</lang>
 
=={{header|C sharp|C#}}==
<langsyntaxhighlight lang="csharp">
using System;
using System.Collections.Generic;
Line 711:
}
}
}</langsyntaxhighlight>
{{out}}
<pre>1 2 4 5 7 8 10 11
Line 718:
=={{header|C++}}==
{{trans|C#}}
<langsyntaxhighlight lang="cpp">//#include <functional>
#include <iostream>
#include <vector>
Line 813:
mergeN(display, { v3, v2, v1 });
std::cout << '\n';
}</langsyntaxhighlight>
{{out}}
<pre>0 1 3 4 6 7
Line 820:
 
=={{header|D}}==
<langsyntaxhighlight Dlang="d">import std.range.primitives;
import std.stdio;
 
Line 892:
}
} while (!done);
}</langsyntaxhighlight>
 
{{out}}
Line 902:
 
=={{header|Elixir}}==
<langsyntaxhighlight lang="elixir">defmodule StreamMerge do
def merge2(file1, file2), do: mergeN([file1, file2])
Line 930:
StreamMerge.merge2("temp1.dat", "temp2.dat")
IO.puts "\nN-stream merge:"
StreamMerge.mergeN(filenames)</langsyntaxhighlight>
 
{{out}}
Line 980:
 
=={{header|Fortran}}==
This is a classic problem, but even so, Fortran does not supply a library routine for this. So...<langsyntaxhighlight Fortranlang="fortran"> SUBROUTINE FILEMERGE(N,INF,OUTF) !Merge multiple inputs into one output.
INTEGER N !The number of input files.
INTEGER INF(*) !Their unit numbers.
Line 1,047:
CALL FILEMERGE(MANY,FI,F) !E pluribus unum.
 
END !That was easy.</langsyntaxhighlight>
Obviously, there would be variations according to the nature of the data streams being merged, and whatever sort key was involved. For this example, input from disc files will do and the sort key is the entire record's text. This means there is no need to worry over the case where, having written a record from stream S and obtained the next record from stream S, it proves to have equal precedence with the waiting record for some other stream. Which now should take precedence? With entirely-equal records it obviously doesn't matter but if the sort key is only partial then different record content could be deemed equal and then a choice has an effect.
 
Line 1,057:
 
The source for subroutine GRAB is within subroutine FILEMERGE for the convenience in sharing and messing with variables important to both, but not to outsiders. This facility is standard in Algol-following languages but often omitted and was not added to Fortran until F90. In its absence, either more parameters are required for the separate routines, or there will be messing with COMMON storage areas.
 
=={{header|FreeBASIC}}==
{{trans|C++}}
<syntaxhighlight lang="vbnet">Sub Merge2(c1() As Integer, c2() As Integer)
Dim As Integer i1 = Lbound(c1)
Dim As Integer i2 = Lbound(c2)
While i1 <= Ubound(c1) And i2 <= Ubound(c2)
If c1(i1) <= c2(i2) Then
Print c1(i1);
i1 += 1
Else
Print c2(i2);
i2 += 1
End If
Wend
While i1 <= Ubound(c1)
Print c1(i1);
i1 += 1
Wend
While i2 <= Ubound(c2)
Print c2(i2);
i2 += 1
Wend
Print
End Sub
 
Sub MergeN(all() As Integer)
Dim As Integer i = Lbound(all)
While i <= Ubound(all)
Print all(i);
i += 1
Wend
Print
End Sub
 
Dim As Integer v1(2) = {0, 3, 6}
Dim As Integer v2(2) = {1, 4, 7}
Dim As Integer v3(2) = {2, 5, 8}
Merge2(v2(), v1())
MergeN(v1())
 
Dim As Integer all(8) = {v1(0), v2(0), v3(0), v1(1), v2(1), v3(1), v1(2), v2(2), v3(2)}
MergeN(all())
 
Sleep</syntaxhighlight>
{{out}}
<pre> 0 1 3 4 6 7
0 3 6
0 1 2 3 4 5 6 7 8</pre>
 
=={{header|Go}}==
'''Using standard library binary heap for mergeN:'''
<langsyntaxhighlight lang="go">package main
 
import (
Line 1,154 ⟶ 1,206:
}
}
}</langsyntaxhighlight>
{{out}}
<pre>
Line 1,161 ⟶ 1,213:
</pre>
'''MergeN using package from [[Fibonacci heap]] task:'''
<langsyntaxhighlight lang="go">package main
 
import (
Line 1,220 ⟶ 1,272:
}
}
}</langsyntaxhighlight>
{{out}}
<pre>
Line 1,232 ⟶ 1,284:
=== conduit ===
 
<langsyntaxhighlight lang="haskell">-- stack runhaskell --package=conduit-extra --package=conduit-merge
 
import Control.Monad.Trans.Resource (runResourceT)
Line 1,250 ⟶ 1,302:
runResourceT $ mergeSources inputs $$ sinkStdoutLn
where
sinkStdoutLn = Conduit.map (`BS.snoc` '\n') =$= sinkHandle stdout</langsyntaxhighlight>
 
See implementation in https://github.com/cblp/conduit-merge/blob/master/src/Data/Conduit/Merge.hs
Line 1,256 ⟶ 1,308:
=== pipes ===
 
<langsyntaxhighlight lang="haskell">-- stack runhaskell --package=pipes-safe --package=pipes-interleave
 
import Pipes (runEffect, (>->))
Line 1,270 ⟶ 1,322:
sourceFileNames <- getArgs
let sources = map readFile sourceFileNames
runSafeT . runEffect $ interleave compare sources >-> stdoutLn</langsyntaxhighlight>
 
See implementation in https://github.com/bgamari/pipes-interleave/blob/master/Pipes/Interleave.hs
 
=={{header|Java}}==
<langsyntaxhighlight Javalang="java">import java.util.Iterator;
import java.util.List;
import java.util.Objects;
Line 1,374 ⟶ 1,426:
System.out.flush();
}
}</langsyntaxhighlight>
{{out}}
<pre>1245781011
Line 1,382 ⟶ 1,434:
{{trans|C}}
The IOStream type in Julia encompasses any data stream, including file I/O and TCP/IP. The IOBuffer used here maps a stream to a buffer in memory, and so allows an easy simulation of two streams without opening files.
<syntaxhighlight lang="julia">
<lang Julia>
function merge(stream1, stream2, T=Char)
if !eof(stream1) && !eof(stream2)
Line 1,421 ⟶ 1,473:
println("\nDone.")
 
</langsyntaxhighlight>{{output}}<pre>
abcdefghijklmnopqrstuvwyxz
Done.
Line 1,428 ⟶ 1,480:
=={{header|Kotlin}}==
Uses the same data as the REXX entry. As Kotlin lacks a Heap class, when merging N files, we use a nullable MutableList instead. All comparisons are text based even when the files contain nothing but numbers.
<langsyntaxhighlight lang="scala">// version 1.2.21
 
import java.io.File
Line 1,487 ⟶ 1,539:
println(File("merged2.txt").readText())
println(File("mergedN.txt").readText())
}</langsyntaxhighlight>
 
{{out}}
Line 1,514 ⟶ 1,566:
Optimized for clarity and simplicity, not performance.
assumes two files containing sorted integers separated by newlines
<langsyntaxhighlight lang="nim">import streams,strutils
let
stream1 = newFileStream("file1")
Line 1,524 ⟶ 1,576:
echo line
for line in stream2.lines:
echo line</langsyntaxhighlight>
 
===Merge N streams===
Line 1,530 ⟶ 1,582:
Of course, as Phix and Nim are very different languages, the code is quite different, but as Phix, we use a priority queue (which is provided by the standard module <code>heapqueue</code>. We work with files built from the “Data” constant, but we destroy them after usage. We have also put the whole merging code in an procedure.
 
<langsyntaxhighlight Nimlang="nim">import heapqueue, os, sequtils, streams
 
type
Line 1,586 ⟶ 1,638:
# Clean-up: delete the files.
for name in Filenames:
removeFile(name)</langsyntaxhighlight>
 
{{out}}
Line 1,604 ⟶ 1,656:
=={{header|Perl}}==
We make use of an iterator interface which String::Tokenizer provides. Credit: we obtained all the sample text from http://www.lipsum.com/.
<langsyntaxhighlight lang="perl">use strict;
use warnings;
use English;
Line 1,729 ⟶ 1,781:
# At this point every iterator has been exhausted.
return;
}</langsyntaxhighlight>
{{out}}
<pre>Merge of 2 streams:
Line 1,739 ⟶ 1,791:
=={{header|Phix}}==
Using a priority queue
<!--<langsyntaxhighlight Phixlang="phix">(notonline)-->
<span style="color: #008080;">without</span> <span style="color: #008080;">js</span> <span style="color: #000080;font-style:italic;">-- file i/o</span>
<span style="color: #008080;">include</span> <span style="color: #000000;">builtins</span><span style="color: #0000FF;">/</span><span style="color: #000000;">pqueue</span><span style="color: #0000FF;">.</span><span style="color: #000000;">e</span>
Line 1,787 ⟶ 1,839:
<span style="color: #0000FF;">{}</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">delete_file</span><span style="color: #0000FF;">(</span><span style="color: #000000;">filenames</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">])</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<!--</langsyntaxhighlight>-->
{{out}}
<pre>
Line 1,805 ⟶ 1,857:
 
=={{header|PicoLisp}}==
<langsyntaxhighlight PicoLisplang="picolisp">(de streamMerge @
(let Heap
(make
Line 1,818 ⟶ 1,870:
(if (in (cdar Heap) (read))
(set (car Heap) @)
(close (cdr (pop 'Heap))) ) ) ) ) )</langsyntaxhighlight>
<pre>$ cat a
3 14 15
Line 1,830 ⟶ 1,882:
2 3 5 7</pre>
Test:
<langsyntaxhighlight PicoLisplang="picolisp">(test (2 3 14 15 17 18)
(streamMerge
(open "a")
Line 1,840 ⟶ 1,892:
(open "b")
(open "c")
(open "d") ) )</langsyntaxhighlight>
'streamMerge' works with non-numeric data as well, and also - instead of calling
'open' on a file or named pipe - with the results of 'connect' or 'listen' (i.e.
Line 1,851 ⟶ 1,903:
There exists a standard library function <code>heapq.merge</code> that takes any number of sorted stream iterators and merges them into one sorted iterator, using a [[heap]].
 
<langsyntaxhighlight lang="python">import heapq
import sys
 
sources = sys.argv[1:]
for item in heapq.merge(open(source) for source in sources):
print(item)</langsyntaxhighlight>
 
=={{header|Racket}}==
 
<langsyntaxhighlight lang="racket">;; This module produces a sequence that merges streams in order (by <)
#lang racket/base
(require racket/stream)
Line 1,932 ⟶ 1,984:
'(1 2 3 4 5 6 7 8 9 10))
(check-equal? (for/list ((i (merge-sequences/< '(2 4 6 7 8 9 10) '(1 3 5)))) i)
'(1 2 3 4 5 6 7 8 9 10)))</langsyntaxhighlight>
 
{{out}}
Line 1,948 ⟶ 2,000:
=={{header|REXX}}==
===version 1===
<langsyntaxhighlight lang="rexx">/* REXX ***************************************************************
* Merge 1.txt ... n.txt into all.txt
* 1.txt 2.txt 3.txt 4.txt
Line 2,027 ⟶ 2,079:
Return
 
o: Return lineout(oid,arg(1))</langsyntaxhighlight>
{{out}}
<pre>1
Line 2,050 ⟶ 2,102:
 
No &nbsp; ''heap'' &nbsp; is needed to keep track of which record was written, nor needs replenishing from its input file.
<langsyntaxhighlight lang="rexx">/*REXX pgm reads sorted files (1.TXT, 2.TXT, ···), and writes sorted data ───► ALL.TXT */
@.=copies('ff'x, 1e4); call lineout 'ALL.TXT',,1 /*no value should be larger than this. */
do n=1 until @.n==@.; call rdr n; end /*read any number of appropriate files.*/
Line 2,063 ⟶ 2,115:
end /*forever*/ /*keep reading/merging until exhausted.*/
/*──────────────────────────────────────────────────────────────────────────────────────*/
rdr: arg z; @.z= @.; f= z'.TXT'; if lines(f)\==0 then @.z= linein(f); return</langsyntaxhighlight>
{{out|output|text=&nbsp; is the same as the 1<sup>st</sup> REXX version when using identical input files, &nbsp; except the output file is named &nbsp; '''ALL.TXT'''}} <br><br>
 
Line 2,070 ⟶ 2,122:
{{works with|Rakudo|2018.02}}
 
<syntaxhighlight lang="raku" perl6line>sub merge_streams ( @streams ) {
my @s = @streams.map({ hash( STREAM => $_, HEAD => .get ) })\
.grep({ .<HEAD>.defined });
Line 2,082 ⟶ 2,134:
}
 
say merge_streams([ @*ARGS».&open ]);</langsyntaxhighlight>
 
=={{header|Ruby}}==
<langsyntaxhighlight lang="ruby">def stream_merge(*files)
fio = files.map{|fname| open(fname)}
merge(fio.map{|io| [io, io.gets]})
Line 2,109 ⟶ 2,161:
puts "#{fname}: #{data}"
end
stream_merge(*files)</langsyntaxhighlight>
 
{{out}}
Line 2,139 ⟶ 2,191:
 
=={{header|Scala}}==
<langsyntaxhighlight lang="scala">def mergeN[A : Ordering](is: Iterator[A]*): Iterator[A] = is.reduce((a, b) => merge2(a, b))
 
def merge2[A : Ordering](i1: Iterator[A], i2: Iterator[A]): Iterator[A] = {
Line 2,158 ⟶ 2,210:
nextHead ++ merge2Buffered(i1, i2)
}
}</langsyntaxhighlight>
 
Example usage, demonstrating lazyness:
 
<langsyntaxhighlight lang="scala">val i1 = Iterator.tabulate(5) { i =>
val x = i * 3
println(s"generating $x")
Line 2,185 ⟶ 2,237:
val x = merged.next
println(s"output: $x")
}</langsyntaxhighlight>
 
{{out}}
Line 2,221 ⟶ 2,273:
=={{header|Sidef}}==
{{trans|Raku}}
<langsyntaxhighlight lang="ruby">func merge_streams(streams) {
var s = streams.map { |stream|
Pair(stream, stream.readline)
Line 2,235 ⟶ 2,287:
}
 
say merge_streams(ARGV.map {|f| File(f).open_r }).join("\n")</langsyntaxhighlight>
 
=={{header|Tcl}}==
Line 2,242 ⟶ 2,294:
A careful reader will notice that '''$peeks''' is treated alternately as a dictionary ('''dict set''', '''dict get''') and as a list ('''lsort''', '''lassign'''), exploiting the fact that dictionaries are simply lists of even length. For large dictionaries this would not be recommended, as it causes [https://wiki.tcl.tk/3033 "shimmering"], but in this example the impact is too small to matter.
 
<langsyntaxhighlight Tcllang="tcl">#!/usr/bin/env tclsh
proc merge {args} {
set peeks {}
Line 2,262 ⟶ 2,314:
 
merge {*}[lmap f $::argv {open $f r}]
</syntaxhighlight>
</lang>
 
=={{header|UNIX Shell}}==
Line 2,274 ⟶ 2,326:
{{libheader|Wren-seq}}
No Heap class, so we use a List. Comparisons are text based even for numbers.
<langsyntaxhighlight ecmascriptlang="wren">import "io" for File
import "./ioutil" for FileUtil
import "./str" for Str
import "./seq" for Lst
 
var merge2 = Fn.new { |inputFile1, inputFile2, outputFile|
Line 2,325 ⟶ 2,377:
// check it worked
System.print(File.read("merged2.txt"))
System.print(File.read("mergedN.txt"))</langsyntaxhighlight>
 
{{out}}
Line 2,351 ⟶ 2,403:
=={{header|zkl}}==
This solution uses iterators, doesn't care where the streams orginate and only keeps the head of the stream on hand.
<langsyntaxhighlight lang="zkl">fcn mergeStreams(s1,s2,etc){ //-->Walker
streams:=vm.arglist.pump(List(),fcn(s){ // prime and prune
if( (w:=s.walker())._next() ) return(w);
Line 2,364 ⟶ 2,416:
v
}.fp(streams));
}</langsyntaxhighlight>
Using infinite streams:
<langsyntaxhighlight lang="zkl">w:=mergeStreams([0..],[2..*,2],[3..*,3],T(5));
w.walk(20).println();</langsyntaxhighlight>
{{out}}
<pre>
Line 2,373 ⟶ 2,425:
</pre>
Using files:
<langsyntaxhighlight lang="zkl">w:=mergeStreams(File("unixdict.txt"),File("2hkprimes.txt"),File("/dev/null"));
do(10){ w.read().print() }</langsyntaxhighlight>
{{out}}
<pre>
Line 2,389 ⟶ 2,441:
</pre>
Using the above example to squirt the merged stream to a file:
<langsyntaxhighlight lang="zkl">mergeStreams(File("unixdict.txt"),File("2hkprimes.txt"),File("/dev/null"))
.pump(File("foo.txt","w"));</langsyntaxhighlight>
{{out}}
<pre>
2,122

edits