Jump to content

Separate the house number from the street name: Difference between revisions

no edit summary
(→‎Tcl: Added implementation)
No edit summary
Line 1:
{{Draft task}}
In Germany and the Netherlands postal addresses have the form: street name, followed by the house number, in accordance with the national standards DIN 5008 respectively NEN 5825. The problem is that some street names have numbers (e.g. special years) and some [http://en.wikipedia.org/wiki/House_numbering#Europe house numbers] have characters as an extension. It's a real life problem and difficult because in the Netherlands some street names are a tribute to our liberators. The street names have the numbers 40 and 45 indicating the years of war between 1940 and 1945.
;Task:
 
Write code that correctly separates the house number from the street name and presents them both. <span style="color:#FFFFFF; background:#FF69B4">No static data must be shown, only processed data.</span>
* For bonus Kudos: Make the splitting also valid for more countries.
A test-set:
The test-set:
<pre>Plataanstraat 5
Straat 12
Line 37 ⟶ 38:
Marktplatz 31
Schmidener Weg 3
Karl-Weysser-Str. 6</pre>The Scala solution has the right separations.
 
=={{header|Scala}}==
<lang Scala>import scala.io.Source.fromString
import scala.util.matching.Regex
 
object HouseNumber extends App {
def adressen =
"""Plataanstraat 5
|Straat 12
|Straat 12 II
|Dr. J. Straat 12
|Dr. J. Straat 12 a
|Dr. J. Straat 12-14
|Laan 1940 – 1945 37
|Plein 1940 2
|1213-laan 11
|16 april 1944 Pad 1
|1e Kruisweg 36
|Laan 1940-’45 66
|Laan ’40-’45
|Langeloërduinen 3 46
|Marienwaerdt 2e Dreef 2
|Provincialeweg N205 1
|Rivium 2e Straat 59.
|Nieuwe gracht 20rd
|Nieuwe gracht 20rd 2
|Nieuwe gracht 20zw /2
|Nieuwe gracht 20zw/3
|Nieuwe gracht 20 zw/4
|Bahnhofstr. 4
|Wertstr. 10
|Lindenhof 1
|Nordesch 20
|Weilstr. 6
|Harthauer Weg 2
|Mainaustr. 49
|August-Horch-Str. 3
|Marktplatz 31
|Schmidener Weg 3
|Karl-Weysser-Str. 6""".stripMargin
 
val extractor = new Regex("""(\s\d+[-/]\d+)|(\s(?!1940|1945)\d+[a-zI. /]*\d*)$""")
 
def splitsAdressen(input: String) = (extractor.split(input).mkString, extractor.findFirstIn(input).getOrElse(""))
 
adressen.lines.foreach(s => println(f"$s%-25s split as ${splitsAdressen(s)}"))
}</lang>
{{out}}
<pre>Plataanstraat 5 split as (Plataanstraat, 5)
Straat 12 split as (Straat, 12)
Straat 12 II split as (Straat, 12 II)
Dr. J. Straat 12 split as (Dr. J. Straat , 12)
Dr. J. Straat 12 a split as (Dr. J. Straat, 12 a)
Dr. J. Straat 12-14 split as (Dr. J. Straat, 12-14)
Laan 1940 – 1945 37 split as (Laan 1940 – 1945, 37)
Plein 1940 2 split as (Plein 1940, 2)
1213-laan 11 split as (1213-laan, 11)
16 april 1944 Pad 1 split as (16 april 1944 Pad, 1)
1e Kruisweg 36 split as (1e Kruisweg, 36)
Laan 1940-’45 66 split as (Laan 1940-’45, 66)
Laan ’40-’45 split as (Laan ’40-’45,)
Langeloërduinen 3 46 split as (Langeloërduinen, 3 46)
Marienwaerdt 2e Dreef 2 split as (Marienwaerdt 2e Dreef, 2)
Provincialeweg N205 1 split as (Provincialeweg N205, 1)
Rivium 2e Straat 59. split as (Rivium 2e Straat, 59.)
Nieuwe gracht 20rd split as (Nieuwe gracht, 20rd)
Nieuwe gracht 20rd 2 split as (Nieuwe gracht, 20rd 2)
Nieuwe gracht 20zw /2 split as (Nieuwe gracht, 20zw /2)
Nieuwe gracht 20zw/3 split as (Nieuwe gracht, 20zw/3)
Nieuwe gracht 20 zw/4 split as (Nieuwe gracht, 20 zw/4)
Bahnhofstr. 4 split as (Bahnhofstr., 4)
Wertstr. 10 split as (Wertstr., 10)
Lindenhof 1 split as (Lindenhof, 1)
Nordesch 20 split as (Nordesch, 20)
Weilstr. 6 split as (Weilstr., 6)
Harthauer Weg 2 split as (Harthauer Weg, 2)
Mainaustr. 49 split as (Mainaustr., 49)
August-Horch-Str. 3 split as (August-Horch-Str., 3)
Marktplatz 31 split as (Marktplatz, 31)
Schmidener Weg 3 split as (Schmidener Weg, 3)
Karl-Weysser-Str. 6 split as (Karl-Weysser-Str., 6)</pre>
 
=={{header|J}}==
{{needs-reviewincorrect|langJ}}
'''Solution''' (''native''):<lang j> din5008 =: split~ i.&1@:e.&'0123456789'</lang>
'''Solution''' (''regex''):<lang j> require'regex'
Line 227 ⟶ 147:
'''Notes''':I'm jumping on this task very early in its development; at the moment, it lacks explicit rules for identifying the location where the house number begins. So, since I don't read German or Dutch, pending more explicit rules, I'm going to assume the number starts at the first decimal digit in the string and continues to the end, and that everything preceding that point is considered the street name.
=={{header|Perl 6}}==
An unquestioning translation of the Scala example's regex to show how we lay out such regexes for readability in Perl 6, except that we take the liberty of leaving the space out of the house number. (Hard constants like 1940 and 1945 are a code smell, and the task should probably not require such constants unless there is a standard to point to that mandates them.) So expect this solution to change if the task is actually defined reasonably, such as by specifying that four-digit house numbers are excluded in Europe. (In contrast, four- and five-digit house numbers are not uncommon in places such as the U.S. where each block gets a hundred house numbers to play with, and there are cities with hundreds of blocks along a street.)
<lang perl6>say m[
( .*? )
Line 376 ⟶ 296:
 
=={{header|Python}}==
{{incorrect|Python}}<lang python>print('''\
(See talk page)
<lang python>print('''\
Plataanstraat 5 split as (Plataanstraat, 5)
Straat 12 split as (Straat, 12)
Line 411 ⟶ 330:
Schmidener Weg 3 split as (Schmidener Weg, 3)
Karl-Weysser-Str. 6 split as (Karl-Weysser-Str., 6)''')</lang>
 
=={{header|Scala}}==
<lang Scala>import scala.io.Source.fromString
import scala.util.matching.Regex
 
object HouseNumber extends App {
def adressen =
"""Plataanstraat 5
|Straat 12
|Straat 12 II
|Dr. J. Straat 12
|Dr. J. Straat 12 a
|Dr. J. Straat 12-14
|Laan 1940 – 1945 37
|Plein 1940 2
|1213-laan 11
|16 april 1944 Pad 1
|1e Kruisweg 36
|Laan 1940-’45 66
|Laan ’40-’45
|Langeloërduinen 3 46
|Marienwaerdt 2e Dreef 2
|Provincialeweg N205 1
|Rivium 2e Straat 59.
|Nieuwe gracht 20rd
|Nieuwe gracht 20rd 2
|Nieuwe gracht 20zw /2
|Nieuwe gracht 20zw/3
|Nieuwe gracht 20 zw/4
|Bahnhofstr. 4
|Wertstr. 10
|Lindenhof 1
|Nordesch 20
|Weilstr. 6
|Harthauer Weg 2
|Mainaustr. 49
|August-Horch-Str. 3
|Marktplatz 31
|Schmidener Weg 3
|Karl-Weysser-Str. 6""".stripMargin
 
val extractor = new Regex("""(\s\d+[-/]\d+)|(\s(?!1940|1945)\d+[a-zI. /]*\d*)$""")
 
def splitsAdressen(input: String) = (extractor.split(input).mkString, extractor.findFirstIn(input).getOrElse(""))
 
adressen.lines.foreach(s => println(f"$s%-25s split as ${splitsAdressen(s)}"))
}</lang>
{{out}}
<pre>Plataanstraat 5 split as (Plataanstraat, 5)
Straat 12 split as (Straat, 12)
Straat 12 II split as (Straat, 12 II)
Dr. J. Straat 12 split as (Dr. J. Straat , 12)
Dr. J. Straat 12 a split as (Dr. J. Straat, 12 a)
Dr. J. Straat 12-14 split as (Dr. J. Straat, 12-14)
Laan 1940 – 1945 37 split as (Laan 1940 – 1945, 37)
Plein 1940 2 split as (Plein 1940, 2)
1213-laan 11 split as (1213-laan, 11)
16 april 1944 Pad 1 split as (16 april 1944 Pad, 1)
1e Kruisweg 36 split as (1e Kruisweg, 36)
Laan 1940-’45 66 split as (Laan 1940-’45, 66)
Laan ’40-’45 split as (Laan ’40-’45,)
Langeloërduinen 3 46 split as (Langeloërduinen, 3 46)
Marienwaerdt 2e Dreef 2 split as (Marienwaerdt 2e Dreef, 2)
Provincialeweg N205 1 split as (Provincialeweg N205, 1)
Rivium 2e Straat 59. split as (Rivium 2e Straat, 59.)
Nieuwe gracht 20rd split as (Nieuwe gracht, 20rd)
Nieuwe gracht 20rd 2 split as (Nieuwe gracht, 20rd 2)
Nieuwe gracht 20zw /2 split as (Nieuwe gracht, 20zw /2)
Nieuwe gracht 20zw/3 split as (Nieuwe gracht, 20zw/3)
Nieuwe gracht 20 zw/4 split as (Nieuwe gracht, 20 zw/4)
Bahnhofstr. 4 split as (Bahnhofstr., 4)
Wertstr. 10 split as (Wertstr., 10)
Lindenhof 1 split as (Lindenhof, 1)
Nordesch 20 split as (Nordesch, 20)
Weilstr. 6 split as (Weilstr., 6)
Harthauer Weg 2 split as (Harthauer Weg, 2)
Mainaustr. 49 split as (Mainaustr., 49)
August-Horch-Str. 3 split as (August-Horch-Str., 3)
Marktplatz 31 split as (Marktplatz, 31)
Schmidener Weg 3 split as (Schmidener Weg, 3)
Karl-Weysser-Str. 6 split as (Karl-Weysser-Str., 6)</pre>
 
=={{header|Tcl}}==
Anonymous user
Cookies help us deliver our services. By using our services, you agree to our use of cookies.