Commatizing numbers: Difference between revisions
(J) |
(Some HTML cleanup) |
||
Line 3: | Line 3: | ||
Commatizing numbers (as used here, a handy expedient made-up word) is the act of adding commas to a number (or string), or the numeric part of a larger string. |
Commatizing numbers (as used here, a handy expedient made-up word) is the act of adding commas to a number (or string), or the numeric part of a larger string. |
||
; |
;Task: |
||
Write a function that takes a string as an argument with optional arguments or parameters (the format of parameters/options is left to the programmer) that in general, adds commas (or some |
Write a function that takes a string as an argument with optional arguments or parameters (the format of parameters/options is left to the programmer) that in general, adds commas (or some |
||
Line 11: | Line 11: | ||
The number may be part of a larger (non-numeric) string such as: |
The number may be part of a larger (non-numeric) string such as: |
||
::::* «US$1744 millions» |
::::* «US$1744 millions» ──or── |
||
::::* ±25000 motes. |
::::* ±25000 motes. |
||
The string may possibly ''not'' have a number suitable for commatizing, so it should be untouched and ''no error generated''. |
The string may possibly ''not'' have a number suitable for commatizing, so it should be untouched and ''no error generated''. |
||
If any argument (option) is invalid, nothing is changed and no error ''need be'' generated (quiet execution, no fail execution). |
If any argument (option) is invalid, nothing is changed and no error ''need be'' generated (quiet execution, no fail execution). Error message generation is optional. |
||
The exponent part of a number is never commatized. |
The exponent part of a number is never commatized. The following string isn't suitable for commatizing: 9.7e+12000 |
||
Leading zeroes are never commatized. |
Leading zeroes are never commatized. The string 0000000005714.882 after commatization is: 0000000005,714.882 |
||
Any period in a number is assumed to be a decimal point. |
Any period in a number is assumed to be a decimal point. |
||
Line 28: | Line 28: | ||
Leading signs ('''+''', '''-''') are to be preserved (even superfluous signs). |
Leading signs ('''+''', '''-''') are to be preserved (even superfluous signs). |
||
Leading/trailing/imbedded blanks, tabs, and other whitespace are to be preserved. |
Leading/trailing/imbedded blanks, tabs, and other whitespace are to be preserved. E.g.: +1024 bottles of beer on the wall. |
||
The case (upper/lower) of the exponent indicator is to be preserved. |
The case (upper/lower) of the exponent indicator is to be preserved. E.g.: 4.8903d-002 |
||
Any exponent character(s) should be supported: |
Any exponent character(s) should be supported: |
||
Line 41: | Line 41: | ||
::::::* 1000**100 |
::::::* 1000**100 |
||
::::::* 2048² |
::::::* 2048² |
||
::::::* 4096<sup>32</sup |
::::::* 4096<sup>32</sup> |
||
::::::* 10000pow(pi) |
::::::* 10000pow(pi) |
||
Numbers may be terminated with any non-digit character, including subscripts and/or superscript. |
Numbers may be terminated with any non-digit character, including subscripts and/or superscript. 4142135624² or 7320509076<sub>(24)</sub>. |
||
The character(s) to be used for the comma can be specified, and may contain blanks, tabs, and other whitespace characters, as well as multiple characters. |
The character(s) to be used for the comma can be specified, and may contain blanks, tabs, and other whitespace characters, as well as multiple characters. The default is the comma (''',''') |
||
character. |
character. |
||
The ''period length'' can be specified (sometimes referred to as "thousands"). |
The ''period length'' can be specified (sometimes referred to as "thousands"). The ''period length'' is the length (or number) of the digits between commas. The default period length is 3. |
||
default period length is 3. |
|||
E.g.: |
E.g.: in this example, the period length is five: 56789,12340,14148 |
||
The location of where to start the scanning for the target field (the numeric part) should be able to be specified. |
The location of where to start the scanning for the target field (the numeric part) should be able to be specified. The default is 1 (one). |
||
The (numeric?) strings below may be placed in a file (and read) or stored as simple strings within the program. |
The (numeric?) strings below may be placed in a file (and read) or stored as simple strings within the program. |
||
Line 73: | Line 72: | ||
::::* ␢␢␢$-140000±100 millions. |
::::* ␢␢␢$-140000±100 millions. |
||
::::* 6/9/1946 was a good year for some. |
::::* 6/9/1946 was a good year for some. |
||
<br>where the penultimate string has three leading blanks |
<br>where the penultimate string has three leading blanks (real blanks are to be used). |
||
;Also see: |
;Also see: |
||
* The Wiki entry: |
* The Wiki entry: [http://en.wikipedia.org/wiki/Eddington_number Arthur Eddington's number of protons in the universe]. |
||
Revision as of 03:43, 9 June 2014
Commatizing numbers (as used here, a handy expedient made-up word) is the act of adding commas to a number (or string), or the numeric part of a larger string.
- Task
Write a function that takes a string as an argument with optional arguments or parameters (the format of parameters/options is left to the programmer) that in general, adds commas (or some other characters, including blanks or tabs) to the first numeric part of a string (if it's suitable for commatizing), and returns that newly commatized string.
Some of the commatizing rules (specified below) are arbitrary, but they'll be a part of this task requirements, if only to make the results consistent amongst national preferences and other disciplines.
The number may be part of a larger (non-numeric) string such as:
- «US$1744 millions» ──or──
- ±25000 motes.
The string may possibly not have a number suitable for commatizing, so it should be untouched and no error generated.
If any argument (option) is invalid, nothing is changed and no error need be generated (quiet execution, no fail execution). Error message generation is optional.
The exponent part of a number is never commatized. The following string isn't suitable for commatizing: 9.7e+12000
Leading zeroes are never commatized. The string 0000000005714.882 after commatization is: 0000000005,714.882
Any period in a number is assumed to be a decimal point.
The original string is never changed except by the addition of commas (or whatever is used for insertion), if at all.
Leading signs (+, -) are to be preserved (even superfluous signs).
Leading/trailing/imbedded blanks, tabs, and other whitespace are to be preserved. E.g.: +1024 bottles of beer on the wall.
The case (upper/lower) of the exponent indicator is to be preserved. E.g.: 4.8903d-002
Any exponent character(s) should be supported:
- 1247e12
- 57256.1D-4
- 4444^60
- 7500∙10**35
- 8500x10**35
- +55000↑3
- 1000**100
- 2048²
- 409632
- 10000pow(pi)
Numbers may be terminated with any non-digit character, including subscripts and/or superscript. 4142135624² or 7320509076(24).
The character(s) to be used for the comma can be specified, and may contain blanks, tabs, and other whitespace characters, as well as multiple characters. The default is the comma (,) character.
The period length can be specified (sometimes referred to as "thousands"). The period length is the length (or number) of the digits between commas. The default period length is 3.
E.g.: in this example, the period length is five: 56789,12340,14148
The location of where to start the scanning for the target field (the numeric part) should be able to be specified. The default is 1 (one).
The (numeric?) strings below may be placed in a file (and read) or stored as simple strings within the program.
- Strings to be used as a minimum
The value of pi should be separated with blanks every 5 places past the decimal point,
the Zimbabwe dollar amount should use a decimal point for the "comma" separator:
- pi=3.14159265358979323846264338327950288419716939937510582097494459231
- The author has two Z$100000000000000 Zimbabwe notes (100 trillion).
- "-in Aus$+1411.8millions"
- ===US$0017440 millions=== (in 2000 dollars)
- 123.e8000 is pretty big.
- The land area of the earth is 57268900(29% of the surface) square miles.
- Ain't no numbers in this here words, nohow, no way, Jose.
- James was never known as 0000000007
- Arthur Eddington wrote: I believe there are 15747724136275002577605653961181555468044717914527116709366231425076185631031296 protons in the universe.
- ␢␢␢$-140000±100 millions.
- 6/9/1946 was a good year for some.
where the penultimate string has three leading blanks (real blanks are to be used).
- Also see
- The Wiki entry: Arthur Eddington's number of protons in the universe.
J
These rules are relatively baroque, which demands long names and minimally complex statements, thus:
<lang J>require'regex' commatize=:3 :0"1 L:1 0
(i.0) commatize y
NB. deal with all those rules about options
opts=. boxopen x char=. (#~ ' '&=@{.@(0&#)@>) opts num=. ;opts-.char delim=. 0 {:: char,<',' 'begin period'=. _1 0+2{.num,(#num)}.1 3
NB. initialize
prefix=. begin {.y text=. begin }. y
NB. process
'start len'=. ,'[1-9][0-9]*' rxmatch text if.0=len do. y return. end. number=. (start,:len) [;.0 text numb=. (>:period|<:#number){.number fixed=. numb,;delim&,each (-period)<\ (#numb)}.number prefix,(start{.text),fixed,(start+len)}.text
)</lang>
In use, this might look like:
<lang J> (5;5;' ') commatize 'pi=3.14159265358979323846264338327950288419716939937510582097494459231' pi=3.14159 26535 89793 23846 26433 83279 50288 41971 69399 37510 58209 74944 59231
'.' commatize 'The author has two Z$100000000000000 Zimbabwe notes (100 trillion).'
The author has two Z$100.000.000.000.000 Zimbabwe notes (100 trillion).
commatize '-in Aus$+1411.8millions'
-in Aus$+1,411.8millions
commatize '===US$0017440 millions=== (in 2000 dollars)'
===US$0017,440 millions=== (in 2000 dollars)
commatize '123.e8000 is pretty big.'
123.e8000 is pretty big.
commatize 'The land area of the earth is 57268900(29% of the surface) square miles.'
The land area of the earth is 57,268,900(29% of the surface) square miles.
commatize 'Aint no numbers in this here words, nohow, no way, Jose.'
Ain't no numbers in this here words, nohow, no way, Jose.
commatize 'James was never known as 0000000007'
James was never known as 0000000007
commatize 'Arthur Eddington wrote: I believe there are 15747724136275002577605653961181555468044717914527116709366231425076185631031296 protons in the universe.'
Arthur Eddington wrote: I believe there are 15,747,724,136,275,002,577,605,653,961,181,555,468,044,717,914,527,116,709,366,231,425,076,185,631,031,296 protons in the universe.
commatize ' $-140000±100 millions.' $-140,000±100 millions. commatize '6/9/1946 was a good year for some.'
6/9/1946 was a good year for some.</lang>
REXX
<lang rexx>/*REXX program adds commas (or other chars) to a number within a string.*/ @. = @.1="pi=3.14159265358979323846264338327950288419716939937510582097494459231" @.2="The author has two Z$100000000000000 Zimbabwe notes (100 trillion)." @.3="-in Aus$+1411.8millions" @.4="===US$0017440 millions=== (in 2000 dollars)" @.5="123.e8000 is pretty big." @.6="The land area of the earth is 57268900(29% of the surface) square miles." @.7="Ain't no numbers in this here words, nohow, no way, Jose." @.8="James was never known as 0000000007" @.9="Arthur Eddington wrote: I believe there are 15747724136275002577605653961181555468044717914527116709366231425076185631031296 protons in the universe." @.10=" $-140000±100 millions." @.11="6/9/1946 was a good year for some."
do i=1 while @.i\==; if i\==1 then say /*process each string*/ say 'before:'@.i /*show the before str*/ if i==1 then say ' after:'comma(@.i,'blank',5,,6) /*.=5,start=c6*/ if i==2 then say ' after:'comma(@.i,".") /*comma=a decimal pt.*/ if i>2 then say ' after:'comma(@.i) /*use the defaults. */ end /*j*/
exit /*stick a fork in it, we're done.*/ /*──────────────────────────────────COMMA subroutine────────────────────*/ comma: procedure; parse arg _,c,p,t,s; arg ,u; c=p(c ",") if u=='BLANK' then c=' ' /*special case for a "blank" sep.*/ o=p(p 3) /*get optional period length. */ p=abs(o) /*get positive period length. */ t=p(t 999999999) /*get max# of "commas" to insert.*/ s=p(s 1) /*get optional start position. */ if \isInt(p)| \isInt(t)| \isInt(s)| t<1| s<1 | p==0| arg()>5 then return _ n=_'.9'; #=123456789; k=0 /*define some handy-dandy vars. */ if o<0 then do /*using a negative period length.*/
b=verify(_,' ',,s) /*position of 1st blank in string*/ e=length(_)-verify(reverse(_),' ')+1-p end else do /*using a positive period length.*/ b=verify(n,#,"M",s) /*position of 1st useable digits.*/ z=max(1,verify(n,#"0.",'M',s)) e=verify(n,#'0',,max(1,verify(n,#"0.",'M',s)))-p-1 end
if e>0 & b>0 then do j=e to b by -p while k<t /*commatize the digs*/
_=insert(c,_,j) /*comma spray ──► #.*/ k=k+1 /*bump commatizing. */ end /*j*/
return _ /*──────────────────────────────────one-liner subroutines───────────────*/ isInt: return datatype(arg(1),'W') /*is the argument a whole number?*/ p: return word(arg(1), 1) /*return the first word found. */</lang> output when using the internal strings for input:
before:pi=3.14159265358979323846264338327950288419716939937510582097494459231 after:pi=3.14159 26535 89793 23846 26433 83279 50288 41971 69399 37510 58209 74944 59231 before:The author has two Z$100000000000000 Zimbabwe notes (100 trillion). after:The author has two Z$100.000.000.000.000 Zimbabwe notes (100 trillion). before:-in Aus$+1411.8millions after:-in Aus$+1,411.8millions before:===US$0017440 millions=== (in 2000 dollars) after:===US$0017,440 millions=== (in 2000 dollars) before:123.e8000 is pretty big. after:123.e8000 is pretty big. before:The land area of the earth is 57268900(29% of the surface) square miles. after:The land area of the earth is 57,268,900(29% of the surface) square miles. before:Ain't no numbers in this here words, nohow, no way, Jose. after:Ain't no numbers in this here words, nohow, no way, Jose. before:James was never known as 0000000007 after:James was never known as 0000000007 before:Arthur Eddington wrote: I believe there are 15747724136275002577605653961181555468044717914527116709366231425076185631031296 protons in the universe. after:Arthur Eddington wrote: I believe there are 15,747,724,136,275,002,577,605,653,961,181,555,468,044,717,914,527,116,709,366,231,425,076,185,631,031,296 protons in the universe. before: $-140000±100 millions. after: $-140,000±100 millions. before:6/9/1946 was a good year for some. after:6/9/1946 was a good year for some.