I'm working on modernizing Rosetta Code's infrastructure. Starting with communications. Please accept this time-limited open invite to RC's Slack.. --Michael Mol (talk) 20:59, 30 May 2020 (UTC)

Long literals, with continuations

From Rosetta Code
Long literals, with continuations is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

This task is about writing a computer program that has long literals   (character literals that may require specifying the words/tokens on more than one (source) line,   either with continuations or some other method, such as abutments or concatenations   (or some other mechanisms).


The literal is to be in the form of a "list",   a literal that contains many words (tokens) separated by a blank (space),   in this case   (so as to have a common list),   the (English) names of the chemical elements of the periodic table.


The list is to be in (ascending) order of the element's atomic number:

hydrogen helium lithium beryllium boron carbon nitrogen oxygen fluorine neon sodium aluminum silicon ...

... up to the last known (named) element   (at this time).


Do not include any of the   "unnamed"   element names such as:

ununennium unquadnilium triunhexium penthextrium penthexpentium septhexunium octenntrium ennennbium


To make computer programming languages comparable,   the statement widths should be restricted to less than   81   bytes (characters),   or less if a computer programming language has more restrictive limitations or standards.

Also mention what column the programming statements can start in if   not   in column one.


The list   may   have leading/embedded/trailing blanks during the declaration   (the actual program statements),   this is allow the list to be more readable.   The "final" list shouldn't have any leading/trailing or superfluous blanks   (when stored in the program's "memory").

This list should be written with the idea in mind that the program   will   be updated,   most likely someone other than the original author,   as there will be newer (discovered) elements of the periodic table being added   (possibly in the near future).   These future updates should be one of the primary concerns in writing these programs and it should be "easy" for someone else to add chemical elements to the list   (within the computer program).

Attention should be paid so as to not exceed the   clause length   of continued or specified statements,   if there is such a restriction.   If the limit is greater than (say) 4,000 bytes or so,   it needn't be mentioned here.


Task
  •   Write a computer program (by whatever name) to contain a list of the known elements.
  •   The program should eventually contain a long literal of words   (the elements).
  •   The literal should show how one could create a long list of blank-delineated words.
  •   The "final" (stored) list should only have a single blank between elements.
  •   Try to use the most idiomatic approach(es) in creating the final list.
  •   Use continuation if possible, and/or show alternatives   (possibly using concatenation).
  •   Use a program comment to explain what the continuation character is if not obvious.
  •   The program should contain a variable that has the date of the last update/revision.
  •   The program, when run, should display with verbiage:
  •   The last update/revision date   (and should be unambiguous).
  •   The number of chemical elements in the list.
  •   The name of the highest (last) element name.


Show all output here, on this page.

Factor[edit]

The qw vocabulary provides Perl-ish syntax for arrays of strings. For instance, the literal
qw{ a    bc d }
expands to
{ "a" "bc" "d" }
during parse time. This is convenient to use when the strings that are stored contain no whitespace.


The convention in Factor is to limit lines to 64 characters wide if possible. This constraint is sometimes waived for large literals, but it was easy enough to accommodate here.

Works with: Factor version 0.99 2020-03-02
USING: formatting kernel qw sequences ;
 
qw{
hydrogen helium lithium beryllium
boron carbon nitrogen oxygen
fluorine neon sodium magnesium
aluminum silicon phosphorous sulfur
chlorine argon potassium calcium
scandium titanium vanadium chromium
manganese iron cobalt nickel
copper zinc gallium germanium
arsenic selenium bromine krypton
rubidium strontium yttrium zirconium
niobium molybdenum technetium ruthenium
rhodium palladium silver cadmium
indium tin antimony tellurium
iodine xenon cesium barium
lanthanum cerium praseodymium neodymium
promethium samarium europium gadolinium
terbium dysprosium holmium erbium
thulium ytterbium lutetium hafnium
tantalum tungsten rhenium osmium
iridium platinum gold mercury
thallium lead bismuth polonium
astatine radon francium radium
actinium thorium protactinium uranium
neptunium plutonium americium curium
berkelium californium einsteinium fermium
mendelevium nobelium lawrencium rutherfordium
dubnium seaborgium bohrium hassium
meitnerium darmstadtium roentgenium copernicium
nihonium flerovium moscovium livermorium
tennessine oganesson
}
 
"2020-03-23"  ! last revision date in YYYY-MM-DD format
 
"Last revision: %s\n" printf
[ length ] [ last ] bi
"Number of elements: %d\nLast element: %s\n" printf
Output:
Last revision: 2020-03-23
Number of elements: 118
Last element: oganesson

Go[edit]

package main
 
import (
"fmt"
"regexp"
"strings"
)
 
// Uses a 'raw string literal' which is a character sequence enclosed in back quotes.
// Within the quotes any character (including new line) may appear except
// back quotes themselves.
var elements = `
hydrogen helium lithium beryllium
boron carbon nitrogen oxygen
fluorine neon sodium magnesium
aluminum silicon phosphorous sulfur
chlorine argon potassium calcium
scandium titanium vanadium chromium
manganese iron cobalt nickel
copper zinc gallium germanium
arsenic selenium bromine krypton
rubidium strontium yttrium zirconium
niobium molybdenum technetium ruthenium
rhodium palladium silver cadmium
indium tin antimony tellurium
iodine xenon cesium barium
lanthanum cerium praseodymium neodymium
promethium samarium europium gadolinium
terbium dysprosium holmium erbium
thulium ytterbium lutetium hafnium
tantalum tungsten rhenium osmium
iridium platinum gold mercury
thallium lead bismuth polonium
astatine radon francium radium
actinium thorium protactinium uranium
neptunium plutonium americium curium
berkelium californium einsteinium fermium
mendelevium nobelium lawrencium rutherfordium
dubnium seaborgium bohrium hassium
meitnerium darmstadtium roentgenium copernicium
nihonium flerovium moscovium livermorium
tennessine oganesson
`

 
func main() {
lastRevDate := "March 24th, 2020"
re := regexp.MustCompile(`\s+`) // split on one or more whitespace characters
els := re.Split(strings.TrimSpace(elements), -1)
numEls := len(els)
// Recombine as a single string with elements separated by a single space.
elements2 := strings.Join(els, " ")
// Required output.
fmt.Println("Last revision Date: ", lastRevDate)
fmt.Println("Number of elements: ", numEls)
// The compiler complains that 'elements2' is unused if we don't use
// something like this to get the last element rather than just els[numEls-1].
lix := strings.LastIndex(elements2, " ") // get index of last space
fmt.Println("Last element  : ", elements2[lix+1:])
}
Output:
Last revision Date:  March 24th, 2020
Number of elements:  118
Last element      :  oganesson

Julia[edit]

The task does not as of the current revision mention lower versus upper case, but the below is corrected per a request anyway. The task does ask to comment on which column code may start. The start column for code does not matter to Julia, or to most modern computer language compilers, other than in some cases (not string data) Python.

using Dates
 
# FOR FUTURE EDITORS:
#
# Add to this list by adding more lines of text to this listing, placing the
# new words of text before the last """ below, with all entries separated by
# spaces.
#
const CHEMICAL_ELEMENTS = """
 
hydrogen helium lithium beryllium boron carbon nitrogen oxygen fluorine neon
sodium magnesium aluminum silicon phosphorus sulfur chlorine argon potassium
calcium scandium titanium vanadium chromium manganese iron cobalt nickel copper
zinc gallium germanium arsenic selenium bromine krypton rubidium strontium
yttrium zirconium niobium molybdenum technetium ruthenium rhodium palladium
silver cadmium indium tin antimony tellurium iodine xenon cesium barium
lanthanum cerium praseodymium neodymium promethium samarium europium gadolinium
terbium dysprosium holmium erbium thulium ytterbium lutetium hafnium tantalum
tungsten rhenium osmium iridium platinum gold mercury thallium lead bismuth
polonium astatine radon francium radium actinium thorium protactinium uranium
neptunium plutonium americium curium berkelium californium einsteinium fermium
mendelevium nobelium lawrencium rutherfordium dubnium seaborgium bohrium hassium
meitnerium darmstadtium roentgenium copernicium nihonium flerovium moscovium
livermorium tennessine oganesson
 
"""
#
# END OF ABOVE LISTING--DO NOT ADD ELEMENTS BELOW THIS LINE
#
 
const EXCLUDED = split(strip(
"ununennium unquadnilium triunhexium penthextrium penthexpentium " *
" septhexunium octenntrium ennennbium"), r"\s+")
 
function process_chemical_element_list(s = CHEMICAL_ELEMENTS)
# remove leading and trailing whitespace
s = strip(s)
# return a list after splitting using whitespace between words as a separator
return [element for element in split(s, r"\s+") if !(element in EXCLUDED)]
end
 
function report()
filedate = Dates.unix2datetime(mtime(@__FILE__))
element_list = process_chemical_element_list()
element_count = length(element_list)
last_element_in_list = element_list[end]
 
println("File last revised (formatted as dateTtime): ", filedate, " GMT")
println("Length of element list: ", element_count)
println("last element in list: ", last_element_in_list)
end
 
report()
 
Output:
File last revised (formatted as dateTtime): 2020-03-24T02:48:55.421 GMT
Length of element list: 118
last element in list: oganesson

Phix[edit]

Back-ticks and triple-quotes permit multi-line strings. We first replace all/any cr/lf/tab characters with spaces, then split (by default on a single space), omitting empty elements. You could use spaced_elements = join(elements) to join them back up into a space-separated single string, if that's really what you want, and you could then, like Go, use rfind(' ',spaced_elements) to re-extract the last one. You could also, like Julia, use get_file_date(command_line()[2]) instead of the hand-written last_updated constant. Phix code is free-format, indent things however you like, there is no specific maximum line length.

constant last_updated = "March 24th, 2020",
elements_text = `
hydrogen helium lithium beryllium
boron carbon nitrogen oxygen
fluorine neon sodium magnesium
aluminum silicon phosphorous sulfur
chlorine argon potassium calcium
scandium titanium vanadium chromium
manganese iron cobalt nickel
copper zinc gallium germanium
arsenic selenium bromine krypton
rubidium strontium yttrium zirconium
niobium molybdenum technetium ruthenium
rhodium palladium silver cadmium
indium tin antimony tellurium
iodine xenon cesium barium
lanthanum cerium praseodymium neodymium
promethium samarium europium gadolinium
terbium dysprosium holmium erbium
thulium ytterbium lutetium hafnium
tantalum tungsten rhenium osmium
iridium platinum gold mercury
thallium lead bismuth polonium
astatine radon francium radium
actinium thorium protactinium uranium
neptunium plutonium americium curium
berkelium californium einsteinium fermium
mendelevium nobelium lawrencium rutherfordium
dubnium seaborgium bohrium hassium
meitnerium darmstadtium roentgenium copernicium
nihonium flerovium moscovium livermorium
tennessine oganesson
`,
elements = split(substitute_all(elements_text,"\n\r\t"," "),no_empty:=true),
fmt = """
Last revision: %s
Number of elements: %d
The last of which is: `%s`
"""
printf(1,fmt,{last_updated,length(elements),elements[$]})
Output:
Last revision: March 24th, 2020
Number of elements: 118
The last of which is: `oganesson`

Raku[edit]

This example is incorrect. Please fix the code and remove this message.
Details: The task is to have a list of the names of the elements, not their atomic weight and chemical element symbol.
Also, the element names are not capitalized.

Incorrectly marked incorrect. Enforcing rules that exists only in somebodies head.

Works with: Rakudo version 2020.02

Not really sure I understand the point of this task. Seems to be load some list into memory and manipulate it somehow. Exceptionally boring to just read it in and then read it back out again. Perform some more interesting manipulations. Use < > quoting construct for literal string; unlimited (memory limited) characters, spaces don't matter, new-lines don't matter, blank lines don't matter.

my %periodic;
%periodic<revision-date> = Date.new(2020,3,23);
%periodic<table> = |<
 
Hydrogen 1.0079 H Helium 4.0026 He
Lithium 6.941 Li Beryllium 9.0122 Be
Boron 10.811 B Carbon 12.0107 C
Nitrogen 14.0067 N Oxygen 15.9994 O
Fluorine 18.9984 F Neon 20.1797 Ne
Sodium 22.9897 Na Magnesium 24.305 Mg
Aluminum 26.9815 Al Silicon 28.0855 Si
Phosphorus 30.9738 P Sulfur 32.065 S
Chlorine 35.453 Cl Potassium 39.0983 K
Argon 39.948 Ar Calcium 40.078 Ca
Scandium 44.9559 Sc Titanium 47.867 Ti
Vanadium 50.9415 V Chromium 51.9961 Cr
Manganese 54.938 Mn Iron 55.845 Fe
Nickel 58.6934 Ni Cobalt 58.9332 Co
Copper 63.546 Cu Zinc 65.39 Zn
Gallium 69.723 Ga Germanium 72.64 Ge
Arsenic 74.9216 As Selenium 78.96 Se
Bromine 79.904 Br Krypton 83.8 Kr
Rubidium 85.4678 Rb Strontium 87.62 Sr
Yttrium 88.9059 Y Zirconium 91.224 Zr
Niobium 92.9064 Nb Molybdenum 95.94 Mo
Technetium 98 Tc Ruthenium 101.07 Ru
Rhodium 102.9055 Rh Palladium 106.42 Pd
Silver 107.8682 Ag Cadmium 112.411 Cd
Indium 114.818 In Tin 118.71 Sn
Antimony 121.76 Sb Iodine 126.9045 I
Tellurium 127.6 Te Xenon 131.293 Xe
Cesium 132.9055 Cs Barium 137.327 Ba
Lanthanum 138.9055 La Cerium 140.116 Ce
Praseodymium 140.9077 Pr Neodymium 144.24 Nd
Promethium 145 Pm Samarium 150.36 Sm
Europium 151.964 Eu Gadolinium 157.25 Gd
Terbium 158.9253 Tb Dysprosium 162.5 Dy
Holmium 164.9303 Ho Erbium 167.259 Er
Thulium 168.9342 Tm Ytterbium 173.04 Yb
Lutetium 174.967 Lu Hafnium 178.49 Hf
Tantalum 180.9479 Ta Tungsten 183.84 W
Rhenium 186.207 Re Osmium 190.23 Os
Iridium 192.217 Ir Platinum 195.078 Pt
Gold 196.9665 Au Mercury 200.59 Hg
Thallium 204.3833 Tl Lead 207.2 Pb
Bismuth 208.9804 Bi Polonium 209 Po
Astatine 210 At Radon 222 Rn
Francium 223 Fr Radium 226 Ra
Actinium 227 Ac Protactinium 231.0359 Pa
Thorium 232.0381 Th Neptunium 237 Np
Uranium 238.0289 U Americium 243 Am
Plutonium 244 Pu Curium 247 Cm
Berkelium 247 Bk Californium 251 Cf
Einsteinium 252 Es Fermium 257 Fm
Mendelevium 258 Md Nobelium 259 No
Rutherfordium 261 Rf Lawrencium 262 Lr
Dubnium 262 Db Bohrium 264 Bh
Seaborgium 266 Sg Meitnerium 268 Mt
Roentgenium 272 Rg Hassium 277 Hs
Darmstadtium ??? Ds Copernicium ??? Cn
Nihonium ??? Nh Flerovium ??? Fl
Moscovium ??? Mc Livermorium ??? Lv
Tennessine ??? Ts Oganesson ??? Og
 
>.words.map: { (:name($^a), :weight($^b), :symbol($^c)).hash };
 
put 'Revision date: ', %periodic<revision-date>;
put 'Last element by position (nominally by weight): ', %periodic<table>.tail.<name>;
put 'Total number of elements: ', %periodic<table>.elems;
put 'Last element sorted by full name: ', %periodic<table>.sort( *.<name> ).tail.<name>;
put 'Longest element name: ', %periodic<table>.sort( *.<name>.chars ).tail.<name>;
put 'Shortest element name: ', %periodic<table>.sort( -*.<name>.chars ).tail.<name>;
put 'Symbols for elements whose name starts with "P": ', %periodic<table>.grep( *.<name>.starts-with('P') )».<symbol>;
put "Elements with molecular weight between 20 & 40:\n ",%periodic<table>.grep( {+.<weight> ~~ Numeric and 20 < .<weight> < 40} )».<name>;
put "SCRN: ", %periodic<table>[87,17,92]».<symbol>.join.tclc;
Output:
Revision date: 2020-03-23
Last element by position (nominally by weight): Oganesson
Total number of elements: 118
Last element sorted by full name: Zirconium
Longest element name: Rutherfordium
Shortest element name: Tin
Symbols for elements whose name starts with "P": P K Pd Pr Pm Pt Po Pa Pu
Elements with molecular weight between 20 & 40:
 Neon Sodium Magnesium Aluminum Silicon Phosphorus Sulfur Chlorine Potassium Argon
SCRN: Raku

REXX[edit]

using continuations[edit]

This method will not work for some REXXes such as PC/REXX and Personal REXX as those two REXXes have a clause length limit of   1,024   bytes.

The   space   BIF is used to eliminate superfluous blanks from the list.

Most modern REXXes have no practical limit for a clause length.

/*REXX pgm illustrates how to code a list of words  (named chemical elements  */
/*──────────────────────── ordered by their atomic number) in a list format. */
 
$= 'hydrogen helium lithium beryllium boron carbon' ,
'nitrogen oxygen fluorine neon sodium magnesium' ,
'aluminum silicon phosphorous sulfur chlorine argon' ,
'potassium calcium scandium titanium vanadium chromium' ,
'manganese iron cobalt nickel copper zinc' ,
'gallium germanium arsenic selenium bromine krypton' ,
'rubidium strontium yttrium zirconium niobium molybdenum',
'technetium ruthenium rhodium palladium silver cadmium' ,
'indium tin antimony tellurium iodine xenon' ,
'cesium barium lanthanum cerium praseodymium neodymium' ,
'promethium samarium europium gadolinium terbium dysprosium',
'holmium erbium thulium ytterbium lutetium hafnium' ,
'tantalum tungsten rhenium osmium iridium platinum' ,
'gold mercury thallium lead bismuth polonium' ,
'astatine radon francium radium actinium thorium' ,
'protactinium uranium neptunium plutonium americium curium' ,
'berkelium californium einsteinium fermium mendelevium nobelium' ,
'lawrencium rutherfordium dubnium seaborgium bohrium hassium' ,
'meitnerium darmstadtium roentgenium copernicium nihonium flerovium' ,
'moscovium livermorium tennessine oganesson'
 
/* [↑] element list using continuation (commas).*/
 
updated= 'February 29th, 2020' /*date of the last revision of list.*/
say 'revision date of the list: ' updated /*show the date of the last update. */
elements= space($) /*elide excess blanks in the list*/
#= words(elements) /*the number of elements " " " */
say 'number of elements in the list: ' # /*show " " " " " " */
say 'the last element is: ' word($, #) /*stick a fork in it, we're all done*/
output   when using the default input:
revision date of the list:  February 29th, 2020
number of elements in the list:  118
the last element is:  oganesson

using concatenations[edit]

Note that at least one REXX has a maximum width of any one line, whether or not it is continued.
PC/REXX and Personal REXX have a limit is 250 characters   (which includes comments).

Also note that REXX comments are   not   totally ignored by the parser, they are kept around for   tracing   and
for the invocation of the   sourceline   BIF.


The REXX version uses concatenation (also called abutment) to build the list.

/*REXX pgm illustrates how to code a list of words  (named chemical elements  */
/*──────────────────────── ordered by their atomic number) in a list format. */
 
$= 'hydrogen helium lithium beryllium boron carbon'
$=$ 'nitrogen oxygen fluorine neon sodium magnesium'
$=$ 'aluminum silicon phosphorous sulfur chlorine argon'
$=$ 'potassium calcium scandium titanium vanadium chromium'
$=$ 'manganese iron cobalt nickel copper zinc'
$=$ 'gallium germanium arsenic selenium bromine krypton'
$=$ 'rubidium strontium yttrium zirconium niobium molybdenum'
$=$ 'technetium ruthenium rhodium palladium silver cadmium'
$=$ 'indium tin antimony tellurium iodine xenon'
$=$ 'cesium barium lanthanum cerium praseodymium neodymium'
$=$ 'promethium samarium europium gadolinium terbium dysprosium'
$=$ 'holmium erbium thulium ytterbium lutetium hafnium'
$=$ 'tantalum tungsten rhenium osmium iridium platinum'
$=$ 'gold mercury thallium lead bismuth polonium'
$=$ 'astatine radon francium radium actinium thorium'
$=$ 'protactinium uranium neptunium plutonium americium curium'
$=$ 'berkelium californium einsteinium fermium mendelevium nobelium'
$=$ 'lawrencium rutherfordium dubnium seaborgium bohrium hassium'
$=$ 'meitnerium darmstadtium roentgenium copernicium nihonium flerovium'
$=$ 'moscovium livermorium tennessine oganesson'
 
/* [↑] element list using abutments*/
 
update= '29Feb2020' /*date of the last revision of list.*/
say 'revision date of the list: ' update /*show the date of the last update. */
elements= space($) /*elide excess blanks in the list*/
#= words(elements) /*the number of elements " " " */
say 'number of elements in the list: ' # /*show " " " " " " */
say 'the last element is: ' word($, #) /*stick a fork in it, we're all done*/
output   when using the default input:
revision date of the list:  29Feb2020
number of elements in the list:  118
the last element is:  oganesson

zkl[edit]

This solution uses a "doc string", a chunk of text that the parser eats verbatim. It starts and ends with #<<<. If started with #<<<", a leading " is added to the text. The text is then parsed as one [long] line.

The string split method creats a list of items split at white space (by default). To turn that into one string with one space between each item, use: elements.concat(" ")

revisionDate:="2020-03-23";
elements:=
#<<<"
hydrogen helium lithium beryllium boron carbon
nitrogen oxygen fluorine neon sodium magnesium
aluminum silicon phosphorous sulfur chlorine argon
potassium calcium scandium titanium vanadium chromium
manganese iron cobalt nickel copper zinc
gallium germanium arsenic selenium bromine krypton
rubidium strontium yttrium zirconium niobium molybdenum
technetium ruthenium rhodium palladium silver cadmium
indium tin antimony tellurium iodine xenon
cesium barium lanthanum cerium praseodymium neodymium
promethium samarium europium gadolinium terbium dysprosium
holmium erbium thulium ytterbium lutetium hafnium
tantalum tungsten rhenium osmium iridium platinum
gold mercury thallium lead bismuth polonium
astatine radon francium radium actinium thorium
protactinium uranium neptunium plutonium americium curium
berkelium californium einsteinium fermium mendelevium nobelium
lawrencium rutherfordium dubnium seaborgium bohrium hassium
meitnerium darmstadtium roentgenium copernicium nihonium flerovium
moscovium livermorium tennessine oganesson"
.split();
#<<<
println("Revision date: ",revisionDate);
println(elements.len()," elements, the last being \"",elements[-1],"\"");
Output:
Revision date: 2020-03-23
118 elements, the last being "oganesson"