Tree datastructures: Difference between revisions
Line 471: | Line 471: | ||
# forestFromNestLevels :: [(Int, |
# forestFromNestLevels :: [(Int, a)] -> [Tree a] |
||
def forestFromNestLevels(tuples): |
def forestFromNestLevels(tuples): |
||
'''A list of trees derived from a list of |
'''A list of trees derived from a list of values paired |
||
with integers giving their levels of indentation. |
with integers giving their levels of indentation. |
||
''' |
''' |
||
def go(xs): |
def go(xs): |
||
if xs: |
if xs: |
||
(intIndent, |
(intIndent, v) = xs[0] |
||
(firstTreeLines, rest) = span( |
(firstTreeLines, rest) = span( |
||
lambda x: intIndent < x[0] |
lambda x: intIndent < x[0] |
||
)(xs[1:]) |
)(xs[1:]) |
||
return [Node( |
return [Node(v)(go(firstTreeLines))] + go(rest) |
||
else: |
else: |
||
return [] |
return [] |
||
Line 516: | Line 516: | ||
'Parse tree from outline text:\n', |
'Parse tree from outline text:\n', |
||
forestJSON(forestA), |
forestJSON(forestA), |
||
'\nNesting level list from tree:\n', |
'\nNesting level list from tree:\n', |
||
json.dumps(nestLevels, indent=2), |
json.dumps(nestLevels, indent=2), |
||
'\nTree rebuilt from nesting level list:\n', |
'\nTree rebuilt from nesting level list:\n', |
||
forestJSON(forestB), |
forestJSON(forestB), |
||
Line 564: | Line 566: | ||
return [node['root'], [go(x) for x in node['nest']]] |
return [node['root'], [go(x) for x in node['nest']]] |
||
return [go(x) for x in xs] |
return [go(x) for x in xs] |
||
Revision as of 09:04, 16 October 2019
The following shows a tree of data with nesting denoted by visual levels of indentation:
RosettaCode rocks code comparison wiki mocks golfing
A common datastructure for trees is to define node structures having a name and a, (possibly empty), list of child nodes. The nesting of nodes captures the indentation of the tree. Lets call this the nest form.
Another datastructure for trees is to construct from the root an ordered list of the nodes level of indentation and the name of that node. The indentation for the root node is zero; node 'rocks is indented by one level from the left, and so on. Lets call this the indent form.
0 RosettaCode 1 rocks 2 code ...
- Task
- Create/use a nest datastructure format and textual representation for arbitrary trees.
- Create/use an indent datastructure format and textual representation for arbitrary trees.
- Create methods/classes/proceedures/routines etc to:
- Change from a nest tree datastructure to an indent one.
- Change from an indent tree datastructure to a nest one
- Use the above to encode the example at the start into the nest format, and show it.
- transform the initial nest format to indent format and show it.
- transform the indent format to final nest format and show it.
- Compare initial and final nest formats which should be the same.
Show all output on this page.
JavaScript
Parses the initial tree from outline text, and writes out the flat and nested structures in a minimal JSON format: <lang javascript>(() => {
'use strict';
// main :: IO () const main = () => {
// (INDENT, STRING) PAIRS FROM OUTLINE ------------ const indentLevelTuplesA = indentLevelsFromLines( lines(strOutlineB) );
// LIST OF TREES FROM LIST OF (INDENT, STRING) PAIRS const forestA = forestFromIndentLevels( indentLevelTuplesA );
// (INDENT, STRING) PAIRS FROM LIST OF TREES ------ const indentLevelTuplesB = indentLevelsFromForest(forestA);
// LIST OF TREES FROM SECONDARY (INDENT, STRING) PAIRS const forestB = forestFromIndentLevels( indentLevelTuplesB );
// JSON OUTPUT OF FORESTS AND INDENT TUPLES -------
console.log('Tree structure from outline:\n') console.log(jsonFromForest(forestA));
console.log('\n\nIndent levels from tree structure:\n') console.log(jsonFromIndentLevels(indentLevelTuplesB));
console.log('\nTree structure from indent levels:\n') console.log(jsonFromForest(forestB)); };
// CONVERSIONS BETWEEN OUTLINES, TREES, AND (LEVEL, VALUE) PAIRS
// indentLevelsFromLines :: [String] -> [(Int, String)] const indentLevelsFromLines = xs => { const indentTextPairs = xs.map(compose( firstArrow(length), span(isSpace) )), indentUnit = minimum(indentTextPairs.flatMap(pair => { const w = fst(pair); return 0 < w ? [w] : []; })); return indentTextPairs.map( firstArrow(flip(div)(indentUnit)) ); };
// forestFromIndentLevels :: [(Int, String)] -> [Tree String] const forestFromIndentLevels = tuples => { const go = xs => 0 < xs.length ? (() => { const [n, s] = Array.from(xs[0]); // Lines indented under this line, // tupled with all the rest. const [firstTreeLines, rest] = Array.from( span(x => n < x[0])(xs.slice(1)) ); // This first tree, and then the rest. return [Node(s)(go(firstTreeLines))] .concat(go(rest)); })() : []; return go(tuples); };
// indentLevelsFromForest :: [Tree a] -> [(Int, a)] const indentLevelsFromForest = trees => { const go = n => node => [ [n, node.root] ] .concat(node.nest.flatMap(go(1 + n))) return trees.flatMap(go(0)); };
// JSON RENDERING OF NESTED LINES AND (LEVEL, VALUE) PAIRS
// jsonFromForest :: [Tree a] -> JSON String const jsonFromForest = trees => JSON.stringify( nestedListsFromForest(trees), null, 2 );
// nestedListsFromForest :: [Tree a] -> NestedList a const nestedListsFromForest = xs => { const go = node => [node.root, node.nest.map(go)]; return xs.map(go); };
// jsonFromIndentLevels :: [(Int, String)] -> JSON String const jsonFromIndentLevels = xs => JSON.stringify( xs.map(x => Array.from(x)), null, 2 );
// GENERIC FUNCTIONS ----------------------------
// Node :: a -> [Tree a] -> Tree a const Node = v => xs => ({ type: 'Node', root: v, // any type of value (consistent across tree) nest: xs || [] });
// Tuple (,) :: a -> b -> (a, b) const Tuple = a => b => ({ type: 'Tuple', '0': a, '1': b, length: 2 });
// compose (<<<) :: (b -> c) -> (a -> b) -> a -> c const compose = (...fs) => x => fs.reduceRight((a, f) => f(a), x);
// div :: Int -> Int -> Int const div = x => y => Math.floor(x / y);
// firstArrow :: (a -> b) -> ((a, c) -> (b, c)) const firstArrow = f => xy => Tuple(f(xy[0]))( xy[1] );
// flip :: (a -> b -> c) -> b -> a -> c const flip = f => 1 < f.length ? ( (a, b) => f(b, a) ) : (x => y => f(y)(x));
// foldl1 :: (a -> a -> a) -> [a] -> a const foldl1 = f => xs => 1 < xs.length ? xs.slice(1) .reduce(uncurry(f), xs[0]) : xs[0];
// fst :: (a, b) -> a const fst = tpl => tpl[0];
// isSpace :: Char -> Bool const isSpace = c => /\s/.test(c);
// Returns Infinity over objects without finite length. // This enables zip and zipWith to choose the shorter // argument when one is non-finite, like cycle, repeat etc
// length :: [a] -> Int const length = xs => (Array.isArray(xs) || 'string' === typeof xs) ? ( xs.length ) : Infinity;
// lines :: String -> [String] const lines = s => s.split(/[\r\n]/);
// minimum :: Ord a => [a] -> a const minimum = xs => 0 < xs.length ? ( foldl1(a => x => x < a ? x : a)(xs) ) : undefined;
// span :: (a -> Bool) -> [a] -> ([a], [a]) const span = p => xs => { const iLast = xs.length - 1; return splitAt( until(i => iLast < i || !p(xs[i]))( succ )(0) )(xs); };
// splitAt :: Int -> [a] -> ([a], [a]) const splitAt = n => xs => Tuple(xs.slice(0, n))( xs.slice(n) );
// succ :: Enum a => a -> a const succ = x => 1 + x;
// uncurry :: (a -> b -> c) -> ((a, b) -> c) const uncurry = f => (x, y) => f(x)(y);
// until :: (a -> Bool) -> (a -> a) -> a -> a const until = p => f => x => { let v = x; while (!p(v)) v = f(v); return v; };
// SAMPLE OUTLINES ------------------------------------
const strOutlineA = `Heilmeier catechism Objectives and benefits What are you trying to do? Articulate your objectives using absolutely no jargon. What are the problems you address ? How is it done today, and what are the limits of current practice? What is your solution ? What is new in your approach and why do you think it will be successful? Who cares? If you are successful, what difference will it make? Costs What are the risks? How much will it cost? How long will it take? Indicators What are the mid-term and final “exams” to check for success?`;
const strOutlineB = `Rosetta stone is a granodiorite stele engraved with Greek and Egyptian texts in different scripts. which shed new light on various homologies.`;
// MAIN --- return main();
})();</lang>
- Output:
Tree structure from outline: [ [ "Rosetta stone", [ [ "is a granodiorite stele", [ [ "engraved", [ [ "with Greek and Egyptian texts", [] ] ] ], [ "in different scripts.", [] ] ] ], [ "which shed new light", [ [ "on various homologies.", [] ] ] ] ] ] ] Indent levels from tree structure: [ [ 0, "Rosetta stone" ], [ 1, "is a granodiorite stele" ], [ 2, "engraved" ], [ 3, "with Greek and Egyptian texts" ], [ 2, "in different scripts." ], [ 1, "which shed new light" ], [ 2, "on various homologies." ] ] Tree structure from indent levels: [ [ "Rosetta stone", [ [ "is a granodiorite stele", [ [ "engraved", [ [ "with Greek and Egyptian texts", [] ] ] ], [ "in different scripts.", [] ] ] ], [ "which shed new light", [ [ "on various homologies.", [] ] ] ] ] ] ]
Python
Procedural
Just arranges the standard lists and tuples for the datastructures allowing pprint to show the different arrangement of storage.
<lang python>from pprint import pprint as pp from collections import namedtuple
def to_indent(node, depth=0, flat=None):
if flat is None: flat = [] if node: flat.append((depth, node[0])) for child in node[1]: to_indent(child, depth + 1, flat) return flat
def to_nest(lst, depth=0, level=None):
if level is None: level = [] while lst: d, name = lst[0] if d == depth: children = [] level.append((name, children)) lst.pop(0) elif d > depth: # down to_nest(lst, d, children) elif d < depth: # up return return level[0] if level else None
if __name__ == '__main__':
print('Start Nest format:') nest = ('RosettaCode', [('rocks', [('code', []), ('comparison', []), ('wiki', [])]), ('mocks', [('golfing', [])])]) pp(nest, width=25)
print('\n... To Indent format:') as_ind = to_indent(nest) pp(as_ind, width=25)
print('\n... To Nest format:') as_nest = to_nest(as_ind) pp(as_nest, width=25)
if nest != as_nest: print("Whoops round-trip issues")</lang>
- Output:
Start Nest format: ('RosettaCode', [('rocks', [('code', []), ('comparison', []), ('wiki', [])]), ('mocks', [('golfing', [])])]) ... To Indent format: [(0, 'RosettaCode'), (1, 'rocks'), (2, 'code'), (2, 'comparison'), (2, 'wiki'), (1, 'mocks'), (2, 'golfing')] ... To Nest format: ('RosettaCode', [('rocks', [('code', []), ('comparison', []), ('wiki', [])]), ('mocks', [('golfing', [])])])
Functional
Using a Node constructor with root and nest keys for the value and sub-forest of each tree node, and serialising both trees and nesting-level lists to JSON-compatible formats.
Functional composition, as an alternative to .append() and .pop() mutations.
(Initial tree constructed as the parse of an outline text)
<lang python>Tree data structures
from itertools import chain, takewhile import json
- Node :: a -> [Tree a] -> Tree a
def Node(v):
Constructor for a Tree node which connects a value of some kind to a list of zero or more child trees. return lambda xs: {'type': 'Tree', 'root': v, 'nest': xs}
- forestFromNestLevels :: [(Int, a)] -> [Tree a]
def forestFromNestLevels(tuples):
A list of trees derived from a list of values paired with integers giving their levels of indentation. def go(xs): if xs: (intIndent, v) = xs[0] (firstTreeLines, rest) = span( lambda x: intIndent < x[0] )(xs[1:]) return [Node(v)(go(firstTreeLines))] + go(rest) else: return [] return go(tuples)
- nestLevelsFromForest :: [Tree a] -> [(Int, a)]
def nestLevelsFromForest(xs):
A flat list of (nest level, value) tuples, representing a series of trees. def go(level): return lambda node: [(level, node['root'])] + concatMap( go(1 + level) )(node['nest']) return concatMap(go(0))(xs)
- TEST ----------------------------------------------------
- main :: IO ()
def main():
Conversion from trees to flat lists of nest levels, and back again, with each stage shown as a JSON string. forestA = forestFromNestLevels( indentLevelsFromLines(OUTLINE.splitlines()) ) nestLevels = nestLevelsFromForest(forestA) forestB = forestFromNestLevels(nestLevels)
for x in [ 'Parse tree from outline text:\n', forestJSON(forestA),
'\nNesting level list from tree:\n', json.dumps(nestLevels, indent=2),
'\nTree rebuilt from nesting level list:\n', forestJSON(forestB), ]: print(x)
- INITIAL TREE FROM PARSE OF OUTLINE TEXT -----------------
- indentLevelsFromLines :: [String] -> [(Int, String)]
def indentLevelsFromLines(xs):
Each input line stripped of leading white space, and tupled with a preceding integer giving its level of indentation from 0 upwards. indentTextPairs = [ (n, s[n:]) for (n, s) in ((len(list(takewhile(isSpace, x))), x) for x in xs) ] indentUnit = min(concatMap( lambda x: [x[0]] if x[0] else [] )(indentTextPairs)) return [ (x[0] // indentUnit, x[1]) for x in indentTextPairs ]
- JSON SERIALISATION --------------------------------------
- forestJSON :: [Tree a] -> JSON String
def forestJSON(trees):
A simple JSON serialisation of a list of trees, with each tree node represented as a [value, nodes] pair. return json.dumps( forestAsNestedPairs(trees), indent=2 )
- forestAsNestedPairs :: [Tree a] -> NestedPair [(a, [NestedPair])]
def forestAsNestedPairs(xs):
A simple nested pair representation of a tree. def go(node): return [node['root'], [go(x) for x in node['nest']]] return [go(x) for x in xs]
- GENERIC -------------------------------------------------
- concatMap :: (a -> [b]) -> [a] -> [b]
def concatMap(f):
A concatenated list or string over which a function f has been mapped. The list monad can be derived by using an (a -> [b]) function which wraps its output in a list (using an empty list to represent computational failure). return lambda xs: (.join if isinstance(xs, str) else list)( chain.from_iterable(map(f, xs)) )
- isSpace :: Char -> Bool
- isSpace :: String -> Bool
def isSpace(s):
True if s is not empty, and contains only white space. return s.isspace()
- span :: (a -> Bool) -> [a] -> ([a], [a])
def span(p):
The longest (possibly empty) prefix of xs that contains only elements satisfying p, tupled with the remainder of xs. span p xs is equivalent to (takeWhile p xs, dropWhile p xs). def go(xs): prefix = list(takewhile(p, xs)) return (prefix, xs[len(prefix):]) return lambda xs: go(xs)
- MAIN ---
if __name__ == '__main__':
OUTLINE = Rosetta stone is a granodiorite stele engraved with Greek and Egyptian texts in different scripts. which shed new light on various homologies.
main()</lang>
- Output:
Parse tree from outline text: [ [ "Rosetta stone", [ [ "is a granodiorite stele", [ [ "engraved", [ [ "with Greek and Egyptian texts", [] ] ] ], [ "in different scripts.", [] ] ] ], [ "which shed new light", [ [ "on various homologies.", [] ] ] ] ] ] ] Nesting level list from tree: [ [ 0, "Rosetta stone" ], [ 1, "is a granodiorite stele" ], [ 2, "engraved" ], [ 3, "with Greek and Egyptian texts" ], [ 2, "in different scripts." ], [ 1, "which shed new light" ], [ 2, "on various homologies." ] ] Tree rebuilt from nesting level list: [ [ "Rosetta stone", [ [ "is a granodiorite stele", [ [ "engraved", [ [ "with Greek and Egyptian texts", [] ] ] ], [ "in different scripts.", [] ] ] ], [ "which shed new light", [ [ "on various homologies.", [] ] ] ] ] ] ]