Array: Difference between revisions

Content added Content deleted
m (Minor tidying.)
Line 1: Line 1:
[[Category:Encyclopedia]]An '''array''' is a composite data type, meaning it can store multiple values, and so is in the [[Collections|collection]] category. The stored values are called '''elements''' of the array and are accessed by a sequence of indices. In a program, indices may be a mixture of constant values or variables which allow the elements accessed to vary under program control. The indices of an array are [http://en.wikipedia.org/wiki/Total_order totally ordered].
An '''array''' is a composite data type -- it can store multiple values, and so is in the [[Collections|collection]] category. The stored values are called '''elements''' of the array, and are accessed by a sequence of indices. In a program, indices may be a mixture of constant values or variables that allow the elements accessed to vary under program control. The indices of an array are [http://en.wikipedia.org/wiki/Total_order totally ordered].


The implementation details of an array give rise to an important distinction between '''arrays''' and '''associative arrays'''.


The implementation details of an array gives rise to an important distinction between '''arrays''' and '''associative arrays'''.
:The implementation of '''arrays''' is based on setting the bounds of indices of the array, the ''size'' of the array, normally by allocating a contiguous region of memory to hold the elements of the array, and using simple offset calculations on the indices from the origin of the memory to access memory elements. Some languages support extensions to allow such arrays to be resized, or re-shaped, in which the memory area is adjusted, but extent elements are retained.
:The implementation of '''arrays''' is based on setting the bounds of indices of the array, the ''size'' of the array, normally by allocating a contiguous region of memory to hold the elements of the array, and using simple offset calculations on the indices from the origin of the memory to access memory elements. Some languages support extensions to allow such arrays to be resized, or re-shaped, in which the memory area is adjusted, but extent elements are retained.


:By contrast an '''[[associative array]]''' maps the association between index "keys" and their associated values, generally using more complex [http://en.wikipedia.org/wiki/Hash_function hash functions] on the keys of the array to map them to their corresponding elements (by pointers, references or memory addresses of some sort). Associative arrays are referred to variously as "hashes" ([[Perl]]), "maps" or "mappings" ([[Lua]]), "dictionaries" ([[Python]]) as well as "associative arrays" ([[AWK]], [[ksh]] and others). The keys into associative arrays are normally not constrained to be integers (unlike arrays which generally required contiguous integer ranges). Different languages may impose various constraints on these keys. For example, in [[Perl]], keys must always be strings, so 1, "1", and 1.0, each of which stringifies to "1", are the same key, but "1.0" is distinct from all of these. In [[PHP]], keys must be strings or integers; floats and booleans get implicitly converted to an integer. Other languages (such as Python) may treat each type of object as distinct. (See [[associative array]] for further discussion).
:By contrast, an '''[[associative array]]''' maps the association between index "keys" and their associated values, generally using more complex [http://en.wikipedia.org/wiki/Hash_function hash functions] on the keys of the array to map them to their corresponding elements (by pointers, references or memory addresses of some sort). Associative arrays are referred to variously as "hashes" ([[Perl]]), "maps" or "mappings" ([[Lua]]), or "dictionaries" ([[Python]]), as well as "associative arrays" ([[AWK]], [[ksh]], and others). The keys into associative arrays are normally not constrained to be integers, unlike arrays, which generally required contiguous integer ranges. Different languages may impose various constraints on these keys. For example, in [[Perl]], keys must always be strings, so 1, "1", and 1.0, each of which stringifies to "1", are the same key, but "1.0" is distinct from all of these. In [[PHP]], keys must be strings or integers; floats and booleans get implicitly converted to an integer. Other languages (such as Python) may treat each type of object as distinct. (See [[associative array]] for further discussion.)


:Non-associative arrays may have speed and memory consumption advantages. Associative arrays have greater flexibility in types used for keys and generally obviate the need to implement searches through the collection (Each component on which one would search can be implemented as a different associative array of references to their corresponding values or records).
:Non-associative arrays may have speed and memory consumption advantages. Associative arrays have greater flexibility in types used for keys, and generally obviate the need to implement searches through the collection. (Each component on which one would search can be implemented as a different associative array of references to their corresponding values or records.)


Arrays with more than one index are called '''multidimensional''' arrays. For example, a matrix is a two-dimensional array.
Arrays with more than one index are called '''multidimensional''' arrays. For example, a matrix is a two-dimensional array.


Some languages (such as [[AWK]]) do not support true arrays and merely emulate them through their associative arrays. Similarly some languages emulate multi-dimensional arrays by concatenation of dimensional indices into keys (perhaps a peculiarity of [[AWK]]).
Some languages (such as [[AWK]]) do not support true arrays; they merely emulate them through their associative arrays. Similarly, some languages emulate multi-dimensional arrays by concatenation of dimensional indices into keys (perhaps a peculiarity of [[AWK]]).


Common operations defined on arrays include:
Common operations defined on arrays include:

* Indexing: accessing an array element by its indices. (There is a one to one mapping between an index and its corresponding element).
* Indexing: accessing an array element by its indices. (There is a one to one mapping between an index and its corresponding element).
* Slicing: producing a subarray by putting some constraint on the indices. For example, [[PL/1]] provides extracting of a row or a column of an array. In [[Ada]] any range of the index can be used in order to extract a subarray from a single-dimensional array. In [[Python]] slices can extract any contiguous subset of an array and extended slice notation can extract elements in reversed order and/or by traversing in a given "stride" --- for example ''a[100:0:-2]'' would return every odd element from 100 to the beginning of the list: a[99], a[97], ... a[1].
* Slicing: producing a subarray by putting some constraint on the indices. For example, [[PL/1]] provides extracting of a row or a column of an array. In [[Ada]] any range of the index can be used in order to extract a subarray from a single-dimensional array. In [[Python]] slices can extract any contiguous subset of an array and extended slice notation can extract elements in reversed order and/or by traversing in a given "stride" --- for example ''a[100:0:-2]'' would return every odd element from 100 to the beginning of the list: a[99], a[97], ... a[1].
Line 23: Line 24:
* Array programming languages provide operations applied to entire arrays, so programs in such languages often lack specific index reference (for example [[APL]]).
* Array programming languages provide operations applied to entire arrays, so programs in such languages often lack specific index reference (for example [[APL]]).


Multidimensional arrays in which the valid range of one index depends on the value of another are called '''ragged''' (also '''jagged'''). This term comes from a typical example of a ragged array, when a two-dimensional array is used to store strings of different length in its rows. When put on paper the right margin of the output become ''ragged''.
Multidimensional arrays in which the valid range of one index depends on the value of another are called '''ragged''' (also '''jagged'''). This term comes from a typical example of a ragged array, when a two-dimensional array is used to store strings of different length in its rows. When put on paper the right margin of the output become ''ragged''.


The lower bound of non-associative arrays in many [[:Category:Programming Languages|programming languages]] is commonly fixed at either 0 ([[C]] and relatives) or 1 (Old [[Fortran]] and relatives); or an arbitrary integer ([[Pascal]] and relatives, modern Fortran). In [[Ada]] any discrete type can used as an index. Zero-based indexing is best thought of in terms of the index being an offset from the beginning of the array. Thus the first element is located zero elements from this starting point. The alternative can be thought of as ordinal indexes referring to the first, second, ... and ''n''th elements of the array.
The lower bound of non-associative arrays in many [[:Category:Programming Languages|programming languages]] is commonly fixed at either 0 ([[C]] and relatives) or 1 (Old [[Fortran]] and relatives); or an arbitrary integer ([[Pascal]] and relatives, modern Fortran). In [[Ada]] any discrete type can used as an index. Zero-based indexing is best thought of in terms of the index being an offset from the beginning of the array. Thus the first element is located zero elements from this starting point. The alternative can be thought of as ordinal indexes referring to the first, second, ... and ''n''th elements of the array.


In most programming languages, arrays are accessed by using the array brackets <tt>[</tt> and <tt>]</tt>, e.g. in <tt>A[i]</tt>, but exceptions exist, including [[Rexx]] which instead uses the dot operator <tt>.</tt>, such as in <tt>A.i</tt>; [[Fortran]], [[Ada]] and [[BASIC]] which use round parentheses <tt>A(i)</tt>, and in [[LISP|lisp]] dialects which use constructions like <tt>(ELT A n)</tt> for accessing and <tt>(SETA A n new_val)</tt> for setting (Interlisp) or <tt>(vector-ref A n)</tt> for accessing and <tt>(vector-set! A n new_val)</tt> for setting (Scheme). No bracket indexing occurs in [[J]], an array language; instead, the normal syntax of function creation and function calling applies.
In most programming languages, arrays are accessed by using the array brackets <tt>[</tt> and <tt>]</tt>, e.g. in <tt>A[i]</tt>. However, exceptions exist, including [[Rexx]] which instead uses the dot operator <tt>.</tt>, such as in <tt>A.i</tt>; [[Fortran]], [[Ada]] and [[BASIC]] which use round parentheses <tt>A(i)</tt>, and in [[LISP|lisp]] dialects which use constructions like <tt>(ELT A n)</tt> for accessing and <tt>(SETA A n new_val)</tt> for setting (Interlisp) or <tt>(vector-ref A n)</tt> for accessing and <tt>(vector-set! A n new_val)</tt> for setting (Scheme). No bracket indexing occurs in [[J]], an array language; instead, the normal syntax of function creation and function calling applies.


==Computational metrics==
==Computational metrics==
Line 62: Line 63:
<lang c> FILE *any_text;
<lang c> FILE *any_text;
/* declare array */
/* declare array */
int frequency[26];
int frequency[26];
/* declare a computed index */
/* declare a computed index */
int ch;
int ch;


any_text = fopen ("a_text_file.txt", "rt");
any_text = fopen ("a_text_file.txt", "rt");


/* init the freq table: */
/* init the freq table: */
for (ch = 0; ch < 26; ch++)
for (ch = 0; ch < 26; ch++)
frequency[ch] = 0;
frequency[ch] = 0;


Line 158: Line 159:
lettercounts = countletters(sourcedata)
lettercounts = countletters(sourcedata)
for i in xrange(len(lettercounts)):
for i in xrange(len(lettercounts)):
print "%s=%d" % (chr(i + ord('a')), lettercounts[i]),
print "%s=%d" % (chr(i + ord('a')), lettercounts[i]),
</lang>
</lang>


Line 282: Line 283:


[[Category:Data Structures]]
[[Category:Data Structures]]
[[Category:Encyclopedia]]