Memory layout of a data structure: Difference between revisions

m
Line 285:
Without the ability to create data aggregates via TYPE statements, whereby a single variable might sprawl across memory as above, one instead prepared a collection of variables, usually with some systematic name convention linking the names of the parts. These variables would be anywhere in memory and so had no particular memory layout in themselves. However, when stored as a record in a disc file a data structure is created with a definite layout, and this is manifest in the READ and WRITE statements involved. Suppose the statement was <code>WRITE (F,REC = n) THIS,M,NAME</code> meaning that the n'th record of file unit F was to be written. The types of the variables would be known, and their sizes also. Say <code>REAL*8 THIS</code>, <code>INTEGER*1 M</code>, and <code>CHARACTER*28 NAME</code>. Such a record could be read (or written) by <code>READ (F,REC = n) STUFF</code> given a declaration <code>CHARACTER*37 STUFF</code> (counting on fingers as necessary) and the various parts of the data aggregate could be indexed within STUFF. However, the interpretation of the interior bytes of multi-byte items such as integer and floating-point variables is complicated by the endianness of the processor, a confounding nuisance.
 
It is further possible to declare that STUFF is to occupy the same storage as the named variables,. If the declaration was <code>CHARACTER*1 STUFF(37)</code>, then <code>EQUIVALENCE (STUFF(1),THIS),(STUFF(9),M),(STUFF(10),NAME)</code> would mean that STUFF occupied the same storage as those variables, or rather, that the variables occupied the same storage as STUFF - indeed, they could overlay each other, which would be unlikely to be helpful. This could mean that a floating-point or integer variable was ''not'' aligned to a word boundary with the consequent penalty in access, for instance by having THIS start with STUFF(2). Some systems may not allow byte-based addressing, only word-based so complications can arise. But this demonstrates precise knowledge of the memory layout of a data structure. The more modern compilers that allow the TYPE declaration typically do not allow the appearance of such variables in EQUIVALENCE statements, to prevent access to the memory layout of such data structures. Others allow a new version of EQUIVALENCE (which the moderns deprecate) via the MAP statement, but this is not standard Fortran.
 
As before stated, there is no BIT facility, so packing is to byte boundaries. But, if one is determined to store thousands of records with minimal storage use, it may seem worth the effort to engage in the arithmetic to pack the likes of say three bits, followed by the thirty-two bits of a floating-point value, and so on, into a sequence of bytes which then would be written. In such a situation it may even be worth packing only a portion of the floating-point variable, if reduced precision is acceptable and one is certain of the usage of the bits within such a number. However, given the difficulty of access to the parts of such a packed aggregate, it is usually better to leave the byte/word packing and unpacking to the I/O system as via <code>WRITE (F,REC = n) THIS,M,NAME</code> and then manipulate the variables as their conveniently-aligned in-memory forms as ordinary variables, only repacking to the data structure form with a subsequent WRITE statement.
1,220

edits