Previous: background Up: ../chrrtn.html Next: argument-list-summary
In view of the problems with FORTRAN 77 CHARACTER data noted in the previous section, what can we do to produce a useful set of CHARACTER primitives? First of all, an alternate to CHAR() is needed which can be guaranteed to handle all possible bit patterns which can be stored in a CHARACTER memory unit. Second, an alternate to ICHAR() is need which is guaranteed to return integer values in the range 0 .. (character set size - 1), where the character set size is 2**n for an n-bit CHARACTER memory unit. Third, primitives must be capable of dealing with strings of arbitrary length. This means that they must not use CHARACTER*n data, but instead arrays of n CHARACTER*1 elements. This in turn means that the LEN() function is of no use, and an explicit length must be provided in the argument list. The caller may choose to pass either type of data, since the Standard requires that both have exactly the same storage layout. For portability, however, code should be designed to use arrays for CHARACTER storage, and to limit string constants to no more than 128 characters. However, unlike the KARxxx Hollerith primitives, which required string offset locations to handle the possibility of characters aligned at other than word boundaries, since FORTRAN 77 CHARACTER data, both as arrays of CHARACTER*1, or scalars of CHARACTER*n, is addressable in FORTRAN to the individual character, we can dispense with these extra arguments. Finally, since there is no null string, and no agreement on a character set as there is in the C and Ada languages to permit assignment of one character (NUL -- all bits zero) as a string terminator to mark the end of a string, the closest we can come to this is to choose "blank" as a fill character. The last non blank-character in an array or variable then defines the real end of the string, and if the string is entirely blank, it is to be regarded as a null string. This is not a great restriction for three reasons. One, FORTRAN I/O is record oriented, and padding blanks may be supplied or trimmed as required by the I/O system. This means that trailing blanks on external files are not recognizable by virtue of the FORTRAN language definition. Second, blank space at the end of lines is invisible to the eye. Third, blank space may be viewed the least significant of all printable characters, since only its presence between words, not its amount, generally matters. Varying length strings can therefore be conveniently supported subject to the twin burden on the programmer of declaring adequate dimensions on the CHARACTER array variables, and managing the length manually. Without a data structuring facility in FORTRAN, like C's struct, PL/1 and Cobol's STRUCTURE, and Pascal, Modula/2 and Ada's RECORDs, no acceptable way of carrying the string length around transparently presents itself. Storage of the length in a single character position is not acceptable, because this is one of the compiler limitations we are trying to remove, and storing it as an unpacked integer in several character positions is inefficient and unpleasant, and introduces an address space limitation which we wish to avoid.