.PAGE SIZE 60,68
.RIGHT MARGIN 68
.LAYOUT 1,2
.CENTER; Record Definition Tutorial
.BLANK
.CENTER; B.#Z.#Lederman
.CENTER; System Resources Corp.
.CENTER; 128 Wheeler Rd.
.CENTER; Burlington, MA 01803-5139
.CENTER; (617) 270-9228
.NOTE Abstract
This session will attempt to go beyond the material in the
Datatrieve manuals, and show how various types of problems may be
solved using the options available within the record definition.
Some introductory material will be given, and then the session will
concentrate on showing how a record definition can be derived for a
file of unknown layout. 
.END NOTE
.TITLE Record Definition Tutorial
.SUBTITLE B.#Z.#Lederman
.PARAGRAPH
It is not unreasonable to state that the record definition is the
foundation of any Datatrieve application, as it is the reference by
which all data is stored and retrieved. Therefore, the first rule
for any application is: 
.BLANK.CENTER
KNOW YOUR APPLICATION
.BLANK
from which immediately follows:
.BLANK.CENTER
KNOW YOUR DATA
.BLANK
Often, an unsuspected benefit of taking an existing manual operation
and implementing it on Datatrieve is that the people involved must
sit down and figure out exactly what pieces of data they are dealing
with, and in what manner: this is often the first time anyone
actually does this, and they are often surprised by the amount of
data involved. 
.PARAGRAPH
Some applications move onto Datatrieve almost automatically. If you
are using a pre-printed form (and almost every company has some sort
of printed form for orders, absence reports, pencil requisitions,
etc.) then one can simply copy the fields into the record
definition: there are cases of new users moving applications like
this in one day. If your records are not as well organized, then you
must analyze them yourself. If you are using Datatrieve to read an
existing file created by some other program, then it is necessary to
obtain the file record layout and follow it. 
.PARAGRAPH
Keep in mind that, while it is nice to get a good record definition
at the beginning, it is always possible to define a new domain and
record and read the data from the old domain to a new one, so if you
have 10,000 records stored, and find you need to add a field, don't
panic.
.BLANK.TEST PAGE 3.CENTER
Data Types.
.PARAGRAPH
Although the data types are explained in the manual, a brief review
is appropriate. One important factor to keep in mind is that there
are no "Datatrieve-type" data types and no "Datatrieve-type" files.
Datatrieve works with standard VMS (or PDP-11) datatypes and files,
and all of these data types are found in programs written in other
languages. 
.BLANK.TEST PAGE 3
CHARACTER STRINGS store one ASCII character per byte. They are the
only way to store character data in files. 
.BLANK.TEST PAGE 6
DISPLAY (the default for numeric) stores one ASCII character per
byte. This mode is easy to transfer to other languages (including
DCL), the file can be printed out directly on terminals and printers
for examination, and is easy to determine the space used for
redefine clauses. However, it uses more space in the record than
other types, and math operations will often be slower. 
.BLANK.TEST PAGE 7
COMP#/#INTEGER (BYTE, WORD, LONG, QUAD) is the most efficient method
of storing whole numbers, using the least space, and the fastest
math operations. It will transfer to most other languages. The file
cannot be examined directly, and usually cannot be redefined. The
field is 1, 2, 4, or 8 bytes depending upon the size of the PIC
clause if it declared as COMP or INTEGER, and this must be known if
compatability with other languages is required. 
.BLANK.TEST PAGE 7
COMP__1#/#REAL is most efficient for real numbers (numbers with
fractions), and will cover a wide range from very large to very
small numbers without having to know the range when defining the
record. Math is almost as fast as integer, but due to floating point
rounding, should usually not be used for sums of money. This data
type usually cannot be redefined, and will transfer to most other
languages (but not all). The field is 4 bytes by default. 
.BLANK.TEST PAGE 4
COMP__2#/#DOUBLE is like real: it uses more space but has more
precision (more decimal places of accuracy). It transfers to
slightly fewer languages, and is 8 bytes in size. 
.BLANK.TEST PAGE 8
COMP__3#/#PACKED is like display but slightly more compact, having
about two digits per byte: the formula in the manual must be used
for the exact number. My recommendation is that this type should be
reserved for compatability with other languages which require the
use of packed data. It is used by other computer systems, in which
case it may be more easily transferred than integer or real. 
.BLANK.TEST PAGE 3
COMP__5#/#ZONED is like packed with different sign handling. The
same comments apply. 
.BLANK.TEST PAGE 3
COMP__6 is a data type used by COBOL, and is 2, 4, 6, or 8 bytes
long depending upon the PIC length. I recommend using this only to
read existing files. 
.BLANK.TEST PAGE 5
DATE is a special data type unique to DEC. It is an 8 byte integer,
but the numeric value is converted to a date when displayed. Many
problems with the DATE field type may be avoided by remembering that
internally the date is always a number, and not a character string.
DATE can be very useful as it allows comparisons to be made on date
and time (for record selection, such as all records after
14__AUG__1987#12:15:00.00), and allows sorting records on date and
time. 
.BLANK.TEST PAGE 3.CENTER
Getting Started: a text file with SYSGEN parameters.
.PARAGRAPH
The easiest file to read is usually one where all of the information 
is normal ASCII text. This is because the file is very easy to 
examine (you just have to TYPE it to your terminal or EDIT it). If 
you're lucky, the fields will even be aligned.
.PARAGRAPH
An example of this is the file SYSGEN creates with a description of 
all of the system parameters in it. It is created with the following
commands:
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 7
$ SET DEFAULT SYS$SYSTEM
$ MC SYSGEN
SYSGEN> USE CURRENT
SYSGEN> SET /OUTPUT = PARAMS.LOG
SYSGEN> SHOW /ALL
SYSGEN> EXIT
.BLANK
The beginning of this file looks like this: 
.BLANK
Parameters in use: Active
Parameter Name  Current   Default   Minimum   Maximum Unit  Dynamic
--------------  -------   -------   -------   ------- ----  -------
PFCDEFAULT           64        32         0       127 Pages       D
KFILSTCNT            16         4         2       255 Slots       
GBLSECTIONS         400       128        20      4095 Sections    
GBLPAGES          20500      4096       512        -1 Pages       
GBLPAGFIL          2048      1024       128        -1 Pages       
MAXPROCESSCNT       128        72        12      8192 Processes   
PROCSECTCNT          32        32         5      1024 Sections    
MINWSCNT             20        20        10        -1 Pages       
PAGFILCNT             2         2         1        63 Files       
.BLANK.JUSTIFY.FILL
.PARAGRAPH
Producing a record definition for this file is simple: all you have 
to do is count how many columns each text field occupies, paying 
attention to the actual data and not the header text at the top of 
the page (though these give you good names for your fields). The 
results should look like this:
.BLANK.NO JUSTIFY.NO FILL
REDEFINE RECORD TPARAMS__RECORD
01 TPARAMS__REC.
    10 NAME PIC X(16).
    10 CURRENT PIC X(18).
    10 SUBCUR REDEFINES CURRENT.
        20 FILLER PIC X(8).
        20 RCUR PIC X(10).
    10 DEFAULT PIC X(10).
    10 MINIMUM PIC X(10).
    10 MAXIMUM PIC X(10).
    10 FILLER PIC X.
    10 UNIT PIC X(12).
    10 DYNAMIC PIC X.
;
.BLANK.JUSTIFY.FILL.PARAGRAPH
FILLER is used to skip over those areas which just have blank spaces 
and no useful information.
.PARAGRAPH
This reads the file as text. Since some of the data is numeric, it 
would be nice to convert it to numbers. This is quite simple: create 
another domain with the fields defined as numeric:
.BLANK.NO JUSTIFY.NO FILL
REDEFINE RECORD PARAMS__RECORD
01 PARAMS__REC.
    10 NAME PIC X(16).
    10 CURRENT USAGE LONG EDIT__STRING ZZZ,ZZZ,ZZ9.
    10 DEFAULT USAGE LONG EDIT__STRING ZZZ,ZZZ,ZZ9.
    10 MINIMUM USAGE LONG EDIT__STRING ZZZ,ZZZ,ZZ9.
    10 MAXIMUM USAGE LONG EDIT__STRING ZZZ,ZZZ,ZZ9.
    10 UNIT PIC X(12).
    10 DYNAMIC PIC X.
;
.BLANK.JUSTIFY.FILL
and then read the data from the text file domain and write to the 
numeric file. The error messages you would get by trying to convert 
those fields which are text to numbers can be avoided by selecting 
only valid records:
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 3
FOR TPARAMS WITH UNIT NE "Ascii", "Unit  Dynami", " "
    STORE PARAMS USING PARAMS__RECORD = TPARAMS__RECORD
.BLANK.JUSTIFY.FILL
Notice that if the field names match in the two records, then all 
you have to do is equate the two records: Datatrieve will 
automatically do the conversion from text to numbers.
.BLANK 2.TEST PAGE 5.CENTER
More than one record type in a file.
.PARAGRAPH
Unfortunately, not all text files have nicely aligned columns: the
output of an INSTALL#LIST/FULL command is one such file which is not
so easily processed. A portion of typical listing looks like this: 
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 7
DISK$VAXVMSRL4:<ALLIN1.LIB>.EXE
   A1;5             Open     Shar Prv
        Entry access count         = 50
        Current / Maximum shared   = 0 / 7
        Global section count       = 5
        Privileges = SYSNAM GRPNAM TMPMBX WORLD OPER NETMBX SYSPRV
.BLANK 2.TEST PAGE 5
DISK$VAXVMSRL4:<SYS1.SYSCOMMON.SYSEXE>.EXE
   ANALIMDMP;1                    Prv
        Entry access count         = 0
        Privileges = CMKRNL CMEXEC
.BLANK.TEST PAGE 4
   AUTHORIZE;1                    Prv
        Entry access count         = 4
        Privileges = CMKRNL
.BLANK.TEST PAGE 6
   BACKTRANS;1      Open     Shar Prv
        Entry access count         = 0
        Current / Maximum shared   = 0 / 0
        Global section count       = 3
        Privileges = GRPNAM PRMGBL
.BLANK.JUSTIFY.FILL
and so on. Fortunately, some things do occur in the same place in
each record: for example, the image name always starts in column 4.
The goal is to get the information in the listing to be 'normalized'
into something like this: 
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 12
                                                        MAX
      DIRECTORY                 IMAGE     QUAL   ENTRY SHARED GLOBAL
.BLANK
<ALLIN1.LIB>.EXE              A1          O SP        50    7    5
<SYS1.SYSCOMMON.SYSEXE>.EXE   ANALIMDMP      P         0    0    0
<SYS1.SYSCOMMON.SYSEXE>.EXE   AUTHORIZE      P         4    0    0
<SYS1.SYSCOMMON.SYSEXE>.EXE   BACKTRANS   O SP         0    0    3
<SYS1.SYSCOMMON.SYSEXE>.EXE   CDU         OH P         2    1    0
<SYS1.SYSCOMMON.SYSEXE>.EXE   CMS         O S          0    0    2
<SYS1.SYSCOMMON.SYSEXE>.EXE   COPY        OHS         20    2    1
<SYS1.SYSCOMMON.SYSEXE>.EXE   DCL         OHS  L  16,859    5    1
.BLANK
The record definition for this domain is quite simple:
.BLANK
DEFINE RECORD INSTALL__P__RECORD OPTIMIZE
01 INS__P__REC.
    10 DIRECTORY PIC X(64) EDIT__STRING T(48).
    10 IMAGE PIC X(32) EDIT__STRING T(24).
    10 QUAL PIC X(8).
    10 QUALIFIERS REDEFINES QUAL.
        20 OPEN PIC X.            ! matches INSTALL abbreviations
        20 HDR PIC X.
        20 SHAR PIC X.
        20 PRV PIC X.
        20 PROT PIC X.
        20 LNKBL PIC X.
        20 CMODE PIC X.
        20 NOPURG PIC X.
    10 ENTRY PIC 9(6) EDIT__STRING ZZZ,ZZ9.
    10 MAX__SHARED PIC 9(6) EDIT__STRING ZZZ,ZZ9.
    10 GLOBAL PIC 9(6) EDIT__STRING ZZZ,ZZ9.
;
.JUSTIFY.FILL
the trick is to get from one format to the other. In this case, the
work has to be done in the processing of the data rather than all in
the record definition. 
.BLANK
The input record definition looks like this:
.BLANK.NO JUSTIFY.NO FILL
DEFINE RECORD INSTALL__IN__RECORD OPTIMIZE
01 INS__IN__REC.
    10 TEXT__IN PIC X(80).
    10 PARSE__FIELDS REDEFINES TEXT__IN.
        20 DISK PIC XXX.
        20 IMAGE PIC X(77).
        20 PARSE__2 REDEFINES IMAGE.
            30 BLANK PIC X(5).
            30 TYPE PIC X(10).
;
.BLANK.JUSTIFY.FILL
The feature which is of most use here is REDEFINES, which allows a
portion of the record to be handled in more than one way. For
example, TEXT__IN makes the entire record one long text field, but
PARSE__FIELDS allows it to be sub-divided into a section where the
disk name will appear and one where the image name will appear. But
since other information also shows up in the record under the image
name, we further sub-divide IMAGE into a section where there should
only be blank space, and a section where some text will appear which
will tell us what kind of record we are reading.
.PARAGRAPH
If you print the IMAGE field of this record definition, you get
this:
.BLANK.NO JUSTIFY.NO FILL
ET NOVERIFY
T/FULL
.BLANK
K$VMSRL5:<NEWUTIL.SPELL>.EXE
SPELL;36         Open Hdr Shar              <--
     Entry access count         = 10
     Current / Maximum shared   = 0 / 2
     Global section count       = 1
.BLANK.JUSTIFY.FILL
And similarly the TYPE field looks like this:
.BLANK.NO JUSTIFY.NO FILL
VERIFY
L
.BLANK
RL5:<NEWUT
;36
Entry acce            <--
Current /             <--
Global sec            <--
.BLANK.JUSTIFY.FILL
Although the fields look a bit messy, what counts is that it is now
possible to distinguish one record from another. The subsequent
processing is a bit complicated (once you see it all in one piece),
but is not really hard to develop a bit at a time. 
.BLANK.NO JUSTIFY.NO FILL
READY INSTALL__IN
DEFINE FILE FOR INSTALL__P ALLOCATION = 30;
READY INSTALL__P WRITE
!
DECLARE OUT__PUT PIC X.           ! decide when to store output
OUT__PUT = "F"
!
DECLARE I USAGE LONG.             ! character pointer
!
DECLARE T__DIRECTORY PIC X(64).   ! temporary data fields.
DECLARE T__IMAGE PIC X(32).
DECLARE T__OPEN PIC X.
DECLARE T__HDR PIC X.
DECLARE T__SHAR PIC X.
DECLARE T__PRV PIC X.
DECLARE T__PROT PIC X.
DECLARE T__LNKBL PIC X.
DECLARE T__CMODE PIC X.
DECLARE T__NOPURG PIC X.
DECLARE T__ENTRY PIC X(6).
DECLARE T__MAX__SHARED PIC X(6).
DECLARE T__SHARED__WORK PIC X(32).
DECLARE T__GLOBAL PIC X(6).
!
FOR INSTALL__IN BEGIN
   IF TEXT__IN = "" BEGIN                  ! end of a data set
      IF OUT__PUT = "T" BEGIN              ! if we have something
          STORE INSTALL__P USING BEGIN     ! store it
            DIRECTORY = T__DIRECTORY
            IMAGE = T__IMAGE
            OPEN = T__OPEN
            HDR = T__HDR
            SHAR = T__SHAR
            PRV = T__PRV
            PROT = T__PROT
            LNKBL = T__LNKBL
            CMODE = T__CMODE
            NOPURG = T__NOPURG
            ENTRY = T__ENTRY
            MAX__SHARED = T__MAX__SHARED
            GLOBAL = T__GLOBAL
            END
!
! clear out fields for next set of data
!
          T__IMAGE = "                                "
          T__ENTRY = "000000"
          T__MAX__SHARED = "000000"
          T__GLOBAL = "000000"
          OUT__PUT = "F"
          T__OPEN = " "
          T__HDR = " "
          T__SHAR = " "
          T__PRV = " "
          T__PROT = " "
          T__LNKBL = " "
          T__CMODE = " "
          T__NOPURG = " "
      END
   END ELSE BEGIN
!
! Pull out the information line-by-line by looking for unique
!  identification for each line.
!
   IF DISK = "DIS" BEGIN
      I = FN$STR__LOC (TEXT__IN, " ")             ! find end of data
      T__DIRECTORY = FN$STR__EXTRACT(TEXT__IN, 1, I)
   END ELSE IF TYPE = "Entry acce" BEGIN
      I = FN$STR__LOC (TEXT__IN, "=")             ! find start of data
      T__ENTRY = FN$STR__EXTRACT (TEXT__IN, (I + 2), 6)      ! get data
      OUT__PUT = "T"
   END ELSE IF TYPE = "Current / " BEGIN
! find start of numbers
      I = FN$STR__LOC (TEXT__IN, "=")
! get numbers
      T__SHARED__WORK = FN$STR__EXTRACT(TEXT__IN, I, 32)
! find start of real data
      I = FN$STR__LOC (T__SHARED__WORK, "/")
! get real data
      T__MAX__SHARED = FN$STR__EXTRACT(T__SHARED__WORK, (I + 1), 32)
      OUT__PUT = "T"
   END ELSE IF TYPE = "Global sec" BEGIN
      I = FN$STR__LOC (TEXT__IN, "=")            ! find start of data
      T__GLOBAL = FN$STR__EXTRACT (TEXT__IN, (I + 2), 6)    ! get data
      OUT__PUT = "T"
   END ELSE IF TYPE = "Compatabil" BEGIN      ! for compatability
      OUT__PUT = "T"                          ! mode programs
   END ELSE BEGIN
      I = FN$STR__LOC (IMAGE, ";")    ! look for semicolon
      IF I > 0 BEGIN                  ! if there is one
! image name, no ";"
          T__IMAGE = FN$STR__EXTRACT(IMAGE, 1, (I - 1))
          IF FN$STR__LOC (IMAGE, "Open")   > 0 THEN T__OPEN = "O"
          IF FN$STR__LOC (IMAGE, "Hdr")    > 0 THEN T__HDR = "H"
          IF FN$STR__LOC (IMAGE, "Shar")   > 0 THEN T__SHAR = "S"
          IF FN$STR__LOC (IMAGE, "Prv")    > 0 THEN T__PRV = "P"
          IF FN$STR__LOC (IMAGE, "Prot")   > 0 THEN T__PROT = "P"
          IF FN$STR__LOC (IMAGE, "Lnkbl")  > 0 THEN T__LNKBL = "L"
          IF FN$STR__LOC (IMAGE, "Cmode")  > 0 THEN T__CMODE = "C"
          IF FN$STR__LOC (IMAGE, "Nopurg") > 0 THEN T__NOPURG = "N"
          END
      END
   END
END
.BLANK.NO JUSTIFY.NO FILL
The only 'trick' here, if there is one, is the use of FN$STR__LOC to
find certain text strings no matter where they occur in a field,
rather than the usual record definition layout where fields occur in
specific positions, and then FN$STR__EXTRACT can be used to extract
something from the input record and put it into a fixed field in the
output record. 
.COMMENT ~~~~~
.BLANK.TEST PAGE 3.CENTER
A more difficult example: a non-text file.
.PARAGRAPH
When deriving a record definition for an existing data file, it
helps of course to have some idea of what is in the file. A record
definition, even if written for another language (COBOL, C, FORTRAN,
etc.) would be most, if not all, of what is needed. But for this
example, I'm going to use an example where no starting information
exists. 
.PARAGRAPH
Even if no information exists, the operating system will tell you
something about the file: for example, DIRECTORY/FULL.
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 10
Directory SYS$SYSROOT:[SYSERR]
.BLANK
RPCSWL.LOG;1                  File ID:  (1206,174,0)       
Size:           30/33         Owner:    [SYSTEM]
Created:  21-MAR-1990 09:43:22.71
Revised:  30-AUG-1990 09:50:46.22 (33)
File organization:  Sequential
Record format:      Variable length, maximum 269 bytes
Record attributes:  Carriage return carriage control
.BLANK.JUSTIFY.FILL
I've deleted a few of the listing lines to bring it down to the
relevant information. The listing tells me that this is a sequential
file, that records can vary in length, and gives the maximum record
size. This last item is very useful for record definitions. Another
very useful utility is ANALYZE/RMS__FILE/FDL, which produces a File
Descriptor Language file with similar information. 
.PARAGRAPH
If the file in question is an indexed file, then the ANALYZE utility
will tell you how many indexed fields there are, their position in
the file, and the data type. This is very useful, because it gives
you a known field within the file as a point of reference. It
happens that the sample file I've chosen is sequential, so we'll
have to derive all of the information for ourselves. 
.BLANK.TEST PAGE 3.CENTER
Exploring with the Editor.
.PARAGRAPH
The next utility to try is a text editor. It's difficult to display
the editor screen on paper, but if I use the TPU editor on this file
I get something which looks like this (after I substitute the tilde
("~") character for all characters which won't print on my printer):
.BLANK.NO JUSTIFY.NOFILL.TEST PAGE 6
~~~~~~~~~~e~~s~�~T~~~~6~~~~�~ �6(tT~~~~~~5DTS66$DUA0:[SYS0.SYSCOMMON
~~~~~~~~~~e~~s~�~[~~~~6~~~~�~@���~@~~~~~~5DTS66$DUA0:[SYS0.SYSCOMMON
~~~~~~~~~~e~~s~�~U~~~~6~~~~�~~�J�~@~~~~~~5DTS66$DUA0:[SYS0.SYSCOMMON
~~~~~~~~~~e~~s~�~}~~~~6~~~~�~~f~J~@~~~~~~5DTS66$DUA0:[SYS0.SYSCOMMON
~~~~~~~~~~e~~s~�~U~~~~6~~~~�~~f~%~@~~~~~~5DTS66$DUA0:[SYS0.SYSCOMMON
.BLANK.JUSTIFY.FILL
Since this wraps past 80 columns, I also shifted the data to the
left until I saw the entire record. 
.BLANK.NO JUSTIFY.NOFILL.TEST PAGE 6
A0:[SYS0.SYSCOMMON.][SYSEXE]RPC$SWLPROC.EXE;2~~~~~~~~~~~~RPC$SWL~~~~
A0:[SYS0.SYSCOMMON.][SYSEXE]RPC$SWLPROC.EXE;2~~~~~~~~~~~~RPC$SWL~~~~
A0:[SYS0.SYSCOMMON.][SYSEXE]RPC$SWLPROC.EXE;2~~~~~~~~~~~~RPC$SWL~~~~
A0:[SYS0.SYSCOMMON.][SYSEXE]RPC$SWLPROC.EXE;2~~~~~~~~~~~~RPC$SWL~~~~
A0:[SYS0.SYSCOMMON.][SYSEXE]RPC$SWLPROC.EXE;2~~~~~~~~~~~~RPC$SWL~~~~
.BLANK.TEST PAGE 6
~~~~~~~RPC$SWL~~~~~~~~~SYSTEM      ~~%RPC$__SWL-I-RESTART, RPC__SWL
~~~~~~~RPC$SWL~~~~~~~~~SYSTEM      ~~%RPC$__SWL-I-RESTART, RPC__SWL
~~~~~~~RPC$SWL~~~~~~~~~SYSTEM      ~~%RPC$__SWL-I-RESTART, RPC__SWL
~~~~~~~RPC$SWL~~~~~~~~~SYSTEM      ~~%RPC$__SWL-I-RESTART, RPC__SWL
~~~~~~~RPC$SWL~~~~~~~~~SYSTEM      ~~%RPC$__SWL-I-RESTART, RPC__SWL
.BLANK.TEST PAGE 6
ESTART, RPC__SWL started
ESTART, RPC__SWL started
ESTART, RPC__SWL started
ESTART, RPC__SWL started
ESTART, RPC__SWL started
.BLANK.JUSTIFY.FILL
There is actually more data than is shown here: many of the 'non
printing' characters I had to blank out show up on a CRT screen: in
particular, bytes which contain nulls print as a "NL" (in one
character). This makes it easy to spot sections of records which are
just null filler. 
.PARAGRAPH
Now what can be learned from this? First, it's easy to see that the
data 'lines up' in columns. This is good news: it means we can
define fields in a fixed position in the record definition and it
will match all of the records. With some other types of records, the
data isn't always in fixed positions. Something else which can be
seen is that some of the data is in text, and it's easy to count to
the column where it starts and ends. Finally, at the beginning of
the records we can see some characters which change from record to
record, and don't seem to make sense as text: these probably
represent binary numerical data. 
.BLANK.TEST PAGE 3.CENTER
The DUMP utility.
.PARAGRAPH
This utility is very helpful in looking at the data in a file. 
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 10
Dump of file SYS$SYSROOT:[SYSERR]RPCSWL.LOG;1 on 18-OCT-1990
File ID (1206,174,0)   End of file block 30 / Allocated 33
.BLANK
Record number 1 (00000001), 269 (010D) bytes
.BLANK
 DFF07300 00650000 00000000 00000000 ..........e..s.� 000000
 7E66A003 C0000000 0036FE00 00005501 .U....6....�..f~ 000010
 24363653 54443500 01000400 93401825 %.@......5DTS66$ 000020
 4F435359 532E3053 59535B3A 30415544 DUA0:[SYS0.SYSCO 000030
 50525D45 58455359 535B5D2E 4E4F4D4D MMON.][SYSEXE]RP 000040
 00323B45 58452E43 4F52504C 57532443 C$SWLPROC.EXE;2. 000050
 53244350 52070000 00000000 00000000 ...........RPC$S 000060
 45545359 530C0000 00000000 00004C57 WL.........SYSTE 000070
 535F2443 50522584 01202020 2020204D M      ..%RPC$__S 000080
 5052202C 54524154 5345522D 492D4C57 WL-I-RESTART, RP 000090
 20202064 65747261 7473204C 57535F43 C__SWL started    0000A0
 20202020 20202020 20202020 20202020                  0000B0
 20202020 20202020 20202020 20202020                  0000C0
 20202020 20202020 20202020 20202020                  0000D0
 20202020 20202020 20202020 20202020                  0000E0
 20202020 20202020 20202020 20202020                  0000F0
       20 20202020 20202020 20202020              ... 000100
.BLANK.JUSTIFY.FILL
I've only shown one record here to save space, but normally you
would want to look at several records to find things which match up.
You would also dump the same records with /BYTE and /WORD qualifiers
(the default shown here is /LONG) to see if data falls on byte or
word boundaries, and you might also dump the data in /DECIMAL to see
if there are clearly increasing numbers, or numeric data you
recognize. 
.PARAGRAPH
If you see text, you should also check the value of the byte just
before the text starts to see if it's value is the same as the
length of the field. For example, the same record above dumped with
/DECIMAL and /BYTE looks like this: 
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 8
   0   0   0   0   0   0   0   0 ........      0
 -33 -16 115   0   0 101   0   0 ..e..s.�      8
   0  54  -2   0   0   0  85   1 .U....6.     16
 126 102 -96   3 -64   0   0   0 ...�..f~     24
   1   0   4   0-109  64  24  37 %.@.....     32
  36  54  54  83  84  68  53   0 .5DTS66$     40
  89  83  91  58  48  65  85  68 DUA0:[SY     48
  79  67  83  89  83  46  48  83 S0.SYSCO     56
  83  91  93  46  78  79  77  77 MMON.][S     64
  80  82  93  69  88  69  83  89 YSEXE]RP     72
  79  82  80  76  87  83  36  67 C$SWLPRO     80
   0  50  59  69  88  69  46  67 C.EXE;2.     88
   0   0   0   0   0   0   0   0 ........     96
  83  36  67  80  82   7   0   0 ...RPC$S    104
   0   0   0   0   0   0  76  87 WL......    112
  69  84  83  89  83  12   0   0 ...SYSTE    120
   1  32  32  32  32  32  32  77 M      .    128
  83  95  36  67  80  82  37-124 .%RPC$__S    136
  83  69  82  45  73  45  76  87 WL-I-RES    144
  80  82  32  44  84  82  65  84 TART, RP    152
 116 115  32  76  87  83  95  67 C__SWL st    160
  32  32  32 100 101 116 114  97 arted       168
  32  32  32  32  32  32  32  32             176
  ....
.BLANK.JUSTIFY.FILL
In the second character of line '40' the byte has a value of 53, and
the character string which follows starting "DTS66" is 53 characters
long. Similarly, the third byte in line '104' is 7 and the string
"RPC$SWL" is seven characters long, and the third byte in line '120'
is 12 and what follows is 12 characters long if you include the
trailing blank spaces. One common type of data on VMS systems is the
"counted ASCII string", where the first byte gives the length of the
string, and these fields are probably this data type: it is at least
a reasonable first guess, even though all of the fields in this
particular file end up being the same length. 
.BLANK.TEST PAGE 3.CENTER
First record definition.
.PARAGRAPH
Since some of the fields are text, we can write a preliminary record
definition and start using Datatrieve to look at the file. 
.BLANK.NO JUSTIFY.NO FILL
REDEFINE RECORD RPC1__RECORD OPTIMIZE
01 RPC1__REC.
    10 FX.
        20 FILLER PIC X(41).
    10 HIDE1.
        20 FILLER PIC X(96).
    10 TEXT1 REDEFINES HIDE2.
        20 C1 USAGE BYTE.
        20 T1 PIC X(64).
        20 C2 USAGE BYTE.
        20 T2 PIC X(15).
        20 C3 USAGE BYTE.
        20 T3 PIC X(12).
        20 W2 USAGE WORD.
    10 STATUS__TEXT PIC X(132) EDIT__STRING T(48).
    10 PRINT1.
.TEST PAGE 4
        20 TASK__NAME COMPUTED BY
                FORMAT FN$STR__EXTRACT(T1, 1, C1) USING X(64)
                EDIT__STRING T(48).
.TEST PAGE 4
        20 PROCESS__NAME COMPUTED BY
                FORMAT FN$STR__EXTRACT(T2, 1, C2) USING X(15)
                EDIT__STRING T(15).
.TEST PAGE 4
        20 OWNER COMPUTED BY
                FORMAT FN$STR__EXTRACT(T3, 1, C3) USING X(12)
                EDIT__STRING T(12).
;
.BLANK.JUSTIFY.FILL
This record definition has a 'trick' in it. Since the text fields
are potentially of variable length, and are padded with nulls, I
want to convert them to fields which are text only. I also would
like to have a record definition where only the fields I want print
out by default. When there is a REDEFINES clause in a record
definition, the first group is the one that prints by default. So by
doing the following: 
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 4
    10 HIDE1.
        20 FILLER PIC X(96).
    10 TEXT1 REDEFINES HIDE2.
.BLANK.JUSTIFY.FILL
it is the HIDE1 group which prints by default: but that group
contains only FILLER, so it won't print at all. Then I put 'real'
fields under the redefining TEXT1 header. The final COMPUTED#BY
field definition looks a bit odd, but it's what is needed to
over-ride the built-in string length and edit string limitations of
the FN$STR_EXTRACT function. This function extracts a string of a
given length from an input string: I have the strings extract
starting at position 1 to the size of the string given by the byte
containing the string length in the record (C1, C2, or C3). 
.PARAGRAPH
This record definition will give me the following data:
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 6
STATUS__TEXT  : %RPC$__SWL-I-RESTART, RPC__SWL started
TASK__NAME    : DTS66$DUA0:[SYS0.SYSCOMMON.][SYSEXE]RPC$SWLPROC.
               EXE;2
PROCESS__NAME : RPC$SWL
OWNER        : SYSTEM
.BLANK.JUSTIFY.FILL
.BLANK.TEST PAGE 3.CENTER
Decoding binary fields.
.PARAGRAPH
So far so good: but now I have to start looking at the binary
fields. Usually the first step is to define a number of longword
integer fields, to see if anything comes out making sense. From
looking at the file, the first 10 bytes always seem to be null, so
I'm going to skip them and work from the "right" (the first text
field I know comes out correctly) and work back. 
.BLANK.NO JUSTIFY.NO FILL
REDEFINE RECORD RPC2__RECORD OPTIMIZE
01 RPC2__REC.
    10 FX.
        20 FILLER PIC X(41).
    10 BIN1 REDEFINES FX.
        20 FILLER PIC X(9).
        20 A USAGE LONG.
        20 B USAGE LONG.
        20 C USAGE LONG.
        20 D USAGE LONG.
        20 E USAGE LONG.
        20 F USAGE LONG.
        20 G USAGE LONG.
        20 H USAGE LONG.
    10 HIDE1.
        20 FILLER PIC X(96).
[remainder of record definition remains unchanged]
.BLANK.JUSTIFY.FILL
If I print out some of these new fields, I get this:
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 14
DTR> for rpc2 print e, f, g, h
.BLANK
     E           F           G           H
.BLANK
   62914560   629040800     9650200       65540
   62914560  1243440800     9650200       65540
   62914560  -833961088     9650202       65540
   62914560  -354961088     9650202       65540
   62914560  1547094048     9655412       65540
   62914560  1932194048     9655412       65540
   62914560 -1083305504     9657423       65540
   62914560  -577505504     9657423       65540
   62914560  1388511904     9661051       65540
.BLANK.JUSTIFY.FILL
Something else which is worth doing is to look at the binary values
of some of these fields: 
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 14
DTR> for rpc2 print e, f, g, h, fn$hex(h)
.BLANK
     E           F           G           H       FN$HEX
.BLANK
   62914560   629040800     9650200       65540    10004
   62914560  1243440800     9650200       65540    10004
   62914560  -833961088     9650202       65540    10004
   62914560  -354961088     9650202       65540    10004
   62914560  1547094048     9655412       65540    10004
   62914560  1932194048     9655412       65540    10004
   62914560 -1083305504     9657423       65540    10004
   62914560  -577505504     9657423       65540    10004
   62914560  1388511904     9661051       65540    10004
.BLANK.JUSTIFY.FILL
Now the next part depends somewhat on guesswork (or trial and
error), but since the text fields have to do with starting a
process, and processes have UICs, and the value of field H looks
like a valid UIC for the SYSTEM account which is starting this
process: so I'm going to assign it that meaning. One way to check
this is to link in the FN$FAO function (which has been published in
the Newsletter and included in the SIG tapes), as it can change a
binary UIC to an identifier: 
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 14
DTR> for rpc2 print e, f, g, h, fn$fao("!%I", h, 0,0,0,0,0,0,0)
.BLANK
     E           F           G           H          String
.BLANK
   62914560   629040800     9650200       65540 [SYSTEM]
   62914560  1243440800     9650200       65540 [SYSTEM]
   62914560  -833961088     9650202       65540 [SYSTEM]
   62914560  -354961088     9650202       65540 [SYSTEM]
   62914560  1547094048     9655412       65540 [SYSTEM]
   62914560  1932194048     9655412       65540 [SYSTEM]
   62914560 -1083305504     9657423       65540 [SYSTEM]
   62914560  -577505504     9657423       65540 [SYSTEM]
   62914560  1388511904     9661051       65540 [SYSTEM]
.BLANK.JUSTIFY.FILL
Next, let's look at the two fields just before the UIC. G is slowly
increasing, while F takes on a large number of values which vary a
lot. As previously mentioned, VMS stores a standard date and time
value in a quadword, which is two long words, and the low order word
comes first. Since a date and time are likely to be stored in a log,
it makes sense to try looking at these two fields as one date and
time: 
.BLANK.NO JUSTIFY.NO FILL
RECORD RPC3__RECORD OPTIMIZE 
01 RPC3__REC. 
    10 FX. 
        20 FILLER PIC X(41). 
    10 BIN1 REDEFINES FX. 
        20 FILLER PIC X(9). 
        20 A USAGE LONG. 
        20 B USAGE LONG. 
        20 C USAGE LONG. 
        20 D USAGE LONG. 
        20 E USAGE LONG. 
        20 DATE USAGE DATE. 
        20 UIC USAGE LONG. 
    10 HIDE1. 
[remainder of record is unchanged]
.BLANK.TEST PAGE 10
DTR> for rpc3 print date, fn$time(date)
.BLANK
   DATE                FN$TIME
.BLANK
21-Mar-1990 09:43:22.89
21-Mar-1990 09:44:24.33
21-Mar-1990 10:02:25.08
21-Mar-1990 10:03:12.98
16-Apr-1990 07:33:51.65
16-Apr-1990 07:34:30.16
26-Apr-1990 07:31:56.03
[records removed to save space]
17-Aug-1990 10:51:29.72
24-Aug-1990 08:19:59.68
24-Aug-1990 08:20:14.92
.BLANK.JUSTIFY.FILL
Since this field in all of the records comes out as a valid date
which increases in time, and all of the dates fall between the file
creation date and last update date, this probably is a date and time
field. 
.PARAGRAPH
Hopefully, when decoding a file in this manner at least the nature
of some of the data fields will be known. When I started this
exercise, I did not know any of the data fields: I have since come
across some information which confirms what I have shown so far, and
which indicates that probable nature of fields D and E. I did not
pursue this record definition any further because it was giving me
the information I needed at the time, but decoding other binary
fields is largely a matter of trial-and-error, following the
procedure shown above. I did "clean up" the formatting of some of
the fields, however, so that it now looks like this: 
.BLANK.NO JUSTIFY.NO FILL
REDEFINE RECORD RPCA__RECORD OPTIMIZE
01 RPCA__REC.
   10 FX.
      20 FILLER PIC X(29).
   10 BIN1 REDEFINES FX.
      20 FILLER PIC X(21).
      20 PC__LONGWORD USAGE LONG.
      20 PSL__LONGWORD USAGE LONG.
   10 PRINT1.
      20 DATE1 USAGE DATE.
      20 TIME1 COMPUTED BY FN$TIME(DATE1) EDIT__STRING T(11).
.TEST PAGE 3
      20 UIC COMPUTED BY FN$FAO("!%U", UIC__LONGWORD, 0,0,0,0,0,0,0)
            EDIT__STRING X(9).
.TEST PAGE 3
      20 ACNT COMPUTED BY FN$FAO("!%I", UIC__LONGWORD, 0,0,0,0,0,0,0)
            EDIT__STRING X(14).
      20 PC COMPUTED BY FN$HEX(PC__LONGWORD) EDIT__STRING T(8).
      20 PSL COMPUTED BY FN$HEX(PSL__LONGWORD) EDIT__STRING T(8).
   10 HIDE2.
      20 UIC__LONGWORD USAGE LONG.
      20 FILLER PIC X(96).
   10 TEXT2 REDEFINES HIDE2.
      20 X1 USAGE LONG.
      20 C1 USAGE BYTE.
      20 T1 PIC X(64).
      20 C2 USAGE BYTE.
      20 T2 PIC X(15).
      20 C3 USAGE BYTE.
      20 T3 PIC X(12).
      20 W2 USAGE WORD.
   10 STATUS__TEXT PIC X(132) EDIT__STRING T(48).
   10 PRINT2.
.TEST PAGE 4
      20 TASK__NAME COMPUTED BY
            FORMAT FN$STR__EXTRACT(T1, 1, C1) USING X(64)
            EDIT__STRING T(48).
.TEST PAGE 4
      20 PROCESS__NAME COMPUTED BY
            FORMAT FN$STR__EXTRACT(T2, 1, C2) USING X(15)
            EDIT__STRING T(15).
.TEST PAGE 4
      20 OWNER COMPUTED BY
            FORMAT FN$STR__EXTRACT(T3, 1, C3) USING X(12)
            EDIT__STRING T(12).
;
.BLANK.JUSTIFY.FILL
.BLANK.TEST PAGE 3.CENTER
Second example: the ANALYZE/DISK__STRUCTURE/USAGE file.
.PARAGRAPH
The ANALYZE utility will produce a data file containing statistics
on all files on a volume: specifically, the owner of the file, and
how much disk space is allocated to and used by the file. This can
be very useful in determining which users are taking up disk space:
but no utility is provided to summarize the data. Datatrieve will
easily report the data once there is a record definition to match
the data file. 
.PARAGRAPH
When I first decoded this file, I don't think there was any
information available on it's format (or at least I didn't find
any). It happens now that there is some format information
available, in the form of a C language header file. A portion of 
SYS$LIBRARY:USGDEF.H is shown below:
.BLANK.NO JUSTIFY.NO FILL
struct usgdef {
    unsigned char usg$b_type;           /* Record type          */
    unsigned long int usg$l_serialnum;  /* Pack serial number   */
    char usg$t_strucname [12];          /* Volume set name      */
    char usg$t_volname [12];            /* Volume name          */
    char usg$t_ownername [12];          /* Volume owner name    */
    char usg$t_format [12];             /* Volume format type   */
    unsigned int usg$q_time [2];     /* Time usage file created */
    } ;
struct usgdef1 {
    char usgdef$$_fill_1;
    unsigned long int usg$l_fileowner;  /* File owner UIC       */
    unsigned long int usg$l_allocated;  /* Blocks allocated     */
    unsigned long int usg$l_used;       /* Blocks used          */
    unsigned short int usg$w_dir_len;   /* Length of directory  */
                                        /* string               */
    unsigned short int usg$w_spec_len;  /* Length of complete   */
                                        /* file specification   */
    char usg$t_filespec [406];          /* File spec            */
                                        /* "[dir]nam.typ;ver"   */
    } ;
.BLANK.JUSTIFY.FILL
If we dump a portion of the usage file and compare it with the above
definitions, we see a good match. There are several problems which
can occur with this type of data file, however. The first is that
there are two different types of records in the same file: this is
usually called "unnormalized" data. Fortunately, the REDEFINES
clause will help us get around this: we can effectively define two
types of record within the same file. Another problem is that the
complete file specification is of variable size, and unlike the
previous example is not blank or null filled: however, a similar
COMPUTED#BY field can take care of this. A related problem is that
there are not separate fields for the file directory and the
remainder of the file specification, but a little effort with
COMPUTED#BY fields will also solve this. 
.PARAGRAPH
The record definition I finally derived looks like this: 
.BLANK.NO JUSTIFY.NO FILL
REDEFINE RECORD DISK__USE__RECORD OPTIMIZE
01 DISK__USE__REC.
   10 TYPE USAGE BYTE.
   10 COVERUP.
      20 FILLER PIC X(120).
   10 IDENTIFICATION REDEFINES COVERUP.
      20 SERIAL__NUMBER USAGE LONG.
      20 STRUCTURE__NAME PIC X(12).
      20 VOLUME__NAME PIC X(12).
      20 OWNER__NAME PIC X(12).
      20 FORMAT PIC X(12).
      20 CREATION__DATE USAGE DATE.
      20 CREATION__TIME COMPUTED BY FN$TIME(CREATION__DATE).
   10 FILE__RECORD REDEFINES COVERUP.
      20 FILEOWNER USAGE LONG.
.TEST PAGE 3
      20 IDENTIFIER COMPUTED BY FN$FAO("!%I", FILEOWNER,
            0,0,0,0,0,0,0) EDIT__STRING X(24).
      20 ALLOCATED USAGE LONG EDIT__STRING ZZZ,ZZZ,ZZ9.
      20 USED USAGE LONG EDIT__STRING ZZZ,ZZZ,ZZ9.
      20 DIR__LEN USAGE WORD.
      20 SPEC__LEN USAGE WORD.
      20 FILESPEC PIC X(96) EDIT__STRING T(76).
.TEST PAGE 3
      20 DIRECTORY COMPUTED BY FORMAT FN$STR__EXTRACT(FILESPEC, 1,
            DIR__LEN) USING X(64) EDIT__STRING T(64).
.TEST PAGE 4
      20 FILE__SPECIFICATION COMPUTED BY
         FORMAT FN$STR__EXTRACT(FILESPEC, (DIR__LEN + 1),
            SPEC__LEN) USING X(64) EDIT__STRING T(64).
;
.BLANK.JUSTIFY.FILL
Defining the record as all filler within group COVERUP and then
defining two groups IDENTIFICATION and FILE__RECORD solves the
problem of having two different data types within the file: one can
say FOR DISK__USAGE WITH TYPE = 2 PRINT FILE__RECORD, for example,
to use the correct field layout on the appropriate records. The two
fields DIRECTORY and FILE__SPECIFICATION use FN$STR__EXTRACT in a
manner similar to the first example to remove the appropriate
portions of the file specification field. Notice how DIR__LEN is
used as the length of the DIRECTORY field, and then as the starting
position of the FILE__SPECIFICATION field. 
.BLANK.TEST PAGE 3.CENTER
Converting date formats and CHOICE.
.PARAGRAPH
A problem which is often encountered is reading a file containing a
date format which contains a date and/or time which is not in the
standard VMS data type. While Datatrieve can convert many kinds of
date formats directly, there are some which are difficult to handle.
I discovered that many of the data files used by ALL-IN-1 have such
date fields, which look like this: 
.BLANK
1990070809101112
.BLANK
and work out to be the year, month, day, hour, minute, second, and
fraction of a second, in ASCII characters. Now this data type does
allow sorting in date and time order when the data is in ASCII
characters, but doesn't allow for indexed retrieval and range
retrievals as easily as the VMS quadword date type. 
.PARAGRAPH
Converting this data to a standard VMS quadword date using
Datatreive would be quite easy if it weren't for the fact that
ALL-IN-1 sometimes leaves the fields blank. When the fields are
treated as numbers, blanks don't translate to anything reasonable.
Fortunately, the CHOICE statement can be used within a record
definition to cause more than one value to be assigned to a
COMPUTED#BY field depending on what data is encountered. What
follows is part of one such record definition: 
.BLANK.NO JUSTIFY.NO FILL
REDEFINE RECORD AI1__NETWORK__RECORD OPTIMIZE
01 AI1__NETWORK__REC.
   10 USER__NAME PIC X(30).
   10 NODE PIC X(6).
   10 DATE1.
      20 FILLER PIC X(16).
.TEST PAGE 6
      20 LAST__UPDATE COMPUTED BY CHOICE OF
         (IY1 EQ "  ") THEN FN$DATE("18-NOV-1858 00:00:00.00")
         ELSE FN$DATE(ID1 | "-" | IM1 VIA MONTH__TABLE | "-" |
            IY1 | " " | IH1 | ":" | II1 | ":" | IS1 | "." | IF1)
         END__CHOICE.
.TEST PAGE 3
      20 UPDATE__TIME COMPUTED BY
         FN$TIME(DATE1.LAST__UPDATE) EDIT__STRING X(11).
   10 DI1 REDEFINES DATE1.
      20 IY1 PIC XXXX.
      20 IM1 PIC XX.
      20 ID1 PIC XX.
      20 IH1 PIC XX.
      20 II1 PIC XX.
      20 IS1 PIC XX.
      20 IF1 PIC XX.
!
! other fields followed these.
!
.BLANK.JUSTIFY.FILL
The CHOICE statment gives one of two results, depending on whether
the date field is blank or not: if it is, the arbitrary base date
and time for the VMS quadword date is inserted; if not, then the
actual date and time is computed using the input fields. Notice how
the date and time string is built up by concatenating all of the
necessary data fields and delimiting characters, and that the month
is converted to a character string via a table. The table is quite
simple: 
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 8
REDEFINE TABLE MONTH_TABLE
01 : "JAN",
02 : "FEB",
03 : "MAR",
                [ etcetera ]
12 : "DEC"
END_TABLE
.BLANK.JUSTIFY.FILL
It is important that the months be in upper case for this particular
function to work.
.PARAGRAPH
The data file looks like this:
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 7
BROWN                         DTS66 1990070809101112
CHAN                          DTS66                 
JONES                         DTS66 1990090800000000
LEDERMAN                      DTS66 1990102910111213
SMITH                         DTS66                 
WILLIAMS                      DTS66 1989010100000000
.BLANK.JUSTIFY.FILL
but when processed with Datatrieve it looks like this:
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 10
             USER                        LAST       UPDATE
             NAME               NODE    UPDATE       TIME
.BLANK
BROWN                          DTS66   8-Jul-1990 09:10:11.12
CHAN                           DTS66  18-Nov-1858 00:00:00.00
JONES                          DTS66   8-Sep-1990 00:00:00.00
LEDERMAN                       DTS66  29-Oct-1990 10:11:12.13
SMITH                          DTS66  18-Nov-1858 00:00:00.00
WILLIAMS                       DTS66   1-Jan-1989 00:00:00.00
.BLANK.JUSTIFY.FILL
.COMMENT ~~~~
.BLANK.TEST PAGE 3.CENTER
Possible alignment problems (if dealing with PDP-11s).
.PARAGRAPH
The length of each field is of interest when using the REDEFINES
clause, when the data must be read by other programs, or when you
are trying to match an existing file. There is a hidden problem
which sometimes occurs that should be noted: 
.BLANK.NO JUSTIFY.NO FILL.TEST PAGE 6
DEFINE RECORD GOTCHA
01 GOTCHA.
   03 A PIC X.
   03 B PIC 99 USAGE IS INTEGER.
;
.BLANK.JUSTIFY.FILL
It might be thought that this record is 3 bytes long, and on a VAX
it is, but on a PDP-11 it isn't: it is 4 bytes long, because of
something called word alignment. INTEGER, REAL, DOUBLE and DATE must
start on a word boundary, which is an even number of bytes: if they
don't, Datatrieve inserts a hidden byte to align the data; in this
case, between fields A and B. This byte takes up space in the record
but can never be accessed, unless you change the definition. This
will also occur on the VAX if the ALIGN clause is used to force
integers to fall on word boundaries, which is sometimes done with
other languages. I find that the best way to work out record length
is with a pencil, and a piece of graph paper ruled in squares, or a
printer form layout sheet, or CRT display form, or an old coding
sheet. By marking off the fields, using one square per byte, the
length of each field and it's alignment is easily determined. This
is especially important with the REDEFINES clause, which will be
shown later.