.PAPER SIZE 54,60
.RIGHT MARGIN 60
.NO NUMBER
.TITLE Favorite RSX problems
#
.BLANK 10
.CENTER;Favorite RSX Problems
.BLANK 7
.CENTER;- DA115 -
.BLANK 5
.CENTER;Thomas R. Wyant, III
.CENTER;E. I. DuPont de Nemours
.CENTER;Richmond, Virginia
.BLANK 5
.CENTER;December  9, 1991
.AUTOPARAGRAPH
.PAGE

This paper began its life with the suggestion that perhaps the
same questions came up repeatedly in the software clinic. A
symposium presentation was duly scheduled, and notes from
previous clinics were pulled and studied. The suspected
repeatability was found, but it turned out to be at the topic
level, not the question level.

The question then became, what to do with the information?  The
obvious thing was to order the clinic notes by topic and re-issue
them. That is the easy solution, but also probably the deathly
dull one. In an attempt to improve on this, I have attempted to
write a few paragraphs about each of the "hot" topic areas. There
is no attempt here to give comprehensive coverage of these
topics, but to capture those areas which, if the clinic is a
guide, prove troublesome in practice.

.HEADER LEVEL Family Differences

The RSX phylogenetic tree is unusually richly branched. This can
lead to some confusion, especially between RSX-11M+, RSX-11M, and
RSX-11S.

RSX-11M+ is unique in the RSX family in a number of respects:
.LIST "*"
.LE;It supports I_&D space.
.LE;It supports symmetrical multiprocessing, which would be a
moot point (there being no commercially available SMP hardware),
except that the configuration control functionality (CON ONL,
CON OFF) is a consequence of this.
.LE;It can load more than 124Kw at boot. This can cause
confusion, as you can BOO a large M or S system from M+, but not
from M; this is a (by all indications permanent) restriction to
the other systems.
.END LIST
RSX-11M+ requires a 22-bit PDP-11 on which to run. That is, there
is a way to GEN it for 18-bit systems, but you are on your own if
you do so. It does not technically require an I_&D space
processor, but you should expect to find yourself extremely short
on pool if you don't use one. Micro-RSX is just a variant of M+,
built from the same kit. Though M+ is larger than M, it may be
faster; some code paths have been optimized for speed, whereas in
RSX-11M they were optimized for size.

RSX-11M is rather mature now, and typically only gains new
features by the "trickle-down" effect. Unlike RSX-11M+, RSX-11M
is maintained in such a manner that it can run on any PDP-11 ever
sold, and this can also restrict the features available. There
are a lot more configuration options for RSX-11M than for M+;
this is partially offset by the fact that the RSX-11M SYSGEN is
cleaner than the M+ one. For example, you give SYSGEN a list of
the devices you want, and it asks only questions relevant to
those devices (as opposed to M+, which asks an endless series of
questions like "How many PC-11 paper tape readers do you have?").

RSX-11S is a "proper subset" of RSX-11M, designed to run in a
memory-only configuration, booted either from a local disk or
over the net. The major changes involve removal of F11ACP (ie -
no system support for Files-11), multiuser protection, and of
command line support. Other system features (like memory-resident
overlays, parent-offspring tasking, send/receive, etc.) tend to be
there. Most  MCR commands are also gone; the minimal system
includes TIM, RUN, and REM. There is no INS, but there is a task
called OTL ("Online Task Loader") which can install tasks off
RT-11 format floppies or DOS-11 tape. Since RSX-11S has a small
installed base and OTL is not shared with RSX-11M, it tends to be
a bit buggy, with the bugs generally being in the area of header
setup.

Independent of the RSX variant, there are two varieties of the
TT: driver -- the half-duplex driver and the full-duplex driver.
The full-duplex driver offers more functions, such as true
full-duplex operation (you can have a read and a write posted at
the same time), device-independent cursor positioning,
table-terminated reads, command timeout, and so on. It's also
bigger.

Users of the half-duplex driver need to be aware that a
control/U, though it cancels an MCR command, does NOT release the
command buffer (and therefore does NOT free the terminal for
I/O). You MUST follow the control/U with a return if that is your
intention.

Porting user-mode code among the variants of RSX is generally no
problem. Code written under RSX-11M generally runs unmodified
under M+, though some of the M+ features (such as named
directories) can be picked up by relinking. Similarly, code
linked under M+ generally runs under M, provided it uses no M+
specific features.

There are, however, some differences in task dispatching that may
need to be attended to:
.LIST "*"
.LE;Task _...xxx is never active under M+, though it is the first
of its kind to be activated under M.
.LE;The RPOI_$ directive is restricted to CLIs under M.
.LE;The inheritance of default directories is different between M
and M+, due to the way named directory support was done. Under
M+, a task inherits its default directory from the task that
spawned it, not from the terminal. This means command buffers and
such need to issue a SDIR_$ _.. SD.TI directive to re-acquire the
correct default before each spawn.
.END LIST

.HEADER LEVEL Overlays

Overlays are the canonical way to get five pounds of code in a
two pound bag. If done properly, they can get you a lot of
functionality in a little address space. If done improperly, they
can result in a task that won't run, or even a task that won't
taskbuild.

The best time to overlay a task is before it's coded. Plan to
segment functionality into several independent (or fairly
independent) modules, rather than one megamodule. Of course, this
is a good idea even if you don't need to overlay.

RSX requires you to organize overlays into one or more trees. The
root of the main tree is your main module - that is, the one that
first gets control when the task runs. RSX will (if you ask it)
autoload other segments as they are called, making overlays, if
not painless, at least not excruciating.

There ARE some more-or-less arcane rules for constructing overlay
trees, but they are all based on the following principle: Any
attempt to refer to data or code that is not in memory is doomed
to failure.

DEC has made it fairly difficult for code not to be in memory,
but you can force out a module you need if you're not careful of
your calling relationships. For example, in the following tree
.BLANK;.TEST PAGE 5;.LITERAL
           ROOT
            |
            A
           / \
          B   C
.END LITERAL;.BLANK
"A" can call either "B" or "C", and whichever one gets called
gets loaded in. But if the root calls "B", then "B" calls "A" and
"A" calls "C", "B" has been overlaid by "C", and all perdition
will break out when "A" tries to return to "B", landing instead
at some random location in "C".

Another way to screw up would be for the root to declare "B" as
an AST, then call "C". The AST mechanism has no way to load overlays,
so when the AST fires, control goes to whatever random location
in "C" corresponds to the AST entry point in "B". Moral: Put AST
routines in the root, unless you intend to be VERY careful about
who calls who when.

Data is also susceptible to corruption by this process - in fact,
perhaps more so than code. Most problems of this sort occur when
a routine saves the address of some data, then uses it later.
FORTRAN ASSOCIATEVARIABLEs, for example, should probably be in
the root; at the very least, you MUST close the file before
overlaying the module the variable is in. I/O buffers and status
blocks would also be a problem if the I/O is asynchronous; in
synchronous I/O the system is done with the addresses before you
leave the module. One way to force variables into the root is to
declare them in commons (FORTRAN) or as "extern" ("C").

Problems with overlays can also occur when code modifies
variables initialized with DATA statements, then makes
assumptions about whether the modified value is preserved from
one call to the next. In brief, if the overlay was forced out and
reloaded, the variables got re-initialized. If not, they didn't.
The safe thing to do is not to modify variables initialized with
DATA statements.

All this can leave the poor programmer with a real dilemma about
where to put utility routines in the overlay tree. If they're put
in the root, they can't overlay each other, and the overlay tree
looks like a palm tree, with most of it's height being trunk, and
branches bushing out at the very top. If they're put at the tips
of the branches, you need many copies of each, and the task image
becomes immense -- possibly to the point where the task builder
can't build it any more.

There is a third solution - to put the routines in a cotree.
Basically, instead of a single tree, the task becomes an orchard.
Any routine on one tree can call any routine in another tree, as
long as you avoid the cotree version of the "A-B-C" example
above. Routines in a cotree can call each other with the usual
restrictions.

Overlaying is a useful technique only when the overlaying modules
don't change, or when changes can be ignored. This makes it
(sometimes) great for code, but (generally) useless for data.
Those with a need to cram five pounds of data in a two pound bag
should look into files, virtual arrays (in FORTRAN - a BASIC
virtual array is a file), or perhaps memory-resident overlays
(which still have to be loaded by hand).


.HEADER LEVEL I_&D Space.

So what is I_&D space anyway, and what does it buy me? The answer
requires a brief tour of the PDP-11's architecture. A PDP-11, any
PDP-11, can only address 64Kb of memory at a time. This
restriction is evaded somewhat by the Memory Management Unit; you
can still see only 64Kb at a time, but it can be the 64Kb of your
choice (sort of) out of the 4Mb the bus can address. This is good
for the executive, but still involves some penalties for the
application programmer: a memory-resident overlay, for example,
is still an overlay.

But on some PDP-11 models, the Memory Management unit has twice
as many registers as you'd expect, and the capability to tell the
difference between an instruction access and a data access. The
last is pretty simple, really: any access through the PC register
(including fetching the offsets for address modes 6 and 7,
regardless of which register was specified) is an "instruction"
access; anything else is a "data" access.

Hence I_&D, for "Instruction and Data". Given an operating system
that supports it (M+, MicroRSX, and Coprocessor/RSX do, M and S
don't) What it buys is simple. Instead of having a total of 64Kb
at its disposal, a task can have 64Kb of instructions PLUS 64Kb
of data. This is no big deal for tasks that are mostly code with
a little data or vice versa. But it can significantly flatten (or
eliminate!) overlays for tasks with about the same amount of
each.

Unfortunately, there's no such thing as a free lunch. Though the
CPU knows (by convention!) what's instructions and what's data,
the Task Builder doesn't; it can't afford to make that kind of
assumption, and relies on the compilers and assemblers for this
information, in the form of _.PSECT declarations. If the
information is there in the objects, all you have to do is apply
the _/ID switch, and you're there. If it isn't, you can't use
I_&D space for that particular task.

What determines whether the _.PSECT information is there depends
on where the object module came from. If it's a MACRO-11 module,
it's there if you put it there. If it's a high-level language
module, it's there if the language supports it and the compiler
is configured for it (NOT the same thing, and PDP-11 Fortran-77
is a prime example!). If it's a library module it reduces to one
of the above two cases, with the complication that if the _.PSECT
information ISN'T there, there may not be much you can do about
it.

As mentioned above, Fortran-77 supports I_&D space, but the
compiler must still be configured to emit the proper code. The
reason is that the FORTRAN OTS sometimes uses calls with linkage
registers, and linkage registers don't work in an I_&D space
environment. "Linkage registers" refers to subroutine calls like
the following:
.BLANK;.NOFILL;.LEFT MARGIN +5
JSR R4,FOOBAR
_.WORD DATA
.LEFT MARGIN -5;.FILL;.BLANK
In this example, R4 serves as the linkage register. The
subroutine refers to the in-line data with instructions like
.BLANK;.NOFILL;.LEFT MARGIN +5
MOV (R4)+,R0
.LEFT MARGIN -5;.FILL;.BLANK
and returns (after R4 has been properly positioned) with "RTS
R4". Without I_&D space, the technique of using linkage registers
and in-line code saves space. With I_&D, however, it doesn't work
- TKB puts the data in instruction space, but the CPU looks for
it in data space. So the Fortran-77 compiler MUST be configured
to produce code that doesn't use this technique. The bad news is
that you have to relink the compiler to do it (the type of code
emitted is determined by a GBLDEF in the taskbuild command file).
The good news is that you probably don't have to keep two
versions of the compiler around, as code configured for I_&D
space will still work when linked without the _/ID switch.

In MACRO-11 modules, you have to do all the sectioning off of
code and data yourself. Basically, this is done by placing
instructions in _.PSECTs with the "I" attribute, and data in
_.PSECTs with the "D" attribute. While you're at it, you may want
to divide readonly data from read/write data (using the "RO" and
"RW" _.PSECT attributes respectively), so the /MU taskbuilder
switch buys you space also. It will suffice to have three
_.PSECTs:
.BLANK;.NOFILL;.LEFT MARGIN +5
_.PSECT PURCOD RO,I,LCL,REL,CON  ; For Code
.BLANK
_.PSECT PURDAT RO,D,LCL,REL,CON  ; For RO data
.BLANK
_.PSECT IMPDAT RW,D,LCL,REL,CON  ; For RW data
.BLANK
.LEFT MARGIN -5;.FILL;.BLANK
All you need to do is to insert one of the above lines in front
of each section of code or data. Once you've defined a _.PSECT,
you don't need to repeat the attributes every time you use it
(though it won't hurt to do so). Just it's name is enough for the
task builder, eg:
.BLANK;.NOFILL;.LEFT MARGIN +5
_.PSECT PURCOD
.LEFT MARGIN -5;.FILL;.BLANK
for all blocks of code after the first one.

You'll also have to find all uses of linkage registers, and
recode to eliminate them.

.HEADER LEVEL System Management 101

There is a miscellany of "problems" with managing an RSX system,
many of which come down to "what command do I issue to ...".
This kind of question appears to arise from a number of sources,
including: not realizing the nature of the problem, incomplete
documentation sets, or incompletely or inobviously documented
functionality. This kind of question has been grouped here under
the title "System Management 101".

.HEADER LEVEL +1 The DCL Interface

DCL is not native to RSX, but was layered on top of MCR as an
"alternate" command line interpreter. An RSX system will support
up to 15 alternate CLIs, with MCR, DCL, CCL, and so on
co-existing happily; users can switch between them with a simple
SET command.

Alternate CLIs were implemented with the RSX group's usual
minimalist economy; all the alternate CLI does is dequeue its
command, translate it to the equivalent MCR command, and
reissue the new command to MCR.

DCL was implemented using a fairly involved parser and a two-step
translation scheme. This evidently gave the developers some
trouble, as DCL has to this day the command SET DEBUG, which
causes DCL to display the MCR command it generated from your DCL
command, rather than passing it to MCR for execution. This can be
valuable when you can't figure out why DCL won't do what you
want. For example, there are at least three different FORTRAN
compilers available under RSX: _...FOR (FORTRAN IV), _...F4P
(FORTRAN 4+), and _...F77 (FORTRAN 77). The cheap way to get DCL
to invoke the one you want is to use SET DEBUG to find out what
DCL's default FORTRAN is, and install yours accordingly. You use
SET NODEBUG to get back to normal operation.

Many CLIs under RSX also implement the idea of a "catchall"; that
is, a task which gets the command if the current CLI can't make
heads or tails of it. DCL can be built so that MCR is its
catchall, and in fact it is distributed this way for M+. But
RSX-11M, the last time I looked, still distributed DCL with no
catchall. To turn on this capability, edit DCLBLD.CMD
(instructions are in the file) and relink.

.HEADER LEVEL Console Logging

Console logging is a facility that captures all output to CO: in
a file for your later reading enjoyment. It can be a real boon if
your console terminal is a CRT, or if over zealous neatniks tend
to trash critical console listings. All messages are time stamped
(though not date stamped) on the left. The drawback is that ONLY
output to CO: is in the log file; output directly to TT0: will
not appear, nor will output that bypasses the operating system
completely, such as crash notification. To turn it on, simply
.INDENT 5;.BREAK
SET /COLOG_=ON
.BREAK
You can start a new log file by
.INDENT 5;.BREAK
SET /LOGFILE_=
.BREAK
(no, there's nothing after the "_="). The log files end up in
LB:[1,6].

.HEADER LEVEL Pool Conservation

System managers are frequently looking for more pool. Use of LAT
terminals can gain you pool, but only if you you end up with
fewer TT: units by so doing. Unused TT: ports should be SET
/SLAVE; likewise TT: ports connected to "odd" devices, as a
"streaming" line can deplete pool in a heartbeat. On an M+
system, the terminal should be slaved before it is CONed ONLINE.
On an M or S system, it must be done in VMR.

Occasionally, you can gain contiguous pool by SET /MAXPKT_=0,
followed by setting it back to the normal value. This works 
because I/O packets go into the MAXPKT queue when deallocated, 
and they were originally allocated from random locations in pool.

.HEADER LEVEL Getting a Crash Dump

Sooner or later (sooner if the pool problems aren't mastered),
the system manager needs to force a crash dump. The general
method is the same for any RSX system: force a transfer of
control to address 40 (octal), which contains a JMP to the crash
code. How the method is implemented, however, depends on the CPU
involved:
.LIST '*'
.LE;11s with a console subsystem:
.LIST
.LE;Type control/P to get the console prompt (_>_>_>). All
console subsystem commands are terminated with the _<Return_>
key.
.LE;Type "H" to halt the system.
.LE;Type "D/G 7 40" to deposit 40 in the program counter.
.LE;Type "C" to continue the processor.
.LE;If this fails, type "S 40"; this initializes the processor,
thus losing you the state of the memory management registers.
.END LIST
.LE;11s with console ODT:
.LIST
.LE;Type control/P to get the ODT prompt ("_@"). Console ODT
commands don't need the _<Return_> key unless explicitly stated.
.LE;Type "H" to halt the system.
.LE;Type "_$7/" to open the program counter; its contents will be
printed.
.LE;Type "40" (with carriage return) to deposit 40 in it.
.LE;Type "P" to proceed.
.LE;If this fails, type "40G", which initializes the system.
.END LIST
.LE;"Real" 11s, with toggle switches and blinking lights:
.BLANK
Regrettably, the author no longer has access to such equipment.
.END LIST

.HEADER LEVEL Disk Management

One of the realities of a modern file system is that disk space
WILL become fragmented. The DEC-approved way to deal with this is
to use BRU to back up and restore the disk. People with
single-disk systems are in a slightly awkward position, but can
get by with standalone BRU. As distributed, you can find
standalone BRU in LB:[1,64]BRU64K.SYS, which you can bring up
using "BOO". Tape-resident versions are available, or can be made
yourself with BRU64K.SYS, BRU64K.STB, and a suitable copy of VMR
(you'll need the 11M version, as standalone BRU is based on
RSX-11S). You probably want to have the tape on hand even if you
normally use the disk-based version, just in case your disk gets
clobbered.

Another fact of life is that systems evolve. The addition of
named directory support to RSX-11M+ was NOT transparent, due to
the need not to break existing applications. This left old
applications unable to reference named directories. The fix for
this is in most cases very simple; re-link the application. If
the application relies on FCS, RMS, or CSI_$ to parse the file
spec, and if the application is not so tight that the expanded
functionality won't fit in the available address space, this is
all that is needed. Don't forget that this works not only on your
own code, but DEC's (eg: compilers) and DECUS's.

.HEADER LEVEL Emergency Entrance

Occasionally a system manager must gain access, on an emergency
basis, to a system for which he/she does not have a password. The
following procedure has worked on every RSX system the author has
tested:
.LIST
.LE;Reboot
.LE;Soon as the system identifies itself, start typing control/C
as fast as you can.
.LE;When you get an MCR_> prompt, enter "REM _...AT.". This will
cause the startup command file not to run (since there's no "@"
processor to run it), leaving you logged on to the system console
as a privileged user.
.END LIST
This works because ...SAV unslaves the system console on reboot -
so even if TT0: is slaved in VMR, there is a "window of
opportunity" between the time MCR becomes available and the time
[1,2]STARTUP.CMD is dispatched (to possibly slave TT0: again).

Lest the VMS people get too smug, a similar procedure is
available for the VAX:
.LIST
.LE;Do an interactive boot ("B/1" on a MicroVAX).
.LE;On receiving a SYSGEN prompt, enter 'SET UAFALTERNATE 1'. For
a VAXstation, you'll have to 'SET STARTUP__P1 "MIN"' also. Then
"CONTINUE"
.LE;After the system comes up, you should be able to log into
SYSTEM using any password of appropriate length. Don't forget to
run SYSGEN again to put the modified parameters back as they
were.
.END LIST
The above procedure can be defeated by actually providing an
alternate UAF. If that has been done, then at the SYSGEN prompt
'SET/STARTUP=OPA0:'. This causes the system to attempt to use
OPA0: as the startup command file.

.HEADER LEVEL Miscellaneous Tweaks and Twiddles

Various RSX utilities, such as CMP, will sometimes give errors
such as "XXX -- Dynamic Storage Allocation Failure". These tasks
use available space in the task's partition, past the end of the
task, as their work area. For some reason known only to the
developers, they refuse to use the EXTK_$ directive to acquire more
space. However, you can manually increase the available memory by
installing these tasks /INC=nnnn. You will have to play around a
bit to find out what the maximum is; /INC=MAX was "Noted" at one
point, but may not have made it into RSX.

.HEADER LEVEL Peering Under the Hood

One of the advantages of RSX is that you get the source. As the
code is its own most accurate documentation, reading it can
clarify points about how RSX really operates. Unfortunately, you
don't get the code to ALL the components, just the executive; and
typically you get nothing for layered products but an object
library. You can, however, disassemble the modules in the library
if you have a crying need to know what's in them. There are a
couple disassemblers available on the RSX tapes, including DOB
and ORC.

.HEADER LEVEL Strange but True

.LIST "*"
.LE;The output volume of a disk-to-disk BRU can have different
numbers of occupied blocks than the input volume, due to the
possibility of lost files (or files with multiple directory
entries), and differences in [0,0] (particularly BADBLK.SYS on
non-MSCP disks). .LE;Any utility that uses the GCML_$ interface
(and most of them do) will accept any number of output file
specs, and ignore the extras. This came to our attention when
someone asked the function of the third output file of MAC. 
.END LIST

.HEADER LEVEL -1 Networking and Data Transfer

RSX systems are frequently found in data collection applications.
Getting the data into and out of the RSX system over a network
(DECnet or otherwise) then becomes one of the implementation
tasks.

DECnet is a surprisingly easy networking system to use (that is to
say, I learned on DECnet and have found other systems appallingly
difficult). Typically, you tell it little to nothing about what
you want it to do; you just describe the local configuration and
DECnet figures out the rest. Routing, for example, happens
automatically providing routers are available (and routers aren't
required if the nodes are adjacent and have only a single DECnet
line).

.HEADER LEVEL +1 Managing DECnet/RSX

Managing a DECnet/RSX system consists mainly of managing the node
table and the communications buffers.  Managing the node table
can be a problem in a large net, with nodes coming and going
frequently. The strategy of choice in this case seems to be to
set up a distribution tree, with one (or a few) nodes having the
master list, and others getting it from them. RSX has an NCP
command to support this:
.BLANK;.INDENT 5
_>NCP COPY KNOWN NODES USING source
.BLANK
where SOURCE:: is the node you intend to get the node list from.
There are a couple caveats here:
.LIST "*"
.LE;NCP does not update the static database; you'll have to
define a minimal database by hand, and issue the NCP copy command
every time you want an update.
.LE;DECnet keeps most of its information in private pool
(partition POOL..). This includes the node table. POOL.. does not
expand, nor does DECnet cease operation when it fills up;
instead, DECnet starts using primary pool. Unless POOL.. has a
LOT of freeboard, NCP COPY KNOWN NODES (or any other procedure
that adds a bunch of nodes to the table) can send you down the
tubes.
.END LIST

Managing the communications buffers consists mainly of being sure
there are enough. The critical quantities here (all defined as
SYSTEM parameters in CFE) are:
.LIST "*"
.LE;MAXIMUM LINKS
.BREAK
The maximum number of connections AND CONNECTION REQUESTS
outstanding at any point in time.
.LE;LDB (Large Data Buffers)
.BREAK
Used as data buffers by DECnet. According to the manual, you'll
need one per logical link plus at least one per line (more for
high-speed lines). These come out of POOL.. if space is
available.
.LE;SDB (Small data buffers)
.BREAK
Used for interrupt messages. The manual says to allocate them
similarly to LDBs (one per link plus at least one per line), plus
a couple more "for the pot". These also come out of POOL.. if
space is available.
.LE;CCB (communications control buffers)
.BREAK
Used to manage the LDBs and SDBs; you'll need one for each of the
above. These come out of primary pool.
.END LIST

There are two ways to know when you haven't got enough. One is
the NCP SHOW component COUNT command, which will show you if
you're getting allocation failures. These failures are
accompanied by performance degradation, which will be noticeable
if the failures are sufficiently frequent. The second symptom is
the failure to establish DECnet links due to insufficient DECnet
resources. NTD (which itself uses one to two DECNet links) and
the DECnet RMDemo pages (which you have to build into RMD
yourself using [200,200]UNSGEN.CMD) can give you an idea of what
links you have up at any given time. There's a caveat here,
hinted at under "MAXIMUM LINKS" above: connect requests are not
links (except for the purposes of the NTD display that counts
links outstanding), but they consume all the resources of a link.
One way to soak up any amount of DECnet resources works like
this:
.LIST
.LE;Write (as we all do) a DECnet task "A" that accepts a link
and then just sits around processing data, ignoring anything that
might come in on its mailbox LUN.
.LE;Have task "B" connect to task "A".
.LE;Have task "C" attempt to connect to task "A", retrying
periodically. The connect requests will remain queued to "A"
(though they time out to "C") until "A" either dequeues them or
exits.
.END LIST
There are several (not necessarily mutually exclusive) ways to
deal with this:
.LIST "*"
.LE;Wait for the symptoms to occur, and then track down the
culprit and shoot him/her/it.
.LE;Service the mailbox LUN periodically, if the logic of your
code allows it.
.LE;Use the SPA_$ macro (yes, you have to use MACRO-11) to fire an
AST on receipt of a mailbox message. The AST dequeues messages as
they arrive, remembering to reject any connect attempts so the
resources are deallocated.
.END LIST

.HEADER LEVEL Communicating with VMS

Networking between RSX and VMS systems is generally not the
problem it is perceived to be. The differences in machine
architecture are, as it turns out, no problem. Differences in
operating system can be (especially in the realm of user
authentication), but the problems are generally manageable. And
though the software interface to DECnet LOOKS vastly different
between the two systems, there's really no problem here either.

Most problems with RSX-to-VMS networking revolve around user
authentication. Modern versions of RSX-11M+ support proxy
logins, which is a big help (for one thing, the VMS people
understand it). The proxy database is maintained using
[5,54]PROXY.TSK. For proxies to work, however, a couple
conditions must be met on the RSX side: 
.LIST "*"
.LE;You must be running accounting (which implies M+).
.LE;If the RSX system initiates the link, it must be initiated
from a terminal that was logged in while accounting was active.
This generally excludes tasks run from the clock queue.
.END LIST
Other ways of getting the access control right include aliases on
the RSX side, and default accounts (possibly attached to DECnet
objects) on the VMS side. If explicit passwords need to be
exchanged over the net, remember that earlier versions of RSX
have an eight character maximum on passwords.

Once the formality of access control is taken care of, links are
initiated and accepted in the usual way. The RSX side calls
OPNNTW followed by BFMTn and CONNTW to initiate a connection, and
OPNNTW followed by GNDNTW and ACCNTW to accept a link. The VMS
side, if using transparent DECnet, opens '"TASK=name"' to
initiate a connection to the named task, and opens SYS_$NET to
accept a connection. A connection must be both initiated and
accepted to be successful. If, for example, an RSX task issues a
CONNTW call to a VMS command procedure which does not open
SYS_$NET (or run a program which does) the RSX task will
eventually get "no response from object".

Almost any VMS program can create, open, read, and write files on
an RSX system just as though they were on a remote VMS system;
just put a node name on the front of the file spec. This works not
only for DIRECTORY, COPY, and TYPE, but for EDIT, BACKUP (save
sets only), SEARCH, and even ANALYZE/RMS. Similarly, most files
on a VMS system (except indexed prologue 2 and 3 files) can be
opened by an RSX task linked to RMS and DAPRES. This can be done
with a FORTRAN-77 task (at least under M+) by simply relinking
the task to the RMS FORTRAN run-time system, RMS itself, and
DAPRES. In addition to whatever commands you need to include the
RMS OTS, you'll need the following in the root of your task:
.BLANK;.LEFT MARGIN +5;.NOFILL
LB:[1,1]RMSLIB/LB:R0EXSY:R0IMPA
LB:[1,1]RMSDAP/LB:R0AULS
.BLANK;.LEFT MARGIN -5;.FILL
You'll also need the following options:
.BLANK;.LEFT MARGIN +5;.NOFILL
RESSUP _= LB:[3,54]RMSRES/SV:0
LIBR   _= DAPRES:RO
.LEFT MARGIN -5;.FILL

There is also a bug in versions of DECnet/RSX prior to M+ 3.0 B
and M 4.2 B, that prevents the RSX system from accepting
connections from more recent systems, either RSX or VMS. The
problem was overvalidation of the connect block, and it can be
fixed by ZAPping NOPs over the offending instructions. The exact
procedure is:
.LIST
.LE;Copy or back up [5,54]NETACP.TSK (JUST IN CASE!!!)
.LE;Dump it to a text file.
.LE;With your favorite editor, find consecutive words having the 
following octal values: 032705, 177774, 001xxx (where xxx doesn't really
matter). Make note of the block:byte number (in octal) of the location 
of the first of these three (eg: 14:130 -- yours may differ wildly from 
this!)
.LE;Execute the following ZAP commands:
.BREAK;.NOFILL;.LEFT MARGIN +5
> zap
ZAP> netacp.tsk/ab
__ block:byte/  ; ZAP should show you 032705
__ 240          ; NOPs the first word
__ <CR_>         ; should show you 177774
__ 240          ; NOPs the second word
__ <CR_>         ; should show 001xxx
__ 240          ; NOP the last word
__ _^Z           ; CTRL-Z out.
.FILL;.LEFT MARGIN -5
.LE;Now do whatever is necessary to restart DECnet with the new copy of 
NETACP. The minimum is probably:
.BREAK;.NOFILL;.LEFT MARGIN +5
_>NCP SET EXECUTOR STATE OFF
_>NCP CLEAR SYSTEM
_>REM NETACP
_>INS NETACP
_>NCP SET SYSTEM
_>NCP SET EXECUTOR STATE ON
.FILL;.LEFT MARGIN -5
but this procedure grows extra steps if you've VMRed NETACP or SAVed 
your system with DECnet running.
.END LIST

.HEADER LEVEL Communicating With Odd Devices/Systems

There are, of course, a large number of systems for which DECnet
is not the answer, either because the RSX system is too heavily
loaded, or because the remote system won't support it. In this
case, there are other alternatives:

.LIST "*"
.LE;Non-DEC implementations of DECnet.
.LE;The DLX interface, which includes only the bottom two layers
of DECnet.
.LE;KERMIT, which is available for a large number of systems, and
from a number of sources within DECUS.
.LE;XMODEM.
.LE;The RSX communication drivers, available for a number of interfaces
(but not, unfortunately, all of them).
.LE;A no-protocol communication program such as DTE (SET HOST/DTE), PIP
(COPY), TEM; some fiddling with device attributes (such as
/NOECHO) may be needed.
.LE;Your own code. Use QIO or READ if possible for efficiency
rather than unsolicited input AST.
.END LIST
This grades off into the topic of custom device I/O, which will
be taken up later.

.HEADER LEVEL Miscellaneous tweaks and oddities.

The two remote terminal commands in general use under RSX, SET
/HOST and RMT, are NOT equivalent. SET /HOST uses the CTERM
protocol, which is DEC's standard remote terminal DECnet
protocol, whereas RMT uses an RSX-specific protocol (there are
other system-specific components available though unsupported:
RVT for VAXen, RST for RSTS). It would seem to be the obvious
move to ditch RMT in favor of the standard, except for the
pragmatic consideration that RMT, where applicable, is more
robust. One should beware, for example, of running ACNT over a
CTERM link, as the link may refuse to pass a lone escape
character, making it impossible to exit ACNT. Modern ACNTs have
been modified to work around this.

Certain versions of the remote terminal protocols have trouble
with large data buffers (eg: screen fills) on input, output, or
both. Symptoms can range from partial I/O up to system crashes.
This can be lived with (while you're waiting for your SPR to be
acted on) by calling GETLUN to determine your device name, and
using smaller buffers if the device name is RT or HT.

Certain communication devices are known to occasionally drop interrupt
enable. When this happens, everything comes to a screeching halt.
When this is discovered, DECnet generally grows a few lines of
code to deal with the problem, so the solution is to upgrade.
Certain models of the DL, DZQ, and DELQA are known to have this
problem, but this list is far from exhaustive.

.HEADER LEVEL -1 Queue Manager

The queue manager is an odd and sometimes frustrating beast. It is
reasonably functional, but it lacks a callable interface for
anything more than the most rudimentary functions, and the
documentation tends to cover only plain vanilla uses. Anything
out of the ordinary drives one straight to the source, which is
in [25,10].

The source is actually pretty well commented. One finds that the
interface is similar in design to the VAX interface; one creates
jobs, adds files to them, and so on. But the interface is
strictly 13-word send-receive packets - typically one per
function, though some require two (eg: "Start of Job" and
"Son of Start of Job"). Interestingly, the interface requires
files to be identified RMS-style (by fully-qualified file spec),
rather than FCS-style (by device name and file ID) as generated
by PIP and FORTRAN (DISPOSE='PRINT' is equivalent to
DISPOSE='SAVE' under the RMS OTS, and for good reason). PRT...
gets the unenviable task of making the conversion, which it can
only do by reading the file name out of the directory file's
header. There's a trap here for those who rename directories. But
it is possible (though awkward), with the source in hand, to
write your own program interface to the queue manager. Similarly,
one who wants to know what's in the queue can/must open
LB:[1,7]QUEUE.SYS (shared, please!) and use the information in
the sources to help interpret what is found there.

Perusal of the source can also point the way to some useful
techniques that don't require a line of code. For example, it is
sometimes desirable to drain the queue for a stalled printer,
rather than printing the output somewhere else. Assigning a queue
to the null device would seem to be the logical way to do this,
but alas the Queue Manager won't let  you assign a print
processor to a pseudodevice. It turns out, on reading QMGASS.MAC,
that you CAN create an "applications queue" to a pseudodevice,
and the only thing you lose is transparent spooling. The MCR
commands for setting up such a queue are:
.BLANK;.NOFILL;.LEFT MARGIN +5
>INS _$LPP/TASK=NL0  ! Install "Print" processor
>QUE NL0:/CR/NM     ! Create queue.
>QUE NL0:/SP/EX     ! Initialize applications proc.
.LEFT MARGIN -5;.FILL;.BLANK
The critical item is the "/EX" switch on the last QUE command.
Since LPP actually copies the output to NL0:, it takes a bit to
get rid of a large file. Someone REALLY interested in performance
could write their own despooler.

.HEADER LEVEL Fixed in Next Release

Once or twice per clinic, a problem comes in which is simply the
result of a bug in RSX. There is nothing to be done in such
instances but to recommend an upgrade. The more notable bugs
that we have come across in this context include:

.LIST "*"

.LE;M V4.1 through about M V4.2 "C" take the RX50 controller off
line when the drive door is opened, making it necessary to reboot
every time you want to use another floppy.

.LE;M V4.4 and earlier did not provide the NL: device by default
during a SYSGEN. This doesn't sound like much of a problem, until
you realize that SYSGEN requires NL:. You can always override
SYSGEN's default; the moral is not to fall asleep at the switch.

.LE;Micro-RSX V3.1 doesn't recognize a DHQ-11 if the programming
mode jumper is in DHU mode (which, odd-sounding as it is, is
where you really want it).

.LE;Under M+ V4.0 and 4.1, if a non-privileged task issues the
QIO function SF.SMC to a terminal other than it's TI:, the system
crashes. Other amusing ways to crash a system include:
.LIST "-"
.LE;RUN TI: (about M+ 2.1)
.LE;PIP __/LI (about M+ 3.0)
.END LIST

.LE;M+ 3.0 "A" and "B" were unable to run "BAD" on an MSCP device
(RCT, the reconfiguration task, would get totally confused and
mark the disk offline).

.LE;M+ 3.0 "B" and M4.2 "B" and earlier won't accept DECnet 
connect requests from more recent systems. This was discussed 
earlier under networking.

.END LIST

.HEADER LEVEL Driver Hacks

RSX practically invites people to do custom I/O, and many take up
the invitation. The "Guide to Writing an I/O Driver is reasonably
clear, and covers the main cases. The main drawback is that
someone decided to remove the appendix on executive data structures.
This appendix is still in the "Crash Dump Analyzer" manual, so
the information is not lost; but neither is it all together in
one handy place.

For many, the most useful part of the manual is the appendix on
converting RSX-11M drivers to RSX-11M+. The procedure is pretty
mechanical, and not very difficult; the M to M+ conversion is one
of the few pieces of kernel-mode code I have done without sending
the system down the tubes somewhere along the line.
Unfortunately, the appendix does not cover ACPs; the best advice
here is to take F11ACP as your pattern, compare the M and M+
versions, and be guided by the differences. External header
support is guaranteed to be an issue (though a minor one, and
covered by the driver manual), and there may be others.

Drivers come in basically two flavors: programmed- (or
character-) interrupt, and DMA (or NPR). The former is covered
with complete examples (at least for those who need to write a
paper tape driver). The latter is a more complex job on Unibus
machines, due to the need to manage the mapping registers. This
is covered in section 7.3 of the last manual I looked at (earlier
ones relegated it to an appendix), but there seems to be no
single strategy on when to allocate and deallocate. The driver
writer should consider that RSX does basically the same thing when
it runs out of UMRs as when it runs out of pool; that is, nothing
good. If you only want a couple, for a high-data-rate
application, you may want to just grab a couple and keep them, to
avoid the overhead (small but non-zero) of continually
reallocating them. The online entry point is a good place under
M+, so you can give them back at the offline entry point. If you
potentially need a bunch of UMRs, you'll probably need to obtain
them on the fly.

Sometimes you find yourself tinkering with DEC's drivers rather
than writing your own. One may, for example, wish to have more
than one copy of a given driver resident simultaneously, due (for
example) to unit or controller restrictions. If the driver is
loadable, it is simple to change the device name, though you need
to beware of the fact that some of the macros have
driver-name-specific code in them. If you need error logging,
you'll have to edit and recompile the error logger modules.
Failing that, you can rename the cloned devices back to the
original device names once they're loaded, provided you were
careful to avoid unit number overlap. The rename can be done at
the online or powerfail entry point, with a privileged task, or
even with OPEN.

Some older drivers come in resident versions only. Loadable
versions may be available from Colorado Springs (though your
mileage may vary). If you can't get it there, you should be able
to "roll your own"; the hardest part is probably getting the
database built the way you want it.

Per Bruce Mitchell, the TT: driver is "a mouse and a half";
surely it's a driver and a half, with a unique customization
capability called the Ancillary Control Driver. The legend about
the invention of ACDs is that the FMS engineers came to the RSX
engineers with the words "We have a problem." The RSX engineers
answered "We noticed," and invented a method to hang a back end
on the terminal driver to do custom processing. The ACD runs in
kernel mode (to avoid the overhead of context switching), and can
do just about anything it wants to the data stream; the main
restriction is that I/O requests can be neither created nor
destroyed. ACDs written during development included a null ACD, a
"backwards" ACD (which reversed all text on output), a "George
Orwell" ACD (to monitor the data stream on another terminal), and
a command line editor. DEC currently distributes some to do
terminal keyboard mapping; character set conversion (eg: ASCII to
EBCDIC) is also possible. If used with non-terminal devices, ACDs
can be used to generate and interpret communication protocols,
and to time stamp data; in the latter use, be sure you elevate
CPU priority to BR6, so the clock doesn't update while you're
copying its value.

In some applications it may be better to connect a task directly
to the controller, rather than going through a driver. RSX
provides the CINT_$ directive to do this. This directive sets up a
subroutine in the task to be mapped in kernel mode and called
whenever the specified interrupt occurs. Data can be shared with
the main task using a common area (that is, a data _.PSECT).
You'll have to put the code in the same _.PSECT, or at least be
REAL careful about _.PSECT naming, so that the code and data can
be mapped with the same APR. CINT_$ can handle DMA as well as
programmed I/O; the same UMR allocation mechanism used by the
drivers is available to your task. Just don't forget to
deallocate them when you're done.

.HEADER LEVEL Topics in Application Design

Application design seems often to be done these days without
either consideration or proper understanding of the platform on
which the application is to be implemented. This kind of thing
works more or less well, depending on what services the
application needs. In areas other than sequential file
operations, the industry has done remarkably poorly in providing
standard interfaces to similar system features, and even were
such an interface to exist (eg: the POSIX effort), the differing
performance of the underlying system services could still be an
issue. Two of the most idiosyncratic areas to be dealt with are
those of intertask communication and of privilege and protection.

.HEADER LEVEL +1 Intertask Communications

One thing nobody can accuse RSX of is a lack of communications
mechanisms. There is, in fact, an embarrassment of riches.
However, one of the things that seems to be lacking is a
multicast mechanism; that is, a way to have several tasks service
(in effect) the same send/receive queue. This deficit can be
remedied in software in at least the following ways:
.LIST "*"
.LE;Create a dispatcher task to forward the messages via the
"Send, Request, and Connect" directive, keeping track of which
tributary tasks are active, and either stalling or (preferably)
queuing messages internally if all tributaries are busy.
.LE;Install the receiver as _...xxx, and have it invoke the GIN_$
directive as soon as it becomes active to change its name. You
will still need to check the queue before exiting, as another
message may have become queued before the name change.
.END LIST

The opposite problem could be referred to as the "Data Logger
Problem"; that is, the configuration where many tasks funnel
messages to one task, which (among other things) logs them to
disk. The problem here is that the "Receive Data or Exit"
directive does just that, and performs no file cleanup if the
"Exit" function is taken. If the code closes and reopens files
for each message processed, performance will suffer. There are
several alternatives available to boost performance.
.LIST "*"
.LE;Process messages using the "Receive Data" directive until it
returns the "No More Messages" error, and only then close the
files and do "Receive Data or Exit".
.LE;Convert to "Receive Data or Stop" and hold the files open. If
necessary, you can "Crash Proof" your files by doing a flush
operation after the updates. The code for an F77 flush (using the
FCS OTS) looks like this:
.BLANK;.NOFILL
;         CALL FFLUSH(LUN)         ! FORTRAN call.
LUN _= 2                            ; Offset to LUN
     _.IIF NDF D.FDB, D.FDB _=: 14   ; FDB Offset.
FFLUSH::
     MOV  @LUN(R5),R2      ; Get LUN from FORTRAN
     CALL _$FCHNL           ; Map to FFDB address
     ADD  _#D.FDB,R0        ; Point to the REAL FDB.
     CALL _.FLUSH           ; Flush the file.
     RETURN                ; Done.
     _.END
.FILL;.BLANK
.END LIST

Those who want to re-use the aforementioned log file should be
aware that a FORTRAN REWIND does not reset the end-of-file
position. There are (inevitably) several ways to do this:
.LIST "*"
.LE;PIP file/EOF:1:0
.LE;Issue a Truncate call after the rewind. You'll need the
address of the FDB (under FCS, found as above by calling _$FCHNL
and adding D.FDB to the result) or the RAB (under RMS, found
similarly, the "magic number" to add is different).
.LE;Read the file header using an ACP QIO and reset the
end-of-file position. You'll have to re-open the file to get FCS
or RMS to realize what you've done. See the back of the I/O
Operations Guide for more information.
.END LIST

.HEADER LEVEL Privilege and Protection

It had been noted that RSX recognizes only two classes of user:
God and dirt. Installed tasks can allow "dirt" to become
"demigod", doing, under the control and (hopefully) restriction
of the task things that would ordinarily not be possible.
Unfortunately, certain operations are difficult to restrict on
the task level, and premier among these is file access. A
privileged task gets "system" access to all files, and if that
task takes a file name as input, it becomes difficult to keep
people from gaining access to files not meant for them.
Communications programs (privileged so that they could manipulate
terminals other than their TI:) were particularly prone to this,
and the early releases of (at least) KERMIT and TEM were capable
of stealing the system account file.

Lacking the number of privilege bits that other systems may have,
the RSX application can "make do" by turning its privilege on and
off. The GIN_$ directive has support for this; the usual thing to
do being to turn privilege off before a file open operation, and
back on when the operation is complete. Lacking GIN_$, you can
build the task /PR:5 (accepting the restriction in size), and
flip the bit on and off yourself. Kernel mode is also a must for
finding out the characteristics of other tasks and terminals,
such as their UICs. Task characteristics may be tricky, as many
of them are stored in the header, which may not be resident.
[14,10]THPAGE.MAC, the RMD task header display code, is a good
template to follow on how to get task characteristics.