.NO PERIOD .PAGE SIZE 58,70 .LEFT MARGIN 4 .RIGHT MARGIN 70 .RIGHT;VAX/VMS .RIGHT;Shutdown/Reboot .B 3.C;Software Documentation Memo No_. 43.3 .B.C;^&Shutting-down,&#Crashing&#and&#Rebooting&#a&#VAX\& .B.C;Frank J_. Nagy .FLAGS SUBSTITUTE .B.C;$$MONTH#$$DAY,#$$YEAR .NO FLAGS SUBSTITUTE .FLAGS BREAK .STYLE HEADERS 6,0,6 .HL1 Shutting-down VMS .BR.P; A VAX should normally be shutdown gracefully if at all possible. This permits the VMS operating system to dump the contents of its disk caches onto the disks thus insuring the integrity of the file structures and file contents. All processes are shutdown and their I/O streams rundown. The shutdown process is initiated by logging in on the system console as the system operator. The username for this account is OPERATOR (it has no password). After the initial greeting messages, a prompt of "OPR>" will be given. At this point, the system operator command procedure is ready to accept input of certain VMS commands or specially defined additional commands. .BR.P;One of these additional commands is "SHUTDOWN" (may be abbreviated to "SHUT"). This command will initiate the VMS shutdown procedure after verifing that you want to shutdown the system (reply "Y" or "YES" if you do). The shutdown procedure itself will ask a few more questions. The first such question is how many minutes remain before the system should be shutdown. This time allows interactive users to finish their immediate projects and logoff the system. During this time, messages warning of the imminent system shutdown are sent to all the terminals connected to the system. Normally this time is selected to be about 5 minutes for the Development System. The Operational System can usually be shutdown immediately (use a time of 0 minutes) as it has no interactive users. .BR.P;You will also be asked for a reason for the shutdown. This reason becomes part of the message sent to all the user terminals. The reason is also recorded in the system operator log file on the disk. You should supply meaningful information here, especially when shutting-|down the Development System. .BR.P;The third question asks whether the disks should be spundown. The answer to this question is totally irrelevant as our disks do not have the hardware to allow the system to initiate their spindown. The next question prompts for an estimate of when the system will be back up. Reponses to this question an range from "immediately" (if just rebooting for a "fresh" system) to "3 hours" (if, for instance, you are doing a standalone disk save). .BR.P;The final question asks if an automatic reboot should be preformed. If you answer yes and the AUTO RESTART switch is in the ON position, then after VMS has been completely shutdown, the computer will automatically reboot itself and bring VMS back up. After this final question the shutdown procedure will start the timed delay and send the first message to all the user terminals. .BR.P;When the shutdown is complete, a message to that effect will be typed on the system console. This message will also request that the VAX-11 processor be halted; this must be done before a new program or system can be booted. If an automatic reboot was requested, this message is not printed, instead the system begins its reboot procedure. .BR.P;To halt the VAX-11 processor after a system shutdown, you must first get the attention of the LSI-11 console subsystem by typing Control-P (hold down the CTRL key and type P) on the console. The LSI-11 will respond with the prompt ">>>". By then typing an "H" (or "HALT"), the VAX processor will halt and the LSI-11 will type a new prompt. Note that for this to be possible, the rotary key on the VAX CPU cabinet (upper right corner) must be in the LOCAL position (the normal position is LOCAL DISABLE). Otherwise, the LSI-11 will ignore the Control-P request. The AUTO RESTART switch should be placed in the OFF position at this point if the system is to be down for any length of time or if the processor is to be powered down. .NOTE Exiting the Console Program If Control-P has been typed inadvertently, the console command "SET TERMINAL PROGRAM" will exit the console program restoring normal timesharing terminal operation. .END NOTE .BR.P; The following example illustrates the terminal output in doing a VMS shutdown from the OPERATOR account. Input typed by the operator is underlined, all other text is output from the system. .B.C;Shutdown Example .B.NO FILL.NO JUSTIFY .BR;OPR>^&shutdown\& .BR;Are you sure that you want to shutdown the system [Y/N]? &y .B;#########System shutdown command procedure. .B;How many minutes until shutdown [0]? ^&12\& .BR;Reason? ^&Save system disk and install VMS V3.2\& .BR;Do you want to spin down the disks [No]? &n .BR;Expected uptime (<RET> if not known)? ^&1 1/2-2 or so hours\& .BR;Enable automatic reboot [No]? &n .B;__OPA0:,OPERATOR######11:02:30.42 .BR;System shutdown in 12 minutes; up 1 1/2-2 or so hours. .BR;Save system disk and install VMS V3.2 .B;##Login quotas - Interactive limit=0, Current interactive value=2 .BR;########Non-operator logins are disabled. .B;__OPA0:,OPERATOR######11:10:40.07 .BR;Batch and device queues have been stopped. .B;__OPA0:,OPERATOR######11:14:46.41 .BR;System shutdown in 0 ... Logins are disabled; please log out. .BR;Save system disk and install VMS V3.2 .BR;########Invoke installation dependent shutdown procedure. .BR;########Stop all user processes. .BR;########Remove installed images. .BR;########Dismount all mounted volumes. .BR;%OPCOM, 5-Mar-1983 11:15:21.55, messages from user OPERATOR .BR;__OPA0:, Operator requested shutdown .BR;%OPCOM, 5-Mar-1983 11:15:22.73, logfile closed by operator OPA0 .BR;########logfile was SYS$MANAGER:OPERATOR.LOG .BR;########SYSTEM SHUTDOWN COMPLETE - USE CONSOLE TO HALT SYSTEM .B;>>>&H .B;#########HALTED AT 80007D3C .B.FILL.JUSTIFY .HL2 Stopping Only the ACNET Processes .BR.P; The OPERATOR account also provides the ACNET command which permits just the ACNET processes to be stopped or restarted without stopping or rebooting the entire VMS system. All the ACNET processes are stopped by using the command: .B.I+10;^&OPR>\&ACNET STOP .B A single, selected ACNET process may be stopped by appending its name to this command as in: .B.I+10;^&OPR>\&ACNET STOP name .B However, if the network process (ACNET) is to be shutdown, then all the ACNET processes must be stopped. The Operator HELP command can be used to provide information on the ACNET command and to list the recognized names of the ACNET processes. .HL1 Rebooting .BR.P;Once VMS has been shutdown and the problem corrected or the standalone program run, VMS must be rebooted. This is done by typing "BOOT" (or just "B") in response to the LSI-11 prompt ">>>". This will start a VMS bootstrap from the default disk. For the Development VMS system the normal default device is DRA0_: (RM80 disk drive 0) if running on the VAX1 CPU. The Operational VMS system, which is usually running on the VAX2 CPU, is also booted fom device DRA0_:. However, in this case device DRA0_: is an RM03 disk drive. The local console floppies normally used are setup for DRA0_: as the default boot device for the standard case of running the Development system on VAX1 and the Operational system on the VAX2 CPU. Once the VMS bootstrap is started, the AUTO RESTART switch and the rotary key can be returned to their normal positions of ON and LOCAL DISABLE. .BR.P;The command "BOOT#tcu" is used to bootstrap the VAX from some disk other than the default disk. To boot from disk DRA0_:, use the command "BOOT#VA0". Similarly to boot from disk DRA1_:, use the command "BOOT#VA1" and so forth for disk drive units 2, 3, 4 and so on (only drives 0, 1, and 2 exist currently on the A Massbus Adapter). To boot from a disk attached to the B Massbus Adapter, use the command "BOOT#VBn" where n is the disk drive unit number (only 0, 1, and 2 are currently in existence). The "BOOT#tcu" command will locate the command procedure called "tcuBOO.CMD" on the console floppy and execute it in the LSI-11. VcuBOO command files for all existing disk drives are in place on the console floppies. Further information about the contents of the local console floppy can be found in ^&Software Documentation Memo No_. 50.1\&. .BR.P;The VAX will proceed through its bootstrap procedure unassisted. Unlike RSX-11M, the time and date need not be entered as they are usually initialized from the built-|in day clock. However, the SPEAR package (an system error analysis system installed and used by Field Service) may request the operator to enter a reason for the (previous) shutdown by prompting with: .B.I+10;Please give the reason for shutdown: .B followed by a list of keywords (in caps) and descriptions. You should enter the appropriate keyword (use OTHER if none of those listed apply). This prompt times out after a short time to permit the VAX to run unattended including the ability to restart itself without human intervention in the event of system software or power failures (provided the AUTO RESTART is in the ON position). .HL2 Restarting Only the ACNET Processes .BR.P; Just as an OPERATOR command exists to stop just the ACNET processes, similar commands can restart all (or just one) of the ACNET processes. After all the ACNET processes have been stopped, they may be all restarted by the command: .B.I+10;^&OPR>\&ACNET BOOT .B or a single ACNET process may be restarted (without affecting the others) after it alone has been stopped or it crashed on its own. This is effected by the command: .B.I+10;^&OPR>\&ACNET RESTART name .B It is quite a bit faster to stop and restart all the ACNET processes than to shutdown and reboot VMS and will usually correct the problem making a complete system reboot unnecessary. Note that the ACNET processes are shutdown before VMS is stopped and are started as part of the VMS reboot. .HL1 System Crashes .BR.P;The ACNET VAX's are normally run with AUTO RESTART on. If there is a power outage, the VAX will automatically restart itself when power comes back on. Both VAX processors and the shared memory are equipped with battery backup power to maintain the contents of the MOS memory across short AC power outages. The RESTAR.CMD file on the console floppy directs the LSI-11 to start the "Restart Referee" in the VAX-11 processor to prepare to restart the system at the point where it left off (and saved volatile status) due to the power fail interrupt. If the "Restart Referee" detects that the memory has become corrupted (due to batteries being drained or turned off), then a reboot will be initiated to load a fresh copy of VMS into memory. This reboot will proceed just as above in the case of a manual reboot. With the present memory configuration of 2.5 Mbytes on a VAX, the battery backup system will maintain memory contents for at least 15 minutes normally. .BR.P; The detection of certain types of software problems (BUGCHECK's) will cause VMS to crash itself. The machine will write the physical memory contents to the system dump file and then begin a reboot (AUTO RESTART on) automatically. This is one case in particular for which the system dump should be saved and the analysis report printed. Such materials may have to be sent to DEC with a Software Performance Report. .BR.P;The system operator can also bring VMS down with an emergency shutdown procedure. After logging in as the system operator, the command "OPCCRASH" (with two C's) is issued and a YES reply is given to the crash request verification question. This will crash VMS in much the same way as a BUGCHECK. If you cannot login (usually due to the system being hung in a high priority process), turn the keyswitch to LOCAL and type Control-P on the console to get the LSI-11 prompt. The console command "@CRASH" will then institute a system crash from the LSI-11. Either of these procedures should be used ^&very\& sparingly and then only when the system cannot be shutdown gracefully. .HL2 Saving the System Dump .BR.P; A few minutes after the system is up and running after a crash, a batch job called SYSDMPJOB may request that the SYSDMP tape be mounted on a tape drive. This batch job is used to save the system memory image dump file. If the system was shutdown on purpose (not due to a software bug), there is no reason to save this system dump or print the crash analysis report and the batch job will usually not request the tape (it internally detects that the shutdown was requested by the operator). .BR.P; When the system dump is to be saved, a request to mount the SYSDMP tape on a tape drive will be made and printed on the system console a few minutes after the system has been rebooted. If the tape request and print job are to be aborted, dump is not to be saved and the crash analysis is not to be printed (Development System only), log in to the OPERATOR account and, after SYSDMPJOB makes its tape mount request, abort it by typing a "NODUMP" command in response to the "OPR>" prompt. .BR.P; If the shutdown was due to a software problem, the system dump should be saved for further analysis. In this case, you should mount the SYSDMP tape on the specified tape drive (as identified in the request message, usually MTA0:). The system will automatically detect that the tape has been mounted and proceed accordingly. When the disk-|to-|tape copy has completed, the tape will be rewound and unloaded and a completion message will be printed on the console (the tape may then be removed and returned to the rack atop the tape drive). If the tape is not mounted, VMS will reprompt the operator every so often with the tape request. After 30 minutes, the SYSDMPJOB times out, kills the tape request and terminates. The save of the system memory dump will then have to be done manually by systems programming personnel. .HL1 Switching VAX CPU's .BR.P;ACNET has two separate VAX systems, one for Operational use with the Accelerator Control System and the other for a backup (and incidentally used as the Software Development System). The layout of the two VAX's in the computer room is shown in Figure I (see below). The actual hardware is identified by red "VAX1" and "VAX2" tags attached to the CPU cabinets. In switching CPU's, you must also be familiar with the position of the DT07 UNIBUS Switch cabinet and the location of the disk drives. .BR.P; Sometimes the Operational System must be switched to the backup CPU because of a failure in the machine it was running on or to (for instance) allow the DEC engineers to troubleshoot a problem or perform Preventive Maintenance. Normally, the VMS Operational System should be running on the VAX2 CPU, the one nearest the door leading to the Main Control Room (this machine has a single tape drive available as opposed to the two tape drives on the VAX1 system). Before you can switch machines, VMS must be shutdown on both machines as detailed above. .BR.P;To interchange the identities of the Development and Operational Systems, the following items must be done: .LIST 1 .LE;The dual-|port disk drives must be switched to the other machine. Figure III shows the location (in the cutout in the lower door of the drive cabinet) of the dual-|port pushbuttons. These pushbuttons can be in one of three states: .LIST 1 .DISPLAY ELEMENTS "(",LL,")" .LE;Drive is locked in A-|port-|only access if the A button is lighted (pushed in), and the B button is unlighted (and not pushed in). .LE;Drive is locked in B-|port-|only access if the B button is lighted (pushed in), and the A button is unlighted (and not pushed in). .LE;Drive is in dual-|port access if both buttons are pushed in (but neither will be lighted, the light will come on when the drive is actually accessed on a particular port). .END LIST Depressing a dual-|port pushbutton will either lock it into the depressed (port is enabled) state or will release it (if already depressed; disables the port). After setting the dual-|port pushbuttons appropriately, you must then start the disk spin-down procedure and then cancel it to actually switch the dual-|port logic. This is done by depressing the START button (RM03 drive, see Figure IV) or the RUN/STOP button (RM80 drive, see Figure V) once and then again after a delay of a second or two. The light should switch from the one port select pushbutton to the other (or both should go off if enabling dual_|port access) when the disk drive has spun back up to full speed. The control panels for the RM03 and RM80 disk drives are illustrated in Figures IV and V (the "switch" labelled with the "2" in each case shows the position of the disk drive unit plug). The READY light of an RM03 drive will flash while the pack is being spun up or down, both the READY and START lights go out when the pack is fully spun down; both are on when the pack is fully spun up. Similarly, the READY light of an RM80 drive will go out until the drive is spun up to operational speed (note that the procedure described above will cause the RM80's to undergo a full cycle by slowing down to a full stop before restarting the spin-|up operation). .NOTE Dual-port Enable Those disk drives enabled for dual-|port access should normally be left in that state when switching CPU's. In addition, it may be that in the future all the disks will be enabled for dual-|port access and this whole description of switching the dual-|port disk drives rendered moot. .END NOTE .LE;The Switched UNIBUS (RL02 and RX02 disks, terminals, line printers, and card reader) must be switched to the other machine. The Switched UNIBUS is controlled by the DT07 UNIBUS Switch illustrated in Figure II. The UNIBUS is switched by flipping the MANUAL CONNECT (top row of switches) switch from ON to OFF for the currently connected port (indicated by the light above the switch) and the other one from OFF to ON. Note that only port's 0 and 1 of the DT07 are used currently and correspond, respectively, to VAX1 and VAX2. The DT07 should be left in the MANUAL CONNECTION mode (bottom row of switches). .LE;The console floppies must be interchanged so that the default boot devices are correct (standard boot devices are DRA0_:; default boot devices are DRB0_: when switched to other machine). The console floppies are labelled identifying which disk they are setup to use as the default boot device (by the device name, see Table I). In addition, the standard ACNET console floppies are identified by the system and CPU as shown: .B.C;Standard ACNET Console Floppy Labels .B .BR.I+15;RX1 VAX-11/780 Local Console .BR.I+15;For Development System on VAX1 .BR.I+15;Default boot device is DRA0: .B2 .BR.I+15;RX1 VAX-11/780 Local Console .BR.I+15;For Operational System on VAX2 .BR.I+15;Default boot device is DRA0: .B The standard ACNET console floppies for a CPU can be found in the RX01 drive inside the CPU cabinet and in pockets attached to the inside of the CPU cabinet doors. .END LIST The left-|hand VAX (CPU VAX1) is on disk port B and DT07 UNIBUS port 0. The right-|hand VAX (CPU VAX2) is on disk port A and DT07 UNIBUS port 1. Table I lists the device names of the disk drives (excluding the RL02 and RX02 drives) as seen from each CPU. .TP15.B2.C;Table I .C;Disk Drive Device Names .B.LM+8 .TAB STOPS 14,30,40 .BR;^&Disk Drive to VAX1 to VAX2\& .B .BR;RM80 drive _#2 DRB2_: DRA2_: .BR;RM80 drive _#1 DRB1_: DRA1_: .BR;RM80 drive _#0 DRA0_: DRB0_: .B .BR;RM03 drive _#0 DRB0_: DRA0_: .BR;RM03 drive _#1 DRA1_: DRB1_: .BR;RM03 drive _#2 DRA2_: DRB2_: .B.LM-8 .HL1 Normal Operational Setup .BR.P; The standard setup is to be running the Development VMS system on the VAX1 CPU and the Operational system on the VAX2 CPU. In this mode, all three RM80 disks are set to port B access only and the DT07 is set to enable port 0. RM03 disks 0 and 1 should be set to port A access only (these are the system and user disks for the Operational system). RM03 disk drive 2 is the backup for the Operational system disks and does not need to have its port select changed (it is normally kept in dual-|port access mode). .BR.P;In the normal case, the local console floppies actually loaded into the CPU floppy drive are those for which the default boot device is DRA0_:. The local console floppies are labelled with the legend: .B.I+10;RX1 VAX-11/780 Local Console .BR.I+10;For <systemname> on VAXn .BR.I+10;Default boot device is DRA0_: .B;The <systemname> is either Operational or Development and is used to define the system which this floppy's default boot device holds. The default boot device is further identified on the third line of the label. The floppies with default boot device DRB0_: are used when the systems are being run on a CPU other than the one on which they are normally executing. .BR.P; Spare console diskettes can be found in the blue plastic diskette box kept inside the VAX1 CPU cabinet. This box contains one each of the standard ACNET console floppies using DRA0_: as the default boot device. In addition, a set of floppies which provide for using the non-|zero disk units as system disks (for the Operational System) can be found in the box. Since the Operational System disk is an RM03 removeable disk pack, it need not be always placed in RM03 drive 0. In fact the system startup command procedure also does not absolutely require the Operational user disk (volume name OPUSR1) to be in RM03 drive 1. The Operational disks may be shuffled among the RM03 drives (provided that VMS has been shutdown of course) by following the guidelines: .LIST .LE;The system startup procedures expect the user disk to be on RM03 drive 1 and will attempt to software mount it from there. If this fails then the software mount will be attempted on either RM03 drive 2 or 0 (in that order) depending upon which is known not to have the system disk mounted. It should be noted that the startup procedure requires noticeably longer to complete if the user disk is not where first expected (the mount failure is really a timeout). .LE;The Operational system and user disk drives should not be in dual-|port mode but locked onto either port A or port B (depending upon which CPU is being used to run the Operational system). .LE;The console floppy should be replaced with one of the non-|standard floppies from the blue plastic box. The appropriate floppy is selected on the basis of which RM03 disk drive is being used as the Operational system disk (see Table I). .END LIST .PAGE .C;Figure I .C;VAX CPU and Disk Layout .TAB STOPS 12,20,28,36,44,52,60,68,76 .B .LITERAL /\ || South Wall +--------+--------+--------+ | | | | +---------------+ |RL02/01 | Magtape| Magtape| | DT07 Switch | | disk | drive | drive | | and RX02 | | drives | | | | cabinet | | | | | | | | | | | +---------------+ +--------+--------+--------+ | | | | | VAX1 CPU | | cabinets | +---------------+ | | | | | | | RM80 drive #2 | | DT07 port 0 | | | | | +---------------+ | Disk port B | | | | | | RM80 drive #1 | | | | | +---------------+ +---------------+ | MA780 | | | | shared | | RM80 drive #0 | | memory | | | | cabinet | +---------------+ +---------------+ | | | | | RM03 drive #0 | | | | | | VAX2 CPU | +---------------+ +-------+ | cabinets | | | | LA120 | | | | RM03 drive #1 | | for | | | | | | VAX1 | | DT07 port 1 | +---------------+ +-------+ | | | | | LA120 | | Disk port A | | RM03 drive #2 | | for | | | | | | VAX2 | | | +---------------+ +-------+ +---------------+ Main Control Room || \/ .END LITERAL .PAGE .C;Figure II .C;DT07 UNIBUS Switch Front Panel .B .LITERAL +-------------------------------------------------------+ | | | | | | | | | Ports | | 0 1 2 3 | | ON OFF STND on | | ___ BY +++ +++ +++ +++ | | / \ off| | | \ | | | \___/ manual| | +++ +++ +++ +++ | | program| +-------------------------------------------------------+ .END LITERAL .B4 .C;Figure III .C;Disk Drive Port Selection Pushbuttons .B .LITERAL --------------------------------------------------------+ | | | | | +-------------------+ | | | | | | | | | | | +---+ +---+ | | | | A | | B | | | | +---+ +---+ | | | | | | | | +-------------------+ | | | | | | | __ | / \ | \__/ | | | .END LITERAL .C;Lower door of disk drive cabinets .PAGE .C;Figure IV .C;RM03 Disk Drive Controls .B .LITERAL +-------------------------------------------+ | | | START READY FAULT PROTECT | | | | o o o o | | | | +---+ +---+ +---+ +---+ | | | | | 2 | | | | | | | +---+ +---+ +---+ +---+ | | | +-------------------------------------------+ .END LITERAL .B4 .C;Figure IV .C;RM80 Disk Drive Controls .B .LITERAL +-------------------------------------------------+ | +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ | | | RUN | |FAULT| | 2 | |WRITE| | STAT| | STAT| | | | STOP| | | |READY| | PROT| | 1 | | 2 | | | +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ | +-------------------------------------------------+ .END LITERAL .TEST PAGE 20.B 5 ^&Distribution\& .B.NO PERIOD.TAB STOPS 20 ##Normal .BR;##Operations Group .BR;##E. Anderson .BR;##B. C. Brown .BR;##K. Eng MS 120 .BR;##L. Klein .BR;##F. Mehring .BR;##D. Quarrie MS 223 .BR;##D. Ritchie MS 120 .BR;##K. Schuh .BR;##J. Tinsley .BR;##3 copies for computer room (F. Nagy) .BR;##file .B4 fjn: USR$DISK1:[NAGY.DOC]VAXBOOT.MEM