WASD Hypertext Services - Technical Overview

[next] [previous] [contents] [full-page]

8 - Mapping Rules

Mapping rules are used in four primary ways.

  1. To map a request path onto the VMS file system.

  2. To process a request path according to specified criteria resulting in an effective path that is different to that supplied with the request.

  3. To identify requests requiring script activation and to parse the script from the path portion of that request. The path portion is then independently re-mapped.

  4. To conditionally map to different end-results based on one or more criteria of the request.

Mapping is basically for server-internal purposes only. The only time the path information of the request itself is modified is when a script component is removed. At all other times the path information remains unchanged.

Hence, path authorization is always applied to the path supplied with the request, not to the results of any rule mapping that may have occured! This means authorization paths may be administered without any ambiguity introduced by any rule mapping that may occur.

By default, the system-table logical name HTTPD$MAP locates a common mapping rule file, unless an individual rule file is specified using a job- table logical name. Simple editing of the mapping file changes the rules. Comment lines may be included by prefixing them with the hash "#" character. Although, there is no fixed limit on the number of rules there are the processing implications of scanning a large, linear database.

Rules are given a basic consistency check when loaded (i.e. server startup, map reload, etc.) If there is an obvious problem (unknown rule, missing component, etc., path not absolute) a warning message is generated and the rule is not loaded into the database. This will not cause the server startup to fail. These warning messages may be found in the server process log.

The server administration rule mapping facility allows arbitrary paths to be checked against the rule database in real-time. See 11.2 - HTTPd Server Reports.

Any changes to the mapping file may be (re)loaded into the running HTTPd server using the following command on the server system:

  $ HTTPD /DO=MAP

Also see 6 - Server Configuration for daemon configuration.

Server Mapping Rules

A server's currently loaded mapping rules may be interrogated. See 11 - Server Administration for further information.


8.1 - VMS File System Specifications

The VMS file system in mapping rules is always assumed to begin with a device or concealed device logical. Specifying a Master File Directory (MFD) component, the [000000] is completely optional, although always assumed to be implied. The mapping functions will always insert one if required for correct file system syntax. That is, if the VMS file system mapping of a path results in a file in a top-level directory an MFD is inserted if not explicitly present in the mapping. For example, both of the following paths

  /dka100/example.txt
  /dka100/000000/example.txt
would result in a mapping to
  DKA100:[000000]EXAMPLE.TXT
The MFD is completely optional when both specifying paths in mapping rules and when supplying paths in a request. Similarly, when supplying a path that includes directory components the MFD is completely optional, as in
  /dka100/dir1/dir2/example.txt
  /dka100/000000/dir1/dir2/example.txt
both mapping to
  DKA100:[DIR1.DIR2]EXAMPLE.TXT

Implication: When using logical names in file system mappings they must be able to be used as concealed devices and cannot be logical equivalents of directory specifications. Concealed device logicals are created using the following syntax:

  $ DEFINE LOGICAL_NAME device:[dir1.dir2.]
  $ DEFINE LOGICAL_NAME /TRANSLATION=CONCEALED physical_device:[dir1.dir2.]


8.2 - Rules

MAP, PASS, FAIL Rules

  1. map template result

    If the URL path matches the template, substitute the result string for the path and use that for further rule processing. Both template and result paths must be absolute (i.e. begin with "/").

  2. pass template
    pass template result
    pass template "999 message text"

    If the URL path matches the template, substitute the result if present (if not just use the original URL path), processing no further rules.

    The result should be a either a physical VMS file system specification in URL format or an HTTP status-code message (see below). If there is a direct correspondance between the template and result the result may be omitted.

    The "PASS" directive is also used to reverse-map VMS file specifications to the URL path format equivalent.

    An HTTP status-code message can be provided as a result. The server then generates a response corresponding to that status code containing the supplied message. Status-code results should be enclosed in one of single or double quotes, or curly braces. See examples. A 3nn status results in a redirection response with the message text comprising the location. Codes 4nn and 5nn result in an error message. Other code ranges (e.g. 0, 1nn, 2nn, etc.) simply cause the connection to be immediately dropped, and can be used for that purpose (i.e. no indication of why!)

  3. fail template

    If the URL path matches the template, prohibit access, processing no further rules. The template path must be absolute (i.e. begin with "/").

REDIRECT Rule

  1. redirect template result

    If the URL path matches the template, substitute the result string for the path. Process no further rules. The result must be a full URL (http://host/path), and is used to redirect requests to another server on a separate host.

EXEC and SCRIPT, Script Mapping Rules

EXEC+ and SCRIPT+

Also see 12 - Scripting.

The "EXEC" and "SCRIPT" directives have the variants "EXEC+" and "SCRIPT+". The variants behave in exactly the same fashion and simply mark the rule as representing a CGIplus script environment. See 12.7 - CGIplus Scripting. Caution! If changing rules involving these variants it is advised to restart the server rather than reload. Some conflict is possible when using new rules while existing CGIplus scripts are executing.

The "EXEC" rules maps CGI script directories.

The "SCRIPT" rules maps CGI script file names. It is a little different to the "EXEC" rule and an extension to the CERN rules.

Both rules must have a template and result, and both must end in a wildcard asterisk. The placement of the wildcards and the subsequent functionality is slightly different however. Both template and result paths must be absolute (i.e. begin with "/").

  1. exec template result

    The "EXEC" rule requires the template's asterisk to immediately follow the slash terminating the directory specification containing the scripts. The script name follows immediately as part of the wildcard-matched string. For example:

      exec /htbin/* /ht_root/script/*
    

    If the URL path matches the template, the result, including the first slash-terminated part of the wildcard-matched section, becomes the URL format physical VMS file specification the script to be executed. What remains of the original URL path is used to create the path information. Process no further rules.

    Hence, the "EXEC" rule will match multiple script specifications without further rules, the script name being supplied with the URL path. Hence any script (i.e. procedure, executable) in the specified directory is accessable, a possible security concern if script management is distributed.

  2. script template result

    The "SCRIPT" rule requires the template's asterisk to immediately follow the unique string identifying the script in the URL path. The wildcard-matched string is the following path, and supplied to the script. For example:

      script /conan* /ht_root/script/conan*
    

    If the URL path matches the template, the result becomes the URL format physical VMS file specification for the DCL procedure of the script to be executed (the default file extension of ".COM" is not required). What remains of the original URL path is used to create the path information. Process no further rules.

    - NOTE -

    The wildcard asterisk is best located immediately after the unique script identifier. In this way there does not need to be any path supplied with the script. If even a slash follows the script identifier it may be mapped into a file specification that may or may not be meaningful to the script.

    Hence, the "SCRIPT" rule will match only the script specified in the result, making for finely-granular scripting at the expense of a rule for each script thus specified. It also implies that only the script name need precede any other path information.

    It may be thought of as a more efficient implementation of the equivalent functionlity using two CERN rules, as illustrated in the following example:

      map /conan* /script/conan*
      exec /cgi-bin/* /cgi-bin/*
    


8.3 - Rule Interpretation

The rules are scanned from first towards last, until a pass, exec, script or fail is encountered, when processing ceases and final substitution occurs. Mapped rules substitute the template with the result and continue to the next rule.

Use of wildcards in template and result:


8.4 - Mapping Examples

The example mapping rule file for the WASD HTTP server can be viewed.

Example of Map Rule

The result string of these rules may or may not correspond to to a VMS physical file system path. Either way the resulting rule is further processed before passing or failing.

  1. The following example shows a path "/web/unix/shells/c" being mapped to "/web/software/unix/scripts/c", with this being used to process further rules.
      map /web/unix/* /web/software/unix/*
    

Examples of Pass Rule

The result string of these rules should correspond to to a VMS physical file path.

  1. This example shows a path "/web/rts/home.html" being mapped to "/user$rts/web/home.html", and this returned as the mapped path.
      pass /web/rts/* /user$rts/web/*
    

  2. This maps a path "/icon/bhts/dir.gif" to "/web/icon/bhts/dir.gif", and this returned as the mapped path.
      pass /icon/bhts/* /web/icon/bhts/*
    

  3. This example illustrates HTTP status code mapping. Each of these does basically the same thing, just using one of the three possible delimiters according to the characters required in the message. The server generates a 403 response with has as it's text the following message. (Also see the conditional mapping examples.)
      pass /private/* "403 Can't go in there!"
      pass /private/* '403 "/private/" is off-limits!'
      pass /private/* {403 Can't go into "/private/"}
    

Examples of Fail Rule

  1. If a URL path "/web/private/home.html" is being mapped the path would immediately be failed.
      fail /web/private/*
    

  2. To ensure all access fails, other than that explicitly passed, this entry should be included the the rules.
      fail /*
    

Examples of Exec and Script Rules

  1. If a URL path "/htbin/ismap/web/example.conf" is being mapped the "/ht_root/script/" must be the URL format equivalent of the physical VMS specification for the directory locating the script DCL procedure. The "/web/example.conf" that followed the "/htbin/ismap" in the original URL becomes the translated path for the script. See 12 - Scripting for other information on scripting.
      exec /cgi-bin/* /cgi-bin/*
    

  2. If a URL path "/conan/web/example.hlb" is being mapped the "/ht_root/script/conan" must be the URL format equivalent of the physical VMS specification for the DCL procedure. The "/web/example.hlb" that followed the "/conan/" in the original URL becomes the translated path for the script. See 12 - Scripting for other information on scripting.
      script /conan* /ht_root/script/conan*
    

Example of Redirect Rule

  1. If a URL path "/AnotherGroup/this/that/other.html" is being mapped the URL would be redirected to "http://host/this/that/other.html"
      redirect /AnotherGroup/* http://host/group/*
    


8.5 - Conditional Mapping

The purpose of conditional mapping is to apply rules only after certain criteria other than the initial path match are met. These criteria serve to create conditional mapping rules, and were introduced in version 4.4.

THIS OFFERS A POWERFUL TOOL TO THE SERVER ADMINISTRATOR!

Conditional mapping can be applied on the following criteria:

Note that path authorization is always applied to the path supplied with the request, not to the results of any rule mapping that may have occured! This means authorization paths may be administered without any ambiguity being implied by any rule mapping that may occur.

Conditionals must follow the rule and are delimited by "[" and "]". Multiple, space-separated conditions may be included within one "[...]". This behaves as a logical OR (i.e. the condition only needs one matched to be true). Multiple "[...]" conditionals may be included against a rule. These act as a logical AND (i.e. all must have at least one condition matched). If a condition begins with a "!" it acts as a negation operator (i.e. matched strings result in a false condition, unmatched strings in a true condition). The result of an entire conditional may also be negated by prefixing the "[" with a "!".

If a conditional, or set of conditionals, is not met the rule is completely ignored.

Matching is done by simple, case-insensitive, string comparison, using the wildcards "*", matching one or more characters, and "%", matching any single character.

White-space (spaces and TABs), wildcards and the delimiting "[" and "]", are forbidden characters and cannot be used within condition matching strings, nor can they be encoded for inclusion in any way (for simplicity and speed of processing). These characters are uncommon in the information being matched against, but if one does occur then "match" it using a single character wildcard ("%").

While conditionals are powerful adjuncts to smart serving they do add significant overhead to rule mapping and should be used with this in mind.

Conditionals

Examples

NOTE: It is possible to spoof (impersonate) internet host addresses. Therefore any controls applied using host name/address information cannot be used for authorization purposes in the strictest sense of the term.

  1. The following example shows a rule being applied only if the client host is within a particular subnet. This is being used to provide a "private" home page to those in the subnet while others get a "public" page by the second rule.
      pass / /web/internal/ [ho:131.185.250.*]
      pass / /web/
    

  2. This is a similar example to the above, but showing multiple host specifications and specifically excluding one particular host using the negation operator "!". This could be read as pass if ((host OR host) AND (not host)).
      pass / /web/internal/ [ho:*.fred.com ho:*.george.com] [!ho:you.fred.com]
      pass / /web/
    

  3. The next example shows how to prevent browsing of a particular tree except from specified host addresses.
      pass /web/internal/* /web/SorryNoAccess.html [!ho:131.185.250.*]
      pass /web/internal/*
    

    This could be used to prevent browsing of the server configuration files (an alternative to this sort of approach is to use the authorization file, see 9 - Authentication and Authorization).

      pass /ht_root/local/* /web/SorryNoAccess.html [!ho:131.185.250.201]
    

  4. This example performs much the same task as the previous one, but uses whole conditional negation to prevent browsing of a particular tree except from specified addresses (as well as using the continuation character to provide a more easily comprehended layout ... note the trailing spaces as required). This could be read as pass if not (host OR host OR host).
      pass /web/internal/* /web/SorryNoAccess.html \
      ![\
      ho:131.185.250.* \
      ho:131.185.251.* \
      ho:131.185.45.1 \
      ho:ws2.wasd.dsto.gov.au\
      ]
      pass /web/internal/*
    

  5. This example demonstrates mapping pages according to geography or language preference (it's a bit contrived, but ...)
      pass /doc/* /web/doc/french/* [ho:*.fr al:fr]
      pass /doc/* /web/doc/swedish/* [ho:*.se al:se]
      pass /doc/* /web/doc/english/*
    

  6. How to exclude specific browsers from your site (how many times have we seen this!)
      # I had to pick on a well-known acronym, no offence Bill!
      pass /* /web/NoThankYou.html [ua:*MSIE*]
    

  7. This example allows excluding certain requests from specific addresses. This could be read as pass if ((method is POST) AND (not host)).
      pass /* /web/NotAllowed.html [me:POST] [!ho:*.my.net]
    

  8. The following illustrates using the server name and/or server port to conditionally map servers executing on clustered nodes using the same configuration file, or for multi-homed/multi-ported hosts. Distinct home pages are maintained for each system, and on BETA two servers execute, one on port 8000 that may only be used by those within the specified network address range.
      pass / /web/welcome_to_Alpha.html [sn:alpha.*]
      pass / /web/welcome_to_Beta.html [sn:beta.*] [sp:80]
      pass /* /sorry_no_access.html [sn:beta.*] [sp:8000] [!ho:*.my.sub.net]
      pass / /web/welcome_to_Beta_private.html [sn:beta.*] [sp:8000]
    

  9. Each of these three do basically the same thing, just using the three possible delimiters according to the characters required in the message. The server generates a 403 response with has as it's text the following message.
      pass /private/* "403 Can't go in there!" [!ho:my.host.name]
      pass /private/* '403 "/private/" is off-limits!' [!ho:my.host.name]
      pass /private/* {403 Can't go into "/private/"} [!ho:my.host.name]
    

Note that rule processing for any particular path may be checked using the server administration menu.


8.6 - Mapping User Directories (tilde character ("~"))

This server will map user directories using the same mechanisms as for any other. No reference is made to SYSUAF.DAT, user support is accomplished via a combination of mapping rule and logical name. This approach relies on a correspondance between the username and the home directory name. Hence users are made known by the HTTPd using the name of their top-level directory. As the naming of home directories using the username is a common practice this mechanism should suffice in the majority of cases. Where there is no such correspondance individual rules could be used for each user. User scripts can also be supported using WASD's DECnet scripting environment. See 12.8.4 - User Scripts for further detail.

The "PASS" rule provides a wildcard representation of users' directory paths. As part of this mapping a subdirectory specifically for the hypertext data should always be included. Never map users' top-level directories. For instance if a user's account home directory was located in the area USER$DISK:[DANIEL] the following rule would potentially allow the user DANIEL to provide web documents from the home subdirectory [.WWW] (if the user has created it) using the accompanying URL:

  pass /~*/* /user$disk/*/www/*

  http://host/~daniel/

It is recommended that a separate logical name be created for locating user directories. This helps hide the internal organisation of the file system. The following logical name definition and mapping rule illustrate this point.

  $ DEFINE /SYSTEM /EXEC /TRANSLATION=CONCEALED WWW_USER device:[USER.]

  pass /~*/* /www_user/*/www/*

Where users are grouped into different areas of the file system a logical search list may be defined.

  $ DEFINE /SYSTEM /EXEC /TRANSLATION=CONCEALED -
           WWW_USER -
           DISK1:[GROUP1.], -
           DISK1:[GROUP2.], -
           DISK2:[GROUP3.], -
           DISK2:[GROUP4.]
    
  pass /~*/* /www_user/*/www/*

As logical search lists have specific uses and some complications (e.g. when creating files) this is the only use for them recommended with this server, although it is specifically coded to allow for search lists in document specifications.

If only a subset of all users are to be provided with WWW publishing access either their account directories can be individually mapped (best used only with a small number) or a separate area of the file system be provided for this purpose and specifically mapped as user space.

Of course, user mapping is amenable to all other rule processing so it is a simple matter to redirect or otherwise process user paths. For instance, the published username does not need to, or need to continue to, correspond to any real user area, or the user's actual name or home area:

  redirect /~doej/* http://a.nother.host/~doej/*
  pass /~doej/* /www/messages/deceased.html
  pass /~danielm/* /special$www$area/danielm/*
  pass /~Mark.Daniel/* /user$disk/danielm/www/*
  pass /~*/* /www_user/*/www/*

A user directory is always presented as a top-level directory (i.e. no parent directory is shown), although any subdirectory tree is accesssable by default.


[next] [previous] [contents] [full-page]