=========================================================================== IMMUNIX CANARY PATCH TO GCC 2.7.2.2 -- A buffer overflow exploit detector =========================================================================== This document finalized Monday December 15, 1997. See "http://www.cse.ogi.edu/DISC/projects/immunix/StackGuard/" for details. Send technical comments about this distribution to immunix-request@cse.ogi.edu. --------------- PREREQUISITES --------------- You also need the canary library package from the same place as you got this (the compiler mods). This (the compiler mods) should be v1.1, so you would be looking for canarylib version 1.1. Its all one superpackage. (Or the pair is the package, and this is a subpackage.) I only tested for Redhat Linux 4.2 (linux kernel 2.0.30) on Intel x86's. I don't see much reason for it to care what Unix OS you have (must have signals, printf, syslog, and ... ?), but it definitely won't do anything for non x86 as only x86 instructions are emitted for canaries. The canary library is architecture independant, but might be more sensitive to OS. ---------- OVERVIEW ---------- This is version 1.1 of the IMMUNIX CANARY patch to GCC version 2.7.2.2. Only Intel x86 is supported at this time. The idea is to detect typical buffer overflow exploits -- after the fact -- by placing a "canary" next to the return address on the stack. A canary is a "hard to predict" special value that is required to be the same upon exiting a procedure as it was when it was laid down upon entering that procedure. This is an i386 instantiation, so stacks grow from high addresses toward low addresses. Presuming that a buffer overflow exploit can only write contiguously into memory in a upward direction from its "hole", and it is attempting to overwrite the return address, then it can't do that and not also overwrite the canary word, "killing it". (Welch miners used to carry Canaries in cages with them, which would succumb to noxious fumes before humans would, giving a warning alarm of such presense.) A detection of a "dead canary" results in error messages being sent both to stderr and to syslog, and an illegal instruction signal being raised. -------------- INSTALLATION -------------- On my Pentium Redhat Linux 4.2 system, the build area uses just under 86 meg, and the install area uses just under 13 meg. (1) Unpack this tarball: tar xvvzf stackguard-gcc.tar.gz This creates a directory named "stackguard-gcc". (2) Go to that directory: cd stackguard-gcc (3) Configure it to your installation. I use: mkdir ~/canary ./configure --prefix=~/canary which puts everything in a directory named "canary" in my home directory. The whole installed compilation system ends up there. (4) Build the first compiler: make LANGUAGES=c (5) Grab bootstraps and build stage 1 compiler: make stage1 make CC="stage1/xgcc -Bstage1/" CFLAGS="-g -O2" (The "make stage1" emits a lot of ugly looking error messages, but that's okay.) (6) Pull up on bootstraps and build stage 2 compiler: make stage2 make CC="stage2/xgcc -Bstage2/" CFLAGS="-g -O2" (The "make stage2" emits a lot of ugly looking error messages, but that's okay.) (7) Check to see if levitating (ie, got same thing in stages 1 and 2): make compare (8) If this all worked (worked for me), install the compiler: make install Now you should be ready to go. ------------------------------------- COMPILING AND LINKING WITH CANARIES ------------------------------------- First, build and install the stackguard compiler as per the above instructions. Next, retreive the stackguard canary library -- see web page URL at the top of this document -- and follow its installation instructions. Find or generate a piece of code that buffer overflows enough to overwrite a return address. Call the source file "foo.c". Compile it: ~/canary/bin/gcc -c -fcanary-all-functions foo.c Notice the new option "-fcanary-all-functions". That causes *all* functions in that particular compile to get canaries. Now link it: ~/canary/bin/gcc -o foo foo.o -lcanary You could use the stock gcc here, but then it wouldn't know to look in the "~/canary/lib" directory for the "libcanary.a" file, saving you from having to specify a "-L" flag. The canary lib uses a gcc hook to specify that its initializer function gets called before main(). This hook is the "__constructor__" attribute for function declaration, which maybe only the gcc linker understands, or maybe only elf loaders, or I dunno. It works on my Redhat 4.2 system, and is a documented GCC feature (though its not ansi C, I'm sure). Other languages definitely need and use this hook (Modula-2 comes to mind). --------------------------- QUICK DESCRIPTION OF MODS --------------------------- This compiler is GCC 2.7.2.2, straight from prep.ai.mit.edu, with the following mods (ie, none of the Redhat 4.2 patches applied): [All mods, except a few blank lines, have the string "IMMUNIX" embedded in a comment on the same line.] The file "./toplev.c" is modified to include a new option "-fcanary-all-functions", which turns on the normally off option to generate canaries. The file "./function.c" is modified to include a declaration of the global variable "canary_all_functions" as external so the new definition of the FIRST_PARM_OFFSET macro can use it. The macro should declare the extern itself, but I gotta stop tweaking, and release this version. (Recompile/test cycles take a while). The file "./config/i386.h" is modified to have the FIRST_PARM_OFFSET macro emit the normal int value "0" when canaries are turned off, and "4" when they are turned on. The file "./config/i386.c" has the bulk of the modifications. Two globals are declared, one for counting how many procedures have been canaried, and one for holding the index into the dynamically generated table of canary values (see the canary library in a separate package). Then, code is inserted in the function prologue code generator to insert a canary onto the stack upon entering a function (that is, if canaries are turned on). Finally, code is inserted in the function epilogue code generator to lay down a check for corrupted canaries immediately prior to the return-from-procedure instructions, jumping around the return instruction, if canary dead, to code to report the problem. Some of the rtx arrays are lengthened in the existing code, everything else is insertion of code. --------------------------- KNOWN QUIRKS AND PROBLEMS --------------------------- (1) The victim application sends itself a SIGILL (not SIGKILL) upon detecting a dead canary. If the application traps this signal, then the process might not die (to be restarted in variant form by a daemon monitor). What should happens here is more a function of the situation, and we would like feedback from users of this tool as to what you decided to do for your particular situations. (2) Allegedly, GCC supports tail recursion. I have not located this support, but the canary death detection code resides in the code generator for returning normally from a procedure. That code cannot do tail recursion, it's concern is laying down code to return from calls. If tail calls get detected, then the matching of canary generation and checking may get screwed up. (3) It may be the case that canaries only need to be laid down for procedures that have automatic variables that can be overflowed. This feature is a main item on the agenda for the next version. (4) No effort is made to reduce the number of error handler routines. Every return gets its very own. (5) The canary death alarm includes the procedure name of the owner of the canary, but not other useful information like line number, source file name, etc. What's appropriate here? (6) It was more straightforward to emit assembly language to do canaries. As the analysis for what and when to lay down canaries gets more sophicated, the representation of canary generation and checking may move to earlier places in the compilation pipeline, with the nice side-effect of automatically achieving language and architecture independance.