Relay-Version: version B 2.10.3 alpha 4/15/85; site seismo.UUCP Posting-Version: version B 2.10.2 9/3/84; site genrad.UUCP Path: seismo!harvard!talcott!panda!genrad!sources-request From: sources-request@genrad.UUCP Newsgroups: mod.sources Subject: vstr - a dynamic string package Message-ID: <874@genrad.UUCP> Date: 4 Jun 85 13:52:24 GMT Sender: john@genrad.UUCP Lines: 1779 Approved: john@genrad.UUCP From: jordan@greipa (Jordan K. Hubbard) -------- cut here --------- # This is a shell archive. Remove anything before this line, # then unpack it by saving it in a file and typing "sh file". # # Wrapped by greipa!jordan on Fri May 31 02:15:19 PDT 1985 # Contents: Makefile OPERATION README examp.c v_alloc.c v_append.c v_clear.c # v_cursor.c v_debug.c v_delete.c v_find.c v_free.c v_get.c v_move.c # v_put.c v_rewind.c v_string.c v_utils.c vstr.h vstr.man echo x - Makefile sed 's/^@//' > "Makefile" <<'@//E*O*F Makefile//' # Makefile for vstr package, 04/16/85 Jordan K. Hubbard INC=/usr/include/local LIB=/usr/lib OBJS= v_put.o v_get.o v_alloc.o v_debug.o v_rewind.o v_append.o v_utils.o\ v_string.o v_delete.o v_move.o v_cursor.o v_free.o v_clear.o v_find.o INCS=vstr.h @.c.o: cc -O -c $< libvstr.a: $(OBJS) ar r libvstr.a $(OBJS) ranlib libvstr.a $(OBJS): $(INCS) install: libvstr.a cp libvstr.a $(LIB) cp $(INCS) $(INC) cp vstr.man /usr/man/manl/vstr.l ranlib $(LIB)/libvstr.a @//E*O*F Makefile// chmod u=rw,g=r,o=r Makefile echo x - OPERATION sed 's/^@//' > "OPERATION" <<'@//E*O*F OPERATION//' The operation of this package is fairly straightforward. A "virtual string" (actually, I should have called it a "dynamic" string , but it was too late as I didn't want to have to change all those 'v's to 'd's..) Is nothing more than a linked list of 'units', each containing a previous and next (unit) pointer, a data area and a length word. The header block allocated by new_vstr contains a head and tail pointer that points to the first and last units in the 'chain'. This makes it easy to to 'rewind' and 'append' the string. Also in the header block is the 'unit' size for that particular vstr. A variable unit size is necessary so that v-strings can be 'tuned' for various applications. A larger unit size can save an appreciable amount of space on machines where ints and pointers are 32 bits (here I have a 10 byte overhead for each unit). However, if massive editing is being done, or the expected data is small, smaller units pay off. Memory conscious folks should see the footnote at the end of this file. Finally, and probably most important, is a 'cursor'. The cursor is nothing more that a structure containing a unit pointer and an offset into it. The cursor 'points' nowhere when the string is first created, then moves as characters are added to the string. The cursor will always point to one of three locations when the string contains text: 1) In the 'rewound' state, the cursor points to the first unit in the chain with an offset of -1 (not on any character). 2) In the 'appended' state, the cursor will point at the last unit with an offset equal to the number of characters in the last unit. Since arrays start at zero, this puts it 'off the end' of the data area for that unit, not on any character. 3) After a 'get', 'put', 'find' etc. operation, the cursor will point to the character affected by the last operation. If one had just done a 'get' for instance, the cursor would point to the character just gotten. There are also other operations can make use of 'cursors' that the user allocates himself. They are described in the manual page. One might ask why I put a length word in each unit instead of just null terminating it. Well, in a word, it's a lot faster when calculating the length of the string (which I seem to do a lot) and memory is cheap on this machine. This is another area in which I welcome feedback. One last small thing, I got kind of carried away with the 'debugging' features. I found them rather useful for tracing the code as I debugged it. You might wish to look at some of the 'wrapper' defines in vstr.h and the code in v_debug.c (if the 'id' stack confuses you, just think about nested procedure calls). It's all kind of crude, but effective. It's also quite usable elsewhere. Please don't flame me for my handling of 'variable' arguments. A lot of people don't have varargs and I'm certainly not going to vax-kludge it (doesn't work on pyramids anyway). Footnote: * For people that DON'T have a lot of memory, you might want to consider the following (people wishing to create very large vstr's might also look into it). Look at the define called LENTYPE in vstr.h. As shipped, it is defined as a short (16 bits on my machine). This allows me to create and index fairly large units (useful when I slurp in something like the termcap file). Some people may find a limit of 127 adequate and can change it to a signed char (must be signed). Inversely, some may wish to make this an int. @//E*O*F OPERATION// chmod u=rw,g=r,o=r OPERATION echo x - README sed 's/^@//' > "README" <<'@//E*O*F README//' You might want to edit the makefile to put the library and include file somewhere else. Currently it's set for /usr/lib and /usr/include/local, respectivly. The manual page automatically tries to go in /usr/man/manl, you may want to change this as well. This program has not really been used at all, I'm releasing it mainly to get some feedback on it (this is not to say that it doesn't work, there are no KNOWN bugs..). There are two inconsistancies in the find and delete commands that might raise some ire, see the documentation. You will probably want to edit vstr.h to suit your tastes, read OPERATION for some more vaguely useful information. There's also a very crude test driver called examp.c.. It wasn't written, it just grew.. I include it only so that you may do some rudimentary testing if you wish to take the easy way out. No documentation, no support, if it was a child I'd disown it. You figure it out. Please mail any bugs/flames to me so that I may improve the code (the package, not examp.c). I'm also interested in any uses made of it, I haven't figured out any myself [ :-) ]... This package can be redistributed and used freely with appropriate credit. I am quite curious about the ways you may find to use it, any mail/interesting code describing this would be appreciated. Future plans include block operations, compaction on the fly, regular expression searches (currently not included due to the differences in SysV and 4.2 regex(p) code) and any speedups I can think of. Suggestions gratefully accepted. Jordan Hubbard May 30th, 1985. {decwrl, dual, pesnta}!greipa!jordan @//E*O*F README// chmod u=rw,g=r,o=r README echo x - examp.c sed 's/^@//' > "examp.c" <<'@//E*O*F examp.c//' #include "vstr.h" #include #include int i; eatnl() { register char ch; register int foundnum; while ((ch = getchar()) != '\n') if (isdigit(ch) || ch == '-') { ungetc(ch, stdin); scanf("%d", &i); foundnum = 1; } if (!foundnum) i = 0; } main() { register p_vstr testv = 0; register char *st, fname[40]; char ch; struct cursor save; int going = 1, mode = FWD, cnt, chr; register FILE *fp; while (going) { printf("Command> "); ch = getchar(); if (ch == '\n') { eatnl(); continue; } eatnl(); switch (ch) { case 'p': while ((ch = getchar()) != '~') { v_put(testv, ch, mode); if (mode == HERE) break; } eatnl(); break; case 'g': if (mode == INS) { printf("Now in FWD mode\n"); mode = FWD; } while ((ch = v_get(testv, mode)) != (mode == FWD ? ATEND : ATBEG)) { putchar(ch); if (mode == HERE) break; } printf("*END*\n"); break; case 's': v_savecursor(testv, &save); printf("saved\n"); break; case 'r': v_setcursor(testv, &save); printf("restored\n"); break; case 'S': v_show(testv); break; case 'm': if (i < 0) i = v_move(testv, BACK, -i); else i = v_move(testv, FWD, i); printf("Moved %d positions\n", i); break; case 'd': printf("Deleted %d chars\n", v_delete(testv, i)); break; case 'q': going = 0; break; case 'f': v_free(testv); testv = 0; break; case 'n': testv = new_vstr(i); printf("Address is %x\n", testv); break; case 'c': v_clear(testv, END); break; case 'R': v_rewind(testv); printf("Rewound\n"); break; case 'A': v_append(testv); printf("Appended\n"); break; case 'M': switch(i) { case 0: mode = FWD; printf("Forward\n"); break; case 1: mode = BACK; printf("Back\n"); break; case 2: mode = HERE; printf("Here\n"); break; case 3: mode = INS; printf("Insert\n"); break; default: printf("Unknown mode. Set to forward\n"); mode = FWD; break; } break; case '>': cnt = 0; printf("To file name: "); scanf("%s", fname); eatnl(); fp = fopen(fname, "w"); if (!fp) { printf("Can't open %s for writing\n", fname); continue; } while ((ch = v_get(testv, mode)) != (mode == FWD ? ATEND : ATBEG)) { fputc(ch, fp); cnt++; if (mode == HERE) break; } printf("%d chars written\n", cnt); fclose(fp); break; case '<': cnt = 0; printf("From file name: "); scanf("%s", fname); eatnl(); fp = fopen(fname, "r"); if (!fp) { printf("Can't open %s for writing\n", fname); continue; } while((chr = fgetc(fp)) != EOF) { cnt++; v_put(testv, chr, FWD); } fclose(fp); printf("%d chars read, length of vstr is now %d\n", cnt, v_length(testv, BEG)); break; case '/': /* look ma, no statement */ { register p_cursor cp; printf("Search for char: "); scanf("%c", &ch); if (cp = v_find(testv, ch)) { printf("Found.\n"); v_setcursor(testv, cp); } } /* yuk. So I'm too lazy to declare my locals earlier. this is just a lousy test program. What do you want from me? Style? */ break; default: printf("Unknown command: %c.\n", ch); } } } @//E*O*F examp.c// chmod u=rw,g=r,o=r examp.c echo x - v_alloc.c sed 's/^@//' > "v_alloc.c" <<'@//E*O*F v_alloc.c//' /* * V_ALLOC Version 1.0, 4/16/85 * Jordan K. Hubbard */ #include "vstr.h" IMPORT char *malloc(); ROUTINE p_unit new_unit(v) register p_vstr v; { register p_unit tmp; entry(new_unit) tmp = (p_unit)malloc(sizeof *tmp); if (!tmp) freak_out(M_NOSPACE); tmp->next = tmp->prev = NIL; tmp->len = 0; tmp->data = malloc(v->unit_size); if (!tmp->data) freak_out(M_NOSPACE); ret(tmp) } ROUTINE p_vstr new_vstr(i) register int i; { register p_vstr tmp; entry(new_vstr) tmp = (p_vstr)malloc(sizeof *tmp); if (!tmp) freak_out(M_NOSPACE); tmp->head = tmp->tail = tmp->cur.here = NIL; tmp->cur.u_pos = -1; tmp->unit_size = i; ret(tmp) } @//E*O*F v_alloc.c// chmod u=rw,g=r,o=r v_alloc.c echo x - v_append.c sed 's/^@//' > "v_append.c" <<'@//E*O*F v_append.c//' #include "vstr.h" ROUTINE void v_append(v) register p_vstr v; { entry(v_append) v->cur.here = v->tail; v->cur.u_pos = v->cur.here->len; end } @//E*O*F v_append.c// chmod u=rw,g=r,o=r v_append.c echo x - v_clear.c sed 's/^@//' > "v_clear.c" <<'@//E*O*F v_clear.c//' #include "vstr.h" ROUTINE void v_clear(v, how) register p_vstr v; register int how; { register p_unit temp; entry(v_clear) if (!v && !v->head) end if (how == END) { temp = v->cur.here; if (v->cur.u_pos) { temp->len = v->cur.u_pos; temp = temp->next; } } else temp = v->head; while (temp) { free(temp); temp = temp->next; } v->head = v->tail = v->cur.here = 0; v->cur.u_pos = -1; end } @//E*O*F v_clear.c// chmod u=rw,g=r,o=r v_clear.c echo x - v_cursor.c sed 's/^@//' > "v_cursor.c" <<'@//E*O*F v_cursor.c//' #include "vstr.h" ROUTINE p_cursor v_savecursor(v, c) register p_vstr v; register p_cursor c; { entry(v_savecursor) c->here = v->cur.here; c->u_pos = v->cur.u_pos; #ifdef DEBUG yelp("saving cursor at unit %x, pos %d\n", c->here, c->u_pos); #endif ret(c) } ROUTINE void v_setcursor(v, c) register p_vstr v; register p_cursor c; { entry(v_setcursor) v->cur.here = c->here; v->cur.u_pos = c->u_pos; #ifdef DEBUG yelp("setting cursor to unit %u, pos %d\n", c->here, c->u_pos); #endif end } ROUTINE void v_copycursor(c1, c2) register p_cursor c1, c2; { entry(v_copycursor) c1->here = c2->here; c1->u_pos = c2->u_pos; end } ROUTINE int v_span(c1, c2) register p_cursor c1, c2; { register int cnt = 0; register p_unit tmp = c1->here; entry(v_span) while (tmp && (tmp != c2->here)) { cnt += tmp->len; tmp = tmp->next; } if (tmp) cnt += c2->u_pos; ret(cnt) } @//E*O*F v_cursor.c// chmod u=rw,g=r,o=r v_cursor.c echo x - v_debug.c sed 's/^@//' > "v_debug.c" <<'@//E*O*F v_debug.c//' /* * V_DEBUG Version 1.0 4/16/85 * Jordan K. Hubbard */ #include "vstr.h" #include #define PUSH 1 #define POP 2 #define LOOK 3 #ifdef DEBUG char *id(); ROUTINE void yelp(fmt, p1, p2, p3, p4, p5, p6) register char *fmt; register unsigned char *p1, *p2, *p3, *p4, *p5, *p6; { fprintf(stderr, "DEBUG: [%s] ", id(0, LOOK)); fprintf(stderr, fmt, p1, p2, p3, p4, p5, p6); return; } ROUTINE char *id(s, mode) register char *s; register int mode; { /* if we get more than 50 subroutine levels deep, something's wrong */ /* with our program anyway (or we've taken modular programming too far!) */ static char *stack[50]; static int pointer = 0; switch (mode) { case PUSH: stack[pointer++] = s; if (pointer == 50) { yelp("stack overflow in id!\n"); exit(1); } break; case POP: if (pointer - 1 < 0) { yelp("stack underflow in id!\n"); exit(1); } return(stack[--pointer]); break; case LOOK: if (pointer - 1 < 0) { yelp("empty stack for 'look' in id!\n"); exit(1); } return(stack[pointer - 1]); break; } } ROUTINE void _end() { yelp("Exiting with no returned value\n"); id(0, POP); return; } ROUTINE void _entry(s) register char *s; { id(s, PUSH); yelp("Entering\n"); return; } ROUTINE void _ret(l) unsigned char *l; { yelp("Exiting with %u (hex: %x)\n", l, l); id(0, POP); return; } #else EXPORT char *_Curr_rtn; #endif /* Makes character 'visible' by turning it into a printable control */ /* character if it is one or its octal \nnn form if it is greater than */ /* the upper limit for printable ascii */ ROUTINE char *visible(ch) register char ch; { static char buf[5]; entry(visible) buf[0] = '\0'; if (ch < 32) { buf[0] = '^'; ch += 64; } if (ch > 31 && ch < 127) { if (buf[0]) { buf[1] = ch; buf[2] = '\0'; } else { buf[0] = ch; buf[1] = '\0'; } } else if (ch > 126) { sprintf(buf, "\\%03o", ch); buf[4] = '\0'; } ret(buf) } ROUTINE void v_show(v) register p_vstr v; { register p_unit temp; register int n_units = 0, n_chars = 0, l; register int n_funits = 0; entry(v_show) if (!v) { fprintf(stderr, "v_show: vstr is NIL\n"); end } fprintf(stderr, "Debugging information for vstr at %X\n", v); fprintf(stderr, "Head is at %X, Tail at %X\n", v->head, v->tail); fprintf(stderr, "Unit size is %d\n", v->unit_size); fprintf(stderr, "Cursor points to unit %X, position %d\n\n", v->cur.here, v->cur.u_pos); temp = v->head; while (temp) { fprintf(stderr, "prev unit next\n"); fprintf(stderr, "(%X) <= [%X] => (%X) ", temp->prev, temp, temp->next); if (temp == v->head) fprintf(stderr, "HEAD "); if (temp == v->tail) fprintf(stderr, "TAIL "); fprintf(stderr, "#%d\n", n_units); fprintf(stderr, "\tLength = %d", temp->len); if (temp->len) { n_chars += temp->len; fprintf(stderr, " Data = '"); for (l = 0; l < temp->len; l++) fprintf(stderr, "%s", visible(temp->data[l])); fprintf(stderr, "'\n\n"); if (temp->len == v->unit_size) n_funits++; } else fprintf(stderr, "\n\n"); temp = temp->next; n_units++; } fprintf(stderr, "\n\nTotal: %d characters in %d units. %d full, %d not.\n", n_chars, n_units, n_funits, n_units - n_funits); end } ROUTINE void freak_out(fmt, p1, p2, p3, p4, p5, p6) register char *fmt; register int p1, p2, p3, p4, p5, p6; { #ifdef DEBUG fprintf(stderr, "%s: ", id(0, LOOK)); #else fprintf(stderr, "%s: ", _Curr_rtn); #endif fprintf(stderr, fmt, p1, p2, p3, p4, p5, p6); exit(1); } @//E*O*F v_debug.c// chmod u=rw,g=r,o=r v_debug.c echo x - v_delete.c sed 's/^@//' > "v_delete.c" <<'@//E*O*F v_delete.c//' #include "vstr.h" ROUTINE int v_delete(v, cnt) register p_vstr v; register int cnt; { register p_unit temp; register int actual = 0, n; entry(v_delete) if (!cnt) end while (cnt) { temp = v->cur.here; n = temp->len - v->cur.u_pos; if (!n) { if (!temp->next) ret(actual) else { v->cur.here = temp->next; v->cur.u_pos = 0; } continue; } if (n <= cnt) { if (!v->cur.u_pos) { cnt -= temp->len; actual += temp->len; linkout(v, temp); if (v->cur.here == temp->prev) cnt = 0; } else { temp->len -= n; cnt -= n; actual += n; if (temp->next) { v->cur.here = temp->next; v->cur.u_pos = 0; } else cnt = 0; } } else { strncpy(temp->data + v->cur.u_pos, temp->data + v->cur.u_pos + cnt, n); temp->len -= cnt; actual += cnt; cnt = 0; } } ret(actual) } @//E*O*F v_delete.c// chmod u=rw,g=r,o=r v_delete.c echo x - v_find.c sed 's/^@//' > "v_find.c" <<'@//E*O*F v_find.c//' #include "vstr.h" ROUTINE p_cursor v_find(v, ch, dir) register p_vstr v; register char ch; register int dir; { static struct cursor found; struct cursor save; register int i; entry(v_find) if (dir != FWD && dir != BACK) freak_out("called with bad mode (not BACK or FWD)\n"); v_savecursor(v, &save); while ((i = v_get(v, dir)) != (dir == FWD ? ATEND : ATBEG)) if (i == ch) { v_savecursor(v, &found); v_setcursor(v, &save); ret(&found) } v_setcursor(v, &save); ret(0) } @//E*O*F v_find.c// chmod u=rw,g=r,o=r v_find.c echo x - v_free.c sed 's/^@//' > "v_free.c" <<'@//E*O*F v_free.c//' #include "vstr.h" ROUTINE void v_free(v) register p_vstr v; { register p_unit temp; entry(v_free) if (!v) freak_out(M_NILVSTR); temp = v->head; while (temp) { free(temp); temp = temp->next; } free(v); end } @//E*O*F v_free.c// chmod u=rw,g=r,o=r v_free.c echo x - v_get.c sed 's/^@//' > "v_get.c" <<'@//E*O*F v_get.c//' /* * V_GET Version 1.0, 4/16/85 * Jordan K. Hubbard */ #include "vstr.h" ROUTINE int v_get(v, mode) register p_vstr v; register int mode; { register int i; entry(v_get) #ifdef DEBUG yelp("called with vstr at %x, mode = %d\n", v, mode); #endif if (!v) freak_out(M_NILVSTR); if (!v->head) ret((mode == BACK) ? ATBEG : ATEND) switch(mode) { case FWD: if (v_move(v, FWD, 1) == ATEND) ret(ATEND) i = (int)v->cur.here->data[v->cur.u_pos]; break; case BACK: if (v_move(v, BACK, 1) == ATBEG) ret(ATBEG) i = (int)v->cur.here->data[v->cur.u_pos]; break; case HERE: if (IS_BACK(v)) ret(ATEND) i = (int)v->cur.here->data[v->cur.u_pos]; break; } ret(i) } @//E*O*F v_get.c// chmod u=rw,g=r,o=r v_get.c echo x - v_move.c sed 's/^@//' > "v_move.c" <<'@//E*O*F v_move.c//' #include "vstr.h" ROUTINE int v_move(v, dir, cnt) register p_vstr v; register int dir, cnt; { register int actual = 0; entry(v_move) #ifdef DEBUG yelp("called with vstr at %x, count of %d, direction = %d\n", v, cnt, dir); #endif if (!cnt) ret(0) switch (dir) { case FWD: while (cnt && !IS_BACK(v)) { if (++v->cur.u_pos >= v->cur.here->len) if (!(v->cur.here = v->cur.here->next)) { v->cur.here = v->tail; v->cur.u_pos = v->tail->len; break; } else v->cur.u_pos = 0; cnt--; actual++; } break; case BACK: while (cnt && !IS_FRONT(v)) { if (--v->cur.u_pos < 0) if (!(v->cur.here = v->cur.here->prev)) { v->cur.here = v->head; v->cur.u_pos = -1; break; } else v->cur.u_pos = v->cur.here->len - 1; actual++; cnt--; } break; } if (!actual) ret((dir == FWD) ? ATEND : ATBEG) else ret(actual) } @//E*O*F v_move.c// chmod u=rw,g=r,o=r v_move.c echo x - v_put.c sed 's/^@//' > "v_put.c" <<'@//E*O*F v_put.c//' /* * V_PUT Version 1.0, 4/16/85 * Jordan K. Hubbard */ #include "vstr.h" IMPORT p_unit linkin(); ROUTINE void v_put(v, c, mode) register p_vstr v; register char c; register int mode; { register int n; entry(v_put) #ifdef DEBUG yelp("called with vstr at %x, char = %c, mode = %d\n", v, c, mode); if (!v->head) yelp("vstr is cleared\n"); #endif if (!v) freak_out(M_NILVSTR); if (!v->head) { v->head = v->tail = v->cur.here = new_unit(v); v->cur.u_pos = -1; } /* * The awful nasty evil label here is so that some cases can mung * around with the text and then change the mode and goto. It's * crude, but it seems better that a recursive function call.. Look * at the 'END' or 'BEG' case and you'll see what I mean.. */ sw: switch(mode) { case FWD: if (v_move(v, FWD, 1) == ATEND) { if (v->cur.here->len < v->unit_size) { v->cur.here->data[v->cur.here->len] = c; v->cur.u_pos++; } else { v->cur.here = v->tail = linkin(v, FWD); *(v->cur.here->data) = c; v->cur.u_pos = 0; } } else v->cur.here->data[v->cur.u_pos] = c; if (v->cur.here->len <= v->cur.u_pos) v->cur.here->len++; break; case BACK: if (v_move(v, BACK, 1) == ATBEG) { if (v->cur.here->len < v->unit_size) { shift(v->cur.here, 0); *(v->cur.here->data) = c; v->cur.u_pos = 0; } else { v->cur.here = v->head = linkin(v, BACK); v->cur.here->len = 1; v->cur.u_pos = 0; *(v->cur.here->data) = c; } } else v->cur.here->data[v->cur.u_pos] = c; break; case HERE: if (IS_BACK(v)) end; v->cur.here->data[v->cur.u_pos] = c; break; case BEG: v_rewind(v); mode = BACK; goto sw; break; case END: v_append(v); mode = FWD; goto sw; break; case INS: /* * If we're at the end, then this isn't really an insert. * I know it's kludge, but I don't wanna rewrite the append * code! */ if (v_move(v, FWD, 1) == ATEND) { mode = FWD; goto sw; } if (v->cur.here->len < v->unit_size) { if (v->cur.u_pos < v->cur.here->len) shift(v->cur.here, v->cur.u_pos); else v->cur.here->len++; } else if (!ripple(v)) { linkin(v, FWD); ripple(v); } v->cur.here->data[v->cur.u_pos] = c; break; default: freak_out("Bad mode passed, mode = %d, char = %d\n", mode, c); break; } end } @//E*O*F v_put.c// chmod u=rw,g=r,o=r v_put.c echo x - v_rewind.c sed 's/^@//' > "v_rewind.c" <<'@//E*O*F v_rewind.c//' #include "vstr.h" ROUTINE void v_rewind(v) register p_vstr v; { entry(v_rewind) v->cur.here = v->head; v->cur.u_pos = -1; end } @//E*O*F v_rewind.c// chmod u=rw,g=r,o=r v_rewind.c echo x - v_string.c sed 's/^@//' > "v_string.c" <<'@//E*O*F v_string.c//' #include "vstr.h" IMPORT char *malloc(); ROUTINE void v_stov(s, v) register char *s; register p_vstr v; { register int i, l; entry(v_stov) l = strlen(s); #ifdef DEBUG yelp("Putting string at %x, length of %d into vstr at %x\n", s, l, v); #endif for (i = 0; i < l; i++) v_put(v, s[i], FWD); end } ROUTINE char *v_vtos(v) register p_vstr v; { register char *s; register i = 0; register char ch; entry(v_vtos) s = malloc(v_length(v, HERE) + 1); #ifdef DEBUG yelp("making string at %x, %d chars long for vstr at %x", s, v_length(v, HERE) + 1, v); #endif if (!s) freak_out(M_NOSPACE); while ((ch = v_get(v, FWD)) != ATEND) s[i++] = ch; s[i] = '\0'; ret(s) } ROUTINE char *v_nvtos(v, n) register p_vstr v; register int n; { register char *s; register i = 0; register char ch; entry(v_nvtos) s = malloc(n + 1); #ifdef DEBUG yelp("making string at %x, %d chars long for vstr at %x", s, n + 1, v); #endif if (!s) freak_out(M_NOSPACE); while ((ch = v_get(v, FWD)) != ATEND && n) { s[i++] = ch; n--; } s[i] = '\0'; ret(s) } @//E*O*F v_string.c// chmod u=rw,g=r,o=r v_string.c echo x - v_utils.c sed 's/^@//' > "v_utils.c" <<'@//E*O*F v_utils.c//' #include "vstr.h" #include ROUTINE p_unit linkin(v, mode) register p_vstr v; register int mode; { register p_unit temp, p1, u = v->cur.here; entry(linkin) #ifdef DEBUG yelp("called with vstr at %x, mode = %d\n", v, mode); #endif switch(mode) { case FWD: temp = new_unit(v); p1 = u->next; temp->prev = u; u->next = temp; temp->next = p1; if (p1) p1->prev = temp; break; case BACK: temp = new_unit(v); p1 = u->prev; temp->next = u; u->prev = temp; temp->prev = p1; if (p1) p1->next = temp; break; default: freak_out("passed weird mode %d\n", mode); end break; } ret(temp) } ROUTINE void linkout(v, u) register p_vstr v; register p_unit u; { register p_unit p1, p2; entry(linkout) #ifdef DEBUG yelp("linking out unit at %x from vstr at %x\n", u, v); #endif p1 = u->prev; p2 = u->next; if (p1) p1->next = p2; else v->head = p2; if (p2) p2->prev = p1; else v->tail = p1; if (p2) { v->cur.here = p2; v->cur.u_pos = 0; } else { v->cur.here = p1; v->cur.u_pos = (p1 ? p1->len : 0); } free(u); end } ROUTINE void shift(u, pos) register p_unit u; register int pos; { register int i = u->len; entry(shift) #ifdef DEBUG yelp("shifting from %d to %d in unit at %x\n", i, pos, u); #endif if (!i) freak_out("passed an empty unit at %u\n", u); while(i != pos) { u->data[i] = u->data[i - 1]; i--; } u->len++; end } ROUTINE boolean ripple(v) register p_vstr v; { struct cursor pos; register int i = COST, p; register p_unit tmp = v->cur.here; register p_unit dest_u = tmp; register int dest_p = v->cur.u_pos; entry(ripple) /* Scan for a hole */ pos.here = 0; while (i--) { if (!tmp->next) { v->tail = tmp->next = new_unit(v); tmp->next->prev = tmp; pos.here = tmp->next; pos.u_pos = 0; #ifdef DEBUG yelp("found end while scanning, linked in unit at %x\n", pos.here); #endif break; } else if (tmp->len < v->unit_size) { pos.u_pos = tmp->len; pos.here = tmp; #ifdef DEBUG yelp("Found hole at %x(%d)\n", pos.here, pos.u_pos); #endif break; } else tmp = tmp->next; } if (!pos.here) ret(0) pos.here->len++; #ifdef DEBUG yelp("rippling from %x(%d) to %x(%d)\n", pos.here, pos.u_pos, dest_u, dest_p); #endif while (!(pos.here == dest_u && pos.u_pos == dest_p)) { if (!pos.u_pos) { tmp = pos.here->prev; pos.u_pos = tmp->len - 1; *(pos.here->data) = tmp->data[pos.u_pos]; pos.here = tmp; } else { pos.here->data[pos.u_pos] = pos.here->data[pos.u_pos - 1]; pos.u_pos--; } } ret(1) } ROUTINE int v_length(v, how) register p_vstr v; register int how; { register int i = 0; register p_unit temp; entry(v_length) #ifdef DEBUG yelp("Getting length for vstr at %x, how = %d\n", v, how); #endif if (!v || !v->head) ret(0) switch(how) { case HERE: temp = v->head; while (temp) { i += temp->len; temp = temp->next; } break; case BEG: if (IS_FRONT(v)) break; temp = v->cur.here; if (v->cur.u_pos < temp->len) { i = v->cur.u_pos + 1; temp = temp->prev; } while(temp) { i += temp->len; temp = temp->prev; } break; case END: if (IS_BACK(v)) break; temp = v->cur.here; i = temp->len - v->cur.u_pos; while (temp) { i += temp->len; temp = temp->next; } break; } ret(i) } @//E*O*F v_utils.c// chmod u=rw,g=r,o=r v_utils.c echo x - vstr.h sed 's/^@//' > "vstr.h" <<'@//E*O*F vstr.h//' /* The number of units to scan ahead for holes when inserting characters */ /* (as opposed to just creating a new unit). Scanning isn't so */ /* much overhead as is copying characters to the hole, so it's best not */ /* to define COST too large unless you're using large units and/or doing */ /* a fair number of deletions. Perhaps some more clever way of doing */ /* insertions will be implemented some day. (I welcome any suggestions) */ #define COST 4 /* To turn on routine tracing and general annoying messages #define * DEBUG, as usual. Modify _end(), _entry() and _ret() for more advanced * debugging. */ /* #define DEBUG 1 */ /* * Define this to be the type of variable you want to store the length * and index variables in. Read OPERATION for more details.. */ #define LENTYPE short #ifdef DEBUG #define entry(s) _entry("s"); #define ret(s) { _ret(s); return(s); } #define end { _end(); return; } #else #define entry(s) _Curr_rtn = "s"; #define ret(s) return(s); #define end return; #endif /* * utility macros */ #define IMPORT extern #define EXPORT #define ROUTINE #define forever for(;;) #define IS_FRONT(v) (v->cur.here == v->head && v->cur.u_pos == -1) #define IS_BACK(v) (v->cur.here == v->tail \ && (v->cur.u_pos == v->cur.here->len)) typedef unsigned char boolean; /* * messages/error codes/constants */ #define M_NOSPACE "out of memory/arena corrupted\n" #define M_NILVSTR "null virtual string passed\n" #define ATEND -1 #define ATBEG -2 #define NIL 0 #define BACK 1 #define FWD 2 #define HERE 3 #define BEG 4 #define END 5 #define INS 6 /* * declarations */ struct unit { struct unit *prev, *next; LENTYPE len; char *data; }; typedef struct unit *p_unit; struct cursor { struct unit *here; LENTYPE u_pos; }; typedef struct cursor *p_cursor; struct vstr { p_unit head, tail; LENTYPE unit_size; struct cursor cur; }; typedef struct vstr *p_vstr; #ifndef DEBUG IMPORT char *_Curr_rtn; #endif IMPORT char *v_nvtos(); IMPORT char *v_vtos(); IMPORT int v_delete(); IMPORT int v_get(); IMPORT int v_length(); IMPORT int v_move(); IMPORT int v_span(); IMPORT p_cursor v_find(); IMPORT p_cursor v_savecursor(); IMPORT p_unit new_unit(); IMPORT p_vstr new_vstr(); IMPORT void freak_out(); IMPORT void v_append(); IMPORT void v_clear(); IMPORT void v_copycursor(); IMPORT void v_free(); IMPORT void v_put(); IMPORT void v_rewind(); IMPORT void v_setcursor(); IMPORT void v_show(); IMPORT void v_stov(); @//E*O*F vstr.h// chmod u=rw,g=r,o=r vstr.h echo x - vstr.man sed 's/^@//' > "vstr.man" <<'@//E*O*F vstr.man//' @.TH STRING 3-ucb @.SH NAME new_vstr, v_append, v_clear, v_copycursor, v_delete, v_find, v_free, v_get, v_length, v_move, v_vtos, v_nvtos, v_put, v_rewind, v_savecursor, v_setcursor, v_show, v_span, v_stov \-virtual string operations @.SH ORIGIN 4.2BSD @.SH SYNOPSIS @.nf @.B p_vstr new_vstr(unitsize) @.B int unitsize; @.PP @.B void v_append(v) @.B p_vstr v; @.PP @.B void v_clear(v, how) @.B p_vstr v; @.B int how; @.PP @.B void v_copycursor(c1, c2) @.B p_cursor c1, c2; @.PP @.B int v_delete(v, cnt) @.B p_vstr v; @.B int cnt; @.PP @.B int v_find(v, ch, dir) @.B p_vstr v; @.B char ch; @.B int dir; @.PP @.B void v_free(v) @.B p_vstr v; @.PP @.B int v_get(v, mode) @.B p_vstr v; @.B int mode; @.PP @.B int v_length(v, how) @.B p_vstr v; @.B int how; @.PP @.B char *vtos(v) @.B p_vstr v; @.PP @.B char *nvtos(v, n) @.B p_vstr v; @.B int n; @.PP @.B void v_put(v, ch, mode) @.B p_vstr v; @.B char ch; @.B int mode; @.PP @.B void v_rewind(v) @.B p_vstr v; @.PP @.B void v_setcursor(v, c) @.B p_vstr v; @.B p_cursor c; @.PP @.B p_cursor v_savecursor(v, c) @.B p_vstr v; @.B p_cursor c; @.PP @.B void v_show(v) @.B p_vstr v; @.PP @.B int v_span(c1, c2) @.B p_cursor c1, c2; @.PP @.B void v_stov(s, v) @.B char *s; @.B p_vstr v; @.PP @.fi @.SH DESCRIPTION These functions allow the creation and manipulation of 'virtual' strings. That is, strings that expand and contract as characters are added or deleted. Unlike conventional strings, virtual strings have 'cursors' that can be moved around in the string determining where operations will take place. @.PP @.I new_vstr allocates space for a new virtual string header and returns a pointer to it. Subsequent memory allocations will be in @.I unitsize byte chunks.. @.PP @.I v_append moves the cursor to the last (append) position in string @.I v. @.PP @.I v_clear deletes characters from string @.I v in one of two ways. If @.I how is set to the constant 'END' then text is deleted from the cursor position to the end of the string. If set to anything else, the entire string is deleted. @.PP @.I v_copycursor copies the contents of cursor @.I c2 to @.I c1. @.PP @.I v_delete deletes @.I cnt characters from @.I v starting at the @.I current cursor position until @.I cnt is exhausted or the end of the string is hit. Does not move the cursor (Characters slide over from the right). Returns the number of characters actually deleted. @.PP @.I v_find searches through string @.I v for char @.I ch in direction @.I dir until found, or end is hit. If found, a pointer to a cursor is returned containing the location of the found character. Since this cursor is declared as a local static in @.I v_find, care must be taken to save the contents in an auxiliary cursor if the contents are to be preserved across subsequent @.I v_find calls. @.PP @.I v_free deallocates all space used by string @.IR v. @.PP @.I v_get gets a character from @.I v in the following ways depending on @.I mode. If @.I mode is the constant 'HERE' the character at the cursor is returned. Otherwise a character is returned after moving cursor one position in direction @.I mode (BACK or FWD). Returns the constant ATBEG or ATEND if the cursor cannot be moved backwards or forward (the string is rewound or appended). @.PP @.I v_length returns number of characters in string @.I v in one of three ways, depending on the value of @.IR how. If @.I how is the constant 'HERE', then the total number of characters is returned. If @.I how is the constant 'BEG', then the number of characters from the current cursor position to the beginning of the string is returned. Finally, if @.I how is the constant 'END', then the number of characters from the current cursor position to the end of the string is returned. (easy enough..) @.PP @.I v_vtos slurps up characters from the current cursor position in string @.I v to the end. Returns a malloc'd string containing them. @.PP @.IR v_nvtos is like @.I vtos but only slurps (at most) @.I n characters. @.PP @.I v_put places character @.I ch in string @.I v depending on the value of @.I mode. If @.I mode is 'HERE', the character is placed at the cursor position without moving. If set to 'BACK' or 'FWD, it moves backwards or forward and places the character. If mode is 'INS', then the character is inserted after the cursor position and the cursor is moved forward onto the inserted character. @.PP @.I v_rewind moves the cursor in string to the beginning (rewound) position in string @.I v. @.PP @.I v_setcursor sets the cursor in @.I v to the contents of the cursor @.I c. @.PP @.I v_savecursor copies the cursor in @.I v to the cursor @.I c. The address of cursor @.I c is returned. @.PP @.I v_show is mostly for debugging. It displays the header contents of @.I v and shows the structure and contents of the linked list. Some types of list corruption are also detected and flagged. Kind of cute, but of limited use. @.PP @.I v_span returns the number of characters between cursor @.I c1 and @.I c2. The characters under the cursors are included. This function is useful in computing counts for @.I v_delete, v_move, v_nvtos ect. If one is clever, certain block operations between cursors can be done with the aid of this call. @.PP @.I v_stov puts string @.I s into @.I v at the current cursor position. String is added in a forward direction. @.PP @.SH BUGS @.I v_find returns the cursor position of the found character. If a subsequent @.I v_get or @.I v_put is done, you'll get the @.I next character since they both advance the cursor before returning a character. This is not really a bug, but potentially confusing. @.I v_delete also acts on the current cursor position, rather that moving then deleting. Another potential confuser. (Unless you're using it with @.I v_find) @.LP @//E*O*F vstr.man// chmod u=rw,g=r,o=r vstr.man echo Inspecting for damage in transit... temp=/tmp/shar$$; dtemp=/tmp/.shar$$ trap "rm -f $temp $dtemp; exit" 0 1 2 3 15 cat > $temp <<\!!! 23 55 472 Makefile 64 631 3461 OPERATION 40 273 1612 README 206 449 3670 examp.c 42 83 656 v_alloc.c 11 18 148 v_append.c 29 65 419 v_clear.c 56 125 884 v_cursor.c 183 538 3551 v_debug.c 55 129 877 v_delete.c 25 74 495 v_find.c 19 30 224 v_free.c 44 93 708 v_get.c 52 123 928 v_move.c 121 335 2405 v_put.c 11 18 134 v_rewind.c 66 185 1061 v_string.c 205 505 3360 v_utils.c 121 376 2464 vstr.h 263 1009 5594 vstr.man 1636 5114 33123 total !!! wc Makefile OPERATION README examp.c v_alloc.c v_append.c v_clear.c v_cursor.c v_debug.c v_delete.c v_find.c v_free.c v_get.c v_move.c v_put.c v_rewind.c v_string.c v_utils.c vstr.h vstr.man | sed 's=[^ ]*/==' | diff -b $temp - >$dtemp if [ -s $dtemp ] then echo "Ouch [diff of wc output]:" ; cat $dtemp else echo "No problems found." fi exit 0