From: Jim Kingdon Date: Thu, 20 May 1993 18:41:17 +0000 (+0000) Subject: * stabs.texinfo: Remove node Quick Reference and put its children X-Git-Url: http://git.efficios.com/?a=commitdiff_plain;h=8c59ee1150bc24e1c8ae0f2b0326450253ce63ad;p=deliverable%2Fbinutils-gdb.git * stabs.texinfo: Remove node Quick Reference and put its children directly under the main menu. * stabs.texinfo: Many more changes to bring it into line with AIX documentation and reality. I think it now has all the information from the AIX documentation, except that I burned out when I got to variant records (Pascal and Modula-2) and all the COBOL types. Oh well, we can add them later when we're worrying more about those languages. * stabs.texinfo (Automatic variables): Talk about what it means to omit the symbol descriptor. --- diff --git a/gdb/doc/ChangeLog b/gdb/doc/ChangeLog index 78bf3cd17a..64d9e7b61a 100644 --- a/gdb/doc/ChangeLog +++ b/gdb/doc/ChangeLog @@ -1,3 +1,18 @@ +Thu May 20 13:35:10 1993 Jim Kingdon (kingdon@lioth.cygnus.com) + + * stabs.texinfo: Remove node Quick Reference and put its children + directly under the main menu. + + * stabs.texinfo: Many more changes to bring it into line with + AIX documentation and reality. I think it now has all the + information from the AIX documentation, except that I burned + out when I got to variant records (Pascal and Modula-2) and + all the COBOL types. Oh well, we can add them later when we're + worrying more about those languages. + + * stabs.texinfo (Automatic variables): Talk about what it means + to omit the symbol descriptor. + Tue May 18 17:59:18 1993 Jim Kingdon (kingdon@lioth.cygnus.com) * stabs.texinfo (Parameters): Add "(sometimes)" when describing @@ -368,3 +383,10 @@ Thu Dec 5 22:46:12 1991 K. Richard Pixley (rich at rtl.cygnus.com) and shift gpl to v2. Added ChangeLog if it didn't exist. docdir and mandir now keyed off datadir by default. + +Local Variables: +mode: indented-text +left-margin: 8 +fill-column: 74 +version-control: never +End: diff --git a/gdb/doc/stabs.texinfo b/gdb/doc/stabs.texinfo index 5656ea61dc..1aa2143f6e 100644 --- a/gdb/doc/stabs.texinfo +++ b/gdb/doc/stabs.texinfo @@ -10,7 +10,7 @@ END-INFO-DIR-ENTRY @end ifinfo @ifinfo -This document describes GNU stabs (debugging symbol tables) in a.out files. +This document describes the stabs debugging symbol tables. Copyright 1992 Free Software Foundation, Inc. Contributed by Cygnus Support. Written by Julia Menapace. @@ -68,17 +68,19 @@ This document describes the GNU stabs debugging format in a.out files. * Overview:: Overview of stabs * Program structure:: Encoding of the structure of the program * Constants:: Constants -* Simple types:: * Example:: A comprehensive example in C * Variables:: -* Aggregate Types:: +* Types:: Type definitions * Symbol tables:: Symbol information in symbol tables -* GNU Cplusplus stabs:: +* Cplusplus:: Appendixes: * Example2.c:: Source code for extended example * Example2.s:: Assembly code for extended example -* Quick reference:: Various refernce tables +* Stab types:: Table A: Symbol types from stabs +* Assembler types:: Table B: Symbol types from assembler and linker +* Symbol Descriptors:: Table C +* Type Descriptors:: Table D * Expanded reference:: Reference information by stab type * Questions:: Questions and anomolies * xcoff-differences:: Differences between GNU stabs in a.out @@ -98,9 +100,24 @@ to a debugger. This format was apparently invented by the University of California at Berkeley, for the @code{pdx} Pascal debugger; the format has spread widely since then. +This document is one of the few published sources of documentation on +stabs. It is believed to be completely comprehensive for stabs used by +C. The lists of symbol descriptors (@pxref{Symbol Descriptors}) and +type descriptors (@pxref{Type Descriptors}) are believed to be completely +comprehensive. There are known to be stabs for C++ and COBOL which are +poorly documented here. Stabs specific to other languages (e.g. Pascal, +Modula-2) are probably not as well documented as they should be. + +Other sources of information on stabs are @cite{dbx and dbxtool +interfaces}, 2nd edition, by Sun, circa 1988, and @cite{AIX Version 3.2 +Files Reference}, Fourth Edition, September 1992, "dbx Stabstring +Grammar" in the a.out section, page 2-31. This document is believed to +incorporate the information from those two sources except where it +explictly directs you to them for more information. + @menu * Flow:: Overview of debugging information flow -* Stabs format:: Overview of stab format +* Stabs Format:: Overview of stab format * C example:: A simple example in C source * Assembly code:: The simple example at the assembly level @end menu @@ -134,7 +151,7 @@ files into one executable file, with one symbol table and one string table. Debuggers use the symbol and string tables in the executable as a source of debugging information about the program. -@node Stabs format +@node Stabs Format @section Overview of stab format There are three overall formats for stab assembler directives @@ -184,7 +201,7 @@ The overall format is of the @code{"@var{string}"} field is: @var{name} is the name of the symbol represented by the stab. @var{name} can be omitted, which means the stab represents an unnamed -object. For example, @code{":t10=*2"} defines type 10 as a pointer to +object. For example, @samp{:t10=*2} defines type 10 as a pointer to type 2, but does not give the type a name. Omitting the @var{name} field is supported by AIX dbx and GDB after about version 4.8, but not other debuggers. @@ -193,7 +210,7 @@ The @var{symbol_descriptor} following the @samp{:} is an alphabetic character that tells more specifically what kind of symbol the stab represents. If the @var{symbol_descriptor} is omitted, but type information follows, then the stab represents a local variable. For a -list of symbol_descriptors, see @ref{Symbol descriptors,,Table C: Symbol +list of symbol descriptors, see @ref{Symbol Descriptors,,Table C: Symbol descriptors}. The @samp{c} symbol descriptor is an exception in that it is not @@ -220,15 +237,20 @@ Descriptors,,Table D: Type Descriptors}, for a list of There is an AIX extension for type attributes. Following the @samp{=} is any number of type attributes. Each one starts with @samp{@@} and ends with @samp{;}. Debuggers, including AIX's dbx, skip any type -attributes they do not recognize. The attributes are: +attributes they do not recognize. GDB 4.9 does not do this--it will +ignore the entire symbol containing a type attribute. Hopefully this +will be fixed in the next GDB release. Because of a conflict with C++ +(@pxref{Cplusplus}), new attributes should not be defined which begin +with a digit, @samp{(}, or @samp{-}; GDB may be unable to distinguish +those from the C++ type descriptor @samp{@@}. The attributes are: @table @code @item a@var{boundary} -@var{boundary} is an integer specifying the alignment. I assume that +@var{boundary} is an integer specifying the alignment. I assume it applies to all variables of this type. @item s@var{size} -Size in bits of a variabe of this type. +Size in bits of a variable of this type. @item p@var{integer} Pointer class (for checking). Not sure what this means, or how @@ -420,14 +442,15 @@ necessary), but the AIX documentation defines @samp{I}, @samp{P}, and These symbol descriptors are unusual in that they are not followed by type information. -After the symbol descriptor and the type information, there is -optionally a comma, followed by the name of the procedure, followed by a -comma, followed by a name specifying the scope. The first name is local -to the scope specified. I assume then that the name of the symbol -(before the @samp{:}), if specified, is some sort of global name. I -assume the name specifying the scope is the name of a function -specifying that scope. This feature is an AIX extension, and this -information is based on the manual; I haven't actually tried it. +For any of the above symbol descriptors, after the symbol descriptor and +the type information, there is optionally a comma, followed by the name +of the procedure, followed by a comma, followed by a name specifying the +scope. The first name is local to the scope specified. I assume then +that the name of the symbol (before the @samp{:}), if specified, is some +sort of global name. I assume the name specifying the scope is the name +of a function specifying that scope. This feature is an AIX extension, +and this information is based on the manual; I haven't actually tried +it. The stab representing a procedure is located immediately following the code of the procedure. This stab is in turn directly followed by a @@ -525,7 +548,7 @@ Character constant. @var{value} is the numeric value of the constant. @item e@var{type-information},@var{value} Enumeration constant. @var{type-information} is the type of the constant, as it would appear after a symbol descriptor -(@pxref{Overview}). @var{value} is the numeric value of the constant. +(@pxref{Stabs Format}). @var{value} is the numeric value of the constant. @item i@var{value} Integer constant. @var{value} is the numeric value. @@ -545,7 +568,7 @@ string are represented as @samp{\"}). @item S@var{type-information},@var{elements},@var{bits},@var{pattern} Set constant. @var{type-information} is the type of the constant, as it -would appear after a symbol descriptor (@pxref{Overview}). +would appear after a symbol descriptor (@pxref{Stabs Format}). @var{elements} is the number of elements in the set (is this just the number of bits set in @var{pattern}? Or redundant with the type? I don't get it), @var{bits} is the number of bits in the constant (meaning @@ -561,126 +584,6 @@ constants. This information is followed by @samp{;}. -@node Simple types -@chapter Simple types - -@menu -* Basic types:: Basic type definitions -* Range types:: Range types defined by min and max value -* Float "range" types:: Range type defined by size in bytes -@end menu - -@node Basic types -@section Basic type definitions - -@table @strong -@item Directive: -@code{.stabs} -@item Type: -@code{N_LSYM} -@item Symbol Descriptor: -@code{t} -@end table - -The basic types for the language are described using the @code{N_LSYM} stab -type. They are boilerplate and are emited by the compiler for each -compilation unit. Basic type definitions are not always a complete -description of the type and are sometimes circular. The debugger -recognizes the type anyway, and knows how to read bits as that type. - -Each language and compiler defines a slightly different set of basic -types. In this example we are looking at the basic types for C emited -by the GNU compiler targeting the Sun4. Here the basic types are -mostly defined as range types. - - -@node Range types -@section Range types defined by min and max value - -@table @strong -@item Type Descriptor: -@code{r} -@end table - -When defining a range type, if the number after the first semicolon is -smaller than the number after the second one, then the two numbers -represent the smallest and the largest values in the range. - -@example -4 .text -5 Ltext0: - -.stabs "@var{name}: - @var{descriptor} @r{(type)} - @var{type-def}= - @var{type-desc} - @var{type-ref}; - @var{low-bound}; - @var{high-bound}; - ", - N_LSYM, NIL, NIL, NIL - -6 .stabs "int:t1=r1;-2147483648;2147483647;",128,0,0,0 -7 .stabs "char:t2=r2;0;127;",128,0,0,0 -@end example - -Here the integer type (@code{1}) is defined as a range of the integer -type (@code{1}). Likewise @code{char} is a range of @code{char}. This -part of the definition is circular, but at least the high and low bound -values of the range hold more information about the type. - -Here short unsigned int is defined as type number 8 and described as a -range of type @code{int}, with a minimum value of 0 and a maximum of 65535. - -@example -13 .stabs "short unsigned int:t8=r1;0;65535;",128,0,0,0 -@end example - -@node Float "range" types -@section Range type defined by size in bytes - -@table @strong -@item Type Descriptor: -@code{r} -@end table - -In a range definition, if the first number after the semicolon is -positive and the second is zero, then the type being defined is a -floating point type, and the number after the first semicolon is the -number of bytes needed to represent the type. Note that this does not -provide a way to distinguish 8-byte real floating point types from -8-byte complex floating point types. - -@example -.stabs "@var{name}: - @var{desc} - @var{type-def}= - @var{type-desc} - @var{type-ref}; - @var{bit-count}; - 0; - ", - N_LSYM, NIL, NIL, NIL - -17 .stabs "float:t12=r1;4;0;",128,0,0,0 -18 .stabs "double:t13=r1;8;0;",128,0,0,0 -19 .stabs "long double:t14=r1;8;0;",128,0,0,0 -@end example - -Cosmically enough, the @code{void} type is defined directly in terms of -itself. - -@example -.stabs "@var{name}: - @var{symbol-desc} - @var{type-def}= - @var{type-ref} - ",N_LSYM,NIL,NIL,NIL - -20 .stabs "void:t15=15",128,0,0,0 -@end example - - @node Example @chapter A Comprehensive Example in C @@ -854,7 +757,6 @@ nesting is reflected in the nested bracketing stabs (@code{N_LBRAC}, none @end table - In addition to describing types, the @code{N_LSYM} stab type also describes locally scoped automatic variables. Refer again to the body of @code{main} in @file{example2.c}. It allocates two automatic @@ -882,7 +784,7 @@ scoped. @exdent @code{N_LSYM} (128): automatic variable, scoped locally to @code{main} .stabs "@var{name}: - @var{type-ref}", + @var{type information}", N_LSYM, NIL, NIL, @var{frame-pointer-offset} @@ -892,7 +794,7 @@ scoped. @exdent @code{N_LSYM} (128): automatic variable, scoped locally to the @code{for} loop .stabs "@var{name}: - @var{type-ref}", + @var{type information}", N_LSYM, NIL, NIL, @var{frame-pointer-offset} @@ -900,13 +802,13 @@ scoped. 101 .stabn 192,0,0,LBB3 ## begin `for' loop N_LBRAC @end example -Since the character in the string field following the colon is not a -letter, there is no symbol descriptor. This means that the stab -describes a local variable, and that the number after the colon is a -type reference. In this case it a a reference to the basic type @code{int}. -Notice also that the frame pointer offset is negative number for -automatic variables. - +The symbol descriptor is omitted for automatic variables. Since type +information should being with a digit, @samp{-}, or @samp{(}, only +digits, @samp{-}, and @samp{(} are precluded from being used for symbol +descriptors by this fact. However, the Acorn RISC machine (ARM) is said +to get this wrong: it puts out a mere type definition here, without the +preceding @code{@var{typenumber}=}. This is a bad idea; there is no +guarantee that type descriptors are distinct from symbol descriptors. @node Global Variables @section Global Variables @@ -956,6 +858,8 @@ external symbol for the global variable. @node Register variables @section Register variables +@c According to an old version of this manual, AIX uses C_RPSYM instead +@c of C_RSYM. I am skeptical; this should be verified. Register variables have their own stab type, @code{N_RSYM}, and their own symbol descriptor, @code{r}. The stab's value field contains the number of the register where the variable data will be stored. @@ -964,7 +868,8 @@ The value is the register number. AIX defines a separate symbol descriptor @samp{d} for floating point registers. This seems incredibly stupid--why not just just give -floating point registers different register numbers. +floating point registers different register numbers? I have not +verified whether the compiler actually uses @samp{d}. If the register is explicitly allocated to a global variable, but not initialized, as in @@ -1162,7 +1067,7 @@ The following are also said to go with @samp{N_PSYM}: @example "name" -> "param_name:#type" -> pP (<>) - -> pF (<>) + -> pF FORTRAN function parameter -> X (function result variable) -> b (based variable) @@ -1190,8 +1095,8 @@ The type definition of argv is interesting because it contains several type definitions. Type 21 is pointer to type 2 (char) and argv (type 20) is pointer to type 21. -@node Aggregate Types -@chapter Aggregate Types +@node Types +@chapter Type definitions Now let's look at some variable definitions involving complex types. This involves understanding better how types are described. In the @@ -1202,108 +1107,515 @@ the various other type descriptors that may follow the = sign in a type definition. @menu -* Arrays:: -* Enumerations:: -* Structure tags:: -* Typedefs:: +* Builtin types:: Integers, floating point, void, etc. +* Miscellaneous Types:: Pointers, sets, files, etc. +* Cross-references:: Referring to a type not yet defined. +* Subranges:: A type with a specific range. +* Arrays:: An aggregate type of same-typed elements. +* Strings:: Like an array but also has a length. +* Enumerations:: Like an integer but the values have names. +* Structures:: An aggregate type of different-typed elements. +* Typedefs:: Giving a type a name * Unions:: * Function types:: @end menu -@node Arrays -@section Array types +@node Builtin types +@section Builtin types -@table @strong -@item Directive: -@code{.stabs} -@item Types: -@code{N_GSYM}, @code{N_LSYM} -@item Symbol Descriptor: -@code{T} -@item Type Descriptor: -@code{a} +Certain types are built in (@code{int}, @code{short}, @code{void}, +@code{float}, etc.); the debugger recognizes these types and knows how +to handle them. Thus don't be surprised if some of the following ways +of specifying builtin types do not specify everything that a debugger +would need to know about the type---in some cases they merely specify +enough information to distinguish the type from other types. + +The traditional way to define builtin types is convolunted, so new ways +have been invented to describe them. Sun's ACC uses the @samp{b} and +@samp{R} type descriptors, and IBM uses negative type numbers. GDB can +accept all three, as of version 4.8; dbx just accepts the traditional +builtin types and perhaps one of the other two formats. + +@menu +* Traditional Builtin Types:: Put on your seatbelts and prepare for kludgery +* Builtin Type Descriptors:: Builtin types with special type descriptors +* Negative Type Numbers:: Builtin types using negative type numbers +@end menu + +@node Traditional Builtin Types +@subsection Traditional Builtin types + +Often types are defined as subranges of themselves. If the array bounds +can fit within an @code{int}, then they are given normally. For example: + +@example +.stabs "int:t1=r1;-2147483648;2147483647;",128,0,0,0 ; 128 is N_LSYM +.stabs "char:t2=r2;0;127;",128,0,0,0 +@end example + +Builtin types can also be described as subranges of @code{int}: + +@example +.stabs "unsigned short:t6=r1;0;65535;",128,0,0,0 +@end example + +If the upper bound of a subrange is -1, it means that the type is an +integral type whose bounds are too big to describe in an int. +Traditionally this is only used for @code{unsigned int} and +@code{unsigned long}; GCC also uses it for @code{long long} and +@code{unsigned long long}, and the only way to tell those types apart is +to look at their names. On other machines GCC puts out bounds in octal, +with a leading 0. In this case a negative bound consists of a number +which is a 1 bit followed by a bunch of 0 bits, and a positive bound is +one in which a bunch of bits are 1. + +@example +.stabs "unsigned int:t4=r1;0;-1;",128,0,0,0 +.stabs "long long int:t7=r1;0;-1;",128,0,0,0 +@end example + +If the upper bound of a subrange is 0, it means that this is a floating +point type, and the lower bound of the subrange indicates the number of +bytes in the type: + +@example +.stabs "float:t12=r1;4;0;",128,0,0,0 +.stabs "double:t13=r1;8;0;",128,0,0,0 +@end example + +However, GCC writes @code{long double} the same way it writes +@code{double}; the only way to distinguish them is by the name: + +@example +.stabs "long double:t14=r1;8;0;",128,0,0,0 +@end example + +Complex types are defined the same way as floating-point types; the only +way to distinguish a single-precision complex from a double-precision +floating-point type is by the name. + +The C @code{void} type is defined as itself: + +@example +.stabs "void:t15=15",128,0,0,0 +@end example + +I'm not sure how a boolean type is represented. + +@node Builtin Type Descriptors +@subsection Defining Builtin Types using Builtin Type Descriptors + +There are various type descriptors to define builtin types: + +@table @code +@c FIXME: clean up description of width and offset +@item b @var{signed} @var{char-flag} @var{width} ; @var{offset} ; @var{nbits} ; +Define an integral type. @var{signed} is @samp{u} for unsigned or +@samp{s} for signed. @var{char-flag} is @samp{c} which indicates this +is a character type, or is omitted. I assume this is to distinguish an +integral type from a character type of the same size, for example it +might make sense to set it for the C type @code{wchar_t} so the debugger +can print such variables differently (Solaris does not do this). Sun +sets it on the C types @code{signed char} and @code{unsigned char} which +arguably is wrong. @var{width} and @var{offset} appear to be for small +objects stored in larger ones, for example a @code{short} in an +@code{int} register. @var{width} is normally the number of bytes in the +type. @var{offset} seems to always be zero. @var{nbits} is the number +of bits in the type. + +Note that type descriptor @samp{b} used for builtin types conflicts with +its use for Pascal space types (@pxref{Miscellaneous Types}); they can +be distinguished because the character following the type descriptor +will be a digit, @samp{(}, or @samp{-} for a Pascal space type, or +@samp{u} or @samp{s} for a builtin type. + +@item w +Documented by AIX to define a wide character type, but their compiler +actually uses negative type numbers (@pxref{Negative Type Numbers}). + +@item R @var{details} ; @var{bytes} ; +@c FIXME: What does @var{details} mean? +Define a floating point type. @var{details} is a number which has +details about the type, for example whether it is complex. @var{bytes} +is the number of bytes occupied by the type. + +@item g @var{type-information} ; @var{nbits} +Documented by AIX to define a floating type, but their compiler actually +uses negative type numbers (@pxref{Negative Type Numbers}). + +@item c @var{type-information} ; @var{nbits} +Documented by AIX to define a complex type, but their compiler actually +uses negative type numbers (@pxref{Negative Type Numbers}). +@end table + +The C @code{void} type is defined as a signed integral type 0 bits long: +@example +.stabs "void:t19=bs0;0;0",128,0,0,0 +@end example + +I'm not sure how a boolean type is represented. + +@node Negative Type Numbers +@subsection Negative Type numbers + +Since the debugger knows about the builtin types anyway, the idea of +negative type numbers is simply to give a special type number which +indicates the built in type. There is no stab defining these types. + +I'm not sure whether anyone has tried to define what this means if +@code{int} can be other than 32 bits (or other types can be other than +their customary size). If @code{int} has exactly one size for each +architecture, then it can be handled easily enough, but if the size of +@code{int} can vary according the compiler options, then it gets hairy. +I guess the consistent way to do this would be to define separate +negative type numbers for 16-bit @code{int} and 32-bit @code{int}; +therefore I have indicated below the customary size (and other format +information) for each type. The information below is currently correct +because AIX on the RS6000 is the only system which uses these type +numbers. If these type numbers start to get used on other systems, I +suspect the correct thing to do is to define a new number in cases where +a type does not have the size and format indicated below. + +@table @code +@item -1 +@code{int}, 32 bit signed integral type. + +@item -2 +@code{char}, 8 bit type holding a character. Both GDB and dbx on AIX +treat this as signed. GCC uses this type whether @code{char} is signed +or not, which seems like a bad idea. The AIX compiler (xlc) seems to +avoid this type; it uses -5 instead for @code{char}. + +@item -3 +@code{short}, 16 bit signed integral type. + +@item -4 +@code{long}, 32 bit signed integral type. + +@item -5 +@code{unsigned char}, 8 bit unsigned integral type. + +@item -6 +@code{signed char}, 8 bit signed integral type. + +@item -7 +@code{unsigned short}, 16 bit unsigned integral type. + +@item -8 +@code{unsigned int}, 32 bit unsigned integral type. + +@item -9 +@code{unsigned}, 32 bit unsigned integral type. + +@item -10 +@code{unsigned long}, 32 bit unsigned integral type. + +@item -11 +@code{void}, type indicating the lack of a value. + +@item -12 +@code{float}, IEEE single precision. + +@item -13 +@code{double}, IEEE double precision. + +@item -14 +@code{long double}, IEEE extended, RS6000 format. + +@item -15 +@code{integer}. Pascal, I assume. 32 bit signed integral type. + +@item -16 +Boolean. Only one bit is used, not sure about the actual size of the +type. + +@item -17 +@code{short real}. Pascal, I assume. IEEE single precision. + +@item -18 +@code{real}. Pascal, I assume. IEEE double precision. + +@item -19 +A Pascal Stringptr. @xref{Strings}. + +@item -20 +@code{character}, 8 bit unsigned type. + +@item -21 +@code{logical*1}, 8 bit unsigned integral type. + +@item -22 +@code{logical*2}, 16 bit unsigned integral type. + +@item -23 +@code{logical*4}, 32 bit unsigned integral type. + +@item -24 +@code{logical}, 32 bit unsigned integral type. + +@item -25 +A complex type consisting of two IEEE single-precision floating point values. + +@item -26 +A complex type consisting of two IEEE double-precision floating point values. + +@item -27 +@code{integer*1}, 8 bit signed integral type. + +@item -28 +@code{integer*2}, 16 bit signed integral type. + +@item -29 +@code{integer*4}, 32 bit signed integral type. + +@item -30 +Wide character. AIX appears not to use this for the C type +@code{wchar_t}; instead it uses an integral type of the appropriate +size. +@end table + +@node Miscellaneous Types +@section Miscellaneous Types + +@table @code +@item b @var{type-information} ; @var{bytes} +Pascal space type. This is documented by IBM; what does it mean? + +Note that this use of the @samp{b} type descriptor can be distinguished +from its use for builtin integral types (@pxref{Builtin Type +Descriptors}) because the character following the type descriptor is +always a digit, @samp{(}, or @samp{-}. + +@item B @var{type-information} +A volatile-qualified version of @var{type-information}. This is a Sun +extension. A volatile-qualified type means that references and stores +to a variable of that type must not be optimized or cached; they must +occur as the user specifies them. + +@item d @var{type-information} +File of type @var{type-information}. As far as I know this is only used +by Pascal. + +@item k @var{type-information} +A const-qualified version of @var{type-information}. This is a Sun +extension. A const-qualified type means that a variable of this type +cannot be modified. + +@item M @var{type-information} ; @var{length} +Multiple instance type. The type seems to composed of @var{length} +repetitions of @var{type-information}, for example @code{character*3} is +represented by @samp{M-2;3}, where @samp{-2} is a reference to a +character type (@pxref{Negative Type Numbers}). I'm not sure how this +differs from an array. This appears to be a FORTRAN feature. +@var{length} is a bound, like those in range types, @xref{Subranges}. + +@item S @var{type-information} +Pascal set type. @var{type-information} must be a small type such as an +enumeration or a subrange, and the type is a bitmask whose length is +specified by the number of elements in @var{type-information}. + +@item * @var{type-information} +Pointer to @var{type-information}. @end table -As an example of an array type consider the global variable below. +@node Cross-references +@section Cross-references to other types + +If a type is used before it is defined, one common way to deal with this +is just to use a type reference to a type which has not yet been +defined. The debugger is expected to be able to deal with this. + +Another way is with the @samp{x} type descriptor, which is followed by +@samp{s} for a structure tag, @samp{u} for a union tag, or @samp{e} for +a enumerator tag, followed by the name of the tag, followed by @samp{:}. +for example the following C declarations: @example -15 char char_vec[3] = @{'a','b','c'@}; +struct foo; +struct foo *bar; @end example -Since the array is a global variable, it is described by the N_GSYM -stab type. The symbol descriptor G, following the colon in stab's -string field, also says the array is a global variable. Following the -G is a definition for type (19) as shown by the equals sign after the -type number. +produce + +@example +.stabs "bar:G16=*17=xsfoo:",32,0,0,0 +@end example + +Not all debuggers support the @samp{x} type descriptor, so on some +machines GCC does not use it. I believe that for the above example it +would just emit a reference to type 17 and never define it, but I +haven't verified that. + +Modula-2 imported types, at least on AIX, use the @samp{i} type +descriptor, which is followed by the name of the module from which the +type is imported, followed by @samp{:}, followed by the name of the +type. There is then optionally a comma followed by type information for +the type (This differs from merely naming the type (@pxref{Typedefs}) in +that it identifies the module; I don't understand whether the name of +the type given here is always just the same as the name we are giving +it, or whether this type descriptor is used with a nameless stab +(@pxref{Stabs Format}), or what). The symbol ends with @samp{;}. -After the equals sign is a type descriptor, a, which says that the type -being defined is an array. Following the type descriptor for an array -is the type of the index, a semicolon, and the type of the array elements. +@node Subranges +@section Subrange types + +The @samp{r} type descriptor defines a type as a subrange of another +type. It is followed by type information for the type which it is a +subrange of, a semicolon, an integral lower bound, a semicolon, an +integral upper bound, and a semicolon. The AIX documentation does not +specify the trailing semicolon; I believe it is confused. + +AIX allows the bounds to be one of the following instead of an integer: + +@table @code +@item A @var{offset} +The bound is passed by reference on the stack at offset @var{offset} +from the argument list. @xref{Parameters}, for more information on such +offsets. + +@item T @var{offset} +The bound is passed by value on the stack at offset @var{offset} from +the argument list. + +@item a @var{register-number} +The bound is pased by reference in register number +@var{register-number}. + +@item t @var{register-number} +The bound is passed by value in register number @var{register-number}. + +@item J +There is no bound. +@end table + +Subranges are also used for builtin types, @xref{Traditional Builtin Types}. + +@node Arrays +@section Array types + +Arrays use the @samp{a} type descriptor. Following the type descriptor +is the type of the index and the type of the array elements. The two +types types are not separated by any sort of delimiter; if the type of +the index does not end in a semicolon I don't know what is supposed to +happen. IBM documents a semicolon between the two types. For the +common case (a range type), this ends up as being the same since IBM +documents a range type as not ending in a semicolon, but the latter does +not accord with common practice, in which range types do end with +semicolons. The type of the index is often a range type, expressed as the letter r -and some parameters. It defines the size of the array. In in the -example below, the range @code{r1;0;2;} defines an index type which is -a subrange of type 1 (integer), with a lower bound of 0 and an upper -bound of 2. This defines the valid range of subscripts of a -three-element C array. +and some parameters. It defines the size of the array. In the example +below, the range @code{r1;0;2;} defines an index type which is a +subrange of type 1 (integer), with a lower bound of 0 and an upper bound +of 2. This defines the valid range of subscripts of a three-element C +array. -The array definition above generates the assembly language that -follows. +For example, the definition @example -@exdent <32> N_GSYM - global variable -@exdent .stabs "name:sym_desc(global)type_def(19)=type_desc(array) -@exdent index_type_ref(range of int from 0 to 2);element_type_ref(char)"; -@exdent N_GSYM, NIL, NIL, NIL +char char_vec[3] = @{'a','b','c'@}; +@end example -32 .stabs "char_vec:G19=ar1;0;2;2",32,0,0,0 -33 .global _char_vec -34 .align 4 -35 _char_vec: -36 .byte 97 -37 .byte 98 -38 .byte 99 +@noindent +produces the output + +@example +.stabs "char_vec:G19=ar1;0;2;2",32,0,0,0 + .global _char_vec + .align 4 +_char_vec: + .byte 97 + .byte 98 + .byte 99 +@end example + +If an array is @dfn{packed}, it means that the elements are spaced more +closely than normal, saving memory at the expense of speed. For +example, an array of 3-byte objects might, if unpacked, have each +element aligned on a 4-byte boundary, but if packed, have no padding. +One way to specify that something is packed is with type attributes +(@pxref{Stabs Format}), in the case of arrays another is to use the +@samp{P} type descriptor instead of @samp{a}. Other than specifying a +packed array, @samp{P} is identical to @samp{a}. + +@c FIXME-what is it? A pointer? +An open array is represented by the @samp{A} type descriptor followed by +type information specifying the type of the array elements. + +@c FIXME: what is the format of this type? A pointer to a vector of pointers? +An N-dimensional dynamic array is represented by + +@example +D @var{dimensions} ; @var{type-information} +@end example + +@c Does dimensions really have this meaning? The AIX documentation +@c doesn't say. +@var{dimensions} is the number of dimensions; @var{type-information} +specifies the type of the array elements. + +@c FIXME: what is the format of this type? A pointer to some offsets in +@c another array? +A subarray of an N-dimensional array is represented by + +@example +E @var{dimensions} ; @var{type-information} @end example +@c Does dimensions really have this meaning? The AIX documentation +@c doesn't say. +@var{dimensions} is the number of dimensions; @var{type-information} +specifies the type of the array elements. + +@node Strings +@section Strings + +Some languages, like C or the original Pascal, do not have string types, +they just have related things like arrays of characters. But most +Pascals and various other languages have string types, which are +indicated as follows: + +@table @code +@item n @var{type-information} ; @var{bytes} +@var{bytes} is the maximum length. I'm not sure what +@var{type-information} is; I suspect that it means that this is a string +of @var{type-information} (thus allowing a string of integers, a string +of wide characters, etc., as well as a string of characters). Not sure +what the format of this type is. This is an AIX feature. + +@item z @var{type-information} ; @var{bytes} +Just like @samp{n} except that this is a gstring, not an ordinary +string. I don't know the difference. + +@item N +Pascal Stringptr. What is this? This is an AIX feature. +@end table + @node Enumerations @section Enumerations -@table @strong -@item Directive: -@code{.stabs} -@item Type: -@code{N_LSYM} -@item Symbol Descriptor: -@code{T} -@item Type Descriptor: -@code{e} -@end table +Enumerations are defined with the @samp{e} type descriptor. +@c FIXME: Where does this information properly go? Perhaps it is +@c redundant with something we already explain. The source line below declares an enumeration type. It is defined at file scope between the bodies of main and s_proc in example2.c. -Because the N_LSYM is located after the N_RBRAC that marks the end of +The type definition is located after the N_RBRAC that marks the end of the previous procedure's block scope, and before the N_FUN that marks -the beginning of the next procedure's block scope, the N_LSYM does not -describe a block local symbol, but a file local one. The source line: +the beginning of the next procedure's block scope. Therefore it does not +describe a block local symbol, but a file local one. + +The source line: @example -29 enum e_places @{first,second=3,last@}; +enum e_places @{first,second=3,last@}; @end example @noindent -generates the following stab, located just after the N_RBRAC (close -brace stab) for main. The type definition is in an N_LSYM stab -because type definitions are file scope not global scope. - -@display - <128> N_LSYM - local symbol - .stab "name:sym_dec(type)type_def(22)=sym_desc(enum) - enum_name:value(0),enum_name:value(3),enum_name:value(4),;", - N_LSYM, NIL, NIL, NIL -@end display +generates the following stab @example -104 .stabs "e_places:T22=efirst:0,second:3,last:4,;",128,0,0,0 +.stabs "e_places:T22=efirst:0,second:3,last:4,;",128,0,0,0 @end example The symbol descriptor (T) says that the stab describes a structure, @@ -1312,14 +1624,20 @@ the type definition narrows it down to an enumeration type. Following the e is a list of the elements of the enumeration. The format is name:value,. The list of elements ends with a ;. -@node Structure tags -@section Structure Tags +There is no standard way to specify the size of an enumeration type; it +is determined by the architecture (normally all enumerations types are +32 bits). There should be a way to specify an enumeration type of +another size; type attributes would be one way to do this @xref{Stabs +Format}. + +@node Structures +@section Structures @table @strong @item Directive: @code{.stabs} @item Type: -@code{N_LSYM} +@code{N_LSYM} or @code{C_DECL} @item Symbol Descriptor: @code{T} @item Type Descriptor: @@ -1384,61 +1702,35 @@ element of. So the definition of structure type 16 contains an type definition for an element which is a pointer to type 16. @node Typedefs -@section Typedefs - -@table @strong -@item Directive: -@code{.stabs} -@item Type: -@code{N_LSYM} -@item Symbol Descriptor: -@code{t} -@end table - -Here is the stab for the typedef equating the structure tag with a -type. +@section Giving a type a name -@display - <128> N_LSYM - type definition - .stabs "name:sym_desc(type name)type_ref(struct_tag)",N_LSYM,NIL,NIL,NIL -@end display +To give a type a name, use the @samp{t} symbol descriptor. For example, @example -31 .stabs "s_typedef:t16",128,0,0,0 +.stabs "s_typedef:t16",128,0,0,0 @end example -And here is the code generated for the structure variable. - -@display - <32> N_GSYM - global symbol - .stabs "name:sym_desc(global)type_ref(struct_tag)",N_GSYM,NIL,NIL,NIL -@end display - -@example -136 .stabs "g_an_s:G16",32,0,0,0 -137 .common _g_an_s,20,"bss" -@end example +specifies that @code{s_typedef} refers to type number 16. Such stabs +have symbol type @code{N_LSYM} or @code{C_DECL}. -Notice that the structure tag has the same type number as the typedef -for the structure tag. It is impossible to distinguish between a -variable of the struct type and one of its typedef by looking at the -debugging information. +If instead, you are giving a name to a tag for a structure, union, or +enumeration, use the @samp{T} symbol descriptor instead. I believe C is +the only language with this feature. +If the type is an opaque type (I believe this is a Modula-2 feature), +AIX provides a type descriptor to specify it. The type descriptor is +@samp{o} and is followed by a name. I don't know what the name +means---is it always the same as the name of the type, or is this type +descriptor used with a nameless stab (@pxref{Stabs Format})? There +optionally follows a comma followed by type information which defines +the type of this type. If omitted, a semicolon is used in place of the +comma and the type information, and, the type is much like a generic +pointer type---it has a known size but little else about it is +specified. @node Unions @section Unions -@table @strong -@item Directive: -@code{.stabs} -@item Type: -@code{N_LSYM} -@item Symbol Descriptor: -@code{T} -@item Type Descriptor: -@code{u} -@end table - Next let's look at unions. In example2 this union type is declared locally to a procedure and an instance of the union is defined. @@ -1501,37 +1793,46 @@ pointer offset for local variables is negative. @node Function types @section Function types -@display -type descriptor f -@end display +There are various types for function variables. These types are not +used in defining functions; see symbol descriptor @samp{f}; they are +used for things like pointers to functions. -The last type descriptor in C which remains to be described is used -for function types. Consider the following source line defining a -global function pointer. +The simple, traditional, type is type descriptor @samp{f} is followed by +type information for the return type of the function, followed by a +semicolon. + +This does not deal with functions the number and type of whose +parameters are part of their type, as found in Modula-2 or ANSI C. AIX +provides extensions to specify these, using the @samp{f}, @samp{F}, +@samp{p}, and @samp{R} type descriptors. + +First comes the type descriptor. Then, if it is @samp{f} or @samp{F}, +this is a function, and the type information for the return type of the +function follows, followed by a comma. Then comes the number of +parameters to the function and a semicolon. Then, for each parameter, +there is the name of the parameter followed by a colon (this is only +present for type descriptors @samp{R} and @samp{F} which represent +Pascal function or procedure parameters), type information for the +parameter, a comma, @samp{0} if passed by reference or @samp{1} if +passed by value, and a semicolon. The type definition ends with a +semicolon. + +For example, @example -4 int (*g_pf)(); +int (*g_pf)(); @end example -It generates the following code. Since the variable is not -initialized, the code is located in the common area at the end of the -file. - -@display - <32> N_GSYM - global variable - .stabs "name:sym_desc(global)type_def(24)=ptr_to(25)= - type_def(func)type_ref(int) -@end display +@noindent +generates the following code: @example -134 .stabs "g_pf:G24=*25=f1",32,0,0,0 -135 .common _g_pf,4,"bss" +.stabs "g_pf:G24=*25=f1",32,0,0,0 + .common _g_pf,4,"bss" @end example -Since the variable is global, the stab type is N_GSYM and the symbol -descriptor is G. The variable defines a new type, 24, which is a -pointer to another new type, 25, which is defined as a function -returning int. +The variable defines a new type, 24, which is a pointer to another new +type, 25, which is defined as a function returning int. @node Symbol tables @chapter Symbol information in symbol tables @@ -1652,7 +1953,7 @@ entry now holds an absolute address. 215 0000e008 D _g_foo @end example -@node GNU Cplusplus stabs +@node Cplusplus @chapter GNU C++ stabs @menu @@ -1668,24 +1969,25 @@ entry now holds an absolute address. * Static Members:: @end menu - -@subsection Symbol descriptors added for C++ descriptions: - -@display -P - register parameter. -@end display - @subsection type descriptors added for C++ descriptions @table @code @item # method type (two ## if minimal debug) -@item xs -cross-reference +@item @@ +Member (class and variable) type. It is followed by type information +for the offset basetype, a comma, and type information for the type of +the field being pointed to. (FIXME: this is acknowledged to be +gibberish. Can anyone say what really goes here?). + +Note that there is a conflict between this and type attributes +(@pxref{Stabs Format}); both use type descriptor @samp{@@}. +Fortunately, the @samp{@@} type descriptor used in this C++ sense always +will be followed by a digit, @samp{(}, or @samp{-}, and type attributes +never start with those things. @end table - @node Basic Cplusplus types @section Basic types for C++ @@ -2468,19 +2770,8 @@ description in the class stab shows this ordering. 137 .common _g_an_s,20,"bss" @end example - -@node Quick reference -@appendix Quick reference - -@menu -* Stab types:: Table A: Symbol types from stabs -* Assembler types:: Table B: Symbol types from assembler and linker -* Symbol descriptors:: Table C -* Type Descriptors:: Table D -@end menu - @node Stab types -@section Table A: Symbol types from stabs +@appendix Table A: Symbol types from stabs Table A lists stab types sorted by type number. Stab type numbers are 32 and greater. This is the full list of stab numbers, including stab @@ -2542,7 +2833,7 @@ dec hex name source program feature @end smallexample @node Assembler types -@section Table B: Symbol types from assembler and linker +@appendix Table B: Symbol types from assembler and linker Table B shows the types of symbol table entries that hold assembler and linker symbols. @@ -2570,12 +2861,14 @@ n_type n_type name used to describe 31 0x1f N_FN file name of a .o file @end smallexample -@node Symbol descriptors -@section Table C: Symbol descriptors +@node Symbol Descriptors +@appendix Table C: Symbol descriptors @c Please keep this alphabetical @table @code -@item (empty) +@item @var{(digit)} +@itemx ( +@itemx - Local variable, @xref{Automatic variables}. @item a @@ -2585,7 +2878,10 @@ Parameter passed by reference in register, @xref{Parameters}. Constant, @xref{Constants}. @item C -Conformant array bound, @xref{Parameters}. +Conformant array bound (Pascal, maybe other languages), +@xref{Parameters}. Name of a caught exception (GNU C++). These can be +distinguished because the latter uses N_CATCH and the former uses +another symbol type. @item d Floating point register variable, @xref{Register variables}. @@ -2618,17 +2914,20 @@ Label name (documented by AIX, no further information known). Module, @xref{Procedures}. @item p -Argument list parameter @xref{Parameters}. +Argument list parameter, @xref{Parameters}. @item pP @xref{Parameters}. @item pF -@xref{Parameters}. +FORTRAN Function parameter, @xref{Parameters}. @item P -Global Procedure (AIX), @xref{Procedures}. -Register parameter (GNU), @xref{Parameters}. +Global Procedure (AIX), @xref{Procedures}. Register parameter (GNU), +@xref{Parameters}. These two uses can be distinguised because a +register parameter uses N_PSYM and a procedure uses some other symbol +type. Prototype of function referenced by this file (Sun acc) (have not +yet investigated this conflict. FIXME). @item Q Static Procedure, @xref{Procedures}. @@ -2647,10 +2946,10 @@ Static file scope variable @xref{Initialized statics}, Type name, @xref{Typedefs}. @item T -enumeration, struct or union tag, @xref{Unions}. +enumeration, struct or union tag, @xref{Typedefs}. @item v -Call by reference, @xref{Parameters}. +Parameter passed by reference, @xref{Parameters}. @item V Static procedure scope variable @xref{Initialized statics}, @@ -2664,42 +2963,131 @@ Function return variable, @xref{Parameters}. @end table @node Type Descriptors -@section Table D: Type Descriptors +@appendix Table D: Type Descriptors @table @code -@item (digits) -Type reference, @xref{Overview}. +@item @var{digit} +@itemx ( +Type reference, @xref{Stabs Format}. + +@item - +Reference to builtin type, @xref{Negative Type Numbers}. + +@item # +Method (C++), @xref{Cplusplus}. @item * -Pointer type. +Pointer, @xref{Miscellaneous Types}. + +@item & +Reference (C++). @item @@ -Type Attributes (AIX), @xref{Overview}. -Some C++ thing (GNU). +Type Attributes (AIX), @xref{Stabs Format}. Member (class and variable) +type (GNU C++), @xref{Cplusplus}. @item a -Array type. +Array, @xref{Arrays}. + +@item A +Open array, @xref{Arrays}. + +@item b +Pascal space type (AIX), @xref{Miscellaneous Types}. Builtin integer +type (Sun), @xref{Builtin Type Descriptors}. + +@item B +Volatile-qualified type, @xref{Miscellaneous Types}. + +@item c +Complex builtin type, @xref{Builtin Type Descriptors}. + +@item C +COBOL Picture type. See AIX documentation for details. + +@item d +File type, @xref{Miscellaneous Types}. + +@item D +N-dimensional dynamic array, @xref{Arrays}. @item e -Enumeration type. +Enumeration type, @xref{Enumerations}. + +@item E +N-dimensional subarray, @xref{Arrays}. @item f -Function type. +Function type, @xref{Function types}. + +@item g +Builtin floating point type, @xref{Builtin Type Descriptors}. + +@item G +COBOL Group. See AIX documentation for details. + +@item i +Imported type, @xref{Cross-references}. + +@item k +Const-qualified type, @xref{Miscellaneous Types}. + +@item K +COBOL File Descriptor. See AIX documentation for details. + +@item n +String type, @xref{Strings}. + +@item N +Stringptr, @xref{Strings}. + +@item M +Multiple instance type, @xref{Miscellaneous Types}. + +@item o +Opaque type, @xref{Typedefs}. + +@item P +Packed array, @xref{Arrays}. @item r -Range type. +Range type, @xref{Subranges}. + +@item R +Builtin floating type, @xref{Builtin Type Descriptors}. @item s -Structure type. +Structure type, @xref{Structures}. + +@item S +Set type, @xref{Miscellaneous Types}. @item u -Union specifications. +Union, @xref{Unions}. + +@item v +Variant record. This is a Pascal and Modula-2 feature which is like a +union within a struct in C. See AIX documentation for details. + +@item w +Wide character, @xref{Builtin Type Descriptors}. + +@item x +Cross-reference, @xref{Cross-references}. +@item z +gstring, @xref{Strings}. @end table @node Expanded reference @appendix Expanded reference by stab type. +@c FIXME: For most types this should be much shorter and much sweeter, +@c see N_PSYM for an example. For stuff like N_SO where the stab type +@c really is the important thing, the information can stay here. + +@c FIXME: It probably should be merged with Tables A and B. + Format of an entry: The first line is the symbol type expressed in decimal, hexadecimal, @@ -3272,23 +3660,6 @@ because in xcoff N_STSYM and N_LCSYM must be emited in a named static block. Begin the block with .bs s[RW] data_section_name for N_STSYM or .bs s bss_section_name for N_LCSYM. End the block with .es -@item -xcoff stabs describing tags and typedefs use the N_DECL (0x8c)instead -of N_LSYM stab type. - -@item -xcoff uses N_RPSYM (0x8e) instead of the N_RSYM stab type for register -variables. If the register variable is also a value parameter, then -use R instead of P for the symbol descriptor. - -6. -xcoff uses negative numbers as type references to the basic types. -There are no boilerplate type definitions emited for these basic -types. << make table of basic types and type numbers for C >> - -@item -xcoff .stabx sometimes don't have the name part of the string field. - @item xcoff uses a .file stab type to represent the source file name. There is no stab for the path to the source file.