lint
[deliverable/binutils-gdb.git] / ld / ld.tex
CommitLineData
c074abee
DHW
1\input texinfo
2@parindent=0pt
3@setfilename gld
4@c @@setchapternewpage odd
5@settitle GLD, The GNU linker
6@titlepage
7@title{gld}
8@subtitle{The gnu loader}
9@sp 1
10@subtitle Second Edition---gld version 2.0
11@subtitle January 1991
12@vskip 0pt plus 1filll
13Copyright @copyright{} 1991 Free Software Foundation, Inc.
14
15Permission is granted to make and distribute verbatim copies of
16this manual provided the copyright notice and this permission notice
17are preserved on all copies.
18
19Permission is granted to copy and distribute modified versions of this
20manual under the conditions for verbatim copying, provided also that
21the entire resulting derived work is distributed under the terms of a
22permission notice identical to this one.
23
24Permission is granted to copy and distribute translations of this manual
25into another language, under the above conditions for modified versions.
26
27@author {Steve Chamberlain}
28@author {Cygnus Support}
29@author {steve@@cygnus.com}
30@end titlepage
31
32@node Top,,,
33@comment node-name, next, previous, up
34@ifinfo
35This file documents the GNU linker gld.
36@end ifinfo
37
38@c chapter What does a linker do ?
39@c chapter Command Language
40@noindent
41@chapter Overview
42
43
44The @code{gld} command combines a number of object and archive files,
45relocates their data and ties up symbol references. Often the last
46step in building a new compiled program to run is a call to @code{gld}.
47
48The @code{gld} command accepts Linker Command Language files in
49a superset of AT+T's Link Editor Command Language syntax,
50to provide explict and total control over the linking process.
51
52This version of @code{gld} uses the general purpose @code{bfd} libraries
53to operate on object files. This allows @code{gld} to read and
54write any of the formats supported by @code{bfd}, different
55formats may be linked together producing any available object file.
56
57Supported formats:
58@itemize @bullet
59@item
60Sun3 68k a.out
61@item
62IEEE-695 68k Object Module Format
63@item
64Oasys 68k Binary Relocatable Object File Format
65@item
66Sun4 sparc a.out
67@item
6888k bcs coff
69@item
70i960 coff little endian
71@item
72i960 coff big endian
73@item
74i960 b.out little endian
75@item
76i960 b.out big endian
77@item
78s-records
79@end itemize
80
81When linking similar formats, @code{gld} maintains all debugging
82information.
83
84@chapter Command line options
85
86@example
87 gld [ -Bstatic ] [ -D @var{datasize} ]
88 [ -c @var{filename} ]
89 [ -d ] | [ -dc ] | [ -dp ]
90 [ -i ]
91 [ -e @var{entry} ] [ -l @var{arch} ] [ -L @var{searchdir} ] [ -M ]
92 [ -N | -n | -z ] [ -noinhibit-exec ] [ -r ] [ -S ] [ -s ]
93 [ -f @var{fill} ]
94 [ -T @var{textorg} ] [ -Tdata @var{dataorg} ] [ -t ] [ -u @var{sym}]
95 [ -X ] [ -x ]
96 [-o @var{output} ] @var{objfiles}@dots{}
97@end example
98
99Command-line options to GNU @code{gld} may be specified in any order, and
100may be repeated at will. For the most part, repeating an option with a
101different argument will either have no further effect, or override prior
102occurrences (those further to the left on the command line) of an
103option.
104
105The exceptions which may meaningfully be present several times
106are @code{-L}, @code{-l}, and @code{-u}.
107
108@var{objfiles} may follow, precede, or be mixed in with
109command-line options; save that an @var{objfiles} argument may not be
110placed between an option flag and its argument.
111
112Option arguments must follow the option letter without intervening
113whitespace, or be given as separate arguments immediately following the
114option that requires them.
115
116@table @code
117@item @var{objfiles}@dots{}
118The object files @var{objfiles} to be linked; at least one must be specified.
119
120@item -Bstatic
121This flag is accepted for command-line compatibility with the SunOS linker,
122but has no effect on @code{gld}.
123
124@item -c @var{commandfile}
125Directs @code{gld} to read linkage commands from the file @var{commandfile}.
126
127@item -D @var{datasize}
128Use this option to specify a target size for the @code{data} segment of
129your linked program. The option is only obeyed if @var{datasize} is
130larger than the natural size of the program's @code{data} segment.
131
132@var{datasize} must be an integer specified in hexadecimal.
133
134@code{ld} will simply increase the size of the @code{data} segment,
135padding the created gap with zeros, and reduce the size of the
136@code{bss} segment to match.
137
138@item -d
139Force @code{ld} to assign space to common symbols
140even if a relocatable output file is specified (@code{-r}).
141
142@item -dc | -dp
143This flags is accepted for command-line compatibility with the SunOS linker,
144but has no effect on @code{gld}.
145
146@item -e @var{entry}
147Use @var{entry} as the explicit symbol for beginning execution of your
148program, rather than the default entry point. If this symbol is
149not specified, the symbol @code{start} is used as the entry address.
150If there is no symbol called @code{start}, then the entry address
151is set to the first address in the first output section
152(usually the @samp{text} section).
153
154@item -f @var{fill}
155Sets the default fill pattern for ``holes'' in the output file to
156the lowest two bytes of the expression specified.
157
158@item -i
159Produce an incremental link (same as option @code{-r}).
160
161@item -l @var{arch}
162Add an archive file @var{arch} to the list of files to link. This
163option may be used any number of times. @code{ld} will search its
164path-list for occurrences of @code{lib@var{arch}.a} for every @var{arch}
165specified.
166
167@c This also has a side effect of using the "c++ demangler" if we happen
168@c to specify -llibg++. Document? pesch@@cygnus.com, 24jan91
169
170@item -L @var{searchdir}
171This command adds path @var{searchdir} to the
172list of paths that @code{gld} will search for archive libraries. You
173may use this option any number of times.
174
175@c Should we make any attempt to list the standard paths searched
176@c without listing? When hacking on a new system I often want to know
177@c this, but this may not be the place... it's not constant across
178@c systems, of course, which is what makes it interesting.
179@c pesch@@cygnus.com, 24jan91.
180
181@item -M
182@itemx -m
183Print (to the standard output file) a link map---diagnostic information
184about where symbols are mapped by @code{ld}, and information on global
185common storage allocation.
186
187@item -N
188specifies read and writable @code{text} and @code{data} sections. If
189the output format supports Unix style magic numbers, then OMAGIC is set.
190
191@item -n
192sets the text segment to be read only, and @code{NMAGIC} is written
193if possible.
194
195@item -o @var{output}
196@var{output} is a name for the program produced by @code{ld}; if this
197option is not specified, the name @samp{a.out} is used by default.
198
199@item -r
200Generates relocatable output---i.e., generate an output file that can in
201turn serve as input to @code{gld}. As a side effect, this option also
202sets the output file's magic number to @code{OMAGIC}; see @samp{-N}. If this
203option is not specified, an absolute file is produced.
204
205@item -S
206Omits debugger symbol information (but not all symbols) from the output file.
207
208@item -s
209Omits all symbol information from the output file.
210
211@item -T @var{textorg}
212@itemx -Ttext @var{textorg}
213Use @var{textorg} as the starting address for the @code{text} segment of the
214output file. Both forms of this option are equivalent. The option
215argument must be a hexadecimal integer.
216
217@item -Tdata @var{dataorg}
218Use @var{dataorg} as the starting address for the @code{data} segment of
219the output file. The option argument must be a hexadecimal integer.
220
221@item -t
222Prints names of input files as @code{ld} processes them.
223
224@item -u @var{sym}
225Forces @var{sym} to be entered in the output file as an undefined symbol.
226This may, for example, trigger linking of additional modules from
227standard libraries. @code{-u} may be repeated with different option
228arguments to enter additional undefined symbols. This option is equivalent
229to the @code{EXTERN} linker command.
230
231@item -X
232If @code{-s} or @code{-S} is also specified, delete only local symbols
233beginning with @samp{L}.
234
235@item -z
236@code{-z} sets @code{ZMAGIC}, the default: the @code{text} segment is
237read-only, demand pageable, and shared.
238
239Specifying a relocatable output file (@code{-r}) will also set the magic
240number to @code{OMAGIC}.
241
242See description of @samp{-N}.
243
244
245@end table
246@chapter Command Language
247
248
249The command language allows explicit control over the linkage process, allowing
250specification of:
251@table @bullet
252@item input files
253@item file formats
254@item output file format
255@item addresses of sections
256@item placement of common blocks
257@item and more
258@end table
259
260A command file may be supplied to the linker, either explicitly through the
261@code{-c} option, or implicitly as an ordinary file. If the linker opens
262a file which does not have a reasonable object or archive format, it tries
263to read the file as if it were a command file.
264@section Structure
265To be added
266
267@section Expressions
268The syntax for expressions in the command language is identical to that of
269C expressions, with the following features:
270@table @bullet
271@item All expressions evaluated as integers and
272are of ``long'' or ``unsigned long'' type.
273@item All constants are integers.
274@item All of the C arithmetic operators are provided.
275@item Global variables may be referenced, defined and created.
276@item Build in functions may be called.
277@end table
278
279@section Expressions
280
281The linker has a practice of ``lazy evaluation'' for expressions; it only
282calculates an expression when absolutely necessary. For instance,
283when the linker reads in the command file it has to know the values
284of the start address and the length of the memory regions for linkage to continue, so these
285values are worked out, but other values (such as symbol values) are not
286known or needed until after storage allocation.
287They are evaluated later, when the other
288information, such as the sizes of output sections are available for use in
289the symbol assignment expression.
290
291When a linker expression is evaluated and assigned to a variable it is given
292either an absolute or a relocatable type. An absolute expression type
293is one in which the symbol contains the value that it will have in the
294output file, a relocateable expression type is one in which the value
295is expressed as a fixed offset from the base of a section.
296
297The type of the expression is controlled by its position in the script
298file. A symbol assigned within a @code{SECTION} specification is
299created relative to the base of the section, a symbol assigned in any
300other place is created as an absolute symbol. Since a symbol created
301within a @code{SECTION} specification is relative to the base of the
302section it will remain relocatable if relocatable output is requested.
303A symbol may be created with an absolute value even when assigned to
304within a @code{SECTION} specification by using the absolute assignment
305function @code{ABSOLUTE} For example, to create an absolute symbol
306whose address is the last byte of the output section @code{.data}:
307@example
308.data :
309 @{
310 *(.data)
311 _edata = ABSOLUTE(.) ;
312 @}
313@end example
314
315Unless quoted, symbol names start with a letter, underscore, point or
316minus sign and may include any letters, underscores, digits, points,
317and minus signs. Unquoted symbol names must not conflict with any
318keywords. To specify a symbol which contains odd characters or has
319the same name as a keyword surround it in double quotes:
320@example
321 ``SECTION'' = 9;
322 ``with a space'' = ``also with a space'' + 10;
323@end example
324
325@subsection Integers
326An octal integer is @samp{0} followed by zero or more of the octal
327digits (@samp{01234567}).
328
329A decimal integer starts with a non-zero digit followed by zero or
330more digits (@samp{0123456789}).
331
332A hexadecimal integer is @samp{0x} or @samp{0X} followed by one or
333more hexadecimal digits chosen from @samp{0123456789abcdefABCDEF}.
334
335Integers have the usual values. To denote a negative integer, use
336the unary operator @samp{-} discussed under expressions.
337
338Additionally the suffixes @code{K} and @code{M} may be used to multiply the
339previous constant by 1024 or
340@tex
341$1024^2$
342@end tex
343respectively.
344
345@example
346 _as_decimal = 57005;
347 _as_hex = 0xdead;
348 _as_octal = 0157255;
349
350 _4k_1 = 4K;
351 _4k_2 = 4096;
352 _4k_3 = 0x1000;
353@end example
354@subsection Operators
355The linker provides the standard C set of arithmetic operators, with
356the standard bindings and precedence levels:
357@example
358
359@end example
360@tex
361
362\vbox{\offinterlineskip
363\hrule
364\halign
365{\vrule#&\hfil#\hfil&\vrule#&\hfil#\hfil&\vrule#&\hfil#\hfil&\vrule#\cr
366height2pt&&&&&\cr
367&Level&& associativity &&Operators&\cr
368height2pt&&&&&\cr
369\noalign{\hrule}
370height2pt&&&&&\cr
371&highest&&&&&&\cr
372&1&&left&&$ ! - ~$&\cr
373height2pt&&&&&\cr
374&2&&left&&* / \%&\cr
375height2pt&&&&&\cr
376&3&&left&&+ -&\cr
377height2pt&&&&&\cr
378&4&&left&&$>> <<$&\cr
379height2pt&&&&&\cr
380&5&&left&&$== != > < <= >=$&\cr
381height2pt&&&&&\cr
382&6&&left&&\&&\cr
383height2pt&&&&&\cr
384&7&&left&&|&\cr
385height2pt&&&&&\cr
386&8&&left&&{\&\&}&\cr
387height2pt&&&&&\cr
388&9&&left&&||&\cr
389height2pt&&&&&\cr
390&10&&right&&? :&\cr
391height2pt&&&&&\cr
392&11&&right&&$${\&= += -= *= /=}&\cr
393&lowest&&&&&&\cr
394height2pt&&&&&\cr}
395\hrule}
396@end tex
397
398@section Built in Functions
399The command language provides built in functions for use in
400expressions in linkage scripts.
401@table @bullet
402@item @code{ALIGN(@var{exp})}
403returns the result of the current location counter (@code{dot})
404aligned to the next @var{exp} boundary, where @var{exp} is a power of
405two. This is equivalent to @code{(. + @var{exp} -1) & ~(@var{exp}-1)}.
406As an example, to align the output @code{.data} section to the
407next 0x2000 byte boundary after the preceding section and to set a
408variable within the section to the next 0x8000 boundary after the
409input sections:
410@example
411 .data ALIGN(0x2000) :@{
412 *(.data)
413 variable = ALIGN(0x8000);
414 @}
415@end example
416
417@item @code{ADDR(@var{section name})}
418returns the absolute address of the named section if the section has
419already been bound. In the following examples the @code{symbol_1} and
420@code{symbol_2} are assigned identical values:
421@example
422 .output1:
423 @{
424 start_of_output_1 $= .;
425 ...
426 @}
427 .output:
428 @{
429 symbol_1 = ADDR(.output1);
430 symbol_2 = start_of_output_1;
431 @}
432@end example
433
434@item @code{SIZEOF(@var{section name})}
435returns the size in bytes of the named section, if the section has
436been allocated. In the following example the @code{symbol_1} and
437@code{symbol_2} are assigned identical values:
438@example
439 .output @{
440 .start = . ;
441 ...
442 .end = .;
443 @}
444 symbol_1 = .end - .start;
445 symbol_2 = SIZEOF(.output);
446@end example
447
448@item @code{DEFINED(@var{symbol name})}
449Returns 1 if the symbol is in the linker global symbol table and is
450defined, otherwise it returns 0. This example shows the setting of a
451global symbol @code{begin} to the first location in the @code{.text}
452section, only if there is no other symbol
453called @code{begin} already:
454@example
455 .text: @{
456 begin = DEFINED(begin) ? begin : . ;
457 ...
458 @}
459@end example
460@end table
461@page
462@section MEMORY Directive
463The linker's default configuration is for all memory to be
464allocatable. This state may be overridden by using the @code{MEMORY}
465directive. The @code{MEMORY} directive describes the location and
466size of blocks of memory in the target. Careful use can describe
467memory regions which may or may not be used by the linker. The linker
468does not shuffle sections to fit into the available regions, but does
469move the requested sections into the correct regions and issue errors
470when the regions become too full. The syntax is:
471
472@example
473 MEMORY
474 @{
475@tex
476 $\bigl\lbrace {\it name_1} ({\it attr_1}):$ ORIGIN = ${\it origin_1},$ LENGTH $= {\it len_1} \bigr\rbrace $
477@end tex
478
479 @}
480@end example
481@table @code
482@item @var{name}
483is a name used internally by the linker to refer to the region. Any
484symbol name may be used. The region names are stored in a separate
485name space, and will not conflict with symbols, filenames or section
486names.
487@item @var{attr}
488is an optional list of attributes, parsed for compatibility with the
489AT+T linker
490but ignored by the both the AT+T and the gnu linker.
491@item @var{origin}
492is the start address of the region in physical memory expressed as
493standard linker expression which must evaluate to a constant before
494memory allocation is performed. The keyword @code{ORIGIN} may be
495abbreviated to @code{org} or @code{o}.
496@item @var{len}
497is the size in bytes of the region as a standard linker expression.
498The keyword @code{LENGTH} may be abbreviated to @code{len} or @code{l}
499@end table
500
501For example, to specify that memory has two regions available for
502allocation; one starting at 0 for 256k, and the other starting at
5030x40000000 for four megabytes:
504
505@example
506 MEMORY
507 @{
508 rom : ORIGIN= 0, LENGTH = 256K
509 ram : ORIGIN= 0x40000000, LENGTH = 4M
510 @}
511
512@end example
513
514If the combined output sections directed to a region are too big for
515the region the linker will emit an error message.
516@page
517@section SECTIONS Directive
518The @code{SECTIONS} directive
519controls exactly where input sections are placed into output sections, their
520order and to which output sections they are allocated.
521
522When no @code{SECTIONS} directives are specified, the default action
523of the linker is to place each input section into an identically named
524output section in the order that the sections appear in the first
525file, and then the order of the files.
526
527The syntax of the @code{SECTIONS} directive is:
528
529@example
530 SECTIONS
531 @{
532@tex
533 $\bigl\lbrace {\it name_n}\bigl[options\bigr]\colon$ $\bigl\lbrace {\it statements_n} \bigr\rbrace \bigl[ = {\it fill expression } \bigr] \bigl[ > mem spec \bigr] \bigr\rbrace $
534@end tex
535 @}
536@end example
537
538@table @code
539@item @var{name}
540controls the name of the output section. In formats which only support
541a limited number of sections, such as @code{a.out}, the name must be
542one of the names supported by the format (in the case of a.out,
543@code{.text}, @code{.data} or @code{.bss}). If the output format
544supports any number of sections, but with numbers and not names (in
545the case of IEEE), the name should be supplied as a quoted numeric
546string. A section name may consist of any sequence characters, but
547any name which does not conform to the standard @code{gld} symbol name
548syntax must be quoted. To copy sections 1 through 4 from a Oasys file
549into the @code{.text} section of an @code{a.out} file, and sections 13
550and 14 into the @code{data} section:
551@example
552
553 SECTION @{
554 .text :@{
555 *(``1'' ``2'' ``3'' ``4'')
556 @}
557
558 .data :@{
559 *(``13'' ``14'')
560 @}
561 @}
562@end example
563
564@item @var{fill expression}
565If present this
566expression sets the fill value. Any unallocated holes in the current output
567section when written to the output file will
568be filled with the two least significant bytes of the value, repeated as
569necessary.
570@page
571@item @var{options}
572the @var{options} parameter is a list of optional arguments specifying
573attributes of the output section, they may be taken from the following
574list:
575@table @bullet{}
576@item @var{addr expression}
577forces the output section to be loaded at a specified address. The
578address is specified as a standard linker expression. The following
579example generates section @var{output} at location
580@code{0x40000000}:
581@example
582 SECTIONS @{
583 output 0x40000000: @{
584 ...
585 @}
586 @}
587@end example
588Since the built in function @code{ALIGN} references the location
589counter implicitly, a section may be located on a certain boundary by
590using the @code{ALIGN} function in the expression. For example, to
591locate the @code{.data} section on the next 8k boundary after the end
592of the @code{.text} section:
593@example
594 SECTIONS @{
595 .text @{
596 ...
597 @}
598 .data ALIGN(4K) @{
599 ...
600 @}
601 @}
602@end example
603@end table
604@item @var{statements}
605is a list of file names, input sections and assignments. These statements control what is placed into the
606output section.
607The syntax of a single @var{statement} is one of:
608@table @bullet
609
610@item @var{symbol} [ $= | += | -= | *= | /= ] @var{ expression} @code{;}
611
612Global symbols may be created and have their values (addresses)
613altered using the assignment statement. The linker tries to put off
614the evaluation of an assignment until all the terms in the source
615expression are known; for instance the sizes of sections cannot be
616known until after allocation, so assignments dependent upon these are
617not performed until after allocation. Some expressions, such as those
618depending upon the location counter @code{dot}, @samp{.} must be
619evaluated during allocation. If the result of an expression is
620required, but the value is not available, then an error results: eg
621@example
622 SECTIONS @{
623 text 9+this_isnt_constant:
624 @{
625 @}
626 @}
627 testscript:21: Non constant expression for initial address
628@end example
629
630@item @code{CREATE_OBJECT_SYMBOLS}
631causes the linker to create a symbol for each input file and place it
632into the specified section set with the value of the first byte of
633data written from the input file. For instance, with @code{a.out}
634files it is conventional to have a symbol for each input file.
635@example
636 SECTIONS @{
637 .text 0x2020 :
638 @{
639 CREATE_OBJECT_SYMBOLS
640 *(.text)
641 _etext = ALIGN(0x2000);
642 @}
643 @}
644@end example
645Supplied with four object files, @code{a.o}, @code{b.o}, @code{c.o},
646and @code{d.o} a run of
647@code{gld} could create a map:
648@example
649From functions like :
650a.c:
651 afunction() { }
652 int adata=1;
653 int abss;
654
65500000000 A __DYNAMIC
65600004020 B _abss
65700004000 D _adata
65800002020 T _afunction
65900004024 B _bbss
66000004008 D _bdata
66100002038 T _bfunction
66200004028 B _cbss
66300004010 D _cdata
66400002050 T _cfunction
6650000402c B _dbss
66600004018 D _ddata
66700002068 T _dfunction
66800004020 D _edata
66900004030 B _end
67000004000 T _etext
67100002020 t a.o
67200002038 t b.o
67300002050 t c.o
67400002068 t d.o
675
676@end example
677
678@item @var{filename} @code{(} @var{section name list} @code{)}
679This command allocates all the named sections from the input object
680file supplied into the output section at the current point. Sections
681are written in the order they appear in the list so:
682@example
683 SECTIONS @{
684 .text 0x2020 :
685 @{
686 a.o(.data)
687 b.o(.data)
688 *(.text)
689 @}
690 .data :
691 @{
692 *(.data)
693 @}
694 .bss :
695 @{
696 *(.bss)
697 COMMON
698 @}
699 @}
700@end example
701will produce a map:
702@example
703
704 insert here
705@end example
706@item @code{* (} @var{section name list} @code{)}
707This command causes all sections from all input files which have not
708yet been assigned output sections to be assigned the current output
709section.
710
711@item @var{filename} @code{[COMMON]}
712This allocates all the common symbols from the specified file and places
713them into the current output section.
714
715@item @code{* [COMMON]}
716This allocates all the common symbols from the files which have not
717yet had their common symbols allocated and places them into the current
718output section.
719
720@item @var{filename}
721A filename alone within a @code{SECTIONS} statement will cause all the
722input sections from the file to be placed into the current output
723section at the current location. If the file name has been mentioned
724before with a section name list then only those
725sections which have not yet been allocated are noted.
726
727The following example reads all of the sections from file all.o and
728places them at the start of output section @code{outputa} which starts
729at location @code{0x10000}. All of the data from section @code{.input1} from
730file foo.o is placed next into the same output section. All of
731section @code{.input2} is read from foo.o and placed into output
732section @code{outputb}. Next all of section @code{.input1} is read
733from foo1.o. All of the remaining @code{.input1} and @code{.input2}
734sections from any files are written to output section @code{output3}.
735
736@example
737 SECTIONS
738 @{
739 outputa 0x10000 :
740 @{
741 all.o
742 foo.o (.input1)
743 @}
744 outputb :
745 @{
746 foo.o (.input2)
747 foo1.o (.input1)
748 @}
749 outputc :
750 @{
751 *(.input1)
752 *(.input2)
753 @}
754 @}
755
756@end example
757@end table
758@end table
759@section Using the Location Counter
760The special linker variable @code{dot}, @samp{.} always contains the
761current output location counter. Since the @code{dot} always refers to
762a location in an output section, it must always appear in an
763expression within a @code{SECTIONS} directive. The @code{dot} symbol
764may appear anywhere that an ordinary symbol may appear in an
765expression, but its assignments have a side effect. Assigning a value
766to the @code{dot} symbol will cause the location counter to be moved.
767This may be used to create holes in the output section. The location
768counter may never be moved backwards.
769@example
770 SECTIONS
771 @{
772 output :
773 @{
774 file1(.text)
775 . = . + 1000;
776 file2(.text)
777 . += 1000;
778 file3(.text)
779 . -= 32;
780 file4(.text)
781 @} = 0x1234;
782 @}
783@end example
784In the previous example, @code{file1} is located at the beginning of
785the output section, then there is a 1000 byte gap, filled with 0x1234.
786Then @code{file2} appears, also with a 1000 byte gap following before
787@code{file3} is loaded. Then the first 32 bytes of @code{file4} are
788placed over the last 32 bytes of @code{file3}.
789@section Command Language Syntax
790@section The Entry Point
791The linker chooses the first executable instruction in an output file from a list
792of possibilities, in order:
793@itemize @bullet
794@item
795The value of the symbol provided to the command line with the @code{-e} option, when
796present.
797@item
798The value of the symbol provided in the @code{ENTRY} directive,
799if present.
800@item
801The value of the symbol @code{start}, if present.
802@item
803The value of the symbol @code{_main}, if present.
804@item
805The address of the first byte of the @code{.text} section, if present.
806@item
807The value 0.
808@end itemize
809If the symbol @code{start} is not defined within the set of input
810files to a link, it may be generated by a simple assignment
811expression. eg.
812@example
813 start = 0x2020;
814@end example
815@section Section Attributes
816@section Allocation of Sections into Memory
817@section Defining Symbols
818@chapter Examples of operation
819The simplest case is linking standard Unix object files on a standard
820Unix system supported by the linker. To link a file hello.o:
821@example
822$ gld -o output /lib/crt0.o hello.o -lc
823@end example
824This tells gld to produce a file called @code{output} after linking
825the file @code{/lib/crt0.o} with @code{hello.o} and the library
826@code{libc.a} which will come from the standard search directories.
827@chapter Partial Linking
828Specifying the @code{-r} on the command line causes @code{gld} to
829perform a partial link.
830
831
832@chapter BFD
833
834The linker accesses object and archive files using the @code{bfd}
835libraries. These libraries allow the linker to use the same routines
836to operate on object files whatever the object file format.
837
838A different object file format can be supported simply by creating a
839new @code{bfd} back end and adding it to the library.
840
841Formats currently supported:
842@itemize @bullet
843@item
844Sun3 68k a.out
845@item
846IEEE-695 68k Object Module Format
847@item
848Oasys 68k Binary Relocatable Object File Format
849@item
850Sun4 sparc a.out
851@item
85288k bcs coff
853@item
854i960 coff little endian
855@item
856i960 coff big endian
857@item
858i960 b.out little endian
859@item
860i960 b.out big endian
861@end itemize
862
863As with most implementations, @code{bfd} is a compromise between
864several conflicting requirements. The major factor influencing
865@code{bfd} design was efficiency, any time used converting between
866formats is time which would not have been spent had @code{bfd} not
867been involved. This is partly offset by abstraction payback; since
868@code{bfd} simplifies applications and back ends, more time and care
869may be spent optimizing algorithms for a greater speed.
870
871One minor artifact of the @code{bfd} solution which the
872user should be aware of is information lossage.
873There are two places where useful information can be lost using the
874@code{bfd} mechanism; during conversion and during output.
875
876@section How it works
877When an object file is opened, @code{bfd}
878tries to automatically determine the format of the input object file, a
879descriptor is built in memory with pointers to routines to access
880elements of the object file's data structures.
881
882As different information from the the object files is required
883@code{bfd} reads from different sections of the file and processes
884them. For example a very common operation for the linker is processing
885symbol tables. Each @code{bfd} back end provides a routine for
886converting between the object file's representation of symbols and an
887internal canonical format. When the linker asks for the symbol table
888of an object file, it calls through the memory pointer to the relevant
889@code{bfd} back end routine which reads and converts the table into
890the canonical form. Linker then operates upon the common form. When
891the link is finished and the linker writes the symbol table of the
892output file, another @code{bfd} back end routine is called which takes
893the newly created symbol table and converts it into the output format.
894
895@section Information Leaks
896@table @bullet{}
897@item Information lost during output.
898The output formats supported by @code{bfd} do not provide identical
899facilities, and information which may be described in one form
900has no where to go in another format. One example of this would be
901alignment information in @code{b.out}. There is no where in an @code{a.out}
902format file to store alignment information on the contained data, so when
903a file is linked from @code{b.out} and an @code{a.out} image is produced,
904alignment information is lost. (Note that in this case the linker has the
905alignment information internally, so the link is performed correctly).
906
907Another example is COFF section names. COFF files may contain an
908unlimited number of sections, each one with a textual section name. If
909the target of the link is a format which does not have many sections
910(eg @code{a.out}) or has sections without names (eg the Oasys format)
911the link cannot be done simply. It is possible to circumvent this
912problem by describing the desired input section to output section
913mapping with the command language.
914
915@item Information lost during canonicalization.
916The @code{bfd}
917internal canonical form of the external formats is not exhaustive,
918there are structures in input formats for which there is no direct
919representation internally. This means that the @code{bfd} back ends
920cannot maintain all the data richness through the transformation
921between external to internal and back to external formats.
922
923This limitation is only a problem when using the linker to read one
924format and write another. Each @code{bfd} back end is responsible for
925maintaining as much data as possible, and the internal @code{bfd}
926canonical form has structures which are opaque to the @code{bfd} core,
927and exported only to the back ends. When a file is read in one format,
928the canonical form is generated for @code{bfd} and the linker. At the
929same time, the back end saves away any information which may otherwise
930be lost. If the data is then written back to the same back end, the
931back end routine will be able to use the canonical form provided by
932the @code{bfd} core as well as the information it prepared earlier.
933Since there is a great deal of commonality between back ends, this
934mechanism is very useful. There is no information lost when linking
935big endian COFF to little endian COFF, or from a.out to b.out. When a
936mixture of formats are linked, the information is only lost from the
937files with a different format to the destination.
938@end table
939@section Mechanism
940The smallest amount of information is preserved when there
941is a small union between the information provided by the source
942format, that stored by the canonical format and the information needed
943by the destination format. A brief description of the canonical form
944will help the user appreciate what is possible to be maintained
945between conversions.
946
947@table @bullet
948@item file level Information on target machine
949architecture, particular implementation and format type are stored on
950a per file basis. Other information includes a demand pageable bit and
951a write protected bit. Note that information like Unix magic numbers
952is not stored here, only the magic numbers meaning, so a ZMAGIC file
953would have both the demand pageable bit and the write protected text
954bit set.
955
956The byte order of the target is stored on a per file basis, so that
957both big and little endian object files may be linked together at the
958same time.
959@item section level
960Each section in the input file contains the name of the section, the
961original address in the object file, various flags, size and alignment
962information and pointers into other @code{bfd} data structures.
963@item symbol level
964Each symbol contains a pointer to the object file which originally
965defined it, its name, value and various flags bits. When a symbol
966table is read in all symbols are relocated to make them relative to
967the base of the section they were defined in, so each symbol points to
968the containing section. Each symbol also has a varying amount of
969hidden data to contain private data for the back end. Since the symbol
970points to the original file, the symbol private data format is
971accessible. Operations may be done to a list of symbols of wildly
972different formats without problems.
973
974Normal global and simple local symbols are maintained on output, so an
975output file, no matter the format will retain symbols pointing to
976functions, globals, statics and commons. Some symbol information is
977not worth retaining; in @code{a.out} type information is stored in the
978symbol table as long symbol names. This information would be useless
979to most coff debuggers and may be thrown away with appropriate command
980line switches. (Note that gdb does support stabs in coff).
981
982There is one word of type information within the symbol, so if the
983format supports symbol type information within symbols - (eg COFF,
984IEEE, Oasys) and the type is simple enough to fit within one word
985(nearly everything but aggregates) the information will be preserved.
986
987@item relocation level
988Each canonical relocation record contains a pointer to the symbol to
989relocate to, the offset of the data to relocate, the section the data
990is in and a pointer to a relocation type descriptor. Relocation is
991performed effectively by message passing through the relocation type
992descriptor and symbol pointer. It allows relocations to be performed
993on output data using a relocation method only available in one of the
994input formats. For instance, Oasys provides a byte relocation format.
995A relocation record requesting this relocation type would point
996indirectly to a routine to perform this, so the relocation may be
997performed on a byte being written to a COFF file, even though 68k COFF
998has no such relocation type.
999
1000@item line numbers
1001Line numbers have to be relocated along with the symbol information.
1002Each symbol with an associated list of line number records points to
1003the first record of the list. The head of a line number list consists
1004of a pointer to the symbol, which allows divination of the address of
1005the function who's line number is being described. The rest of the
1006list is tuples offsets into the section and line indexes. Any format
1007which can simply derive this information can pass it without lossage
1008between formats (COFF, IEEE and Oasys).
1009@end table
1010
1011
1012@bye
1013
1014
This page took 0.077072 seconds and 4 git commands to generate.