\input texinfo
-@c Copyright 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1998,
-@c 2000
-@c Free Software Foundation, Inc.
+@c Copyright (C) 1988-2018 Free Software Foundation, Inc.
@setfilename bfdint.info
@settitle BFD Internals
@page
@end iftex
+@copying
+This file documents the internals of the BFD library.
+
+Copyright @copyright{} 1988-2018 Free Software Foundation, Inc.
+Contributed by Cygnus Support.
+
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.1 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being ``GNU General Public License'' and ``Funding
+Free Software'', the Front-Cover texts being (a) (see below), and with
+the Back-Cover Texts being (b) (see below). A copy of the license is
+included in the section entitled ``GNU Free Documentation License''.
+
+(a) The FSF's Front-Cover Text is:
+
+ A GNU Manual
+
+(b) The FSF's Back-Cover Text is:
+
+ You have freedom to copy and modify this GNU Manual, like GNU
+ software. Copies published by the Free Software Foundation raise
+ funds for GNU development.
+@end copying
+
@node Top
@top BFD Internals
@raisesections
In some cases there is also implicit information which BFD can not
represent. For example, the MIPS processor distinguishes small and
-large symbols, and requires that all small symbls be within 32K of the
+large symbols, and requires that all small symbols be within 32K of the
GP register. This means that the MIPS assembler must be able to mark
variables as either small or large, and the MIPS linker must know to put
small symbols within range of the GP register. Since BFD can not
Avoid global variables. We ideally want BFD to be fully reentrant, so
that it can be used in multiple threads. All uses of global or static
variables interfere with that. Initialized constant variables are OK,
-and they should be explicitly marked with const. Instead of global
+and they should be explicitly marked with @samp{const}. Instead of global
variables, use data attached to a BFD or to a linker hash table.
@item
ELF.
@item bfd_target_ieee_flavour
IEEE-695.
-@item bfd_target_nlm_flavour
-NLM.
@item bfd_target_oasys_flavour
OASYS.
@item bfd_target_tekhex_flavour
Intel hex format.
@item bfd_target_som_flavour
SOM (used on HP/UX).
+@item bfd_target_verilog_flavour
+Verilog memory hex dump format.
@item bfd_target_os9k_flavour
os9000.
@item bfd_target_versados_flavour
MS-DOS.
@item bfd_target_evax_flavour
openVMS.
+@item bfd_target_mmo_flavour
+Donald Knuth's MMIXware object format.
@end table
@item byteorder
@node BFD target vector swap
@subsection Swapping functions
-Every target vector has fuction pointers used for swapping information
+Every target vector has function pointers used for swapping information
in and out of the target representation. There are two sets of
functions: one for data information, and one for header information.
Each set has three sizes: 64-bit, 32-bit, and 16-bit. Each size has
Get the contents of a section. This is called from
@samp{bfd_get_section_contents}. Most targets set this to
@samp{_bfd_generic_get_section_contents}, which does a @samp{bfd_seek}
-based on the section's @samp{filepos} field and a @samp{bfd_read}. The
+based on the section's @samp{filepos} field and a @samp{bfd_bread}. The
corresponding field in the target vector is named
@samp{_bfd_get_section_contents}.
target vector is named @samp{_bfd_truncate_arname}.
@item _write_armap
-Write out the archive symbol table using calls to @samp{bfd_write}.
+Write out the archive symbol table using calls to @samp{bfd_bwrite}.
This is normally called from the archive @samp{write_contents} routine.
The corresponding field in the target vector is named @samp{write_armap}
(no leading underscore).
@samp{bfd_get_symtab_upper_bound}. The corresponding field in the
target vector is named @samp{_bfd_get_symtab_upper_bound}.
-@item _get_symtab
+@item _canonicalize_symtab
Read in the symbol table. This is called via
@samp{bfd_canonicalize_symtab}. The corresponding field in the target
vector is named @samp{_bfd_canonicalize_symtab}.
@item _bfd_get_relocated_section_contents
Read the contents of a section and apply the relocation information.
-This handles both a final link and a relocateable link; in the latter
+This handles both a final link and a relocatable link; in the latter
case, it adjust the relocation information as well. This is called via
@samp{bfd_get_relocated_section_contents}. Most targets implement it by
calling @samp{bfd_generic_get_relocated_section_contents}.
you configure with the @samp{--enable-maintainer-mode} option. Some
files live in the object directory---the directory from which you run
configure---and some live in the source directory. All files that live
-in the source directory are checked into the CVS repository.
+in the source directory are checked into the git repository.
@table @file
@item bfd.h
Like @file{elfcode.h}, but for functions that are specific to ELF core
files. This is included only by @file{elfcode.h}.
-@item elflink.h
-@cindex @file{elflink.h}
-Like @file{elfcode.h}, but for functions used by the ELF linker. This
-is included only by @file{elfcode.h}.
-
@item elfxx-target.h
@cindex @file{elfxx-target.h}
This file is the source for the generated files @file{elf32-target.h}
Like @file{freebsd.h}, except that there are several files which include
it.
-@item nlm-target.h
-@cindex @file{nlm-target.h}
-Defines the target vector for a standard NLM target.
-
-@item nlmcode.h
-@cindex @file{nlmcode.h}
-Like @file{elfcode.h}, but for NLM targets. This is only included by
-@file{nlm32.c} and @file{nlm64.c}, both of which define the macro
-@samp{ARCH_SIZE} to an appropriate value. There are no 64 bit NLM
-targets anyhow, so this is sort of useless.
-
-@item nlmswap.h
-@cindex @file{nlmswap.h}
-Like @file{coffswap.h}, but for NLM targets. This is included by each
-NLM target, but I think it winds up compiling to the exact same code for
-every target, and as such is fairly useless.
-
@item peicode.h
@cindex @file{peicode.h}
Provides swapping routines and other hooks for PE targets.
section as the value to store. In the IEEE object file format,
relocations may involve arbitrary expressions.
-When doing a relocateable link, the linker may or may not have to do
+When doing a relocatable link, the linker may or may not have to do
anything with a relocation, depending upon the definition of the
relocation. Simple relocations generally do not require any special
action.
@itemize @bullet
@item
Make sure you clearly understand what the contents of the section should
-look like after assembly, after a relocateable link, and after a final
+look like after assembly, after a relocatable link, and after a final
link. Make sure you clearly understand the operations the linker must
-perform during a relocateable link and during a final link.
+perform during a relocatable link and during a final link.
@item
Write a howto structure for the relocation. The howto structure is
able to handle that. You may need to set the @samp{special_function}
field to handle assembly correctly. Be careful to ensure that any code
you write to handle the assembler will also work correctly when doing a
-relocateable link. For example, see @samp{bfd_elf_generic_reloc}.
+relocatable link. For example, see @samp{bfd_elf_generic_reloc}.
@item
Test the assembler. Consider the cases of relocation against an
required handling to the target specific relocation function. In simple
cases this will just involve a call to @samp{_bfd_final_link_relocate}
or @samp{_bfd_relocate_contents}, depending upon the definition of the
-relocation and whether the link is relocateable or not.
+relocation and whether the link is relocatable or not.
@item
Test the linker. Test the case of a final link. If the relocation can
overflow, use a linker script to force an overflow and make sure the
-error is reported correctly. Test a relocateable link, whether the
-symbol is defined or undefined in the relocateable output. For both the
-final and relocateable link, test the case when the symbol is a common
+error is reported correctly. Test a relocatable link, whether the
+symbol is defined or undefined in the relocatable output. For both the
+final and relocatable link, test the case when the symbol is a common
symbol, when the symbol looked like a common symbol but became a defined
symbol, when the symbol is defined in a different object file, and when
the symbol is defined in the same object file.
doing a link in which the output object file format is S-records.
@item
-Using the linker to generate relocateable output in a different object
+Using the linker to generate relocatable output in a different object
file format is impossible in the general case, so you generally don't
-have to worry about that. Linking input files of different object file
-formats together is quite unusual, but if you're really dedicated you
-may want to consider testing this case, both when the output object file
-format is the same as your format, and when it is different.
+have to worry about that. The GNU linker makes sure to stop that from
+happening when an input file in a different format has relocations.
+
+Linking input files of different object file formats together is quite
+unusual, but if you're really dedicated you may want to consider testing
+this case, both when the output object file format is the same as your
+format, and when it is different.
@end itemize
@node BFD relocation codes
of howto structure was being used by a particular format.
The new howto structure would clearly define the relocation behaviour in
-the case of an assembly, a relocateable link, and a final link. At
+the case of an assembly, a relocatable link, and a final link. At
least one special function would be defined as an escape, and it might
make sense to define more.
@subsection ELF sections and segments
The ELF ABI permits a file to have either sections or segments or both.
-Relocateable object files conventionally have only sections.
+Relocatable object files conventionally have only sections.
Executables conventionally have both. Core files conventionally have
only program segments.
@file{elfcode.h} includes functions to swap the ELF structures in and
out of external form, as well as a few more complex functions.
-Linker support is found in @file{elflink.c} and @file{elflink.h}. The
-latter file is compiled twice, for both 32 and 64 bit support. The
+Linker support is found in @file{elflink.c}. The
linker support is only used if the processor specific file defines
@samp{elf_backend_relocate_section}, which is required to relocate the
section contents. If that macro is not defined, the generic linker code
Define either @samp{TARGET_BIG_SYM} or @samp{TARGET_LITTLE_SYM}, or
both, to a unique C name to use for the target vector. This name should
appear in the list of target vectors in @file{targets.c}, and will also
-have to appear in @file{config.bfd} and @file{configure.in}. Define
+have to appear in @file{config.bfd} and @file{configure.ac}. Define
@samp{TARGET_BIG_SYM} for a big-endian processor,
@samp{TARGET_LITTLE_SYM} for a little-endian processor, and define both
for a bi-endian processor.
@item
Define @samp{ELF_MACHINE_CODE} to the magic number which should appear
in the @samp{e_machine} field of the ELF header. As of this writing,
-these magic numbers are assigned by SCO; if you want to get a magic
+these magic numbers are assigned by Caldera; if you want to get a magic
number for a particular processor, try sending a note to
-@email{registry@@sco.com}. In the BFD sources, the magic numbers are
+@email{registry@@caldera.com}. In the BFD sources, the magic numbers are
found in @file{include/elf/common.h}; they have names beginning with
@samp{EM_}.
@item
files (but not in executables, except when using dynamic linking).
However, this is outweighed by the simplicity of addend handling when
using @samp{Rela} relocations. With @samp{Rel} relocations, the addend
-must be stored in the section contents, which makes relocateable links
+must be stored in the section contents, which makes relocatable links
more complex.
For example, consider C code like @code{i = a[1000];} where @samp{a} is
If you are adding support for a RISC chip which uses two or more
instructions to load an address, then the addend may not fit in a single
instruction, and will have to be somehow split among the instructions.
-This makes linking awkward, particularly when doing a relocateable link
+This makes linking awkward, particularly when doing a relocatable link
in which the addend may have to be updated. It can be done---the MIPS
ELF support does it---but it should be avoided when possible.
information which may be needed outside of the BFD code. In particular
it should use the @samp{START_RELOC_NUMBERS}, @samp{RELOC_NUMBER},
@samp{FAKE_RELOC}, @samp{EMPTY_RELOC} and @samp{END_RELOC_NUMBERS}
-macros to create a table mapping the number used to indentify a
+macros to create a table mapping the number used to identify a
relocation to a name describing that relocation.
While not a BFD component, you probably also want to make the binutils
information. In simple cases, this is little more than a loop over the
relocations which computes the value of each relocation and calls
@samp{_bfd_final_link_relocate}. The function must check for a
-relocateable link, and in that case normally needs to do nothing other
+relocatable link, and in that case normally needs to do nothing other
than adjust the addend for relocations against a section symbol.
The complex cases generally have to do with dynamic linker support. GOT
The processor function hooks and constants are ad hoc and need better
documentation.
-When a linker script uses @samp{SIZEOF_HEADERS}, the ELF backend must
-guess at the number of program segments which will be required, in
-@samp{get_program_header_size}. This is because the linker calls
-@samp{bfd_sizeof_headers} before it knows all the section addresses and
-sizes. The ELF backend may later discover, when creating program
-segments, that more program segments are required. This is currently
-reported as an error in @samp{assign_file_positions_for_segments}.
-
-In practice this makes it difficult to use @samp{SIZEOF_HEADERS} except
-with a carefully defined linker script. Unfortunately,
-@samp{SIZEOF_HEADERS} is required for fast program loading on a native
-system, since it permits the initial code section to appear on the same
-page as the program segments, saving a page read when the program starts
-running. Fortunately, native systems permit careful definition of the
-linker script. Still, ideally it would be possible to use relaxation to
-compute the number of program segments.
-
@node BFD glossary
@section BFD glossary
@cindex glossary for bfd
set of functions which appear in a particular target vector.
@item BFD
-The BFD library itself. Also, each object file, archive, or exectable
+The BFD library itself. Also, each object file, archive, or executable
opened by the BFD library has the type @samp{bfd *}, and is sometimes
referred to as a bfd.
Load Memory Address. This is the address at which a section will be
loaded. Compare with VMA, below.
-@item NLM
-NetWare Loadable Module. Used to describe the format of an object which
-be loaded into NetWare, which is some kind of PC based network server
-program.
-
@item object file
A binary file including machine instructions, symbols, and relocation
information. Normally produced by an assembler.