Commit | Line | Data |
---|---|---|
252b5132 | 1 | \input texinfo |
6f2750fe | 2 | @c Copyright (C) 1991-2016 Free Software Foundation, Inc. |
252b5132 RH |
3 | @setfilename internals.info |
4 | @node Top | |
5 | @top Assembler Internals | |
6 | @raisesections | |
7 | @cindex internals | |
8 | ||
9 | This chapter describes the internals of the assembler. It is incomplete, but | |
10 | it may help a bit. | |
11 | ||
43da67e8 | 12 | This chapter is not updated regularly, and it may be out of date. |
252b5132 RH |
13 | |
14 | @menu | |
252b5132 RH |
15 | * Data types:: Data types |
16 | * GAS processing:: What GAS does when it runs | |
17 | * Porting GAS:: Porting GAS | |
18 | * Relaxation:: Relaxation | |
19 | * Broken words:: Broken words | |
20 | * Internal functions:: Internal functions | |
21 | * Test suite:: Test suite | |
22 | @end menu | |
23 | ||
252b5132 RH |
24 | @node Data types |
25 | @section Data types | |
26 | @cindex internals, data types | |
27 | ||
28 | This section describes some fundamental GAS data types. | |
29 | ||
30 | @menu | |
31 | * Symbols:: The symbolS structure | |
32 | * Expressions:: The expressionS structure | |
33 | * Fixups:: The fixS structure | |
34 | * Frags:: The fragS structure | |
35 | @end menu | |
36 | ||
37 | @node Symbols | |
38 | @subsection Symbols | |
39 | @cindex internals, symbols | |
40 | @cindex symbols, internal | |
41 | @cindex symbolS structure | |
42 | ||
b4013713 ILT |
43 | The definition for the symbol structure, @code{symbolS}, is located in |
44 | @file{struc-symbol.h}. | |
45 | ||
46 | In general, the fields of this structure may not be referred to directly. | |
47 | Instead, you must use one of the accessor functions defined in @file{symbol.h}. | |
48 | These accessor functions should work for any GAS version. | |
49 | ||
50 | Symbol structures contain the following fields: | |
252b5132 RH |
51 | |
52 | @table @code | |
53 | @item sy_value | |
54 | This is an @code{expressionS} that describes the value of the symbol. It might | |
55 | refer to one or more other symbols; if so, its true value may not be known | |
6386f3a7 AM |
56 | until @code{resolve_symbol_value} is called with @var{finalize_syms} non-zero |
57 | in @code{write_object_file}. | |
252b5132 RH |
58 | |
59 | The expression is often simply a constant. Before @code{resolve_symbol_value} | |
6386f3a7 AM |
60 | is called with @var{finalize_syms} set, the value is the offset from the frag |
61 | (@pxref{Frags}). Afterward, the frag address has been added in. | |
252b5132 RH |
62 | |
63 | @item sy_resolved | |
64 | This field is non-zero if the symbol's value has been completely resolved. It | |
65 | is used during the final pass over the symbol table. | |
66 | ||
67 | @item sy_resolving | |
68 | This field is used to detect loops while resolving the symbol's value. | |
69 | ||
70 | @item sy_used_in_reloc | |
71 | This field is non-zero if the symbol is used by a relocation entry. If a local | |
72 | symbol is used in a relocation entry, it must be possible to redirect those | |
73 | relocations to other symbols, or this symbol cannot be removed from the final | |
74 | symbol list. | |
75 | ||
76 | @item sy_next | |
77 | @itemx sy_previous | |
829c3ed3 AM |
78 | These pointers to other @code{symbolS} structures describe a doubly |
79 | linked list. These fields should be accessed with | |
252b5132 RH |
80 | the @code{symbol_next} and @code{symbol_previous} macros. |
81 | ||
82 | @item sy_frag | |
83 | This points to the frag (@pxref{Frags}) that this symbol is attached to. | |
84 | ||
85 | @item sy_used | |
86 | Whether the symbol is used as an operand or in an expression. Note: Not all of | |
87 | the backends keep this information accurate; backends which use this bit are | |
88 | responsible for setting it when a symbol is used in backend routines. | |
89 | ||
90 | @item sy_mri_common | |
91 | Whether the symbol is an MRI common symbol created by the @code{COMMON} | |
92 | pseudo-op when assembling in MRI mode. | |
93 | ||
92757bc9 JB |
94 | @item sy_volatile |
95 | Whether the symbol can be re-defined. | |
96 | ||
97 | @item sy_forward_ref | |
98 | Whether the symbol's value must only be evaluated upon use. | |
99 | ||
06e77878 AO |
100 | @item sy_weakrefr |
101 | Whether the symbol is a @code{weakref} alias to another symbol. | |
102 | ||
103 | @item sy_weakrefd | |
104 | Whether the symbol is or was referenced by one or more @code{weakref} aliases, | |
105 | and has not had any direct references. | |
106 | ||
252b5132 | 107 | @item bsym |
829c3ed3 | 108 | This points to the BFD @code{asymbol} that |
252b5132 RH |
109 | will be used in writing the object file. |
110 | ||
252b5132 RH |
111 | @item sy_obj |
112 | This format-specific data is of type @code{OBJ_SYMFIELD_TYPE}. If no macro by | |
113 | that name is defined in @file{obj-format.h}, this field is not defined. | |
114 | ||
115 | @item sy_tc | |
116 | This processor-specific data is of type @code{TC_SYMFIELD_TYPE}. If no macro | |
117 | by that name is defined in @file{targ-cpu.h}, this field is not defined. | |
118 | ||
252b5132 RH |
119 | @end table |
120 | ||
b4013713 ILT |
121 | Here is a description of the accessor functions. These should be used rather |
122 | than referring to the fields of @code{symbolS} directly. | |
252b5132 RH |
123 | |
124 | @table @code | |
125 | @item S_SET_VALUE | |
126 | @cindex S_SET_VALUE | |
127 | Set the symbol's value. | |
128 | ||
129 | @item S_GET_VALUE | |
130 | @cindex S_GET_VALUE | |
131 | Get the symbol's value. This will cause @code{resolve_symbol_value} to be | |
6386f3a7 | 132 | called if necessary. |
252b5132 RH |
133 | |
134 | @item S_SET_SEGMENT | |
135 | @cindex S_SET_SEGMENT | |
136 | Set the section of the symbol. | |
137 | ||
138 | @item S_GET_SEGMENT | |
139 | @cindex S_GET_SEGMENT | |
140 | Get the symbol's section. | |
141 | ||
142 | @item S_GET_NAME | |
143 | @cindex S_GET_NAME | |
144 | Get the name of the symbol. | |
145 | ||
146 | @item S_SET_NAME | |
147 | @cindex S_SET_NAME | |
148 | Set the name of the symbol. | |
149 | ||
150 | @item S_IS_EXTERNAL | |
151 | @cindex S_IS_EXTERNAL | |
152 | Return non-zero if the symbol is externally visible. | |
153 | ||
154 | @item S_IS_EXTERN | |
155 | @cindex S_IS_EXTERN | |
156 | A synonym for @code{S_IS_EXTERNAL}. Don't use it. | |
157 | ||
158 | @item S_IS_WEAK | |
159 | @cindex S_IS_WEAK | |
06e77878 AO |
160 | Return non-zero if the symbol is weak, or if it is a @code{weakref} alias or |
161 | symbol that has not been strongly referenced. | |
162 | ||
163 | @item S_IS_WEAKREFR | |
164 | @cindex S_IS_WEAKREFR | |
165 | Return non-zero if the symbol is a @code{weakref} alias. | |
166 | ||
167 | @item S_IS_WEAKREFD | |
168 | @cindex S_IS_WEAKREFD | |
169 | Return non-zero if the symbol was aliased by a @code{weakref} alias and has not | |
170 | had any strong references. | |
252b5132 | 171 | |
92757bc9 JB |
172 | @item S_IS_VOLATILE |
173 | @cindex S_IS_VOLATILE | |
174 | Return non-zero if the symbol may be re-defined. Such symbols get created by | |
175 | the @code{=} operator, @code{equ}, or @code{set}. | |
176 | ||
177 | @item S_IS_FORWARD_REF | |
178 | @cindex S_IS_FORWARD_REF | |
179 | Return non-zero if the symbol is a forward reference, that is its value must | |
180 | only be determined upon use. | |
181 | ||
252b5132 RH |
182 | @item S_IS_COMMON |
183 | @cindex S_IS_COMMON | |
184 | Return non-zero if this is a common symbol. Common symbols are sometimes | |
185 | represented as undefined symbols with a value, in which case this function will | |
186 | not be reliable. | |
187 | ||
188 | @item S_IS_DEFINED | |
189 | @cindex S_IS_DEFINED | |
190 | Return non-zero if this symbol is defined. This function is not reliable when | |
191 | called on a common symbol. | |
192 | ||
193 | @item S_IS_DEBUG | |
194 | @cindex S_IS_DEBUG | |
195 | Return non-zero if this is a debugging symbol. | |
196 | ||
197 | @item S_IS_LOCAL | |
198 | @cindex S_IS_LOCAL | |
199 | Return non-zero if this is a local assembler symbol which should not be | |
200 | included in the final symbol table. Note that this is not the opposite of | |
201 | @code{S_IS_EXTERNAL}. The @samp{-L} assembler option affects the return value | |
202 | of this function. | |
203 | ||
204 | @item S_SET_EXTERNAL | |
205 | @cindex S_SET_EXTERNAL | |
206 | Mark the symbol as externally visible. | |
207 | ||
208 | @item S_CLEAR_EXTERNAL | |
209 | @cindex S_CLEAR_EXTERNAL | |
210 | Mark the symbol as not externally visible. | |
211 | ||
212 | @item S_SET_WEAK | |
213 | @cindex S_SET_WEAK | |
214 | Mark the symbol as weak. | |
215 | ||
06e77878 AO |
216 | @item S_SET_WEAKREFR |
217 | @cindex S_SET_WEAKREFR | |
218 | Mark the symbol as the referrer in a @code{weakref} directive. The symbol it | |
219 | aliases must have been set to the value expression before this point. If the | |
220 | alias has already been used, the symbol is marked as used too. | |
221 | ||
222 | @item S_CLEAR_WEAKREFR | |
223 | @cindex S_CLEAR_WEAKREFR | |
224 | Clear the @code{weakref} alias status of a symbol. This is implicitly called | |
225 | whenever a symbol is defined or set to a new expression. | |
226 | ||
227 | @item S_SET_WEAKREFD | |
228 | @cindex S_SET_WEAKREFD | |
229 | Mark the symbol as the referred symbol in a @code{weakref} directive. | |
230 | Implicitly marks the symbol as weak, but see below. It should only be called | |
231 | if the referenced symbol has just been added to the symbol table. | |
232 | ||
233 | @item S_SET_WEAKREFD | |
234 | @cindex S_SET_WEAKREFD | |
235 | Clear the @code{weakref} aliased status of a symbol. This is implicitly called | |
236 | whenever the symbol is looked up, as part of a direct reference or a | |
237 | definition, but not as part of a @code{weakref} directive. | |
238 | ||
92757bc9 JB |
239 | @item S_SET_VOLATILE |
240 | @cindex S_SET_VOLATILE | |
241 | Indicate that the symbol may be re-defined. | |
242 | ||
243 | @item S_CLEAR_VOLATILE | |
244 | @cindex S_CLEAR_VOLATILE | |
245 | Indicate that the symbol may no longer be re-defined. | |
246 | ||
247 | @item S_SET_FORWARD_REF | |
248 | @cindex S_SET_FORWARD_REF | |
249 | Indicate that the symbol is a forward reference, that is its value must only | |
250 | be determined upon use. | |
251 | ||
252b5132 | 252 | @item S_GET_TYPE |
1f9bb1ca AS |
253 | @itemx S_GET_DESC |
254 | @itemx S_GET_OTHER | |
252b5132 RH |
255 | @cindex S_GET_TYPE |
256 | @cindex S_GET_DESC | |
257 | @cindex S_GET_OTHER | |
258 | Get the @code{type}, @code{desc}, and @code{other} fields of the symbol. These | |
259 | are only defined for object file formats for which they make sense (primarily | |
260 | a.out). | |
261 | ||
262 | @item S_SET_TYPE | |
1f9bb1ca AS |
263 | @itemx S_SET_DESC |
264 | @itemx S_SET_OTHER | |
252b5132 RH |
265 | @cindex S_SET_TYPE |
266 | @cindex S_SET_DESC | |
267 | @cindex S_SET_OTHER | |
268 | Set the @code{type}, @code{desc}, and @code{other} fields of the symbol. These | |
269 | are only defined for object file formats for which they make sense (primarily | |
270 | a.out). | |
271 | ||
272 | @item S_GET_SIZE | |
273 | @cindex S_GET_SIZE | |
274 | Get the size of a symbol. This is only defined for object file formats for | |
275 | which it makes sense (primarily ELF). | |
276 | ||
277 | @item S_SET_SIZE | |
278 | @cindex S_SET_SIZE | |
279 | Set the size of a symbol. This is only defined for object file formats for | |
280 | which it makes sense (primarily ELF). | |
b4013713 ILT |
281 | |
282 | @item symbol_get_value_expression | |
283 | @cindex symbol_get_value_expression | |
284 | Get a pointer to an @code{expressionS} structure which represents the value of | |
285 | the symbol as an expression. | |
286 | ||
287 | @item symbol_set_value_expression | |
288 | @cindex symbol_set_value_expression | |
289 | Set the value of a symbol to an expression. | |
290 | ||
291 | @item symbol_set_frag | |
292 | @cindex symbol_set_frag | |
293 | Set the frag where a symbol is defined. | |
294 | ||
295 | @item symbol_get_frag | |
296 | @cindex symbol_get_frag | |
297 | Get the frag where a symbol is defined. | |
298 | ||
299 | @item symbol_mark_used | |
300 | @cindex symbol_mark_used | |
301 | Mark a symbol as having been used in an expression. | |
302 | ||
303 | @item symbol_clear_used | |
304 | @cindex symbol_clear_used | |
305 | Clear the mark indicating that a symbol was used in an expression. | |
306 | ||
307 | @item symbol_used_p | |
308 | @cindex symbol_used_p | |
309 | Return whether a symbol was used in an expression. | |
310 | ||
311 | @item symbol_mark_used_in_reloc | |
312 | @cindex symbol_mark_used_in_reloc | |
313 | Mark a symbol as having been used by a relocation. | |
314 | ||
315 | @item symbol_clear_used_in_reloc | |
316 | @cindex symbol_clear_used_in_reloc | |
317 | Clear the mark indicating that a symbol was used in a relocation. | |
318 | ||
319 | @item symbol_used_in_reloc_p | |
320 | @cindex symbol_used_in_reloc_p | |
321 | Return whether a symbol was used in a relocation. | |
322 | ||
323 | @item symbol_mark_mri_common | |
324 | @cindex symbol_mark_mri_common | |
325 | Mark a symbol as an MRI common symbol. | |
326 | ||
327 | @item symbol_clear_mri_common | |
328 | @cindex symbol_clear_mri_common | |
329 | Clear the mark indicating that a symbol is an MRI common symbol. | |
330 | ||
331 | @item symbol_mri_common_p | |
332 | @cindex symbol_mri_common_p | |
333 | Return whether a symbol is an MRI common symbol. | |
334 | ||
335 | @item symbol_mark_written | |
336 | @cindex symbol_mark_written | |
337 | Mark a symbol as having been written. | |
338 | ||
339 | @item symbol_clear_written | |
340 | @cindex symbol_clear_written | |
341 | Clear the mark indicating that a symbol was written. | |
342 | ||
343 | @item symbol_written_p | |
344 | @cindex symbol_written_p | |
345 | Return whether a symbol was written. | |
346 | ||
347 | @item symbol_mark_resolved | |
348 | @cindex symbol_mark_resolved | |
349 | Mark a symbol as having been resolved. | |
350 | ||
351 | @item symbol_resolved_p | |
352 | @cindex symbol_resolved_p | |
353 | Return whether a symbol has been resolved. | |
354 | ||
355 | @item symbol_section_p | |
356 | @cindex symbol_section_p | |
357 | Return whether a symbol is a section symbol. | |
358 | ||
359 | @item symbol_equated_p | |
360 | @cindex symbol_equated_p | |
361 | Return whether a symbol is equated to another symbol. | |
362 | ||
363 | @item symbol_constant_p | |
364 | @cindex symbol_constant_p | |
365 | Return whether a symbol has a constant value, including being an offset within | |
366 | some frag. | |
367 | ||
368 | @item symbol_get_bfdsym | |
369 | @cindex symbol_get_bfdsym | |
370 | Return the BFD symbol associated with a symbol. | |
371 | ||
372 | @item symbol_set_bfdsym | |
373 | @cindex symbol_set_bfdsym | |
374 | Set the BFD symbol associated with a symbol. | |
375 | ||
376 | @item symbol_get_obj | |
377 | @cindex symbol_get_obj | |
378 | Return a pointer to the @code{OBJ_SYMFIELD_TYPE} field of a symbol. | |
379 | ||
380 | @item symbol_set_obj | |
381 | @cindex symbol_set_obj | |
382 | Set the @code{OBJ_SYMFIELD_TYPE} field of a symbol. | |
383 | ||
384 | @item symbol_get_tc | |
385 | @cindex symbol_get_tc | |
386 | Return a pointer to the @code{TC_SYMFIELD_TYPE} field of a symbol. | |
387 | ||
388 | @item symbol_set_tc | |
389 | @cindex symbol_set_tc | |
390 | Set the @code{TC_SYMFIELD_TYPE} field of a symbol. | |
391 | ||
252b5132 RH |
392 | @end table |
393 | ||
829c3ed3 | 394 | GAS attempts to store local |
b4013713 ILT |
395 | symbols--symbols which will not be written to the output file--using a |
396 | different structure, @code{struct local_symbol}. This structure can only | |
397 | represent symbols whose value is an offset within a frag. | |
398 | ||
399 | Code outside of the symbol handler will always deal with @code{symbolS} | |
400 | structures and use the accessor functions. The accessor functions correctly | |
401 | deal with local symbols. @code{struct local_symbol} is much smaller than | |
402 | @code{symbolS} (which also automatically creates a bfd @code{asymbol} | |
403 | structure), so this saves space when assembling large files. | |
404 | ||
405 | The first field of @code{symbolS} is @code{bsym}, the pointer to the BFD | |
406 | symbol. The first field of @code{struct local_symbol} is a pointer which is | |
407 | always set to NULL. This is how the symbol accessor functions can distinguish | |
408 | local symbols from ordinary symbols. The symbol accessor functions | |
409 | automatically convert a local symbol into an ordinary symbol when necessary. | |
410 | ||
252b5132 RH |
411 | @node Expressions |
412 | @subsection Expressions | |
413 | @cindex internals, expressions | |
414 | @cindex expressions, internal | |
415 | @cindex expressionS structure | |
416 | ||
417 | Expressions are stored in an @code{expressionS} structure. The structure is | |
418 | defined in @file{expr.h}. | |
419 | ||
420 | @cindex expression | |
421 | The macro @code{expression} will create an @code{expressionS} structure based | |
422 | on the text found at the global variable @code{input_line_pointer}. | |
423 | ||
424 | @cindex make_expr_symbol | |
425 | @cindex expr_symbol_where | |
426 | A single @code{expressionS} structure can represent a single operation. | |
427 | Complex expressions are formed by creating @dfn{expression symbols} and | |
428 | combining them in @code{expressionS} structures. An expression symbol is | |
429 | created by calling @code{make_expr_symbol}. An expression symbol should | |
430 | naturally never appear in a symbol table, and the implementation of | |
431 | @code{S_IS_LOCAL} (@pxref{Symbols}) reflects that. The function | |
432 | @code{expr_symbol_where} returns non-zero if a symbol is an expression symbol, | |
433 | and also returns the file and line for the expression which caused it to be | |
434 | created. | |
435 | ||
436 | The @code{expressionS} structure has two symbol fields, a number field, an | |
437 | operator field, and a field indicating whether the number is unsigned. | |
438 | ||
439 | The operator field is of type @code{operatorT}, and describes how to interpret | |
440 | the other fields; see the definition in @file{expr.h} for the possibilities. | |
441 | ||
442 | An @code{operatorT} value of @code{O_big} indicates either a floating point | |
443 | number, stored in the global variable @code{generic_floating_point_number}, or | |
623aa224 | 444 | an integer too large to store in an @code{offsetT} type, stored in the global |
252b5132 RH |
445 | array @code{generic_bignum}. This rather inflexible approach makes it |
446 | impossible to use floating point numbers or large expressions in complex | |
447 | expressions. | |
448 | ||
449 | @node Fixups | |
450 | @subsection Fixups | |
451 | @cindex internals, fixups | |
452 | @cindex fixups | |
453 | @cindex fixS structure | |
454 | ||
455 | A @dfn{fixup} is basically anything which can not be resolved in the first | |
456 | pass. Sometimes a fixup can be resolved by the end of the assembly; if not, | |
457 | the fixup becomes a relocation entry in the object file. | |
458 | ||
459 | @cindex fix_new | |
460 | @cindex fix_new_exp | |
461 | A fixup is created by a call to @code{fix_new} or @code{fix_new_exp}. Both | |
462 | take a frag (@pxref{Frags}), a position within the frag, a size, an indication | |
829c3ed3 AM |
463 | of whether the fixup is PC relative, and a type. |
464 | The type is nominally a @code{bfd_reloc_code_real_type}, but several | |
252b5132 RH |
465 | targets use other type codes to represent fixups that can not be described as |
466 | relocations. | |
467 | ||
468 | The @code{fixS} structure has a number of fields, several of which are obsolete | |
469 | or are only used by a particular target. The important fields are: | |
470 | ||
471 | @table @code | |
472 | @item fx_frag | |
473 | The frag (@pxref{Frags}) this fixup is in. | |
474 | ||
475 | @item fx_where | |
476 | The location within the frag where the fixup occurs. | |
477 | ||
478 | @item fx_addsy | |
479 | The symbol this fixup is against. Typically, the value of this symbol is added | |
480 | into the object contents. This may be NULL. | |
481 | ||
482 | @item fx_subsy | |
483 | The value of this symbol is subtracted from the object contents. This is | |
484 | normally NULL. | |
485 | ||
486 | @item fx_offset | |
487 | A number which is added into the fixup. | |
488 | ||
489 | @item fx_addnumber | |
490 | Some CPU backends use this field to convey information between | |
55cf6793 | 491 | @code{md_apply_fix} and @code{tc_gen_reloc}. The machine independent code does |
252b5132 RH |
492 | not use it. |
493 | ||
494 | @item fx_next | |
495 | The next fixup in the section. | |
496 | ||
497 | @item fx_r_type | |
829c3ed3 | 498 | The type of the fixup. |
252b5132 RH |
499 | |
500 | @item fx_size | |
501 | The size of the fixup. This is mostly used for error checking. | |
502 | ||
503 | @item fx_pcrel | |
504 | Whether the fixup is PC relative. | |
505 | ||
506 | @item fx_done | |
507 | Non-zero if the fixup has been applied, and no relocation entry needs to be | |
508 | generated. | |
509 | ||
510 | @item fx_file | |
511 | @itemx fx_line | |
512 | The file and line where the fixup was created. | |
513 | ||
514 | @item tc_fix_data | |
515 | This has the type @code{TC_FIX_TYPE}, and is only defined if the target defines | |
516 | that macro. | |
517 | @end table | |
518 | ||
519 | @node Frags | |
520 | @subsection Frags | |
521 | @cindex internals, frags | |
522 | @cindex frags | |
523 | @cindex fragS structure. | |
524 | ||
525 | The @code{fragS} structure is defined in @file{as.h}. Each frag represents a | |
526 | portion of the final object file. As GAS reads the source file, it creates | |
527 | frags to hold the data that it reads. At the end of the assembly the frags and | |
528 | fixups are processed to produce the final contents. | |
529 | ||
530 | @table @code | |
531 | @item fr_address | |
532 | The address of the frag. This is not set until the assembler rescans the list | |
533 | of all frags after the entire input file is parsed. The function | |
534 | @code{relax_segment} fills in this field. | |
535 | ||
536 | @item fr_next | |
537 | Pointer to the next frag in this (sub)section. | |
538 | ||
539 | @item fr_fix | |
540 | Fixed number of characters we know we're going to emit to the output file. May | |
541 | be zero. | |
542 | ||
543 | @item fr_var | |
544 | Variable number of characters we may output, after the initial @code{fr_fix} | |
545 | characters. May be zero. | |
546 | ||
547 | @item fr_offset | |
548 | The interpretation of this field is controlled by @code{fr_type}. Generally, | |
549 | if @code{fr_var} is non-zero, this is a repeat count: the @code{fr_var} | |
550 | characters are output @code{fr_offset} times. | |
551 | ||
552 | @item line | |
553 | Holds line number info when an assembler listing was requested. | |
554 | ||
555 | @item fr_type | |
556 | Relaxation state. This field indicates the interpretation of @code{fr_offset}, | |
557 | @code{fr_symbol} and the variable-length tail of the frag, as well as the | |
558 | treatment it gets in various phases of processing. It does not affect the | |
559 | initial @code{fr_fix} characters; they are always supposed to be output | |
560 | verbatim (fixups aside). See below for specific values this field can have. | |
561 | ||
562 | @item fr_subtype | |
563 | Relaxation substate. If the macro @code{md_relax_frag} isn't defined, this is | |
564 | assumed to be an index into @code{TC_GENERIC_RELAX_TABLE} for the generic | |
565 | relaxation code to process (@pxref{Relaxation}). If @code{md_relax_frag} is | |
566 | defined, this field is available for any use by the CPU-specific code. | |
567 | ||
568 | @item fr_symbol | |
569 | This normally indicates the symbol to use when relaxing the frag according to | |
570 | @code{fr_type}. | |
571 | ||
572 | @item fr_opcode | |
573 | Points to the lowest-addressed byte of the opcode, for use in relaxation. | |
574 | ||
575 | @item tc_frag_data | |
576 | Target specific fragment data of type TC_FRAG_TYPE. | |
577 | Only present if @code{TC_FRAG_TYPE} is defined. | |
578 | ||
579 | @item fr_file | |
580 | @itemx fr_line | |
581 | The file and line where this frag was last modified. | |
582 | ||
583 | @item fr_literal | |
584 | Declared as a one-character array, this last field grows arbitrarily large to | |
585 | hold the actual contents of the frag. | |
586 | @end table | |
587 | ||
588 | These are the possible relaxation states, provided in the enumeration type | |
589 | @code{relax_stateT}, and the interpretations they represent for the other | |
590 | fields: | |
591 | ||
592 | @table @code | |
593 | @item rs_align | |
594 | @itemx rs_align_code | |
595 | The start of the following frag should be aligned on some boundary. In this | |
596 | frag, @code{fr_offset} is the logarithm (base 2) of the alignment in bytes. | |
597 | (For example, if alignment on an 8-byte boundary were desired, @code{fr_offset} | |
598 | would have a value of 3.) The variable characters indicate the fill pattern to | |
599 | be used. The @code{fr_subtype} field holds the maximum number of bytes to skip | |
600 | when doing this alignment. If more bytes are needed, the alignment is not | |
601 | done. An @code{fr_subtype} value of 0 means no maximum, which is the normal | |
602 | case. Target backends can use @code{rs_align_code} to handle certain types of | |
603 | alignment differently. | |
604 | ||
605 | @item rs_broken_word | |
606 | This indicates that ``broken word'' processing should be done (@pxref{Broken | |
607 | words}). If broken word processing is not necessary on the target machine, | |
608 | this enumerator value will not be defined. | |
609 | ||
610 | @item rs_cfa | |
611 | This state is used to implement exception frame optimizations. The | |
612 | @code{fr_symbol} is an expression symbol for the subtraction which may be | |
613 | relaxed. The @code{fr_opcode} field holds the frag for the preceding command | |
614 | byte. The @code{fr_offset} field holds the offset within that frag. The | |
615 | @code{fr_subtype} field is used during relaxation to hold the current size of | |
616 | the frag. | |
617 | ||
618 | @item rs_fill | |
619 | The variable characters are to be repeated @code{fr_offset} times. If | |
620 | @code{fr_offset} is 0, this frag has a length of @code{fr_fix}. Most frags | |
621 | have this type. | |
622 | ||
623 | @item rs_leb128 | |
58a77e41 | 624 | This state is used to implement the DWARF ``little endian base 128'' |
252b5132 RH |
625 | variable length number format. The @code{fr_symbol} is always an expression |
626 | symbol, as constant expressions are emitted directly. The @code{fr_offset} | |
627 | field is used during relaxation to hold the previous size of the number so | |
628 | that we can determine if the fragment changed size. | |
629 | ||
630 | @item rs_machine_dependent | |
631 | Displacement relaxation is to be done on this frag. The target is indicated by | |
632 | @code{fr_symbol} and @code{fr_offset}, and @code{fr_subtype} indicates the | |
633 | particular machine-specific addressing mode desired. @xref{Relaxation}. | |
634 | ||
635 | @item rs_org | |
636 | The start of the following frag should be pushed back to some specific offset | |
637 | within the section. (Some assemblers use the value as an absolute address; GAS | |
638 | does not handle final absolute addresses, but rather requires that the linker | |
639 | set them.) The offset is given by @code{fr_symbol} and @code{fr_offset}; one | |
640 | character from the variable-length tail is used as the fill character. | |
641 | @end table | |
642 | ||
643 | @cindex frchainS structure | |
644 | A chain of frags is built up for each subsection. The data structure | |
645 | describing a chain is called a @code{frchainS}, and contains the following | |
646 | fields: | |
647 | ||
648 | @table @code | |
649 | @item frch_root | |
650 | Points to the first frag in the chain. May be NULL if there are no frags in | |
651 | this chain. | |
652 | @item frch_last | |
653 | Points to the last frag in the chain, or NULL if there are none. | |
654 | @item frch_next | |
655 | Next in the list of @code{frchainS} structures. | |
656 | @item frch_seg | |
657 | Indicates the section this frag chain belongs to. | |
658 | @item frch_subseg | |
659 | Subsection (subsegment) number of this frag chain. | |
660 | @item fix_root, fix_tail | |
829c3ed3 | 661 | Point to first and last @code{fixS} structures associated with this subsection. |
252b5132 RH |
662 | @item frch_obstack |
663 | Not currently used. Intended to be used for frag allocation for this | |
664 | subsection. This should reduce frag generation caused by switching sections. | |
665 | @item frch_frag_now | |
666 | The current frag for this subsegment. | |
667 | @end table | |
668 | ||
669 | A @code{frchainS} corresponds to a subsection; each section has a list of | |
670 | @code{frchainS} records associated with it. In most cases, only one subsection | |
671 | of each section is used, so the list will only be one element long, but any | |
672 | processing of frag chains should be prepared to deal with multiple chains per | |
673 | section. | |
674 | ||
675 | After the input files have been completely processed, and no more frags are to | |
676 | be generated, the frag chains are joined into one per section for further | |
677 | processing. After this point, it is safe to operate on one chain per section. | |
678 | ||
679 | The assembler always has a current frag, named @code{frag_now}. More space is | |
680 | allocated for the current frag using the @code{frag_more} function; this | |
0530d30a RS |
681 | returns a pointer to the amount of requested space. The function |
682 | @code{frag_room} says by how much the current frag can be extended. | |
683 | Relaxing is done using variant frags allocated by @code{frag_var} | |
684 | or @code{frag_variant} (@pxref{Relaxation}). | |
252b5132 RH |
685 | |
686 | @node GAS processing | |
687 | @section What GAS does when it runs | |
688 | @cindex internals, overview | |
689 | ||
690 | This is a quick look at what an assembler run looks like. | |
691 | ||
692 | @itemize @bullet | |
693 | @item | |
694 | The assembler initializes itself by calling various init routines. | |
695 | ||
696 | @item | |
697 | For each source file, the @code{read_a_source_file} function reads in the file | |
698 | and parses it. The global variable @code{input_line_pointer} points to the | |
699 | current text; it is guaranteed to be correct up to the end of the line, but not | |
700 | farther. | |
701 | ||
702 | @item | |
703 | For each line, the assembler passes labels to the @code{colon} function, and | |
704 | isolates the first word. If it looks like a pseudo-op, the word is looked up | |
705 | in the pseudo-op hash table @code{po_hash} and dispatched to a pseudo-op | |
706 | routine. Otherwise, the target dependent @code{md_assemble} routine is called | |
707 | to parse the instruction. | |
708 | ||
709 | @item | |
710 | When pseudo-ops or instructions output data, they add it to a frag, calling | |
711 | @code{frag_more} to get space to store it in. | |
712 | ||
713 | @item | |
714 | Pseudo-ops and instructions can also output fixups created by @code{fix_new} or | |
715 | @code{fix_new_exp}. | |
716 | ||
717 | @item | |
718 | For certain targets, instructions can create variant frags which are used to | |
719 | store relaxation information (@pxref{Relaxation}). | |
720 | ||
721 | @item | |
722 | When the input file is finished, the @code{write_object_file} routine is | |
723 | called. It assigns addresses to all the frags (@code{relax_segment}), resolves | |
724 | all the fixups (@code{fixup_segment}), resolves all the symbol values (using | |
829c3ed3 | 725 | @code{resolve_symbol_value}), and finally writes out the file. |
252b5132 RH |
726 | @end itemize |
727 | ||
728 | @node Porting GAS | |
729 | @section Porting GAS | |
730 | @cindex porting | |
731 | ||
732 | Each GAS target specifies two main things: the CPU file and the object format | |
1110793a | 733 | file. Two main switches in the @file{configure.ac} file handle this. The |
252b5132 RH |
734 | first switches on CPU type to set the shell variable @code{cpu_type}. The |
735 | second switches on the entire target to set the shell variable @code{fmt}. | |
736 | ||
737 | The configure script uses the value of @code{cpu_type} to select two files in | |
738 | the @file{config} directory: @file{tc-@var{CPU}.c} and @file{tc-@var{CPU}.h}. | |
739 | The configuration process will create a file named @file{targ-cpu.h} in the | |
740 | build directory which includes @file{tc-@var{CPU}.h}. | |
741 | ||
742 | The configure script also uses the value of @code{fmt} to select two files: | |
743 | @file{obj-@var{fmt}.c} and @file{obj-@var{fmt}.h}. The configuration process | |
744 | will create a file named @file{obj-format.h} in the build directory which | |
745 | includes @file{obj-@var{fmt}.h}. | |
746 | ||
747 | You can also set the emulation in the configure script by setting the @code{em} | |
748 | variable. Normally the default value of @samp{generic} is fine. The | |
749 | configuration process will create a file named @file{targ-env.h} in the build | |
750 | directory which includes @file{te-@var{em}.h}. | |
751 | ||
56385375 L |
752 | There is a special case for COFF. For historical reason, the GNU COFF |
753 | assembler doesn't follow the documented behavior on certain debug symbols for | |
754 | the compatibility with other COFF assemblers. A port can define | |
755 | @code{STRICTCOFF} in the configure script to make the GNU COFF assembler | |
756 | to follow the documented behavior. | |
757 | ||
252b5132 RH |
758 | Porting GAS to a new CPU requires writing the @file{tc-@var{CPU}} files. |
759 | Porting GAS to a new object file format requires writing the | |
760 | @file{obj-@var{fmt}} files. There is sometimes some interaction between these | |
761 | two files, but it is normally minimal. | |
762 | ||
763 | The best approach is, of course, to copy existing files. The documentation | |
764 | below assumes that you are looking at existing files to see usage details. | |
765 | ||
766 | These interfaces have grown over time, and have never been carefully thought | |
767 | out or designed. Nothing about the interfaces described here is cast in stone. | |
768 | It is possible that they will change from one version of the assembler to the | |
769 | next. Also, new macros are added all the time as they are needed. | |
770 | ||
771 | @menu | |
772 | * CPU backend:: Writing a CPU backend | |
773 | * Object format backend:: Writing an object format backend | |
774 | * Emulations:: Writing emulation files | |
775 | @end menu | |
776 | ||
777 | @node CPU backend | |
778 | @subsection Writing a CPU backend | |
779 | @cindex CPU backend | |
780 | @cindex @file{tc-@var{CPU}} | |
781 | ||
782 | The CPU backend files are the heart of the assembler. They are the only parts | |
783 | of the assembler which actually know anything about the instruction set of the | |
784 | processor. | |
785 | ||
786 | You must define a reasonably small list of macros and functions in the CPU | |
787 | backend files. You may define a large number of additional macros in the CPU | |
788 | backend files, not all of which are documented here. You must, of course, | |
789 | define macros in the @file{.h} file, which is included by every assembler | |
790 | source file. You may define the functions as macros in the @file{.h} file, or | |
791 | as functions in the @file{.c} file. | |
792 | ||
793 | @table @code | |
794 | @item TC_@var{CPU} | |
795 | @cindex TC_@var{CPU} | |
796 | By convention, you should define this macro in the @file{.h} file. For | |
797 | example, @file{tc-m68k.h} defines @code{TC_M68K}. You might have to use this | |
798 | if it is necessary to add CPU specific code to the object format file. | |
799 | ||
800 | @item TARGET_FORMAT | |
801 | This macro is the BFD target name to use when creating the output file. This | |
802 | will normally depend upon the @code{OBJ_@var{FMT}} macro. | |
803 | ||
804 | @item TARGET_ARCH | |
805 | This macro is the BFD architecture to pass to @code{bfd_set_arch_mach}. | |
806 | ||
807 | @item TARGET_MACH | |
808 | This macro is the BFD machine number to pass to @code{bfd_set_arch_mach}. If | |
809 | it is not defined, GAS will use 0. | |
810 | ||
811 | @item TARGET_BYTES_BIG_ENDIAN | |
812 | You should define this macro to be non-zero if the target is big endian, and | |
813 | zero if the target is little endian. | |
814 | ||
815 | @item md_shortopts | |
816 | @itemx md_longopts | |
817 | @itemx md_longopts_size | |
818 | @itemx md_parse_option | |
819 | @itemx md_show_usage | |
acebd4ce | 820 | @itemx md_after_parse_args |
252b5132 RH |
821 | @cindex md_shortopts |
822 | @cindex md_longopts | |
823 | @cindex md_longopts_size | |
824 | @cindex md_parse_option | |
825 | @cindex md_show_usage | |
acebd4ce | 826 | @cindex md_after_parse_args |
252b5132 RH |
827 | GAS uses these variables and functions during option processing. |
828 | @code{md_shortopts} is a @code{const char *} which GAS adds to the machine | |
829 | independent string passed to @code{getopt}. @code{md_longopts} is a | |
830 | @code{struct option []} which GAS adds to the machine independent long options | |
831 | passed to @code{getopt}; you may use @code{OPTION_MD_BASE}, defined in | |
832 | @file{as.h}, as the start of a set of long option indices, if necessary. | |
833 | @code{md_longopts_size} is a @code{size_t} holding the size @code{md_longopts}. | |
329e276d | 834 | |
252b5132 RH |
835 | GAS will call @code{md_parse_option} whenever @code{getopt} returns an |
836 | unrecognized code, presumably indicating a special code value which appears in | |
329e276d NC |
837 | @code{md_longopts}. This function should return non-zero if it handled the |
838 | option and zero otherwise. There is no need to print a message about an option | |
b45619c0 | 839 | not being recognized. This will be handled by the generic code. |
329e276d NC |
840 | |
841 | GAS will call @code{md_show_usage} when a usage message is printed; it should | |
842 | print a description of the machine specific options. @code{md_after_pase_args}, | |
843 | if defined, is called after all options are processed, to let the backend | |
844 | override settings done by the generic option parsing. | |
252b5132 RH |
845 | |
846 | @item md_begin | |
847 | @cindex md_begin | |
848 | GAS will call this function at the start of the assembly, after the command | |
849 | line arguments have been parsed and all the machine independent initializations | |
850 | have been completed. | |
851 | ||
852 | @item md_cleanup | |
853 | @cindex md_cleanup | |
854 | If you define this macro, GAS will call it at the end of each input file. | |
855 | ||
856 | @item md_assemble | |
857 | @cindex md_assemble | |
858 | GAS will call this function for each input line which does not contain a | |
859 | pseudo-op. The argument is a null terminated string. The function should | |
860 | assemble the string as an instruction with operands. Normally | |
861 | @code{md_assemble} will do this by calling @code{frag_more} and writing out | |
862 | some bytes (@pxref{Frags}). @code{md_assemble} will call @code{fix_new} to | |
863 | create fixups as needed (@pxref{Fixups}). Targets which need to do special | |
864 | purpose relaxation will call @code{frag_var}. | |
865 | ||
866 | @item md_pseudo_table | |
867 | @cindex md_pseudo_table | |
868 | This is a const array of type @code{pseudo_typeS}. It is a mapping from | |
869 | pseudo-op names to functions. You should use this table to implement | |
870 | pseudo-ops which are specific to the CPU. | |
871 | ||
872 | @item tc_conditional_pseudoop | |
873 | @cindex tc_conditional_pseudoop | |
874 | If this macro is defined, GAS will call it with a @code{pseudo_typeS} argument. | |
875 | It should return non-zero if the pseudo-op is a conditional which controls | |
876 | whether code is assembled, such as @samp{.if}. GAS knows about the normal | |
8108ad8e | 877 | conditional pseudo-ops, and you should normally not have to define this macro. |
252b5132 RH |
878 | |
879 | @item comment_chars | |
880 | @cindex comment_chars | |
881 | This is a null terminated @code{const char} array of characters which start a | |
882 | comment. | |
883 | ||
884 | @item tc_comment_chars | |
885 | @cindex tc_comment_chars | |
886 | If this macro is defined, GAS will use it instead of @code{comment_chars}. | |
2e6976a8 DG |
887 | This has the advantage that this macro does not have to refer to a constant |
888 | array. | |
252b5132 RH |
889 | |
890 | @item tc_symbol_chars | |
891 | @cindex tc_symbol_chars | |
892 | If this macro is defined, it is a pointer to a null terminated list of | |
893 | characters which may appear in an operand. GAS already assumes that all | |
b45619c0 | 894 | alphanumeric characters, and @samp{$}, @samp{.}, and @samp{_} may appear in an |
252b5132 RH |
895 | operand (see @samp{symbol_chars} in @file{app.c}). This macro may be defined |
896 | to treat additional characters as appearing in an operand. This affects the | |
897 | way in which GAS removes whitespace before passing the string to | |
898 | @samp{md_assemble}. | |
899 | ||
900 | @item line_comment_chars | |
901 | @cindex line_comment_chars | |
902 | This is a null terminated @code{const char} array of characters which start a | |
903 | comment when they appear at the start of a line. | |
904 | ||
905 | @item line_separator_chars | |
906 | @cindex line_separator_chars | |
907 | This is a null terminated @code{const char} array of characters which separate | |
63a0b638 | 908 | lines (null and newline are such characters by default, and need not be |
65fd87bc ILT |
909 | listed in this array). Note that line_separator_chars do not separate lines |
910 | if found in a comment, such as after a character in line_comment_chars or | |
911 | comment_chars. | |
252b5132 | 912 | |
2e6976a8 DG |
913 | @item tc_line_separator_chars |
914 | @cindex tc_line_separator_chars | |
915 | If this macro is defined, GAS will use it instead of | |
916 | @code{line_separator_chars}. This has the advantage that this macro does not | |
917 | have to refer to a constant array. | |
918 | ||
919 | ||
252b5132 RH |
920 | @item EXP_CHARS |
921 | @cindex EXP_CHARS | |
922 | This is a null terminated @code{const char} array of characters which may be | |
923 | used as the exponent character in a floating point number. This is normally | |
924 | @code{"eE"}. | |
925 | ||
926 | @item FLT_CHARS | |
927 | @cindex FLT_CHARS | |
928 | This is a null terminated @code{const char} array of characters which may be | |
929 | used to indicate a floating point constant. A zero followed by one of these | |
930 | characters is assumed to be followed by a floating point number; thus they | |
931 | operate the way that @code{0x} is used to indicate a hexadecimal constant. | |
932 | Usually this includes @samp{r} and @samp{f}. | |
933 | ||
934 | @item LEX_AT | |
935 | @cindex LEX_AT | |
65fd87bc | 936 | You may define this macro to the lexical type of the @kbd{@@} character. The |
252b5132 RH |
937 | default is zero. |
938 | ||
939 | Lexical types are a combination of @code{LEX_NAME} and @code{LEX_BEGIN_NAME}, | |
940 | both defined in @file{read.h}. @code{LEX_NAME} indicates that the character | |
941 | may appear in a name. @code{LEX_BEGIN_NAME} indicates that the character may | |
65fd87bc | 942 | appear at the beginning of a name. |
252b5132 RH |
943 | |
944 | @item LEX_BR | |
945 | @cindex LEX_BR | |
946 | You may define this macro to the lexical type of the brace characters @kbd{@{}, | |
947 | @kbd{@}}, @kbd{[}, and @kbd{]}. The default value is zero. | |
948 | ||
949 | @item LEX_PCT | |
950 | @cindex LEX_PCT | |
951 | You may define this macro to the lexical type of the @kbd{%} character. The | |
952 | default value is zero. | |
953 | ||
954 | @item LEX_QM | |
955 | @cindex LEX_QM | |
956 | You may define this macro to the lexical type of the @kbd{?} character. The | |
957 | default value it zero. | |
958 | ||
959 | @item LEX_DOLLAR | |
960 | @cindex LEX_DOLLAR | |
961 | You may define this macro to the lexical type of the @kbd{$} character. The | |
962 | default value is @code{LEX_NAME | LEX_BEGIN_NAME}. | |
963 | ||
f805106c TW |
964 | @item NUMBERS_WITH_SUFFIX |
965 | @cindex NUMBERS_WITH_SUFFIX | |
966 | When this macro is defined to be non-zero, the parser allows the radix of a | |
58a77e41 | 967 | constant to be indicated with a suffix. Valid suffixes are binary (B), |
f805106c TW |
968 | octal (Q), and hexadecimal (H). Case is not significant. |
969 | ||
252b5132 RH |
970 | @item SINGLE_QUOTE_STRINGS |
971 | @cindex SINGLE_QUOTE_STRINGS | |
972 | If you define this macro, GAS will treat single quotes as string delimiters. | |
973 | Normally only double quotes are accepted as string delimiters. | |
974 | ||
975 | @item NO_STRING_ESCAPES | |
976 | @cindex NO_STRING_ESCAPES | |
977 | If you define this macro, GAS will not permit escape sequences in a string. | |
978 | ||
979 | @item ONLY_STANDARD_ESCAPES | |
980 | @cindex ONLY_STANDARD_ESCAPES | |
981 | If you define this macro, GAS will warn about the use of nonstandard escape | |
982 | sequences in a string. | |
983 | ||
984 | @item md_start_line_hook | |
985 | @cindex md_start_line_hook | |
986 | If you define this macro, GAS will call it at the start of each line. | |
987 | ||
988 | @item LABELS_WITHOUT_COLONS | |
989 | @cindex LABELS_WITHOUT_COLONS | |
990 | If you define this macro, GAS will assume that any text at the start of a line | |
991 | is a label, even if it does not have a colon. | |
992 | ||
993 | @item TC_START_LABEL | |
39bec121 | 994 | @itemx TC_START_LABEL_WITHOUT_COLON |
252b5132 RH |
995 | @cindex TC_START_LABEL |
996 | You may define this macro to control what GAS considers to be a label. The | |
997 | default definition is to accept any name followed by a colon character. | |
998 | ||
f28e8eb3 TW |
999 | @item TC_START_LABEL_WITHOUT_COLON |
1000 | @cindex TC_START_LABEL_WITHOUT_COLON | |
1001 | Same as TC_START_LABEL, but should be used instead of TC_START_LABEL when | |
58a77e41 | 1002 | LABELS_WITHOUT_COLONS is defined. |
f28e8eb3 | 1003 | |
c9cd7160 L |
1004 | @item TC_FAKE_LABEL |
1005 | @cindex TC_FAKE_LABEL | |
1006 | You may define this macro to control what GAS considers to be a fake | |
1007 | label. The default fake label is FAKE_LABEL_NAME. | |
1008 | ||
252b5132 RH |
1009 | @item NO_PSEUDO_DOT |
1010 | @cindex NO_PSEUDO_DOT | |
1011 | If you define this macro, GAS will not require pseudo-ops to start with a | |
1012 | @kbd{.} character. | |
1013 | ||
ee3c9814 CM |
1014 | @item TC_EQUAL_IN_INSN |
1015 | @cindex TC_EQUAL_IN_INSN | |
1016 | If you define this macro, it should return nonzero if the instruction is | |
1017 | permitted to contain an @kbd{=} character. GAS will call it with two | |
1018 | arguments, the character before the @kbd{=} character, and the value of | |
1019 | the string preceding the equal sign. GAS uses this macro to decide if a | |
1020 | @kbd{=} is an assignment or an instruction. | |
1021 | ||
252b5132 RH |
1022 | @item TC_EOL_IN_INSN |
1023 | @cindex TC_EOL_IN_INSN | |
1024 | If you define this macro, it should return nonzero if the current input line | |
1025 | pointer should be treated as the end of a line. | |
1026 | ||
a8a3b3b2 NS |
1027 | @item TC_CASE_SENSITIVE |
1028 | @cindex TC_CASE_SENSITIVE | |
1029 | Define this macro if instruction mnemonics and pseudos are case sensitive. | |
1030 | The default is to have it undefined giving case insensitive names. | |
1031 | ||
252b5132 RH |
1032 | @item md_parse_name |
1033 | @cindex md_parse_name | |
1034 | If this macro is defined, GAS will call it for any symbol found in an | |
1035 | expression. You can define this to handle special symbols in a special way. | |
1036 | If a symbol always has a certain value, you should normally enter it in the | |
1037 | symbol table, perhaps using @code{reg_section}. | |
1038 | ||
1039 | @item md_undefined_symbol | |
1040 | @cindex md_undefined_symbol | |
1041 | GAS will call this function when a symbol table lookup fails, before it | |
1042 | creates a new symbol. Typically this would be used to supply symbols whose | |
1043 | name or value changes dynamically, possibly in a context sensitive way. | |
1044 | Predefined symbols with fixed values, such as register names or condition | |
1045 | codes, are typically entered directly into the symbol table when @code{md_begin} | |
65fd87bc | 1046 | is called. One argument is passed, a @code{char *} for the symbol. |
252b5132 RH |
1047 | |
1048 | @item md_operand | |
1049 | @cindex md_operand | |
65fd87bc ILT |
1050 | GAS will call this function with one argument, an @code{expressionS} |
1051 | pointer, for any expression that can not be recognized. When the function | |
1052 | is called, @code{input_line_pointer} will point to the start of the | |
1053 | expression. | |
252b5132 | 1054 | |
5a918ce7 JB |
1055 | @item md_register_arithmetic |
1056 | @cindex md_register_arithmetic | |
1057 | If this macro is defined and evaluates to zero then GAS will not fold | |
1058 | expressions that add or subtract a constant to/from a register to give | |
1059 | another register. For example GAS's default behaviour is to fold the | |
1060 | expression "r8 + 1" into "r9", which is probably not the result | |
1061 | intended by the programmer. The default is to allow such folding, | |
1062 | since this maintains backwards compatibility with earlier releases of | |
1063 | GAS. | |
1064 | ||
252b5132 RH |
1065 | @item tc_unrecognized_line |
1066 | @cindex tc_unrecognized_line | |
1067 | If you define this macro, GAS will call it when it finds a line that it can not | |
1068 | parse. | |
1069 | ||
1070 | @item md_do_align | |
1071 | @cindex md_do_align | |
1072 | You may define this macro to handle an alignment directive. GAS will call it | |
1073 | when the directive is seen in the input file. For example, the i386 backend | |
1074 | uses this to generate efficient nop instructions of varying lengths, depending | |
1075 | upon the number of bytes that the alignment will skip. | |
1076 | ||
1077 | @item HANDLE_ALIGN | |
1078 | @cindex HANDLE_ALIGN | |
1079 | You may define this macro to do special handling for an alignment directive. | |
1080 | GAS will call it at the end of the assembly. | |
1081 | ||
8684e216 HPN |
1082 | @item TC_IMPLICIT_LCOMM_ALIGNMENT (@var{size}, @var{p2var}) |
1083 | @cindex TC_IMPLICIT_LCOMM_ALIGNMENT | |
1084 | An @code{.lcomm} directive with no explicit alignment parameter will use this | |
1085 | macro to set @var{p2var} to the alignment that a request for @var{size} bytes | |
1086 | will have. The alignment is expressed as a power of two. If no alignment | |
1087 | should take place, the macro definition should do nothing. Some targets define | |
1088 | a @code{.bss} directive that is also affected by this macro. The default | |
1089 | definition will set @var{p2var} to the truncated power of two of sizes up to | |
1090 | eight bytes. | |
1091 | ||
252b5132 RH |
1092 | @item md_flush_pending_output |
1093 | @cindex md_flush_pending_output | |
1094 | If you define this macro, GAS will call it each time it skips any space because of a | |
1095 | space filling or alignment or data allocation pseudo-op. | |
1096 | ||
1097 | @item TC_PARSE_CONS_EXPRESSION | |
1098 | @cindex TC_PARSE_CONS_EXPRESSION | |
1099 | You may define this macro to parse an expression used in a data allocation | |
1100 | pseudo-op such as @code{.word}. You can use this to recognize relocation | |
1101 | directives that may appear in such directives. | |
1102 | ||
1103 | @item BITFIELD_CONS_EXPRESSION | |
1104 | @cindex BITFIELD_CONS_EXPRESSION | |
1105 | If you define this macro, GAS will recognize bitfield instructions in data | |
1106 | allocation pseudo-ops, as used on the i960. | |
1107 | ||
1108 | @item REPEAT_CONS_EXPRESSION | |
1109 | @cindex REPEAT_CONS_EXPRESSION | |
1110 | If you define this macro, GAS will recognize repeat counts in data allocation | |
1111 | pseudo-ops, as used on the MIPS. | |
1112 | ||
1113 | @item md_cons_align | |
1114 | @cindex md_cons_align | |
1115 | You may define this macro to do any special alignment before a data allocation | |
1116 | pseudo-op. | |
1117 | ||
1118 | @item TC_CONS_FIX_NEW | |
1119 | @cindex TC_CONS_FIX_NEW | |
1120 | You may define this macro to generate a fixup for a data allocation pseudo-op. | |
1121 | ||
cc1bc22a AM |
1122 | @item TC_ADDRESS_BYTES |
1123 | @cindex TC_ADDRESS_BYTES | |
1124 | Define this macro to specify the number of bytes used to store an address. | |
1125 | Used to implement @code{dc.a}. The target must have a reloc for this size. | |
1126 | ||
252b5132 RH |
1127 | @item TC_INIT_FIX_DATA (@var{fixp}) |
1128 | @cindex TC_INIT_FIX_DATA | |
1129 | A C statement to initialize the target specific fields of fixup @var{fixp}. | |
1130 | These fields are defined with the @code{TC_FIX_TYPE} macro. | |
1131 | ||
1132 | @item TC_FIX_DATA_PRINT (@var{stream}, @var{fixp}) | |
1133 | @cindex TC_FIX_DATA_PRINT | |
1134 | A C statement to output target specific debugging information for | |
1135 | fixup @var{fixp} to @var{stream}. This macro is called by @code{print_fixup}. | |
1136 | ||
1137 | @item TC_FRAG_INIT (@var{fragp}) | |
1138 | @cindex TC_FRAG_INIT | |
1139 | A C statement to initialize the target specific fields of frag @var{fragp}. | |
1140 | These fields are defined with the @code{TC_FRAG_TYPE} macro. | |
1141 | ||
1142 | @item md_number_to_chars | |
1143 | @cindex md_number_to_chars | |
1144 | This should just call either @code{number_to_chars_bigendian} or | |
1145 | @code{number_to_chars_littleendian}, whichever is appropriate. On targets like | |
1146 | the MIPS which support options to change the endianness, which function to call | |
1147 | is a runtime decision. On other targets, @code{md_number_to_chars} can be a | |
1148 | simple macro. | |
1149 | ||
dd9b19ab NC |
1150 | @item md_atof (@var{type},@var{litP},@var{sizeP}) |
1151 | @cindex md_atof | |
1152 | This function is called to convert an ASCII string into a floating point value | |
1153 | in format used by the CPU. It takes three arguments. The first is @var{type} | |
499ac353 | 1154 | which is a byte describing the type of floating point number to be created. It |
adfd7328 | 1155 | is one of the characters defined in the @code{FLT_CHARS} macro. Possible |
499ac353 NC |
1156 | values are @var{'f'} or @var{'s'} for single precision, @var{'d'} or @var{'r'} |
1157 | for double precision and @var{'x'} or @var{'p'} for extended precision. Either | |
1158 | lower or upper case versions of these letters can be used. Note: some targets | |
1159 | do not support all of these types, and some targets may also support other | |
1160 | types not mentioned here. | |
dd9b19ab NC |
1161 | |
1162 | The second parameter is @var{litP} which is a pointer to a byte array where the | |
499ac353 NC |
1163 | converted value should be stored. The value is converted into LITTLENUMs and |
1164 | is stored in the target's endian-ness order. (@var{LITTLENUM} is defined in | |
1165 | gas/bignum.h). Single precision values occupy 2 littlenums. Double precision | |
1166 | values occupy 4 littlenums and extended precision values occupy either 5 or 6 | |
1167 | littlenums, depending upon the target. | |
1168 | ||
1169 | The third argument is @var{sizeP}, which is a pointer to a integer that should | |
1170 | be filled in with the number of chars emitted into the byte array. | |
1171 | ||
1172 | The function should return NULL upon success or an error string upon failure. | |
dd9b19ab | 1173 | |
580a832e RS |
1174 | @item TC_LARGEST_EXPONENT_IS_NORMAL |
1175 | @cindex TC_LARGEST_EXPONENT_IS_NORMAL (@var{precision}) | |
1176 | This macro is used only by @file{atof-ieee.c}. It should evaluate to true | |
1177 | if floats of the given precision use the largest exponent for normal numbers | |
1178 | instead of NaNs and infinities. @var{precision} is @samp{F_PRECISION} for | |
1179 | single precision, @samp{D_PRECISION} for double precision, or | |
1180 | @samp{X_PRECISION} for extended double precision. | |
1181 | ||
1182 | The macro has a default definition which returns 0 for all cases. | |
1183 | ||
252b5132 RH |
1184 | @item WORKING_DOT_WORD |
1185 | @itemx md_short_jump_size | |
1186 | @itemx md_long_jump_size | |
1187 | @itemx md_create_short_jump | |
1188 | @itemx md_create_long_jump | |
e30e5a6a | 1189 | @itemx TC_CHECK_ADJUSTED_BROKEN_DOT_WORD |
252b5132 RH |
1190 | @cindex WORKING_DOT_WORD |
1191 | @cindex md_short_jump_size | |
1192 | @cindex md_long_jump_size | |
1193 | @cindex md_create_short_jump | |
1194 | @cindex md_create_long_jump | |
e30e5a6a | 1195 | @cindex TC_CHECK_ADJUSTED_BROKEN_DOT_WORD |
252b5132 RH |
1196 | If @code{WORKING_DOT_WORD} is defined, GAS will not do broken word processing |
1197 | (@pxref{Broken words}). Otherwise, you should set @code{md_short_jump_size} to | |
65fd87bc ILT |
1198 | the size of a short jump (a jump that is just long enough to jump around a |
1199 | number of long jumps) and @code{md_long_jump_size} to the size of a long jump | |
1200 | (a jump that can go anywhere in the function). You should define | |
1201 | @code{md_create_short_jump} to create a short jump around a number of long | |
1202 | jumps, and define @code{md_create_long_jump} to create a long jump. | |
e30e5a6a HPN |
1203 | If defined, the macro TC_CHECK_ADJUSTED_BROKEN_DOT_WORD will be called for each |
1204 | adjusted word just before the word is output. The macro takes two arguments, | |
1205 | an @code{addressT} with the adjusted word and a pointer to the current | |
1206 | @code{struct broken_word}. | |
252b5132 RH |
1207 | |
1208 | @item md_estimate_size_before_relax | |
1209 | @cindex md_estimate_size_before_relax | |
1210 | This function returns an estimate of the size of a @code{rs_machine_dependent} | |
1211 | frag before any relaxing is done. It may also create any necessary | |
1212 | relocations. | |
1213 | ||
1214 | @item md_relax_frag | |
1215 | @cindex md_relax_frag | |
c842b53a ILT |
1216 | This macro may be defined to relax a frag. GAS will call this with the |
1217 | segment, the frag, and the change in size of all previous frags; | |
1218 | @code{md_relax_frag} should return the change in size of the frag. | |
1219 | @xref{Relaxation}. | |
252b5132 RH |
1220 | |
1221 | @item TC_GENERIC_RELAX_TABLE | |
1222 | @cindex TC_GENERIC_RELAX_TABLE | |
1223 | If you do not define @code{md_relax_frag}, you may define | |
1224 | @code{TC_GENERIC_RELAX_TABLE} as a table of @code{relax_typeS} structures. The | |
1225 | machine independent code knows how to use such a table to relax PC relative | |
1226 | references. See @file{tc-m68k.c} for an example. @xref{Relaxation}. | |
1227 | ||
1228 | @item md_prepare_relax_scan | |
1229 | @cindex md_prepare_relax_scan | |
1230 | If defined, it is a C statement that is invoked prior to scanning | |
1231 | the relax table. | |
1232 | ||
1233 | @item LINKER_RELAXING_SHRINKS_ONLY | |
1234 | @cindex LINKER_RELAXING_SHRINKS_ONLY | |
1235 | If you define this macro, and the global variable @samp{linkrelax} is set | |
1236 | (because of a command line option, or unconditionally in @code{md_begin}), a | |
1237 | @samp{.align} directive will cause extra space to be allocated. The linker can | |
1238 | then discard this space when relaxing the section. | |
1239 | ||
8108ad8e | 1240 | @item TC_LINKRELAX_FIXUP (@var{segT}) |
58a77e41 EC |
1241 | @cindex TC_LINKRELAX_FIXUP |
1242 | If defined, this macro allows control over whether fixups for a | |
1243 | given section will be processed when the @var{linkrelax} variable is | |
1244 | set. The macro is given the N_TYPE bits for the section in its | |
1245 | @var{segT} argument. If the macro evaluates to a non-zero value | |
1246 | then the fixups will be converted into relocs, otherwise they will | |
55cf6793 | 1247 | be passed to @var{md_apply_fix} as normal. |
58a77e41 | 1248 | |
252b5132 RH |
1249 | @item md_convert_frag |
1250 | @cindex md_convert_frag | |
1251 | GAS will call this for each rs_machine_dependent fragment. | |
1252 | The instruction is completed using the data from the relaxation pass. | |
1253 | It may also create any necessary relocations. | |
1254 | @xref{Relaxation}. | |
1255 | ||
87548816 NC |
1256 | @item TC_FINALIZE_SYMS_BEFORE_SIZE_SEG |
1257 | @cindex TC_FINALIZE_SYMS_BEFORE_SIZE_SEG | |
1258 | Specifies the value to be assigned to @code{finalize_syms} before the function | |
1259 | @code{size_segs} is called. Since @code{size_segs} calls @code{cvt_frag_to_fill} | |
34bca508 | 1260 | which can call @code{md_convert_frag}, this constant governs whether the symbols |
87548816 NC |
1261 | accessed in @code{md_convert_frag} will be fully resolved. In particular it |
1262 | governs whether local symbols will have been resolved, and had their frag | |
1263 | information removed. Depending upon the processing performed by | |
1264 | @code{md_convert_frag} the frag information may or may not be necessary, as may | |
1265 | the resolved values of the symbols. The default value is 1. | |
1266 | ||
a161fe53 AM |
1267 | @item TC_VALIDATE_FIX (@var{fixP}, @var{seg}, @var{skip}) |
1268 | @cindex TC_VALIDATE_FIX | |
1269 | This macro is evaluated for each fixup (when @var{linkrelax} is not set). | |
1270 | It may be used to change the fixup in @code{struct fix *@var{fixP}} before | |
1271 | the generic code sees it, or to fully process the fixup. In the latter case, | |
1272 | a @code{goto @var{skip}} will bypass the generic code. | |
252b5132 | 1273 | |
55cf6793 ZW |
1274 | @item md_apply_fix (@var{fixP}, @var{valP}, @var{seg}) |
1275 | @cindex md_apply_fix | |
a161fe53 AM |
1276 | GAS will call this for each fixup that passes the @code{TC_VALIDATE_FIX} test |
1277 | when @var{linkrelax} is not set. It should store the correct value in the | |
55cf6793 | 1278 | object file. @code{struct fix *@var{fixP}} is the fixup @code{md_apply_fix} |
a161fe53 AM |
1279 | is operating on. @code{valueT *@var{valP}} is the value to store into the |
1280 | object files, or at least is the generic code's best guess. Specifically, | |
1281 | *@var{valP} is the value of the fixup symbol, perhaps modified by | |
1282 | @code{MD_APPLY_SYM_VALUE}, plus @code{@var{fixP}->fx_offset} (symbol addend), | |
1283 | less @code{MD_PCREL_FROM_SECTION} for pc-relative fixups. | |
1284 | @code{segT @var{seg}} is the section the fix is in. | |
1285 | @code{fixup_segment} performs a generic overflow check on *@var{valP} after | |
55cf6793 ZW |
1286 | @code{md_apply_fix} returns. If the overflow check is relevant for the target |
1287 | machine, then @code{md_apply_fix} should modify *@var{valP}, typically to the | |
a161fe53 AM |
1288 | value stored in the object file. |
1289 | ||
1290 | @item TC_FORCE_RELOCATION (@var{fix}) | |
1291 | @cindex TC_FORCE_RELOCATION | |
1292 | If this macro returns non-zero, it guarantees that a relocation will be emitted | |
1293 | even when the value can be resolved locally, as @code{fixup_segment} tries to | |
1294 | reduce the number of relocations emitted. For example, a fixup expression | |
1295 | against an absolute symbol will normally not require a reloc. If undefined, | |
1296 | a default of @w{@code{(S_FORCE_RELOC ((@var{fix})->fx_addsy))}} is used. | |
1297 | ||
1298 | @item TC_FORCE_RELOCATION_ABS (@var{fix}) | |
1299 | @cindex TC_FORCE_RELOCATION_ABS | |
1300 | Like @code{TC_FORCE_RELOCATION}, but used only for fixup expressions against an | |
1301 | absolute symbol. If undefined, @code{TC_FORCE_RELOCATION} will be used. | |
1302 | ||
1303 | @item TC_FORCE_RELOCATION_LOCAL (@var{fix}) | |
1304 | @cindex TC_FORCE_RELOCATION_LOCAL | |
1305 | Like @code{TC_FORCE_RELOCATION}, but used only for fixup expressions against a | |
1306 | symbol in the current section. If undefined, fixups that are not | |
20ee54e8 | 1307 | @code{fx_pcrel} or for which @code{TC_FORCE_RELOCATION} |
a161fe53 AM |
1308 | returns non-zero, will emit relocs. |
1309 | ||
1310 | @item TC_FORCE_RELOCATION_SUB_SAME (@var{fix}, @var{seg}) | |
ae6063d4 | 1311 | @cindex TC_FORCE_RELOCATION_SUB_SAME |
a161fe53 AM |
1312 | This macro controls resolution of fixup expressions involving the |
1313 | difference of two symbols in the same section. If this macro returns zero, | |
1314 | the subtrahend will be resolved and @code{fx_subsy} set to @code{NULL} for | |
55cf6793 | 1315 | @code{md_apply_fix}. If undefined, the default of |
621e3db6 | 1316 | @w{@code{! SEG_NORMAL (@var{seg})}} will be used. |
a161fe53 | 1317 | |
adfd7328 | 1318 | @item TC_FORCE_RELOCATION_SUB_ABS (@var{fix}, @var{seg}) |
a161fe53 AM |
1319 | @cindex TC_FORCE_RELOCATION_SUB_ABS |
1320 | Like @code{TC_FORCE_RELOCATION_SUB_SAME}, but used when the subtrahend is an | |
4f3cafa2 | 1321 | absolute symbol. If the macro is undefined a default of @code{0} is used. |
a161fe53 | 1322 | |
adfd7328 | 1323 | @item TC_FORCE_RELOCATION_SUB_LOCAL (@var{fix}, @var{seg}) |
a161fe53 AM |
1324 | @cindex TC_FORCE_RELOCATION_SUB_LOCAL |
1325 | Like @code{TC_FORCE_RELOCATION_SUB_ABS}, but the subtrahend is a symbol in the | |
1326 | same section as the fixup. | |
1327 | ||
5db484ff | 1328 | @item TC_VALIDATE_FIX_SUB (@var{fix}, @var{seg}) |
a161fe53 AM |
1329 | @cindex TC_VALIDATE_FIX_SUB |
1330 | This macro is evaluated for any fixup with a @code{fx_subsy} that | |
1331 | @code{fixup_segment} cannot reduce to a number. If the macro returns | |
1332 | @code{false} an error will be reported. | |
1333 | ||
97c4f2d9 L |
1334 | @item TC_GLOBAL_REGISTER_SYMBOL_OK |
1335 | @cindex TC_GLOBAL_REGISTER_SYMBOL_OK | |
1336 | Define this macro if global register symbols are supported. The default | |
1337 | is to disallow global register symbols. | |
1338 | ||
a161fe53 AM |
1339 | @item MD_APPLY_SYM_VALUE (@var{fix}) |
1340 | @cindex MD_APPLY_SYM_VALUE | |
1341 | This macro controls whether the symbol value becomes part of the value passed | |
55cf6793 | 1342 | to @code{md_apply_fix}. If the macro is undefined, or returns non-zero, the |
a161fe53 AM |
1343 | symbol value will be included. For ELF, a suitable definition might simply be |
1344 | @code{0}, because ELF relocations don't include the symbol value in the addend. | |
1345 | ||
ae6063d4 | 1346 | @item S_FORCE_RELOC (@var{sym}, @var{strict}) |
a161fe53 | 1347 | @cindex S_FORCE_RELOC |
829c3ed3 | 1348 | This function returns true for symbols |
a161fe53 AM |
1349 | that should not be reduced to section symbols or eliminated from expressions, |
1350 | because they may be overridden by the linker. ie. for symbols that are | |
ae6063d4 AM |
1351 | undefined or common, and when @var{strict} is set, weak, or global (for ELF |
1352 | assemblers that support ELF shared library linking semantics). | |
a161fe53 AM |
1353 | |
1354 | @item EXTERN_FORCE_RELOC | |
1355 | @cindex EXTERN_FORCE_RELOC | |
1356 | This macro controls whether @code{S_FORCE_RELOC} returns true for global | |
1357 | symbols. If undefined, the default is @code{true} for ELF assemblers, and | |
1358 | @code{false} for non-ELF. | |
252b5132 RH |
1359 | |
1360 | @item tc_gen_reloc | |
1361 | @cindex tc_gen_reloc | |
829c3ed3 | 1362 | GAS will call this to generate a reloc. GAS will pass |
252b5132 RH |
1363 | the resulting reloc to @code{bfd_install_relocation}. This currently works |
1364 | poorly, as @code{bfd_install_relocation} often does the wrong thing, and | |
1365 | instances of @code{tc_gen_reloc} have been written to work around the problems, | |
1366 | which in turns makes it difficult to fix @code{bfd_install_relocation}. | |
1367 | ||
1368 | @item RELOC_EXPANSION_POSSIBLE | |
1369 | @cindex RELOC_EXPANSION_POSSIBLE | |
1370 | If you define this macro, it means that @code{tc_gen_reloc} may return multiple | |
1371 | relocation entries for a single fixup. In this case, the return value of | |
1372 | @code{tc_gen_reloc} is a pointer to a null terminated array. | |
1373 | ||
1374 | @item MAX_RELOC_EXPANSION | |
1375 | @cindex MAX_RELOC_EXPANSION | |
1376 | You must define this if @code{RELOC_EXPANSION_POSSIBLE} is defined; it | |
1377 | indicates the largest number of relocs which @code{tc_gen_reloc} may return for | |
1378 | a single fixup. | |
1379 | ||
1380 | @item tc_fix_adjustable | |
1381 | @cindex tc_fix_adjustable | |
1382 | You may define this macro to indicate whether a fixup against a locally defined | |
1383 | symbol should be adjusted to be against the section symbol. It should return a | |
1384 | non-zero value if the adjustment is acceptable. | |
1385 | ||
1262d520 | 1386 | @item MD_PCREL_FROM_SECTION (@var{fixp}, @var{section}) |
252b5132 | 1387 | @cindex MD_PCREL_FROM_SECTION |
1262d520 JR |
1388 | If you define this macro, it should return the position from which the PC |
1389 | relative adjustment for a PC relative fixup should be made. On many | |
1390 | processors, the base of a PC relative instruction is the next instruction, | |
1391 | so this macro would return the length of an instruction, plus the address of | |
1392 | the PC relative fixup. The latter can be calculated as | |
1393 | @var{fixp}->fx_where + @var{fixp}->fx_frag->fr_address . | |
252b5132 RH |
1394 | |
1395 | @item md_pcrel_from | |
1396 | @cindex md_pcrel_from | |
1397 | This is the default value of @code{MD_PCREL_FROM_SECTION}. The difference is | |
1398 | that @code{md_pcrel_from} does not take a section argument. | |
1399 | ||
1400 | @item tc_frob_label | |
1401 | @cindex tc_frob_label | |
1402 | If you define this macro, GAS will call it each time a label is defined. | |
1403 | ||
a1facbec MR |
1404 | @item tc_new_dot_label |
1405 | @cindex tc_new_dot_label | |
1406 | If you define this macro, GAS will call it each time a fake label is created | |
1407 | off the special dot symbol. | |
1408 | ||
252b5132 RH |
1409 | @item md_section_align |
1410 | @cindex md_section_align | |
1411 | GAS will call this function for each section at the end of the assembly, to | |
65fd87bc ILT |
1412 | permit the CPU backend to adjust the alignment of a section. The function |
1413 | must take two arguments, a @code{segT} for the section and a @code{valueT} | |
1414 | for the size of the section, and return a @code{valueT} for the rounded | |
1415 | size. | |
252b5132 | 1416 | |
9f10757c TW |
1417 | @item md_macro_start |
1418 | @cindex md_macro_start | |
1419 | If defined, GAS will call this macro when it starts to include a macro | |
1420 | expansion. @code{macro_nest} indicates the current macro nesting level, which | |
58a77e41 | 1421 | includes the one being expanded. |
9f10757c TW |
1422 | |
1423 | @item md_macro_info | |
1424 | @cindex md_macro_info | |
1425 | If defined, GAS will call this macro after the macro expansion has been | |
1426 | included in the input and after parsing the macro arguments. The single | |
1427 | argument is a pointer to the macro processing's internal representation of the | |
1428 | macro (macro_entry *), which includes expansion of the formal arguments. | |
1429 | ||
1430 | @item md_macro_end | |
1431 | @cindex md_macro_end | |
1432 | Complement to md_macro_start. If defined, it is called when finished | |
58a77e41 | 1433 | processing an inserted macro expansion, just before decrementing macro_nest. |
9f10757c | 1434 | |
f28e8eb3 TW |
1435 | @item DOUBLEBAR_PARALLEL |
1436 | @cindex DOUBLEBAR_PARALLEL | |
1437 | Affects the preprocessor so that lines containing '||' don't have their | |
1438 | whitespace stripped following the double bar. This is useful for targets that | |
1439 | implement parallel instructions. | |
1440 | ||
1441 | @item KEEP_WHITE_AROUND_COLON | |
1442 | @cindex KEEP_WHITE_AROUND_COLON | |
1443 | Normally, whitespace is compressed and removed when, in the presence of the | |
1444 | colon, the adjoining tokens can be distinguished. This option affects the | |
1445 | preprocessor so that whitespace around colons is preserved. This is useful | |
1446 | when colons might be removed from the input after preprocessing but before | |
1447 | assembling, so that adjoining tokens can still be distinguished if there is | |
062b7c0c | 1448 | whitespace, or concatenated if there is not. |
f28e8eb3 | 1449 | |
252b5132 RH |
1450 | @item tc_frob_section |
1451 | @cindex tc_frob_section | |
829c3ed3 | 1452 | If you define this macro, GAS will call it for each |
252b5132 RH |
1453 | section at the end of the assembly. |
1454 | ||
1455 | @item tc_frob_file_before_adjust | |
1456 | @cindex tc_frob_file_before_adjust | |
1457 | If you define this macro, GAS will call it after the symbol values are | |
1458 | resolved, but before the fixups have been changed from local symbols to section | |
1459 | symbols. | |
1460 | ||
1461 | @item tc_frob_symbol | |
1462 | @cindex tc_frob_symbol | |
1463 | If you define this macro, GAS will call it for each symbol. You can indicate | |
062b7c0c | 1464 | that the symbol should not be included in the object file by defining this |
252b5132 RH |
1465 | macro to set its second argument to a non-zero value. |
1466 | ||
1467 | @item tc_frob_file | |
1468 | @cindex tc_frob_file | |
1469 | If you define this macro, GAS will call it after the symbol table has been | |
1470 | completed, but before the relocations have been generated. | |
1471 | ||
1472 | @item tc_frob_file_after_relocs | |
1473 | If you define this macro, GAS will call it after the relocs have been | |
1474 | generated. | |
1475 | ||
2f0c68f2 CM |
1476 | @item tc_cfi_reloc_for_encoding |
1477 | @cindex tc_cfi_reloc_for_encoding | |
1478 | This macro is used to indicate whether a cfi encoding requires a relocation. | |
1479 | It should return the required relocation type. Defining this macro implies | |
1480 | that Compact EH is supported. | |
1481 | ||
e0001a05 NC |
1482 | @item md_post_relax_hook |
1483 | If you define this macro, GAS will call it after relaxing and sizing the | |
1484 | segments. | |
1485 | ||
252b5132 RH |
1486 | @item LISTING_HEADER |
1487 | A string to use on the header line of a listing. The default value is simply | |
1488 | @code{"GAS LISTING"}. | |
1489 | ||
1490 | @item LISTING_WORD_SIZE | |
1491 | The number of bytes to put into a word in a listing. This affects the way the | |
1492 | bytes are clumped together in the listing. For example, a value of 2 might | |
1493 | print @samp{1234 5678} where a value of 1 would print @samp{12 34 56 78}. The | |
1494 | default value is 4. | |
1495 | ||
1496 | @item LISTING_LHS_WIDTH | |
1497 | The number of words of data to print on the first line of a listing for a | |
1498 | particular source line, where each word is @code{LISTING_WORD_SIZE} bytes. The | |
1499 | default value is 1. | |
1500 | ||
1501 | @item LISTING_LHS_WIDTH_SECOND | |
1502 | Like @code{LISTING_LHS_WIDTH}, but applying to the second and subsequent line | |
1503 | of the data printed for a particular source line. The default value is 1. | |
1504 | ||
1505 | @item LISTING_LHS_CONT_LINES | |
1506 | The maximum number of continuation lines to print in a listing for a particular | |
1507 | source line. The default value is 4. | |
1508 | ||
1509 | @item LISTING_RHS_WIDTH | |
1510 | The maximum number of characters to print from one line of the input file. The | |
1511 | default value is 100. | |
b8a9dcab NC |
1512 | |
1513 | @item TC_COFF_SECTION_DEFAULT_ATTRIBUTES | |
1514 | @cindex TC_COFF_SECTION_DEFAULT_ATTRIBUTES | |
1515 | The COFF @code{.section} directive will use the value of this macro to set | |
1516 | a new section's attributes when a directive has no valid flags or when the | |
1517 | flag is @code{w}. The default value of the macro is @code{SEC_LOAD | SEC_DATA}. | |
1518 | ||
c3c36456 | 1519 | @item DWARF2_FORMAT (@var{sec}) |
14e777e0 KB |
1520 | @cindex DWARF2_FORMAT |
1521 | If you define this, it should return one of @code{dwarf2_format_32bit}, | |
1522 | @code{dwarf2_format_64bit}, or @code{dwarf2_format_64bit_irix} to indicate | |
1523 | the size of internal DWARF section offsets and the format of the DWARF initial | |
1524 | length fields. When @code{dwarf2_format_32bit} is returned, the initial | |
1525 | length field will be 4 bytes long and section offsets are 32 bits in size. | |
1526 | For @code{dwarf2_format_64bit} and @code{dwarf2_format_64bit_irix}, section | |
1527 | offsets are 64 bits in size, but the initial length field differs. An 8 byte | |
1528 | initial length is indicated by @code{dwarf2_format_64bit_irix} and | |
1529 | @code{dwarf2_format_64bit} indicates a 12 byte initial length field in | |
1530 | which the first four bytes are 0xffffffff and the next 8 bytes are | |
1531 | the section's length. | |
1532 | ||
1533 | If you don't define this, @code{dwarf2_format_32bit} will be used as | |
1534 | the default. | |
1535 | ||
c3c36456 | 1536 | This define only affects debug |
14e777e0 KB |
1537 | sections generated by the assembler. DWARF 2 sections generated by |
1538 | other tools will be unaffected by this setting. | |
1539 | ||
9605f328 AO |
1540 | @item DWARF2_ADDR_SIZE (@var{bfd}) |
1541 | @cindex DWARF2_ADDR_SIZE | |
1542 | It should return the size of an address, as it should be represented in | |
1543 | debugging info. If you don't define this macro, the default definition uses | |
1544 | the number of bits per address, as defined in @var{bfd}, divided by 8. | |
1545 | ||
329e276d NC |
1546 | @item MD_DEBUG_FORMAT_SELECTOR |
1547 | @cindex MD_DEBUG_FORMAT_SELECTOR | |
1548 | If defined this macro is the name of a function to be called when the | |
1549 | @samp{--gen-debug} switch is detected on the assembler's command line. The | |
1550 | prototype for the function looks like this: | |
1551 | ||
1552 | @smallexample | |
1553 | enum debug_info_type MD_DEBUG_FORMAT_SELECTOR (int * use_gnu_extensions) | |
1554 | @end smallexample | |
1555 | ||
1556 | The function should return the debug format that is preferred by the CPU | |
1557 | backend. This format will be used when generating assembler specific debug | |
1558 | information. | |
1559 | ||
bfff1642 NC |
1560 | @item md_allow_local_subtract (@var{left}, @var{right}, @var{section}) |
1561 | If defined, GAS will call this macro when evaluating an expression which is the | |
1562 | difference of two symbols defined in the same section. It takes three | |
1563 | arguments: @code{expressioS * @var{left}} which is the symbolic expression on | |
1564 | the left hand side of the subtraction operation, @code{expressionS * | |
1565 | @var{right}} which is the symbolic expression on the right hand side of the | |
1566 | subtraction, and @code{segT @var{section}} which is the section containing the two | |
1567 | symbols. The macro should return a non-zero value if the expression should be | |
1568 | evaluated. Targets which implement link time relaxation which may change the | |
1569 | position of the two symbols relative to each other should ensure that this | |
1570 | macro returns zero in situations where this can occur. | |
1571 | ||
8c750480 NC |
1572 | @item md_allow_eh_opt |
1573 | If defined, GAS will check this macro before performing any optimizations on | |
1574 | the DWARF call frame debug information that is emitted. Targets which | |
1575 | implement link time relaxation may need to define this macro and set it to zero | |
1576 | if it is possible to change the size of a function's prologue. | |
252b5132 RH |
1577 | @end table |
1578 | ||
1579 | @node Object format backend | |
1580 | @subsection Writing an object format backend | |
1581 | @cindex object format backend | |
1582 | @cindex @file{obj-@var{fmt}} | |
1583 | ||
1584 | As with the CPU backend, the object format backend must define a few things, | |
1585 | and may define some other things. The interface to the object format backend | |
1586 | is generally simpler; most of the support for an object file format consists of | |
1587 | defining a number of pseudo-ops. | |
1588 | ||
1589 | The object format @file{.h} file must include @file{targ-cpu.h}. | |
1590 | ||
252b5132 RH |
1591 | @table @code |
1592 | @item OBJ_@var{format} | |
1593 | @cindex OBJ_@var{format} | |
1594 | By convention, you should define this macro in the @file{.h} file. For | |
1595 | example, @file{obj-elf.h} defines @code{OBJ_ELF}. You might have to use this | |
1596 | if it is necessary to add object file format specific code to the CPU file. | |
1597 | ||
1598 | @item obj_begin | |
1599 | If you define this macro, GAS will call it at the start of the assembly, after | |
1600 | the command line arguments have been parsed and all the machine independent | |
1601 | initializations have been completed. | |
1602 | ||
1603 | @item obj_app_file | |
1604 | @cindex obj_app_file | |
1605 | If you define this macro, GAS will invoke it when it sees a @code{.file} | |
1606 | pseudo-op or a @samp{#} line as used by the C preprocessor. | |
1607 | ||
1608 | @item OBJ_COPY_SYMBOL_ATTRIBUTES | |
1609 | @cindex OBJ_COPY_SYMBOL_ATTRIBUTES | |
1610 | You should define this macro to copy object format specific information from | |
1611 | one symbol to another. GAS will call it when one symbol is equated to | |
1612 | another. | |
1613 | ||
252b5132 RH |
1614 | @item obj_sec_sym_ok_for_reloc |
1615 | @cindex obj_sec_sym_ok_for_reloc | |
1616 | You may define this macro to indicate that it is OK to use a section symbol in | |
062b7c0c | 1617 | a relocation entry. If it is not, GAS will define a new symbol at the start |
252b5132 RH |
1618 | of a section. |
1619 | ||
1620 | @item EMIT_SECTION_SYMBOLS | |
1621 | @cindex EMIT_SECTION_SYMBOLS | |
1622 | You should define this macro with a zero value if you do not want to include | |
1623 | section symbols in the output symbol table. The default value for this macro | |
1624 | is one. | |
1625 | ||
1626 | @item obj_adjust_symtab | |
1627 | @cindex obj_adjust_symtab | |
1628 | If you define this macro, GAS will invoke it just before setting the symbol | |
1629 | table of the output BFD. For example, the COFF support uses this macro to | |
1630 | generate a @code{.file} symbol if none was generated previously. | |
1631 | ||
1632 | @item SEPARATE_STAB_SECTIONS | |
1633 | @cindex SEPARATE_STAB_SECTIONS | |
0aa5d426 HPN |
1634 | You may define this macro to a nonzero value to indicate that stabs should be |
1635 | placed in separate sections, as in ELF. | |
252b5132 RH |
1636 | |
1637 | @item INIT_STAB_SECTION | |
1638 | @cindex INIT_STAB_SECTION | |
1639 | You may define this macro to initialize the stabs section in the output file. | |
1640 | ||
1641 | @item OBJ_PROCESS_STAB | |
1642 | @cindex OBJ_PROCESS_STAB | |
1643 | You may define this macro to do specific processing on a stabs entry. | |
1644 | ||
1645 | @item obj_frob_section | |
1646 | @cindex obj_frob_section | |
1647 | If you define this macro, GAS will call it for each section at the end of the | |
1648 | assembly. | |
1649 | ||
1650 | @item obj_frob_file_before_adjust | |
1651 | @cindex obj_frob_file_before_adjust | |
1652 | If you define this macro, GAS will call it after the symbol values are | |
1653 | resolved, but before the fixups have been changed from local symbols to section | |
1654 | symbols. | |
1655 | ||
1656 | @item obj_frob_symbol | |
1657 | @cindex obj_frob_symbol | |
1658 | If you define this macro, GAS will call it for each symbol. You can indicate | |
062b7c0c | 1659 | that the symbol should not be included in the object file by defining this |
252b5132 RH |
1660 | macro to set its second argument to a non-zero value. |
1661 | ||
06e77878 AO |
1662 | @item obj_set_weak_hook |
1663 | @cindex obj_set_weak_hook | |
1664 | If you define this macro, @code{S_SET_WEAK} will call it before modifying the | |
1665 | symbol's flags. | |
1666 | ||
1667 | @item obj_clear_weak_hook | |
1668 | @cindex obj_clear_weak_hook | |
b45619c0 | 1669 | If you define this macro, @code{S_CLEAR_WEAKREFD} will call it after cleaning |
06e77878 AO |
1670 | the @code{weakrefd} flag, but before modifying any other flags. |
1671 | ||
252b5132 RH |
1672 | @item obj_frob_file |
1673 | @cindex obj_frob_file | |
1674 | If you define this macro, GAS will call it after the symbol table has been | |
1675 | completed, but before the relocations have been generated. | |
1676 | ||
1677 | @item obj_frob_file_after_relocs | |
1678 | If you define this macro, GAS will call it after the relocs have been | |
1679 | generated. | |
945a1a6b ILT |
1680 | |
1681 | @item SET_SECTION_RELOCS (@var{sec}, @var{relocs}, @var{n}) | |
1682 | @cindex SET_SECTION_RELOCS | |
1683 | If you define this, it will be called after the relocations have been set for | |
1684 | the section @var{sec}. The list of relocations is in @var{relocs}, and the | |
829c3ed3 | 1685 | number of relocations is in @var{n}. |
252b5132 RH |
1686 | @end table |
1687 | ||
1688 | @node Emulations | |
1689 | @subsection Writing emulation files | |
1690 | ||
1691 | Normally you do not have to write an emulation file. You can just use | |
1692 | @file{te-generic.h}. | |
1693 | ||
1694 | If you do write your own emulation file, it must include @file{obj-format.h}. | |
1695 | ||
1696 | An emulation file will often define @code{TE_@var{EM}}; this may then be used | |
1697 | in other files to change the output. | |
1698 | ||
1699 | @node Relaxation | |
1700 | @section Relaxation | |
1701 | @cindex relaxation | |
1702 | ||
1703 | @dfn{Relaxation} is a generic term used when the size of some instruction or | |
1704 | data depends upon the value of some symbol or other data. | |
1705 | ||
1706 | GAS knows to relax a particular type of PC relative relocation using a table. | |
1707 | You can also define arbitrarily complex forms of relaxation yourself. | |
1708 | ||
1709 | @menu | |
1710 | * Relaxing with a table:: Relaxing with a table | |
1711 | * General relaxing:: General relaxing | |
1712 | @end menu | |
1713 | ||
1714 | @node Relaxing with a table | |
1715 | @subsection Relaxing with a table | |
1716 | ||
1717 | If you do not define @code{md_relax_frag}, and you do define | |
1718 | @code{TC_GENERIC_RELAX_TABLE}, GAS will relax @code{rs_machine_dependent} frags | |
1719 | based on the frag subtype and the displacement to some specified target | |
1720 | address. The basic idea is that several machines have different addressing | |
1721 | modes for instructions that can specify different ranges of values, with | |
1722 | successive modes able to access wider ranges, including the entirety of the | |
1723 | previous range. Smaller ranges are assumed to be more desirable (perhaps the | |
1724 | instruction requires one word instead of two or three); if this is not the | |
1725 | case, don't describe the smaller-range, inferior mode. | |
1726 | ||
1727 | The @code{fr_subtype} field of a frag is an index into a CPU-specific | |
1728 | relaxation table. That table entry indicates the range of values that can be | |
1729 | stored, the number of bytes that will have to be added to the frag to | |
062b7c0c | 1730 | accommodate the addressing mode, and the index of the next entry to examine if |
252b5132 RH |
1731 | the value to be stored is outside the range accessible by the current |
1732 | addressing mode. The @code{fr_symbol} field of the frag indicates what symbol | |
1733 | is to be accessed; the @code{fr_offset} field is added in. | |
1734 | ||
1735 | If the @code{TC_PCREL_ADJUST} macro is defined, which currently should only happen | |
1736 | for the NS32k family, the @code{TC_PCREL_ADJUST} macro is called on the frag to | |
1737 | compute an adjustment to be made to the displacement. | |
1738 | ||
1739 | The value fitted by the relaxation code is always assumed to be a displacement | |
1740 | from the current frag. (More specifically, from @code{fr_fix} bytes into the | |
1741 | frag.) | |
1742 | @ignore | |
1743 | This seems kinda silly. What about fitting small absolute values? I suppose | |
1744 | @code{md_assemble} is supposed to take care of that, but if the operand is a | |
1745 | difference between symbols, it might not be able to, if the difference was not | |
1746 | computable yet. | |
1747 | @end ignore | |
1748 | ||
1749 | The end of the relaxation sequence is indicated by a ``next'' value of 0. This | |
1750 | means that the first entry in the table can't be used. | |
1751 | ||
1752 | For some configurations, the linker can do relaxing within a section of an | |
1753 | object file. If call instructions of various sizes exist, the linker can | |
1754 | determine which should be used in each instance, when a symbol's value is | |
1755 | resolved. In order for the linker to avoid wasting space and having to insert | |
1756 | no-op instructions, it must be able to expand or shrink the section contents | |
1757 | while still preserving intra-section references and meeting alignment | |
1758 | requirements. | |
1759 | ||
1760 | For the i960 using b.out format, no expansion is done; instead, each | |
1761 | @samp{.align} directive causes extra space to be allocated, enough that when | |
1762 | the linker is relaxing a section and removing unneeded space, it can discard | |
1763 | some or all of this extra padding and cause the following data to be correctly | |
1764 | aligned. | |
1765 | ||
1766 | For the H8/300, I think the linker expands calls that can't reach, and doesn't | |
1767 | worry about alignment issues; the cpu probably never needs any significant | |
1768 | alignment beyond the instruction size. | |
1769 | ||
1770 | The relaxation table type contains these fields: | |
1771 | ||
1772 | @table @code | |
1773 | @item long rlx_forward | |
1774 | Forward reach, must be non-negative. | |
1775 | @item long rlx_backward | |
1776 | Backward reach, must be zero or negative. | |
1777 | @item rlx_length | |
1778 | Length in bytes of this addressing mode. | |
1779 | @item rlx_more | |
1780 | Index of the next-longer relax state, or zero if there is no next relax state. | |
1781 | @end table | |
1782 | ||
1783 | The relaxation is done in @code{relax_segment} in @file{write.c}. The | |
1784 | difference in the length fields between the original mode and the one finally | |
1785 | chosen by the relaxing code is taken as the size by which the current frag will | |
1786 | be increased in size. For example, if the initial relaxing mode has a length | |
1787 | of 2 bytes, and because of the size of the displacement, it gets upgraded to a | |
1788 | mode with a size of 6 bytes, it is assumed that the frag will grow by 4 bytes. | |
1789 | (The initial two bytes should have been part of the fixed portion of the frag, | |
1790 | since it is already known that they will be output.) This growth must be | |
1791 | effected by @code{md_convert_frag}; it should increase the @code{fr_fix} field | |
1792 | by the appropriate size, and fill in the appropriate bytes of the frag. | |
1793 | (Enough space for the maximum growth should have been allocated in the call to | |
1794 | frag_var as the second argument.) | |
1795 | ||
1796 | If relocation records are needed, they should be emitted by | |
1797 | @code{md_estimate_size_before_relax}. This function should examine the target | |
1798 | symbol of the supplied frag and correct the @code{fr_subtype} of the frag if | |
1799 | needed. When this function is called, if the symbol has not yet been defined, | |
1800 | it will not become defined later; however, its value may still change if the | |
1801 | section it is in gets relaxed. | |
1802 | ||
1803 | Usually, if the symbol is in the same section as the frag (given by the | |
1804 | @var{sec} argument), the narrowest likely relaxation mode is stored in | |
1805 | @code{fr_subtype}, and that's that. | |
1806 | ||
60493797 | 1807 | If the symbol is undefined, or in a different section (and therefore movable |
252b5132 RH |
1808 | to an arbitrarily large distance), the largest available relaxation mode is |
1809 | specified, @code{fix_new} is called to produce the relocation record, | |
1810 | @code{fr_fix} is increased to include the relocated field (remember, this | |
1811 | storage was allocated when @code{frag_var} was called), and @code{frag_wane} is | |
1812 | called to convert the frag to an @code{rs_fill} frag with no variant part. | |
1813 | Sometimes changing addressing modes may also require rewriting the instruction. | |
1814 | It can be accessed via @code{fr_opcode} or @code{fr_fix}. | |
1815 | ||
67db5ab4 HPN |
1816 | If you generate frags separately for the basic insn opcode and any relaxable |
1817 | operands, do not call @code{fix_new} thinking you can emit fixups for the | |
062b7c0c | 1818 | opcode field from the relaxable frag. It is not guaranteed to be the same frag. |
67db5ab4 HPN |
1819 | If you need to emit fixups for the opcode field from inspection of the |
1820 | relaxable frag, then you need to generate a common frag for both the basic | |
1821 | opcode and relaxable fields, or you need to provide the frag for the opcode to | |
1822 | pass to @code{fix_new}. The latter can be done for example by defining | |
1823 | @code{TC_FRAG_TYPE} to include a pointer to it and defining @code{TC_FRAG_INIT} | |
1824 | to set the pointer. | |
1825 | ||
252b5132 RH |
1826 | Sometimes @code{fr_var} is increased instead, and @code{frag_wane} is not |
1827 | called. I'm not sure, but I think this is to keep @code{fr_fix} referring to | |
1828 | an earlier byte, and @code{fr_subtype} set to @code{rs_machine_dependent} so | |
1829 | that @code{md_convert_frag} will get called. | |
1830 | ||
1831 | @node General relaxing | |
1832 | @subsection General relaxing | |
1833 | ||
1834 | If using a simple table is not suitable, you may implement arbitrarily complex | |
1835 | relaxation semantics yourself. For example, the MIPS backend uses this to emit | |
1836 | different instruction sequences depending upon the size of the symbol being | |
1837 | accessed. | |
1838 | ||
1839 | When you assemble an instruction that may need relaxation, you should allocate | |
1840 | a frag using @code{frag_var} or @code{frag_variant} with a type of | |
1841 | @code{rs_machine_dependent}. You should store some sort of information in the | |
1842 | @code{fr_subtype} field so that you can figure out what to do with the frag | |
1843 | later. | |
1844 | ||
1845 | When GAS reaches the end of the input file, it will look through the frags and | |
1846 | work out their final sizes. | |
1847 | ||
1848 | GAS will first call @code{md_estimate_size_before_relax} on each | |
1849 | @code{rs_machine_dependent} frag. This function must return an estimated size | |
1850 | for the frag. | |
1851 | ||
1852 | GAS will then loop over the frags, calling @code{md_relax_frag} on each | |
1853 | @code{rs_machine_dependent} frag. This function should return the change in | |
1854 | size of the frag. GAS will keep looping over the frags until none of the frags | |
1855 | changes size. | |
1856 | ||
1857 | @node Broken words | |
1858 | @section Broken words | |
1859 | @cindex internals, broken words | |
1860 | @cindex broken words | |
1861 | ||
1862 | Some compilers, including GCC, will sometimes emit switch tables specifying | |
1863 | 16-bit @code{.word} displacements to branch targets, and branch instructions | |
1864 | that load entries from that table to compute the target address. If this is | |
1865 | done on a 32-bit machine, there is a chance (at least with really large | |
1866 | functions) that the displacement will not fit in 16 bits. The assembler | |
1867 | handles this using a concept called @dfn{broken words}. This idea is well | |
1868 | named, since there is an implied promise that the 16-bit field will in fact | |
1869 | hold the specified displacement. | |
1870 | ||
1871 | If broken word processing is enabled, and a situation like this is encountered, | |
1872 | the assembler will insert a jump instruction into the instruction stream, close | |
1873 | enough to be reached with the 16-bit displacement. This jump instruction will | |
1874 | transfer to the real desired target address. Thus, as long as the @code{.word} | |
1875 | value really is used as a displacement to compute an address to jump to, the | |
1876 | net effect will be correct (minus a very small efficiency cost). If | |
1877 | @code{.word} directives with label differences for values are used for other | |
1878 | purposes, however, things may not work properly. For targets which use broken | |
1879 | words, the @samp{-K} option will warn when a broken word is discovered. | |
1880 | ||
1881 | The broken word code is turned off by the @code{WORKING_DOT_WORD} macro. It | |
1882 | isn't needed if @code{.word} emits a value large enough to contain an address | |
1883 | (or, more correctly, any possible difference between two addresses). | |
1884 | ||
1885 | @node Internal functions | |
1886 | @section Internal functions | |
1887 | ||
1888 | This section describes basic internal functions used by GAS. | |
1889 | ||
1890 | @menu | |
1891 | * Warning and error messages:: Warning and error messages | |
1892 | * Hash tables:: Hash tables | |
1893 | @end menu | |
1894 | ||
1895 | @node Warning and error messages | |
1896 | @subsection Warning and error messages | |
1897 | ||
1898 | @deftypefun @{@} int had_warnings (void) | |
1899 | @deftypefunx @{@} int had_errors (void) | |
1900 | Returns non-zero if any warnings or errors, respectively, have been printed | |
1901 | during this invocation. | |
1902 | @end deftypefun | |
1903 | ||
252b5132 RH |
1904 | @deftypefun @{@} void as_tsktsk (const char *@var{format}, ...) |
1905 | @deftypefunx @{@} void as_warn (const char *@var{format}, ...) | |
1906 | @deftypefunx @{@} void as_bad (const char *@var{format}, ...) | |
1907 | @deftypefunx @{@} void as_fatal (const char *@var{format}, ...) | |
1908 | These functions display messages about something amiss with the input file, or | |
1909 | internal problems in the assembler itself. The current file name and line | |
1910 | number are printed, followed by the supplied message, formatted using | |
1911 | @code{vfprintf}, and a final newline. | |
1912 | ||
1913 | An error indicated by @code{as_bad} will result in a non-zero exit status when | |
1914 | the assembler has finished. Calling @code{as_fatal} will result in immediate | |
1915 | termination of the assembler process. | |
1916 | @end deftypefun | |
1917 | ||
1918 | @deftypefun @{@} void as_warn_where (char *@var{file}, unsigned int @var{line}, const char *@var{format}, ...) | |
1919 | @deftypefunx @{@} void as_bad_where (char *@var{file}, unsigned int @var{line}, const char *@var{format}, ...) | |
1920 | These variants permit specification of the file name and line number, and are | |
1921 | used when problems are detected when reprocessing information saved away when | |
1922 | processing some earlier part of the file. For example, fixups are processed | |
1923 | after all input has been read, but messages about fixups should refer to the | |
1924 | original filename and line number that they are applicable to. | |
1925 | @end deftypefun | |
1926 | ||
87c245cc BE |
1927 | @deftypefun @{@} void sprint_value (char *@var{buf}, valueT @var{val}) |
1928 | This function is helpful for converting a @code{valueT} value into printable | |
252b5132 RH |
1929 | format, in case it's wider than modes that @code{*printf} can handle. If the |
1930 | type is narrow enough, a decimal number will be produced; otherwise, it will be | |
1931 | in hexadecimal. The value itself is not examined to make this determination. | |
1932 | @end deftypefun | |
1933 | ||
1934 | @node Hash tables | |
1935 | @subsection Hash tables | |
1936 | @cindex hash tables | |
1937 | ||
1938 | @deftypefun @{@} @{struct hash_control *@} hash_new (void) | |
1939 | Creates the hash table control structure. | |
1940 | @end deftypefun | |
1941 | ||
1942 | @deftypefun @{@} void hash_die (struct hash_control *) | |
1943 | Destroy a hash table. | |
1944 | @end deftypefun | |
1945 | ||
5a49b8ac | 1946 | @deftypefun @{@} void *hash_delete (struct hash_control *, const char *, int) |
818236e5 AM |
1947 | Deletes entry from the hash table, returns the value it had. If the last |
1948 | arg is non-zero, free memory allocated for this entry and all entries | |
1949 | allocated more recently than this entry. | |
252b5132 RH |
1950 | @end deftypefun |
1951 | ||
5a49b8ac | 1952 | @deftypefun @{@} void *hash_replace (struct hash_control *, const char *, void *) |
252b5132 RH |
1953 | Updates the value for an entry already in the table, returning the old value. |
1954 | If no entry was found, just returns NULL. | |
1955 | @end deftypefun | |
1956 | ||
5a49b8ac | 1957 | @deftypefun @{@} @{const char *@} hash_insert (struct hash_control *, const char *, void *) |
252b5132 RH |
1958 | Inserting a value already in the table is an error. |
1959 | Returns an error message or NULL. | |
1960 | @end deftypefun | |
1961 | ||
5a49b8ac | 1962 | @deftypefun @{@} @{const char *@} hash_jam (struct hash_control *, const char *, void *) |
252b5132 RH |
1963 | Inserts if the value isn't already present, updates it if it is. |
1964 | @end deftypefun | |
1965 | ||
1966 | @node Test suite | |
1967 | @section Test suite | |
1968 | @cindex test suite | |
1969 | ||
1970 | The test suite is kind of lame for most processors. Often it only checks to | |
1971 | see if a couple of files can be assembled without the assembler reporting any | |
1972 | errors. For more complete testing, write a test which either examines the | |
1973 | assembler listing, or runs @code{objdump} and examines its output. For the | |
1974 | latter, the TCL procedure @code{run_dump_test} may come in handy. It takes the | |
1975 | base name of a file, and looks for @file{@var{file}.d}. This file should | |
1976 | contain as its initial lines a set of variable settings in @samp{#} comments, | |
1977 | in the form: | |
1978 | ||
1979 | @example | |
1980 | #@var{varname}: @var{value} | |
1981 | @end example | |
1982 | ||
1983 | The @var{varname} may be @code{objdump}, @code{nm}, or @code{as}, in which case | |
1984 | it specifies the options to be passed to the specified programs. Exactly one | |
1985 | of @code{objdump} or @code{nm} must be specified, as that also specifies which | |
1986 | program to run after the assembler has finished. If @var{varname} is | |
1987 | @code{source}, it specifies the name of the source file; otherwise, | |
1988 | @file{@var{file}.s} is used. If @var{varname} is @code{name}, it specifies the | |
1989 | name of the test to be used in the @code{pass} or @code{fail} messages. | |
1990 | ||
1991 | The non-commented parts of the file are interpreted as regular expressions, one | |
1992 | per line. Blank lines in the @code{objdump} or @code{nm} output are skipped, | |
1993 | as are blank lines in the @code{.d} file; the other lines are tested to see if | |
1994 | the regular expression matches the program output. If it does not, the test | |
1995 | fails. | |
1996 | ||
1997 | Note that this means the tests must be modified if the @code{objdump} output | |
1998 | style is changed. | |
1999 | ||
2000 | @bye | |
2001 | @c Local Variables: | |
2002 | @c fill-column: 79 | |
2003 | @c End: |