2011-01-07 Michael Snyder <msnyder@vmware.com>
[deliverable/binutils-gdb.git] / gas / doc / c-i386.texi
CommitLineData
2da5c037 1@c Copyright 1991, 1992, 1993, 1994, 1995, 1997, 1998, 1999, 2000,
aa820537 2@c 2001, 2003, 2004, 2005, 2006, 2007, 2008, 2009
f7e42eb4 3@c Free Software Foundation, Inc.
252b5132
RH
4@c This is part of the GAS manual.
5@c For copying conditions, see the file as.texinfo.
731caf76
L
6@c man end
7
252b5132
RH
8@ifset GENERIC
9@page
10@node i386-Dependent
11@chapter 80386 Dependent Features
12@end ifset
13@ifclear GENERIC
14@node Machine Dependencies
15@chapter 80386 Dependent Features
16@end ifclear
17
18@cindex i386 support
b6169b20 19@cindex i80386 support
55b62671
AJ
20@cindex x86-64 support
21
22The i386 version @code{@value{AS}} supports both the original Intel 386
23architecture in both 16 and 32-bit mode as well as AMD x86-64 architecture
24extending the Intel architecture to 64-bits.
25
252b5132
RH
26@menu
27* i386-Options:: Options
a6c24e68 28* i386-Directives:: X86 specific directives
252b5132
RH
29* i386-Syntax:: AT&T Syntax versus Intel Syntax
30* i386-Mnemonics:: Instruction Naming
31* i386-Regs:: Register Naming
32* i386-Prefixes:: Instruction Prefixes
33* i386-Memory:: Memory References
fddf5b5b 34* i386-Jumps:: Handling of Jump Instructions
252b5132
RH
35* i386-Float:: Floating Point
36* i386-SIMD:: Intel's MMX and AMD's 3DNow! SIMD Operations
f88c9eb0 37* i386-LWP:: AMD's Lightweight Profiling Instructions
252b5132 38* i386-16bit:: Writing 16-bit Code
e413e4e9 39* i386-Arch:: Specifying an x86 CPU architecture
252b5132
RH
40* i386-Bugs:: AT&T Syntax bugs
41* i386-Notes:: Notes
42@end menu
43
44@node i386-Options
45@section Options
46
55b62671
AJ
47@cindex options for i386
48@cindex options for x86-64
49@cindex i386 options
50@cindex x86-64 options
51
52The i386 version of @code{@value{AS}} has a few machine
53dependent options:
54
731caf76
L
55@c man begin OPTIONS
56@table @gcctabopt
55b62671
AJ
57@cindex @samp{--32} option, i386
58@cindex @samp{--32} option, x86-64
351f65ca
L
59@cindex @samp{--n32} option, i386
60@cindex @samp{--n32} option, x86-64
55b62671
AJ
61@cindex @samp{--64} option, i386
62@cindex @samp{--64} option, x86-64
351f65ca 63@item --32 | --n32 | --64
35cc6a0b
L
64Select the word size, either 32 bits or 64 bits. @samp{--32}
65implies Intel i386 architecture, while @samp{--n32} and @samp{--64}
66imply AMD x86-64 architecture with 32-bit or 64-bit word-size
67respectively.
55b62671
AJ
68
69These options are only available with the ELF object file format, and
70require that the necessary BFD support has been included (on a 32-bit
71platform you have to add --enable-64-bit-bfd to configure enable 64-bit
72usage and use x86-64 as target platform).
12b55ccc
L
73
74@item -n
75By default, x86 GAS replaces multiple nop instructions used for
76alignment within code sections with multi-byte nop instructions such
77as leal 0(%esi,1),%esi. This switch disables the optimization.
b3b91714
AM
78
79@cindex @samp{--divide} option, i386
80@item --divide
81On SVR4-derived platforms, the character @samp{/} is treated as a comment
82character, which means that it cannot be used in expressions. The
83@samp{--divide} option turns @samp{/} into a normal character. This does
84not disable @samp{/} at the beginning of a line starting a comment, or
85affect using @samp{#} for starting a comment.
86
9103f4f4
L
87@cindex @samp{-march=} option, i386
88@cindex @samp{-march=} option, x86-64
6305a203
L
89@item -march=@var{CPU}[+@var{EXTENSION}@dots{}]
90This option specifies the target processor. The assembler will
91issue an error message if an attempt is made to assemble an instruction
92which will not execute on the target processor. The following
93processor names are recognized:
9103f4f4
L
94@code{i8086},
95@code{i186},
96@code{i286},
97@code{i386},
98@code{i486},
99@code{i586},
100@code{i686},
101@code{pentium},
102@code{pentiumpro},
103@code{pentiumii},
104@code{pentiumiii},
105@code{pentium4},
106@code{prescott},
107@code{nocona},
ef05d495
L
108@code{core},
109@code{core2},
bd5295b2 110@code{corei7},
8a9036a4 111@code{l1om},
9103f4f4
L
112@code{k6},
113@code{k6_2},
114@code{athlon},
9103f4f4
L
115@code{opteron},
116@code{k8},
1ceab344 117@code{amdfam10},
68339fdf 118@code{bdver1},
9103f4f4
L
119@code{generic32} and
120@code{generic64}.
121
6305a203
L
122In addition to the basic instruction set, the assembler can be told to
123accept various extension mnemonics. For example,
124@code{-march=i686+sse4+vmx} extends @var{i686} with @var{sse4} and
125@var{vmx}. The following extensions are currently supported:
309d3373
JB
126@code{8087},
127@code{287},
128@code{387},
129@code{no87},
6305a203 130@code{mmx},
309d3373 131@code{nommx},
6305a203
L
132@code{sse},
133@code{sse2},
134@code{sse3},
135@code{ssse3},
136@code{sse4.1},
137@code{sse4.2},
138@code{sse4},
309d3373 139@code{nosse},
c0f3af97 140@code{avx},
309d3373 141@code{noavx},
6305a203
L
142@code{vmx},
143@code{smx},
f03fe4c1 144@code{xsave},
c7b8aa3a 145@code{xsaveopt},
c0f3af97 146@code{aes},
594ab6a3 147@code{pclmul},
c7b8aa3a
L
148@code{fsgsbase},
149@code{rdrnd},
150@code{f16c},
c0f3af97 151@code{fma},
f1f8f695
L
152@code{movbe},
153@code{ept},
bd5295b2 154@code{clflush},
f88c9eb0 155@code{lwp},
5dd85c99
SP
156@code{fma4},
157@code{xop},
bd5295b2 158@code{syscall},
1b7f3fb0 159@code{rdtscp},
6305a203
L
160@code{3dnow},
161@code{3dnowa},
162@code{sse4a},
163@code{sse5},
164@code{svme},
165@code{abm} and
166@code{padlock}.
309d3373
JB
167Note that rather than extending a basic instruction set, the extension
168mnemonics starting with @code{no} revoke the respective functionality.
6305a203
L
169
170When the @code{.arch} directive is used with @option{-march}, the
9103f4f4
L
171@code{.arch} directive will take precedent.
172
173@cindex @samp{-mtune=} option, i386
174@cindex @samp{-mtune=} option, x86-64
175@item -mtune=@var{CPU}
176This option specifies a processor to optimize for. When used in
177conjunction with the @option{-march} option, only instructions
178of the processor specified by the @option{-march} option will be
179generated.
180
6305a203
L
181Valid @var{CPU} values are identical to the processor list of
182@option{-march=@var{CPU}}.
9103f4f4 183
c0f3af97
L
184@cindex @samp{-msse2avx} option, i386
185@cindex @samp{-msse2avx} option, x86-64
186@item -msse2avx
187This option specifies that the assembler should encode SSE instructions
188with VEX prefix.
189
daf50ae7
L
190@cindex @samp{-msse-check=} option, i386
191@cindex @samp{-msse-check=} option, x86-64
192@item -msse-check=@var{none}
1f9bb1ca
AS
193@itemx -msse-check=@var{warning}
194@itemx -msse-check=@var{error}
daf50ae7
L
195These options control if the assembler should check SSE intructions.
196@option{-msse-check=@var{none}} will make the assembler not to check SSE
197instructions, which is the default. @option{-msse-check=@var{warning}}
198will make the assembler issue a warning for any SSE intruction.
199@option{-msse-check=@var{error}} will make the assembler issue an error
200for any SSE intruction.
201
539f890d
L
202@cindex @samp{-mavxscalar=} option, i386
203@cindex @samp{-mavxscalar=} option, x86-64
204@item -mavxscalar=@var{128}
1f9bb1ca 205@itemx -mavxscalar=@var{256}
539f890d
L
206This options control how the assembler should encode scalar AVX
207instructions. @option{-mavxscalar=@var{128}} will encode scalar
208AVX instructions with 128bit vector length, which is the default.
209@option{-mavxscalar=@var{256}} will encode scalar AVX instructions
210with 256bit vector length.
211
1efbbeb4
L
212@cindex @samp{-mmnemonic=} option, i386
213@cindex @samp{-mmnemonic=} option, x86-64
214@item -mmnemonic=@var{att}
1f9bb1ca 215@itemx -mmnemonic=@var{intel}
1efbbeb4
L
216This option specifies instruction mnemonic for matching instructions.
217The @code{.att_mnemonic} and @code{.intel_mnemonic} directives will
218take precedent.
219
220@cindex @samp{-msyntax=} option, i386
221@cindex @samp{-msyntax=} option, x86-64
222@item -msyntax=@var{att}
1f9bb1ca 223@itemx -msyntax=@var{intel}
1efbbeb4
L
224This option specifies instruction syntax when processing instructions.
225The @code{.att_syntax} and @code{.intel_syntax} directives will
226take precedent.
227
228@cindex @samp{-mnaked-reg} option, i386
229@cindex @samp{-mnaked-reg} option, x86-64
230@item -mnaked-reg
231This opetion specifies that registers don't require a @samp{%} prefix.
e1d4d893 232The @code{.att_syntax} and @code{.intel_syntax} directives will take precedent.
1efbbeb4 233
55b62671 234@end table
731caf76 235@c man end
e413e4e9 236
a6c24e68
NC
237@node i386-Directives
238@section x86 specific Directives
239
240@cindex machine directives, x86
241@cindex x86 machine directives
242@table @code
243
244@cindex @code{lcomm} directive, COFF
245@item .lcomm @var{symbol} , @var{length}[, @var{alignment}]
246Reserve @var{length} (an absolute expression) bytes for a local common
247denoted by @var{symbol}. The section and value of @var{symbol} are
248those of the new local common. The addresses are allocated in the bss
704209c0
NC
249section, so that at run-time the bytes start off zeroed. Since
250@var{symbol} is not declared global, it is normally not visible to
251@code{@value{LD}}. The optional third parameter, @var{alignment},
252specifies the desired alignment of the symbol in the bss section.
a6c24e68
NC
253
254This directive is only available for COFF based x86 targets.
255
256@c FIXME: Document other x86 specific directives ? Eg: .code16gcc,
257@c .largecomm
258
259@end table
260
252b5132
RH
261@node i386-Syntax
262@section AT&T Syntax versus Intel Syntax
263
e413e4e9
AM
264@cindex i386 intel_syntax pseudo op
265@cindex intel_syntax pseudo op, i386
266@cindex i386 att_syntax pseudo op
267@cindex att_syntax pseudo op, i386
252b5132
RH
268@cindex i386 syntax compatibility
269@cindex syntax compatibility, i386
55b62671
AJ
270@cindex x86-64 intel_syntax pseudo op
271@cindex intel_syntax pseudo op, x86-64
272@cindex x86-64 att_syntax pseudo op
273@cindex att_syntax pseudo op, x86-64
274@cindex x86-64 syntax compatibility
275@cindex syntax compatibility, x86-64
e413e4e9
AM
276
277@code{@value{AS}} now supports assembly using Intel assembler syntax.
278@code{.intel_syntax} selects Intel mode, and @code{.att_syntax} switches
279back to the usual AT&T mode for compatibility with the output of
280@code{@value{GCC}}. Either of these directives may have an optional
281argument, @code{prefix}, or @code{noprefix} specifying whether registers
282require a @samp{%} prefix. AT&T System V/386 assembler syntax is quite
252b5132
RH
283different from Intel syntax. We mention these differences because
284almost all 80386 documents use Intel syntax. Notable differences
285between the two syntaxes are:
286
287@cindex immediate operands, i386
288@cindex i386 immediate operands
289@cindex register operands, i386
290@cindex i386 register operands
291@cindex jump/call operands, i386
292@cindex i386 jump/call operands
293@cindex operand delimiters, i386
55b62671
AJ
294
295@cindex immediate operands, x86-64
296@cindex x86-64 immediate operands
297@cindex register operands, x86-64
298@cindex x86-64 register operands
299@cindex jump/call operands, x86-64
300@cindex x86-64 jump/call operands
301@cindex operand delimiters, x86-64
252b5132
RH
302@itemize @bullet
303@item
304AT&T immediate operands are preceded by @samp{$}; Intel immediate
305operands are undelimited (Intel @samp{push 4} is AT&T @samp{pushl $4}).
306AT&T register operands are preceded by @samp{%}; Intel register operands
307are undelimited. AT&T absolute (as opposed to PC relative) jump/call
308operands are prefixed by @samp{*}; they are undelimited in Intel syntax.
309
310@cindex i386 source, destination operands
311@cindex source, destination operands; i386
55b62671
AJ
312@cindex x86-64 source, destination operands
313@cindex source, destination operands; x86-64
252b5132
RH
314@item
315AT&T and Intel syntax use the opposite order for source and destination
316operands. Intel @samp{add eax, 4} is @samp{addl $4, %eax}. The
317@samp{source, dest} convention is maintained for compatibility with
96ef6e0f
L
318previous Unix assemblers. Note that @samp{bound}, @samp{invlpga}, and
319instructions with 2 immediate operands, such as the @samp{enter}
320instruction, do @emph{not} have reversed order. @ref{i386-Bugs}.
252b5132
RH
321
322@cindex mnemonic suffixes, i386
323@cindex sizes operands, i386
324@cindex i386 size suffixes
55b62671
AJ
325@cindex mnemonic suffixes, x86-64
326@cindex sizes operands, x86-64
327@cindex x86-64 size suffixes
252b5132
RH
328@item
329In AT&T syntax the size of memory operands is determined from the last
330character of the instruction mnemonic. Mnemonic suffixes of @samp{b},
55b62671
AJ
331@samp{w}, @samp{l} and @samp{q} specify byte (8-bit), word (16-bit), long
332(32-bit) and quadruple word (64-bit) memory references. Intel syntax accomplishes
333this by prefixing memory operands (@emph{not} the instruction mnemonics) with
334@samp{byte ptr}, @samp{word ptr}, @samp{dword ptr} and @samp{qword ptr}. Thus,
335Intel @samp{mov al, byte ptr @var{foo}} is @samp{movb @var{foo}, %al} in AT&T
336syntax.
252b5132 337
4b06377f
L
338In 64-bit code, @samp{movabs} can be used to encode the @samp{mov}
339instruction with the 64-bit displacement or immediate operand.
340
252b5132
RH
341@cindex return instructions, i386
342@cindex i386 jump, call, return
55b62671
AJ
343@cindex return instructions, x86-64
344@cindex x86-64 jump, call, return
252b5132
RH
345@item
346Immediate form long jumps and calls are
347@samp{lcall/ljmp $@var{section}, $@var{offset}} in AT&T syntax; the
348Intel syntax is
349@samp{call/jmp far @var{section}:@var{offset}}. Also, the far return
350instruction
351is @samp{lret $@var{stack-adjust}} in AT&T syntax; Intel syntax is
352@samp{ret far @var{stack-adjust}}.
353
354@cindex sections, i386
355@cindex i386 sections
55b62671
AJ
356@cindex sections, x86-64
357@cindex x86-64 sections
252b5132
RH
358@item
359The AT&T assembler does not provide support for multiple section
360programs. Unix style systems expect all programs to be single sections.
361@end itemize
362
363@node i386-Mnemonics
364@section Instruction Naming
365
366@cindex i386 instruction naming
367@cindex instruction naming, i386
55b62671
AJ
368@cindex x86-64 instruction naming
369@cindex instruction naming, x86-64
370
252b5132 371Instruction mnemonics are suffixed with one character modifiers which
55b62671
AJ
372specify the size of operands. The letters @samp{b}, @samp{w}, @samp{l}
373and @samp{q} specify byte, word, long and quadruple word operands. If
374no suffix is specified by an instruction then @code{@value{AS}} tries to
375fill in the missing suffix based on the destination register operand
376(the last one by convention). Thus, @samp{mov %ax, %bx} is equivalent
377to @samp{movw %ax, %bx}; also, @samp{mov $1, %bx} is equivalent to
378@samp{movw $1, bx}. Note that this is incompatible with the AT&T Unix
379assembler which assumes that a missing mnemonic suffix implies long
380operand size. (This incompatibility does not affect compiler output
381since compilers always explicitly specify the mnemonic suffix.)
252b5132
RH
382
383Almost all instructions have the same names in AT&T and Intel format.
384There are a few exceptions. The sign extend and zero extend
385instructions need two sizes to specify them. They need a size to
386sign/zero extend @emph{from} and a size to zero extend @emph{to}. This
387is accomplished by using two instruction mnemonic suffixes in AT&T
388syntax. Base names for sign extend and zero extend are
389@samp{movs@dots{}} and @samp{movz@dots{}} in AT&T syntax (@samp{movsx}
390and @samp{movzx} in Intel syntax). The instruction mnemonic suffixes
391are tacked on to this base name, the @emph{from} suffix before the
392@emph{to} suffix. Thus, @samp{movsbl %al, %edx} is AT&T syntax for
393``move sign extend @emph{from} %al @emph{to} %edx.'' Possible suffixes,
394thus, are @samp{bl} (from byte to long), @samp{bw} (from byte to word),
55b62671
AJ
395@samp{wl} (from word to long), @samp{bq} (from byte to quadruple word),
396@samp{wq} (from word to quadruple word), and @samp{lq} (from long to
397quadruple word).
252b5132 398
b6169b20
L
399@cindex encoding options, i386
400@cindex encoding options, x86-64
401
402Different encoding options can be specified via optional mnemonic
403suffix. @samp{.s} suffix swaps 2 register operands in encoding when
f8a5c266
L
404moving from one register to another. @samp{.d32} suffix forces 32bit
405displacement in encoding.
b6169b20 406
252b5132
RH
407@cindex conversion instructions, i386
408@cindex i386 conversion instructions
55b62671
AJ
409@cindex conversion instructions, x86-64
410@cindex x86-64 conversion instructions
252b5132
RH
411The Intel-syntax conversion instructions
412
413@itemize @bullet
414@item
415@samp{cbw} --- sign-extend byte in @samp{%al} to word in @samp{%ax},
416
417@item
418@samp{cwde} --- sign-extend word in @samp{%ax} to long in @samp{%eax},
419
420@item
421@samp{cwd} --- sign-extend word in @samp{%ax} to long in @samp{%dx:%ax},
422
423@item
424@samp{cdq} --- sign-extend dword in @samp{%eax} to quad in @samp{%edx:%eax},
55b62671
AJ
425
426@item
427@samp{cdqe} --- sign-extend dword in @samp{%eax} to quad in @samp{%rax}
428(x86-64 only),
429
430@item
d5f0cf92 431@samp{cqo} --- sign-extend quad in @samp{%rax} to octuple in
55b62671 432@samp{%rdx:%rax} (x86-64 only),
252b5132
RH
433@end itemize
434
435@noindent
55b62671
AJ
436are called @samp{cbtw}, @samp{cwtl}, @samp{cwtd}, @samp{cltd}, @samp{cltq}, and
437@samp{cqto} in AT&T naming. @code{@value{AS}} accepts either naming for these
438instructions.
252b5132
RH
439
440@cindex jump instructions, i386
441@cindex call instructions, i386
55b62671
AJ
442@cindex jump instructions, x86-64
443@cindex call instructions, x86-64
252b5132
RH
444Far call/jump instructions are @samp{lcall} and @samp{ljmp} in
445AT&T syntax, but are @samp{call far} and @samp{jump far} in Intel
446convention.
447
1efbbeb4
L
448@section AT&T Mnemonic versus Intel Mnemonic
449
450@cindex i386 mnemonic compatibility
451@cindex mnemonic compatibility, i386
452
453@code{@value{AS}} supports assembly using Intel mnemonic.
454@code{.intel_mnemonic} selects Intel mnemonic with Intel syntax, and
455@code{.att_mnemonic} switches back to the usual AT&T mnemonic with AT&T
456syntax for compatibility with the output of @code{@value{GCC}}.
1efbbeb4
L
457Several x87 instructions, @samp{fadd}, @samp{fdiv}, @samp{fdivp},
458@samp{fdivr}, @samp{fdivrp}, @samp{fmul}, @samp{fsub}, @samp{fsubp},
459@samp{fsubr} and @samp{fsubrp}, are implemented in AT&T System V/386
460assembler with different mnemonics from those in Intel IA32 specification.
461@code{@value{GCC}} generates those instructions with AT&T mnemonic.
462
252b5132
RH
463@node i386-Regs
464@section Register Naming
465
466@cindex i386 registers
467@cindex registers, i386
55b62671
AJ
468@cindex x86-64 registers
469@cindex registers, x86-64
252b5132
RH
470Register operands are always prefixed with @samp{%}. The 80386 registers
471consist of
472
473@itemize @bullet
474@item
475the 8 32-bit registers @samp{%eax} (the accumulator), @samp{%ebx},
476@samp{%ecx}, @samp{%edx}, @samp{%edi}, @samp{%esi}, @samp{%ebp} (the
477frame pointer), and @samp{%esp} (the stack pointer).
478
479@item
480the 8 16-bit low-ends of these: @samp{%ax}, @samp{%bx}, @samp{%cx},
481@samp{%dx}, @samp{%di}, @samp{%si}, @samp{%bp}, and @samp{%sp}.
482
483@item
484the 8 8-bit registers: @samp{%ah}, @samp{%al}, @samp{%bh},
485@samp{%bl}, @samp{%ch}, @samp{%cl}, @samp{%dh}, and @samp{%dl} (These
486are the high-bytes and low-bytes of @samp{%ax}, @samp{%bx},
487@samp{%cx}, and @samp{%dx})
488
489@item
490the 6 section registers @samp{%cs} (code section), @samp{%ds}
491(data section), @samp{%ss} (stack section), @samp{%es}, @samp{%fs},
492and @samp{%gs}.
493
494@item
495the 3 processor control registers @samp{%cr0}, @samp{%cr2}, and
496@samp{%cr3}.
497
498@item
499the 6 debug registers @samp{%db0}, @samp{%db1}, @samp{%db2},
500@samp{%db3}, @samp{%db6}, and @samp{%db7}.
501
502@item
503the 2 test registers @samp{%tr6} and @samp{%tr7}.
504
505@item
506the 8 floating point register stack @samp{%st} or equivalently
507@samp{%st(0)}, @samp{%st(1)}, @samp{%st(2)}, @samp{%st(3)},
508@samp{%st(4)}, @samp{%st(5)}, @samp{%st(6)}, and @samp{%st(7)}.
55b62671
AJ
509These registers are overloaded by 8 MMX registers @samp{%mm0},
510@samp{%mm1}, @samp{%mm2}, @samp{%mm3}, @samp{%mm4}, @samp{%mm5},
511@samp{%mm6} and @samp{%mm7}.
512
513@item
514the 8 SSE registers registers @samp{%xmm0}, @samp{%xmm1}, @samp{%xmm2},
515@samp{%xmm3}, @samp{%xmm4}, @samp{%xmm5}, @samp{%xmm6} and @samp{%xmm7}.
516@end itemize
517
518The AMD x86-64 architecture extends the register set by:
519
520@itemize @bullet
521@item
522enhancing the 8 32-bit registers to 64-bit: @samp{%rax} (the
523accumulator), @samp{%rbx}, @samp{%rcx}, @samp{%rdx}, @samp{%rdi},
524@samp{%rsi}, @samp{%rbp} (the frame pointer), @samp{%rsp} (the stack
525pointer)
526
527@item
528the 8 extended registers @samp{%r8}--@samp{%r15}.
529
530@item
531the 8 32-bit low ends of the extended registers: @samp{%r8d}--@samp{%r15d}
532
533@item
534the 8 16-bit low ends of the extended registers: @samp{%r8w}--@samp{%r15w}
535
536@item
537the 8 8-bit low ends of the extended registers: @samp{%r8b}--@samp{%r15b}
538
539@item
540the 4 8-bit registers: @samp{%sil}, @samp{%dil}, @samp{%bpl}, @samp{%spl}.
541
542@item
543the 8 debug registers: @samp{%db8}--@samp{%db15}.
544
545@item
546the 8 SSE registers: @samp{%xmm8}--@samp{%xmm15}.
252b5132
RH
547@end itemize
548
549@node i386-Prefixes
550@section Instruction Prefixes
551
552@cindex i386 instruction prefixes
553@cindex instruction prefixes, i386
554@cindex prefixes, i386
555Instruction prefixes are used to modify the following instruction. They
556are used to repeat string instructions, to provide section overrides, to
557perform bus lock operations, and to change operand and address sizes.
558(Most instructions that normally operate on 32-bit operands will use
55916-bit operands if the instruction has an ``operand size'' prefix.)
560Instruction prefixes are best written on the same line as the instruction
561they act upon. For example, the @samp{scas} (scan string) instruction is
562repeated with:
563
564@smallexample
565 repne scas %es:(%edi),%al
566@end smallexample
567
568You may also place prefixes on the lines immediately preceding the
569instruction, but this circumvents checks that @code{@value{AS}} does
570with prefixes, and will not work with all prefixes.
571
572Here is a list of instruction prefixes:
573
574@cindex section override prefixes, i386
575@itemize @bullet
576@item
577Section override prefixes @samp{cs}, @samp{ds}, @samp{ss}, @samp{es},
578@samp{fs}, @samp{gs}. These are automatically added by specifying
579using the @var{section}:@var{memory-operand} form for memory references.
580
581@cindex size prefixes, i386
582@item
583Operand/Address size prefixes @samp{data16} and @samp{addr16}
584change 32-bit operands/addresses into 16-bit operands/addresses,
585while @samp{data32} and @samp{addr32} change 16-bit ones (in a
586@code{.code16} section) into 32-bit operands/addresses. These prefixes
587@emph{must} appear on the same line of code as the instruction they
588modify. For example, in a 16-bit @code{.code16} section, you might
589write:
590
591@smallexample
592 addr32 jmpl *(%ebx)
593@end smallexample
594
595@cindex bus lock prefixes, i386
596@cindex inhibiting interrupts, i386
597@item
598The bus lock prefix @samp{lock} inhibits interrupts during execution of
599the instruction it precedes. (This is only valid with certain
600instructions; see a 80386 manual for details).
601
602@cindex coprocessor wait, i386
603@item
604The wait for coprocessor prefix @samp{wait} waits for the coprocessor to
605complete the current instruction. This should never be needed for the
60680386/80387 combination.
607
608@cindex repeat prefixes, i386
609@item
610The @samp{rep}, @samp{repe}, and @samp{repne} prefixes are added
611to string instructions to make them repeat @samp{%ecx} times (@samp{%cx}
612times if the current address size is 16-bits).
55b62671
AJ
613@cindex REX prefixes, i386
614@item
615The @samp{rex} family of prefixes is used by x86-64 to encode
616extensions to i386 instruction set. The @samp{rex} prefix has four
617bits --- an operand size overwrite (@code{64}) used to change operand size
618from 32-bit to 64-bit and X, Y and Z extensions bits used to extend the
619register set.
620
621You may write the @samp{rex} prefixes directly. The @samp{rex64xyz}
622instruction emits @samp{rex} prefix with all the bits set. By omitting
623the @code{64}, @code{x}, @code{y} or @code{z} you may write other
624prefixes as well. Normally, there is no need to write the prefixes
625explicitly, since gas will automatically generate them based on the
626instruction operands.
252b5132
RH
627@end itemize
628
629@node i386-Memory
630@section Memory References
631
632@cindex i386 memory references
633@cindex memory references, i386
55b62671
AJ
634@cindex x86-64 memory references
635@cindex memory references, x86-64
252b5132
RH
636An Intel syntax indirect memory reference of the form
637
638@smallexample
639@var{section}:[@var{base} + @var{index}*@var{scale} + @var{disp}]
640@end smallexample
641
642@noindent
643is translated into the AT&T syntax
644
645@smallexample
646@var{section}:@var{disp}(@var{base}, @var{index}, @var{scale})
647@end smallexample
648
649@noindent
650where @var{base} and @var{index} are the optional 32-bit base and
651index registers, @var{disp} is the optional displacement, and
652@var{scale}, taking the values 1, 2, 4, and 8, multiplies @var{index}
653to calculate the address of the operand. If no @var{scale} is
654specified, @var{scale} is taken to be 1. @var{section} specifies the
655optional section register for the memory operand, and may override the
656default section register (see a 80386 manual for section register
657defaults). Note that section overrides in AT&T syntax @emph{must}
658be preceded by a @samp{%}. If you specify a section override which
659coincides with the default section register, @code{@value{AS}} does @emph{not}
660output any section register override prefixes to assemble the given
661instruction. Thus, section overrides can be specified to emphasize which
662section register is used for a given memory operand.
663
664Here are some examples of Intel and AT&T style memory references:
665
666@table @asis
667@item AT&T: @samp{-4(%ebp)}, Intel: @samp{[ebp - 4]}
668@var{base} is @samp{%ebp}; @var{disp} is @samp{-4}. @var{section} is
669missing, and the default section is used (@samp{%ss} for addressing with
670@samp{%ebp} as the base register). @var{index}, @var{scale} are both missing.
671
672@item AT&T: @samp{foo(,%eax,4)}, Intel: @samp{[foo + eax*4]}
673@var{index} is @samp{%eax} (scaled by a @var{scale} 4); @var{disp} is
674@samp{foo}. All other fields are missing. The section register here
675defaults to @samp{%ds}.
676
677@item AT&T: @samp{foo(,1)}; Intel @samp{[foo]}
678This uses the value pointed to by @samp{foo} as a memory operand.
679Note that @var{base} and @var{index} are both missing, but there is only
680@emph{one} @samp{,}. This is a syntactic exception.
681
682@item AT&T: @samp{%gs:foo}; Intel @samp{gs:foo}
683This selects the contents of the variable @samp{foo} with section
684register @var{section} being @samp{%gs}.
685@end table
686
687Absolute (as opposed to PC relative) call and jump operands must be
688prefixed with @samp{*}. If no @samp{*} is specified, @code{@value{AS}}
689always chooses PC relative addressing for jump/call labels.
690
691Any instruction that has a memory operand, but no register operand,
55b62671
AJ
692@emph{must} specify its size (byte, word, long, or quadruple) with an
693instruction mnemonic suffix (@samp{b}, @samp{w}, @samp{l} or @samp{q},
694respectively).
695
696The x86-64 architecture adds an RIP (instruction pointer relative)
697addressing. This addressing mode is specified by using @samp{rip} as a
698base register. Only constant offsets are valid. For example:
699
700@table @asis
701@item AT&T: @samp{1234(%rip)}, Intel: @samp{[rip + 1234]}
702Points to the address 1234 bytes past the end of the current
703instruction.
704
705@item AT&T: @samp{symbol(%rip)}, Intel: @samp{[rip + symbol]}
706Points to the @code{symbol} in RIP relative way, this is shorter than
707the default absolute addressing.
708@end table
709
710Other addressing modes remain unchanged in x86-64 architecture, except
711registers used are 64-bit instead of 32-bit.
252b5132 712
fddf5b5b 713@node i386-Jumps
252b5132
RH
714@section Handling of Jump Instructions
715
716@cindex jump optimization, i386
717@cindex i386 jump optimization
55b62671
AJ
718@cindex jump optimization, x86-64
719@cindex x86-64 jump optimization
252b5132
RH
720Jump instructions are always optimized to use the smallest possible
721displacements. This is accomplished by using byte (8-bit) displacement
722jumps whenever the target is sufficiently close. If a byte displacement
fddf5b5b 723is insufficient a long displacement is used. We do not support
252b5132
RH
724word (16-bit) displacement jumps in 32-bit mode (i.e. prefixing the jump
725instruction with the @samp{data16} instruction prefix), since the 80386
726insists upon masking @samp{%eip} to 16 bits after the word displacement
fddf5b5b 727is added. (See also @pxref{i386-Arch})
252b5132
RH
728
729Note that the @samp{jcxz}, @samp{jecxz}, @samp{loop}, @samp{loopz},
730@samp{loope}, @samp{loopnz} and @samp{loopne} instructions only come in byte
731displacements, so that if you use these instructions (@code{@value{GCC}} does
732not use them) you may get an error message (and incorrect code). The AT&T
73380386 assembler tries to get around this problem by expanding @samp{jcxz foo}
734to
735
736@smallexample
737 jcxz cx_zero
738 jmp cx_nonzero
739cx_zero: jmp foo
740cx_nonzero:
741@end smallexample
742
743@node i386-Float
744@section Floating Point
745
746@cindex i386 floating point
747@cindex floating point, i386
55b62671
AJ
748@cindex x86-64 floating point
749@cindex floating point, x86-64
252b5132
RH
750All 80387 floating point types except packed BCD are supported.
751(BCD support may be added without much difficulty). These data
752types are 16-, 32-, and 64- bit integers, and single (32-bit),
753double (64-bit), and extended (80-bit) precision floating point.
754Each supported type has an instruction mnemonic suffix and a constructor
755associated with it. Instruction mnemonic suffixes specify the operand's
756data type. Constructors build these data types into memory.
757
758@cindex @code{float} directive, i386
759@cindex @code{single} directive, i386
760@cindex @code{double} directive, i386
761@cindex @code{tfloat} directive, i386
55b62671
AJ
762@cindex @code{float} directive, x86-64
763@cindex @code{single} directive, x86-64
764@cindex @code{double} directive, x86-64
765@cindex @code{tfloat} directive, x86-64
252b5132
RH
766@itemize @bullet
767@item
768Floating point constructors are @samp{.float} or @samp{.single},
769@samp{.double}, and @samp{.tfloat} for 32-, 64-, and 80-bit formats.
770These correspond to instruction mnemonic suffixes @samp{s}, @samp{l},
771and @samp{t}. @samp{t} stands for 80-bit (ten byte) real. The 80387
772only supports this format via the @samp{fldt} (load 80-bit real to stack
773top) and @samp{fstpt} (store 80-bit real and pop stack) instructions.
774
775@cindex @code{word} directive, i386
776@cindex @code{long} directive, i386
777@cindex @code{int} directive, i386
778@cindex @code{quad} directive, i386
55b62671
AJ
779@cindex @code{word} directive, x86-64
780@cindex @code{long} directive, x86-64
781@cindex @code{int} directive, x86-64
782@cindex @code{quad} directive, x86-64
252b5132
RH
783@item
784Integer constructors are @samp{.word}, @samp{.long} or @samp{.int}, and
785@samp{.quad} for the 16-, 32-, and 64-bit integer formats. The
786corresponding instruction mnemonic suffixes are @samp{s} (single),
787@samp{l} (long), and @samp{q} (quad). As with the 80-bit real format,
788the 64-bit @samp{q} format is only present in the @samp{fildq} (load
789quad integer to stack top) and @samp{fistpq} (store quad integer and pop
790stack) instructions.
791@end itemize
792
793Register to register operations should not use instruction mnemonic suffixes.
794@samp{fstl %st, %st(1)} will give a warning, and be assembled as if you
795wrote @samp{fst %st, %st(1)}, since all register to register operations
796use 80-bit floating point operands. (Contrast this with @samp{fstl %st, mem},
797which converts @samp{%st} from 80-bit to 64-bit floating point format,
798then stores the result in the 4 byte location @samp{mem})
799
800@node i386-SIMD
801@section Intel's MMX and AMD's 3DNow! SIMD Operations
802
803@cindex MMX, i386
804@cindex 3DNow!, i386
805@cindex SIMD, i386
55b62671
AJ
806@cindex MMX, x86-64
807@cindex 3DNow!, x86-64
808@cindex SIMD, x86-64
252b5132
RH
809
810@code{@value{AS}} supports Intel's MMX instruction set (SIMD
811instructions for integer data), available on Intel's Pentium MMX
812processors and Pentium II processors, AMD's K6 and K6-2 processors,
b45619c0 813Cyrix' M2 processor, and probably others. It also supports AMD's 3DNow!@:
252b5132
RH
814instruction set (SIMD instructions for 32-bit floating point data)
815available on AMD's K6-2 processor and possibly others in the future.
816
817Currently, @code{@value{AS}} does not support Intel's floating point
818SIMD, Katmai (KNI).
819
820The eight 64-bit MMX operands, also used by 3DNow!, are called @samp{%mm0},
821@samp{%mm1}, ... @samp{%mm7}. They contain eight 8-bit integers, four
82216-bit integers, two 32-bit integers, one 64-bit integer, or two 32-bit
823floating point values. The MMX registers cannot be used at the same time
824as the floating point stack.
825
826See Intel and AMD documentation, keeping in mind that the operand order in
827instructions is reversed from the Intel syntax.
828
f88c9eb0
SP
829@node i386-LWP
830@section AMD's Lightweight Profiling Instructions
831
832@cindex LWP, i386
833@cindex LWP, x86-64
834
835@code{@value{AS}} supports AMD's Lightweight Profiling (LWP)
836instruction set, available on AMD's Family 15h (Orochi) processors.
837
838LWP enables applications to collect and manage performance data, and
839react to performance events. The collection of performance data
840requires no context switches. LWP runs in the context of a thread and
841so several counters can be used independently across multiple threads.
842LWP can be used in both 64-bit and legacy 32-bit modes.
843
844For detailed information on the LWP instruction set, see the
845@cite{AMD Lightweight Profiling Specification} available at
846@uref{http://developer.amd.com/cpu/LWP,Lightweight Profiling Specification}.
847
252b5132
RH
848@node i386-16bit
849@section Writing 16-bit Code
850
851@cindex i386 16-bit code
852@cindex 16-bit code, i386
853@cindex real-mode code, i386
eecb386c 854@cindex @code{code16gcc} directive, i386
252b5132
RH
855@cindex @code{code16} directive, i386
856@cindex @code{code32} directive, i386
55b62671
AJ
857@cindex @code{code64} directive, i386
858@cindex @code{code64} directive, x86-64
859While @code{@value{AS}} normally writes only ``pure'' 32-bit i386 code
860or 64-bit x86-64 code depending on the default configuration,
252b5132 861it also supports writing code to run in real mode or in 16-bit protected
eecb386c
AM
862mode code segments. To do this, put a @samp{.code16} or
863@samp{.code16gcc} directive before the assembly language instructions to
995cef8c
L
864be run in 16-bit mode. You can switch @code{@value{AS}} to writing
86532-bit code with the @samp{.code32} directive or 64-bit code with the
866@samp{.code64} directive.
eecb386c
AM
867
868@samp{.code16gcc} provides experimental support for generating 16-bit
869code from gcc, and differs from @samp{.code16} in that @samp{call},
870@samp{ret}, @samp{enter}, @samp{leave}, @samp{push}, @samp{pop},
871@samp{pusha}, @samp{popa}, @samp{pushf}, and @samp{popf} instructions
872default to 32-bit size. This is so that the stack pointer is
873manipulated in the same way over function calls, allowing access to
874function parameters at the same stack offsets as in 32-bit mode.
875@samp{.code16gcc} also automatically adds address size prefixes where
876necessary to use the 32-bit addressing modes that gcc generates.
252b5132
RH
877
878The code which @code{@value{AS}} generates in 16-bit mode will not
879necessarily run on a 16-bit pre-80386 processor. To write code that
880runs on such a processor, you must refrain from using @emph{any} 32-bit
881constructs which require @code{@value{AS}} to output address or operand
882size prefixes.
883
884Note that writing 16-bit code instructions by explicitly specifying a
885prefix or an instruction mnemonic suffix within a 32-bit code section
886generates different machine instructions than those generated for a
88716-bit code segment. In a 32-bit code section, the following code
888generates the machine opcode bytes @samp{66 6a 04}, which pushes the
889value @samp{4} onto the stack, decrementing @samp{%esp} by 2.
890
891@smallexample
892 pushw $4
893@end smallexample
894
895The same code in a 16-bit code section would generate the machine
b45619c0 896opcode bytes @samp{6a 04} (i.e., without the operand size prefix), which
252b5132
RH
897is correct since the processor default operand size is assumed to be 16
898bits in a 16-bit code section.
899
900@node i386-Bugs
901@section AT&T Syntax bugs
902
903The UnixWare assembler, and probably other AT&T derived ix86 Unix
904assemblers, generate floating point instructions with reversed source
905and destination registers in certain cases. Unfortunately, gcc and
906possibly many other programs use this reversed syntax, so we're stuck
907with it.
908
909For example
910
911@smallexample
912 fsub %st,%st(3)
913@end smallexample
914@noindent
915results in @samp{%st(3)} being updated to @samp{%st - %st(3)} rather
916than the expected @samp{%st(3) - %st}. This happens with all the
917non-commutative arithmetic floating point operations with two register
918operands where the source register is @samp{%st} and the destination
919register is @samp{%st(i)}.
920
e413e4e9
AM
921@node i386-Arch
922@section Specifying CPU Architecture
923
924@cindex arch directive, i386
925@cindex i386 arch directive
55b62671
AJ
926@cindex arch directive, x86-64
927@cindex x86-64 arch directive
e413e4e9
AM
928
929@code{@value{AS}} may be told to assemble for a particular CPU
5c6af06e 930(sub-)architecture with the @code{.arch @var{cpu_type}} directive. This
e413e4e9
AM
931directive enables a warning when gas detects an instruction that is not
932supported on the CPU specified. The choices for @var{cpu_type} are:
933
934@multitable @columnfractions .20 .20 .20 .20
935@item @samp{i8086} @tab @samp{i186} @tab @samp{i286} @tab @samp{i386}
936@item @samp{i486} @tab @samp{i586} @tab @samp{i686} @tab @samp{pentium}
5c6af06e 937@item @samp{pentiumpro} @tab @samp{pentiumii} @tab @samp{pentiumiii} @tab @samp{pentium4}
ef05d495 938@item @samp{prescott} @tab @samp{nocona} @tab @samp{core} @tab @samp{core2}
8a9036a4 939@item @samp{corei7} @tab @samp{l1om}
1543849b 940@item @samp{k6} @tab @samp{k6_2} @tab @samp{athlon} @tab @samp{k8}
68339fdf 941@item @samp{amdfam10} @tab @samp{bdver1}
1ceab344 942@item @samp{generic32} @tab @samp{generic64}
9103f4f4 943@item @samp{.mmx} @tab @samp{.sse} @tab @samp{.sse2} @tab @samp{.sse3}
d76f7bc1 944@item @samp{.ssse3} @tab @samp{.sse4.1} @tab @samp{.sse4.2} @tab @samp{.sse4}
c7b8aa3a
L
945@item @samp{.avx} @tab @samp{.vmx} @tab @samp{.smx} @tab @samp{.ept}
946@item @samp{.clflush} @tab @samp{.movbe} @tab @samp{.xsave} @tab @samp{.xsaveopt}
947@item @samp{.aes} @tab @samp{.pclmul} @tab @samp{.fma} @tab @samp{.fsgsbase}
948@item @samp{.rdrnd} @tab @samp{.f16c}
1ceab344 949@item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5}
f72d7f29 950@item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} @tab @samp{.abm}
f0ae4a24 951@item @samp{.lwp} @tab @samp{.fma4} @tab @samp{.xop}
1ceab344 952@item @samp{.padlock}
e413e4e9
AM
953@end multitable
954
fddf5b5b
AM
955Apart from the warning, there are only two other effects on
956@code{@value{AS}} operation; Firstly, if you specify a CPU other than
e413e4e9
AM
957@samp{i486}, then shift by one instructions such as @samp{sarl $1, %eax}
958will automatically use a two byte opcode sequence. The larger three
959byte opcode sequence is used on the 486 (and when no architecture is
960specified) because it executes faster on the 486. Note that you can
961explicitly request the two byte opcode by writing @samp{sarl %eax}.
fddf5b5b
AM
962Secondly, if you specify @samp{i8086}, @samp{i186}, or @samp{i286},
963@emph{and} @samp{.code16} or @samp{.code16gcc} then byte offset
964conditional jumps will be promoted when necessary to a two instruction
965sequence consisting of a conditional jump of the opposite sense around
966an unconditional jump to the target.
967
5c6af06e
JB
968Following the CPU architecture (but not a sub-architecture, which are those
969starting with a dot), you may specify @samp{jumps} or @samp{nojumps} to
970control automatic promotion of conditional jumps. @samp{jumps} is the
971default, and enables jump promotion; All external jumps will be of the long
972variety, and file-local jumps will be promoted as necessary.
973(@pxref{i386-Jumps}) @samp{nojumps} leaves external conditional jumps as
974byte offset jumps, and warns about file-local conditional jumps that
975@code{@value{AS}} promotes.
fddf5b5b
AM
976Unconditional jumps are treated as for @samp{jumps}.
977
978For example
979
980@smallexample
981 .arch i8086,nojumps
982@end smallexample
e413e4e9 983
252b5132
RH
984@node i386-Notes
985@section Notes
986
987@cindex i386 @code{mul}, @code{imul} instructions
988@cindex @code{mul} instruction, i386
989@cindex @code{imul} instruction, i386
55b62671
AJ
990@cindex @code{mul} instruction, x86-64
991@cindex @code{imul} instruction, x86-64
252b5132 992There is some trickery concerning the @samp{mul} and @samp{imul}
55b62671 993instructions that deserves mention. The 16-, 32-, 64- and 128-bit expanding
252b5132
RH
994multiplies (base opcode @samp{0xf6}; extension 4 for @samp{mul} and 5
995for @samp{imul}) can be output only in the one operand form. Thus,
996@samp{imul %ebx, %eax} does @emph{not} select the expanding multiply;
997the expanding multiply would clobber the @samp{%edx} register, and this
998would confuse @code{@value{GCC}} output. Use @samp{imul %ebx} to get the
99964-bit product in @samp{%edx:%eax}.
1000
1001We have added a two operand form of @samp{imul} when the first operand
1002is an immediate mode expression and the second operand is a register.
1003This is just a shorthand, so that, multiplying @samp{%eax} by 69, for
1004example, can be done with @samp{imul $69, %eax} rather than @samp{imul
1005$69, %eax, %eax}.
1006
This page took 0.667808 seconds and 4 git commands to generate.