Referencing the contents of a memory location. (x86 addressing modes)

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



Referencing the contents of a memory location. (x86 addressing modes)



I have a memory location that contains a character that I want to compare with another character (and it's not at the top of the stack so I can't just pop it). How do I reference the contents of a memory location so I can compare it?


pop



Basically how do I do it syntactically.




1 Answer
1



See also: table of AT&T(GNU) syntax vs. NASM syntax for different addressing modes, including indirect jumps / calls.



Also see the collection of links at the bottom of this answer.



Suggestions welcome, esp. on which parts are were useful/interesting, and which parts aren't.



x86 (32 and 64bit) has several addressing modes to choose from. They're all of the form:


[base_reg + index_reg*scale + displacement] ; or a subset of this
[RIP + displacement] ; or RIP-relative: 64bit only. No index reg is allowed



(where scale is 1, 2, 4, or 8, and displacement is a signed 32bit constant). All the other forms (except RIP-relative) are subsets of this that leave out one or more component. This means you don't need a zeroed index_reg to access [rsi] for example. In asm source code, it doesn't matter what order you write things: [5 + rax + rsp + 15*4 + MY_ASSEMBLER_MACRO*2] works fine. (All the math on constants happens at assemble time, resulting in a single constant displacement.)


index_reg


[rsi]


[5 + rax + rsp + 15*4 + MY_ASSEMBLER_MACRO*2]



The registers all have to be the same size as the mode you're in, unless you use an alternate address-size, requiring an extra prefix byte. Narrow pointers are rarely useful outside of the x32 ABI (ILP32 in long mode).



If you want to use al as an array index, for example, you need to zero- or sign-extend it to pointer width. (Having the upper bits of rax already zeroed before messing around with byte registers is sometimes possible, and is a good way to accomplish this.)


al


rax



Every possible subset of the general case is encodable, except ones using e/rsp*scale (obviously useless in "normal" code that always keeps a pointer to stack memory in esp).


e/rsp*scale


esp



Normally, the code-size of the encodings is:


[-128 to +127]


disp8


disp32



code-size exceptions:



[reg*scale] by itself can only be encoded with a 32bit displacement. Smart assemblers work around that by encoding lea eax, [rdx*2] as lea eax, [rdx + rdx], but that trick only works for scaling by 2.


[reg*scale]


lea eax, [rdx*2]


lea eax, [rdx + rdx]



It's impossible to encode e/rbp or r13 as the base register without a displacement byte, so [ebp] is encoded as [ebp + byte 0]. The no-displacement encodings with ebp as a base register instead mean there's no base register (e.g. for [disp + reg*scale]).


e/rbp


r13


[ebp]


[ebp + byte 0]


ebp


[disp + reg*scale]



[e/rsp] requires a SIB byte even if there's no index register. (whether or not there's a displacement). The mod/rm encoding that would specify [rsp] instead means that there's a SIB byte.


[e/rsp]


[rsp]



See Table 2-5 in Intel's ref manual, and the surrounding section, for the details on the special cases. (They're the same in 32 and 64bit mode. Adding RIP-relative encoding didn't conflict with any other encoding, even without a REX prefix.)



For performance, it's typically not worth it to spend an extra instruction just to get smaller x86 machine code. On Intel CPUs with a uop cache, it's smaller than L1 I$, and a more precious resource. Minimizing fused-domain uops is typically more important.



16bit address size can't use a SIB byte, so all the one and two register addressing modes are encoded into the single mod/rm byte. reg1 can be BX or BP, and reg2 can be SI or DI (or you can use any of those 4 registers by themself). Scaling is not available. 16bit code is obsolete for a lot of reasons, including this one, and not worth learning if you don't have to.


reg1


reg2



Note that the 16bit restrictions apply in 32bit code when the address-size prefix is used, so 16bit LEA-math is highly restrictive. However, you can work around that: lea eax, [edx + ecx*2] sets ax = dx + cx*2, because garbage in the upper bits of the source registers has no effect.


lea eax, [edx + ecx*2]


ax = dx + cx*2



This table doesn't exactly match the hardware encodings of possible addressing modes, since I'm distinguishing between using a label (for e.g. global or static data) vs. using a small constant displacement. So I'm covering hardware addressing modes + linker support for symbols.



If you have a pointer char array in esi,


char array


esi



mov al, esi: invalid, won't assemble. Without square brackets, it's not a load at all. It's an error because the registers aren't the same size.


mov al, esi



mov al, [esi] loads the byte pointed to.


mov al, [esi]



mov al, [esi + ecx] loads array[ecx].


mov al, [esi + ecx]


array[ecx]



mov al, [esi + 10] loads array[10].


mov al, [esi + 10]


array[10]



mov al, [esi + ecx*8 + 200] loads array[ecx*8 + 200]


mov al, [esi + ecx*8 + 200]


array[ecx*8 + 200]



mov al, [global_array + 10] loads from global_array[10]. In 64bit mode, this can be a RIP-relative address. Using DEFAULT REL is recommended, to generate RIP-relative addresses by default instead of having to always use [rel global_array + 10]. There is no way to use an index register with a RIP-relative address directly. The normal method is lea rax, [global_array] mov al, [rax + rcx*8 + 10] or similar.


mov al, [global_array + 10]


global_array[10]


DEFAULT REL


[rel global_array + 10]


lea rax, [global_array]


mov al, [rax + rcx*8 + 10]



mov al, [global_array + ecx + edx*2 + 10] loads from global_array[ecx + edx*2 + 10] Obviously you can index a static/global array with a single register. Even a 2D array using two separate registers is possible. (pre-scaling one with an extra instruction, for scale factors other than 2, 4, or 8). Note that the global_array + 10 math is done at link time. The object file (assembler output, linker input) informs the linker of the +10 to add to the final absolute address, to put the right 4-byte displacement into the executable (linker output). This is why you can't use arbitrary expressions on link-time constants that aren't assemble-time constants (e.g. symbol addresses).


mov al, [global_array + ecx + edx*2 + 10]


global_array[ecx + edx*2 + 10]


global_array + 10



mov al, 0ABh Not a load at all, but instead an immediate-constant that was stored inside the instruction. (Note that you need to prefix a 0 so the assembler knows it's a constant, not a symbol. Some assemblers will also accept 0xAB). You can use a symbol as the immediate constant, to get an address into a register.


mov al, 0ABh


0


0xAB


mov esi, global_array


mov esi, imm32


mov esi, OFFSET global_array


mov esi, global_array


mov esi, dword [global_array]





In 64bit mode, addressing global symbols is usually done with RIP-relative addressing, which your assembler will do by default with the DEFAULT REL directive, or with mov al, [rel global_array + 10]. No index register can be used with RIP-relative addresses, only constant displacements. You can still do absolute addressing, and there's even a special form of mov that can load from a 64bit absolute address (rather than the usual 32bit sign-extended.) AT&T syntax calls that opcode movabs (also used for mov r64, imm64), while Intel/NASM syntax still calls it a form of mov.


DEFAULT REL


mov al, [rel global_array + 10]


mov


movabs


mov r64, imm64


mov



Use lea rsi, [rel global_array] to get rip-relative addresses into registers, since mov reg, imm would hard-code a non-relative address into the instruction bytes.


lea rsi, [rel global_array]


mov reg, imm



Note that OS X loads all code at an address outside the low 32 bits, so 32-bit absolute addressing is unusable. Position-independent code isn't required for executables, but you might as well because 64-bit absolute addressing is less efficient than RIP-relative. The macho64 object file format doesn't support relocations for 32-bit absolute addresses the way Linux ELF does. Make sure not to use a label name as a compile-time constant anywhere, except in an effective-address like [global_array + constant], because that can be assembled to a RIP-relative addressing mode. e.g. [global_array + rcx] is not allowed, because RIP can't be used with any other registers, so it would have to be assembled with the absolute address of global_array hard-coded as the 32bit displacement (which will be sign-extended to 64b).


[global_array + constant]


[global_array + rcx]


global_array



Any and all of these addressing modes can be used with LEA to do integer math with a bonus of not affecting flags, regardless of whether it's a valid address. [esi*4 + 10] is usually only useful with LEA (unless the displacement is a symbol, instead of a small constant). In machine code, there is no encoding for scaled-register alone, so [esi*4] has to assemble to [esi*4 + 0], with 4 bytes of zeros for a 32-bit displacement. It's still often worth it to copy+shift in one instruction instead of a shorter mov + shl, because usually uop throughput is more of a bottleneck than code size, especially on CPUs with a decoded-uop cache.


LEA


[esi*4 + 10]


[esi*4]


[esi*4 + 0]



You can specify segment-overrides like mov al, fs:[esi]. A segment-override just adds a prefix-byte in front of the usual encoding. Everything else stays the same, with the same syntax.


mov al, fs:[esi]



You can even use segment overrides with RIP-relative addressing. 32-bit absolute addressing takes one more byte to encode than RIP-relative, so mov eax, fs:[0] can most efficiently be encoded using a relative displacement that produces a known absolute address. i.e. choose rel32 so RIP+rel32 = 0. YASM will do this with mov ecx, [fs: rel 0], but NASM always uses disp32 absolute addressing, ignoring the rel specifier. I haven't tested MASM or gas.


mov eax, fs:[0]


mov ecx, [fs: rel 0]


rel



If the operand-size is ambiguous (e.g. in an instruction with an immediate and a memory operand), use byte / word / dword / qword / xmmword / ymmword to specify:


byte


word


dword


qword


xmmword


ymmword


mov dword [rsi + 10], 0xAB ; NASM
mov dword ptr [rsi + 10], 0xAB ; MASM and GNU .intex_syntax noprefix
movl $0xAB, 10(%rsi) # GNU(AT&T): operand size from insn suffix



See the yasm docs for NASM-syntax effective addresses, and/or the wikipedia x86 entry's section on addressing modes. The wiki page says what's allowed in 16bit mode. Here's another "cheat sheet" for 32bit addressing modes.



There's also a more detailed guide to addressing modes, for 16bit. 16bit still has all the same addressing modes as 32bit, so if you're finding addressing modes confusing, read it anyway



Also see the x86 wiki page for links.





16-bit code is still around though. And the user did tag this as DOS, so an explanation of the 16-bit restrictions would probably be reasonable for anyone stumbling on this question and answer. The best rule of thumb I have seen that is reasonably easy to understand and remember, can be found in Section 1.2.7 An Easy Way to Remember the 8086 Memory Addressing Modes of this document . I find it a better description than the Wiki article you linked to
– Michael Petch
Dec 3 '15 at 6:02






No, that isn't the Wiki article. The wiki article doesn't offer up much of an explanation of how you mix and match those rows and columns. I helped someone last year on this site, they didn't understand the Wiki version, but clued in with the other one.
– Michael Petch
Dec 3 '15 at 6:05






It may be worth mentioning that [OSX does not allow global_array + [10]](stackoverflow.com/questions/26927278/…).
– Z boson
Dec 3 '15 at 12:27






@BeeOnRope: Right, I was assuming the default address size. Adding that line about registers being the same size as the address-size opened the door to discussing that, too, I guess. I guess if you write a function that takes a 32-bit integer that you want to use as an index into a static table, you can save an instruction to zero or sign extend it if you just use a 32-bit address-size. It's pretty obscure, since you can't use it with arbitrary pointers except in the x32 ABI (ILP32 in long mode).
– Peter Cordes
Oct 5 '16 at 21:19





@BeeOnRope: No, I mean use LEA with its default operand size (32-bit), to write a 32-bit result zero-extended to 64. e.g. (bad) lea rax, [ecx + ebx*4] always gives the same result as (good) lea eax, [rcx + rbx*4], but takes two extra prefix bytes. 32-bit address-size for LEA is never useful because you can always get the same result without it. High bits of inputs registers can't affect the low bits of the result for addition or left shift.
– Peter Cordes
Mar 21 '17 at 16:30



lea rax, [ecx + ebx*4]


lea eax, [rcx + rbx*4]






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard