Mascotr_harvey


8086 Processor Registers

Contents
CPU Registers.
AX - Accumulator Register.
BX - Base Register.
CX - Counter Register.
DX - Data Register.
BP - Base Pointer.
SI - Source Index.
DI - Destination Index.
SP - Stack Pointer.
CS - Code Segment.

DS - Data Segment.
ES - Extra Segment.
SS - Stack Segment.
Instruction Operator Order.
Segment Overrides.
Register Addressing.
Register Addressing Semantics.
Little-Endians.
Newer Processors.
Intel 32-Bit CPU Registers.
Back to r_harvey home page.

The ubiquitous Intel 8086 was introduced in 1978. An enormous customer base has ensured that software is still written to the 8086 standard, although many of the tens of millions of original PCs that started it all are today working as plant stands.

CPU Registers

Programmers unanimously agree: while Intel 8086 family processors are well designed and versatile, there are just not enough CPU registers (a remnant of their '70s heritage). This real estate crunch is more of a problem for high-level language compilers, which are constantly reloading registers in the middle of things.

This Great register shortage is not enough to thwart the dedicated Assembly language programmer, who takes the time to understand and make the most efficient use of our limited resources. Each register has something it does better than every other register -- sometimes tricks that no other register can do at all.

Here are the 16-bit registers available on all Intel 8086-compatible microprocessors:

General Purpose 8086 Registers
RegHighLowName
AXAHALAccumulator
BXBHBLBase
CXCHCLCounter
DXDHDLData

16-Bit Pointer Registers
RegName
SISource Index
DIDestination Index
BPStack Frame Base Pointer
SPStack Pointer

Segment Registers
RegName
CSCode Segment
DSData Segment
ESExtra Segment
SSStack Segment

The four general purpose registers, AX, BX, CX and DX, work as 16-bit registers, or as two 8-bit registers; you could load a value in AL and a different one in AH, then access them together as AX. The pointer and segment registers always hold 16-bit values.

Note: While the CPU flags are considered to be a register, they are never used in general purpose calculations.

AX - Accumulator Register

The workhorse, AX is constantly changing throughout any program. Other registers are pointers or counters or constants, that seem to be there just to service AX, as information flows through this most useful register.

The 8086 has several hard-wired connections between AX or AL and other registers:

RegisterUnique Relationship With AX
BXThe xlat (translation) instruction uses AL.
DXDivision, multiplication, port manipulation.
SI, DICopy memory using string instructions and AX or AL.

There are also instructions to copy part of the flags register to or from AH (lahf/sahf), but we rarely use them.

Tip: When practical, use AX instead of other registers. Many instructions work faster with the AX register. A few instructions have shorter encoding when used with AX.

Note: When Assembly language functions return an integer or string descriptor to Basic, they always place the value in AX.

BX - Base Register

In addition to general storage, you can use BX for accessing memory; it is the only general purpose register that also works as a pointer. Unlike SI and DI, BX doesn't work with the rep prefix or the fancy string instructions. By default, memory operations using BX assume the DS register. You can use a segment override. This example loads AL with a byte from the current code segment:

  mov  al, BYTE PTR cs:[bx]

Note: In simple routines that have limited direct access of memory, use BX instead of SI or DI. This is because high-level 16-bit DOS programming languages never require you to preserve BX.

One powerful yet seldom used instruction is xlat. Place a pointer in BX, and a character in the AL register, then execute the xlat instruction. Xlat uses the value in AL as an index into the table, replacing the contents of AL with that character. If AL=0, it will be replaced with the first item in the table, AL=1 means the second item, and so forth. This example translates a string using a table pointed to by the BX register:

  mov  bx, OFFSET Table ; Translation table
@@:
  lodsb                 ; Read a character to AL
  xlatb                 ; Translate
  stosb                 ; Write to output string
  loop @B               ; Until CX == 0

Gotcha: EBX is reserved
Many 32-bit operating systems reserve the EBX register. Get into the habit of preserving the BX register on the stack in subroutines. (32-bit versions of Microsoft Visual C++ for Windows reserve EBX.)

CX - Counter Register

In addition to general storage, CX (and to a lesser extent CL) is used as an iteration counter. The rep prefix, loop instruction, and shift/rotate instructions all take counts passed in CX or CL.

This example computes the sum of integer values in an array. At entry, CX is the number of integers, and DS:BX points to the array.

  xor   ax, ax          ; Zero-out the accumulator
  jcxz  NoLoop          ; If CX == 0, don't loop
@@:
  add   ax, [bx]
  add   bx, 2           ; Move the pointer forward
  loop  @B              ; DEC CX  /  JNZ LoopTop

Gotcha: Make Sure CX != 0
Always check that CX is non-zero before entering a loop. We did this in the code fragment above with the jcxz (jump if CX==0) instruction.

The CL half of CX is used for the count for shift and rotate instructions. Intel 80186 and later processors (including the NEC V-20) can shift or rotate more than one bit at a time without using CL. The example shifts AX left five times using CL:

  mov   cl, 5           ; Shift 5 times
  shl   ax, cl

Gotcha: CX Use is Inconsistent
There are several inconsistencies in how the CPU treats loop counters, CX and the rep prefix:

OperationCX Usage
loopDecrements CX each time through the loop. If CX==0 on the first iteration, it will loop 65536 times.
repRepeats the instruction for the number of iterations specified in CX. If CX==0 when the instruction begins, the instruction does nothing. When the instruction is completed, CX==0.
repe, repneUsed to search or compare strings. CX will be adjusted by the number of bytes tested.
shl, shrShifts or rotates a register the number of bits specified in CL. Reads only CL, CH is ignored. If CL==0, the instruction does nothing. The instruction does not change the contents of CL.

DX - Data Register

While DX is the least versatile register, that may be its greatest strength; we often use DX as truly a data storage register, without concern about shuttling data back and forth. DX is also the high word in division instructions. After 16-bit division, the remainder is placed in DX. The DX register is also used with AL and the in and out instructions to specify the port number.

We use DX along with AX to return long integers to high-level languages. The most significant word is in DX, the low word is in AX. When returning far pointers, the segment is in DX, while AX contains the offset.

Note: When a function returns a short integer, it is a good idea to clear DX to zero, or to sign-extend the results into DX (the cwd instruction). This makes it easy to declare the function as a long integer or integer (or both -- with an ALIAS) without any additional code changes.

Tip: You can zero-out DX with the cwd (Convert Word to Double) instruction as long as the high-bit (the sign bit) of AX is zero. cwd sign-extends the sign bit of AX into all bits of DX; if the high bit of AX is zero, all of DX will become zero, but if AX is negative, DX will become -1. If you use this trick, be sure to explain it in a source code comment.

An interesting side effect is that you can return zero if AX is negative, or -1 if AX is positive without doing a conditional with these instructions:

  cwd                   ; Sign extend AX into DX
  mov   ax, dx          ; Move the word back into AX
  neg   ax              ; Toggle all bits of AX

Gotcha: Set DX=0 Before Division
Make sure DX==0 (zero) before doing integer division. Otherwise, if the quotient is too large to fit in AX, the instruction will cause a division by zero error.

BP - Base Pointer

As BX pointers are assumed to be relative to the DS segment, BP pointers are relative to the stack segment (SS). You can use a DS or ES segment override with BP (such as mov AX, ES:[BP]), but that misses the true power of BP: quickly addressing stack data.

Most routines setup a stack frame to access data passed on the stack (the [BP+6] lines we see scattered through our medium model code).

  push  bp              ; Save original contents of BP
  mov   bp, sp          ; Set BP to the stack pointer

When Assembly language routines exit, they restore BP from the stack:

  pop   bp              ; Restore original BP contents

Arguments passed to your routine on the stack are addressed as positive offsets from the stack pointer. Let's look at the relationship between offsets from BP and data passed to our routines again. Say your routine requires two integer arguments; the Basic declaration looks something like this:

DECLARE SUB Sandwich(Arg1%, Arg2%)

Following Basic calling conventions, arguments are pushed on the stack from left to right. The Basic code to call the routine looks something like:

  push  Arg1             ; 2 bytes on the stack
  push  Arg2             ; 2 more bytes
  call  FAR PTR Sandwich ; 4 additional bytes for ret

When the routine gains control, it sets up a stack frame:

  push  bp              ; 2 more bytes
  mov   bp, sp          ; BP = stack pointer

Now we have BP pointing to the value of the stack pointer after we saved BP.

Once again, on the way to the routine the program pushes data onto the stack from left to right, then the call statement places the far return address (segment and offset) to which it will continue running when the routine ends. Finally, when the routine begins it saves the BP register on the stack. Data on the stack looks like this:

The Stack Frame
ValueOffsetLocation
Arg1[BP+8]Higher address
Arg2[BP+6] 
Return segment[BP+4] 
Return offset[BP+2] 
Saved BP register[BP+0]Lower address

Gotcha: Don't Assume The Stack Is In DGROUP
In the medium model, the stack is in DGROUP (usually abbreviated as SS=DS), though you shouldn't make this assumption for compatibility with future compilers and operating systems. While BP is assumed to be SS (Stack Segment) relative, you can use a segment override to point BP at DGROUP.

When you've read everything passed on the stack and you don't have any local stack variables, you can use BP as a scratch register. You can also use BP like BX, to address memory, though remember that the default segment used with BP is the stack segment, so you will frequently need to add a segment override. This example reads a byte from the extra segment to AL using BP:

  mov   al, BYTE PTR es:[bp]

Gotcha: Preserve BP If You Change It
BP is reserved by Microsoft high-level languages. If your procedure will change the BP register, you must preserve the register when your routine begins (usually on the stack), then restore the register when your routine exits.

SI - Source Index

In addition to use as a general purpose register, SI is also the source pointer for string and memory operations. lodsb and lodsw assume that DS:SI point to the source data, while movsb and movsw assume DS:SI point to the source and ES:DI point to the destination.

There are no instructions for accessing just the high or low byte of SI.

Gotcha: Preserve SI
If your procedure changes the SI register, you must preserve the register when your routine begins (usually on the stack), then restore the register when your routine exits.

DI - Destination Index

Assumed to be ES (Extra Segment) relative. scasb, stosb and similar instructions take ES:DI as the destination

This example shows the relationship between DS:SI and ES:DI.

  mov   cx, StrLen      ; Number of bytes
  lds   si, SourcePtr   ; DS:SI == Source pointer
  les   di, DestPtr     ; ES:DI == Destination pointer
  rep   movsb           ; Copy the string

Gotcha: Preserve DI If You Change It
If your procedure changes the DI register, you must preserve the register when your routine begins (usually on the stack), then restore the register when your routine exits.

SP - Stack Pointer

Everything we do affects the stack. The stack pointer register keeps track of data and return addresses, and keeps data from colliding as we jump between modules.

Once the program's startup code has setup the stack pointer, the pointer is maintained automatically as we work. Every time you push or pop a register, the stack pointer is adjusted for the data. Whenever you call a routine, the processor pushes the return address on the stack, then pops the return address into the instruction pointer when you exit the routine with a ret instruction.

You cannot directly address memory with the SP register in real mode. As we've already seen (and we'll show again), we use the BP register to make stack addressing easier.

The following example places a near pointer (the address of the next statement) on the stack with the call instruction. The second instruction, add SP,2, adjusts the stack pointer to eliminate the return address. There is no return address left on the stack, so there's nowhere to return to.

  call  NEAR PTR @F     ; A near call
@@:
  add   sp, 2           ; Discard the near return address

If you've worked with non-stack oriented processors and operating systems (some HP machines, for instance, use a fixed-size hardware return stack), this dependence on a user-defined stack sounds at the very least foolhardy. It works because everyone follows one simple rule: if you change the stack pointer, put it back when you're through. And to ensure that worlds don't collide, programs reserve enough extra space on the stack for any likely contingency -- then add a bit more just to be sure.

Simple programs allocate about 2K for the stack; large processes add quite a bit extra for the fudge factor. (Heavy on the fudge, Microsoft Windows NT 3.x, for instance, uses at least a 128K stack, and later operating systems may use a Virtual stack, addressing a megabyte or more.)

Note: If you adjust the stack pointer, you can make room for local data on the stack. We'll show you how in the "Local Data" section, in the real book.

CS - Code Segment

The CS register always points to the segment where our code lives -- if program control passed successfully to our routine, we can assume the CS register is valid. We never load CS manually; it's loaded automatically by every far call and every far return instruction.

There is never any reason to pop CS; we use push CS to save the value on the stack, usually to load the value into another register. This example loads the ES register with the code segment:

  push  cs              ; Place code segment on the stack
  pop   es              ; Copy code segment to ES register

While it appears that there is no way to change the CS register, we do it all the time. Whenever we do a far call, the current CS value is saved on the stack, and when the routine ends, the retf instruction restores the CS register.

In the same way the stack segment (SS register) is associated with the stack pointer (SP), the CS register is associated with the instruction pointer, or IP register. The instruction pointer is the memory address of the next instruction the processor will load and execute.

We rarely concern ourselves with the instruction pointer directly; there are few instructions that work with the instruction pointer. One way to read the IP register is with a near call instruction, which places the value of the instruction pointer on the stack. The matching near ret instruction pops the value from the stack into IP. A far call and far ret, push and pop the CS register and the instruction pointer.

DS - Data Segment

When the high-level language passes control to your procedure, DS points to the default data segment. More accurately, DS points to DGROUP, which includes all near data segments. DS is reserved by all high-level languages, and must be restored when your procedure exits.

We can't load a constant directly into a segment register. If you need to, for instance, set the data segment register equal to the video segment, load the segment to another register (usually AX), then move the number into DS.

ES - Extra Segment

This is truly the extra segment register. By default, it has no value at all, and Microsoft high-level languages never assume ES is preserved. Like DS, you can't load the ES register directly with a constant.

Usually used with DI, though you can tack-on "ES:" as a segment override and use ES with SI, BX or BP. ES is useful to read a few bytes from far memory -- such as the ROM BIOS version from the F000 segment, or the number of lines on the screen from the 0000 segment.

Gotcha: Don't Use ES as Scratch
Avoid using ES as a scratch register. Even in real mode, where anything is fair-game, loading segment registers is slower than general purpose registers. At the worst, it will make it difficult to update the code to run in protected mode, where seemingly innocent looking code crashes the program.

SS - Stack Segment

As the DS register controls the data segment, the SS register works with the SP register to maintain the stack segment.

Never change the SS register within a procedure called from a high-level language unless you have allocated an internal stack for an exception handler or other task that requires extra stack space. Remember to change SS and SP at the same time, and be sure to disable interrupts when you do, or any event that uses even a bit of stack space, such as the clock tick interrupt, will assume the stack pointer is valid, and can cause unfortunate things to happen to whoever owns the memory.

Gotcha: Disable Interrupts When Changing The Stack Segment
You must disable interrupts before you change the stack segment. Since functions called from high-level languages never alter the stack segment, we won't discuss the procedure. It is not necessary to disable interrupts when you change the stack pointer (SP) to, for instance, allocate local stack data.

Instruction Operator Order

If you've been saddled to a high-level language for awhile, one of the first Assembly language nuances that goes is operator order. Assigning values with mov is backward from the rest of the world. In your mind, replace mov with "LET." For instance:

  mov   ax, bx

Will always be easier to read if you pretend it says:

  let   ax = bx

The cmp instruction is often confusing because it does only half the job -- it's followed by a conditional jump instruction. Here are the two instructions:

  cmp   ax, bx
  ja    @F

Instead, think of a comparison and conditional jump as one large meta-instruction:

  if ax > bx then goto @F

Note: MASM 6.0 brought the power of high-level .IF structures to Assembly language. We'll discuss this modern Assembly language programming style in the book.

Segment Overrides

Most processor memory instructions assume that we are manipulating data in the default data segment. You can change this assumption for a single instruction by adding a segment override. It looks something like this:

  mov   ax, WORD PTR cs:[bx]

Also called a segment override prefix, because a one-byte override instruction is written to the program before the mov instruction, segment overrides make it easy to address just a bit of data in, for instance, the code segment (read-only, please!).

Don't Assume ES With DI
Gotcha: One common mistake is to assume that since string operations that work with DI assume the ES segment, DI is always relative to ES. This is not the case. For normal mov operations, other than functions like movsb, movsw, cmps and scasb, DI is assumed to be relative to DS. If you need to use DI with the ES register, explicitly specify ES:

  mov   ax, WORD PTR ES:[DI]

Register Addressing

Addressing rules, the hard-wired connections between registers and memory, define how data moves through the processor. The following table shows available addressing modes on Intel 8086 processors:

ExampleSourceResult
mov ax, 1234hImmediate ConstantCopy an immediate (constant) value to a register. You cannot move immediate values to segment registers.
mov ax, VariableAbsolute AddressCopy the contents of a variable at an absolute address to a register. You can copy from memory to segment registers.
mov ax, bxRegisterCopy the contents of one register to another. You can copy between any two registers except segment registers; to copy from one segment register to another, use push or pop, or mov to a regular register.
mov ax, [bx]IndirectCopy from the near address pointed to by the BX register to AX. In 16-bit programs, you can use only base (BX or BP) or index (SI or DI) registers to address memory.
mov ax, [bx+2]Base RelativeCopy from an indirect address plus an offset. The memory address is the sum of the register plus the immediate value.
mov ax, [si+bx]Base, IndexAddress calculations can use either base register (BX or BP) plus an index register (SI or DI). You cannot combine two base registers or two index registers.
mov ax, [si+bx+2]Base, Index RelativeCopy from memory using a base, index and offset value.

Register Addressing Semantics

Copying between registers is straightforward; for semantic clarity (of which that phrase is not an example), we specify the source and destinations directly. When registers are used as pointers, Intel notation encloses the register name in brackets:

  mov   ax, bx          ; Load AX with the value of BX
  mov   ax, [bx]        ; Load AX with memory, pointed by BX

For every instruction that stores data in memory or a register, an inverse function reads data in the other direction. The beauty of this orthogonal design is that if you can copy from a source to a destination, you can also copy it back. Memory addresses, whether absolute or relative to a register, can include segment overrides:

  mov   ax, cs:Variable     ; Copy data from the code segment
  mov   ax, es:[bx]         ; The data is relative to ES segment
  mov   ax, es:[si+bx+2]    ; Copy data from base+index+offset

In the other direction, we swap the source and destination arguments:

  mov   cs:Variable, ax     ; Copy data to the code segment
  mov   es:[bx], ax         ; Write to an ES-relative address
  mov   es:[si+bx+2], ax    ; Copy from AX to base+index+offset

Note: Only base and index registers can be used for addressing memory in real mode. You cannot use AX, CX, or DX to addressing memory.

Little-Endians

The 8086 processors are little-endians, which means the least significant bytes of any number fall at lower memory addresses. The effect is memory "rolling-into" the CPU; lower memory addresses go into the lower bytes of registers. An integer, for instance, is stored in memory with the least significant (low) byte at the lowest memory address. Consider a long integer declared with DD (define four-byte data):

Stuff DD 12345678h

In memory, the low byte lies first, followed by the second, and so forth. Notice that the two nibbles in each byte are not reversed:

 Low address   78   56   34   12   High address

For a long integer, like above, the low integer is at the beginning, while the high two bytes follow. Here's one way to load the long integer into DX:AX (note that we used the WORD PTR processor directive):

  mov   ax, WORD PTR Stuff     ; The low two bytes
  mov   dx, WORD PTR Stuff + 2 ; The high two bytes

In this next example, we will read the first two bytes of a string to the AX register; segments are declared using MASM directives:

.DATA                          ; Open the data segment
Hello DB "Hello, world!"       ; A string constant
.CODE                          ; Open the code segment
  mov   ax, WORD PTR Hello     ; Load the first two bytes

We now have the first two bytes in AX. The little-endian Intel processor placed the low byte in AL, and the high byte in AH, so AX="iH". To copy the bytes back into memory, you could use something like:

  mov   ax, "iH"
  mov   WORD PTR Hello, ax

Or write an immediate value, without assigning it to a register:

  mov   WORD PTR Hello, "iH"

Notice that we had to explicitly tell MASM to treat the two bytes as a word, otherwise it would fail because Hello is declared as an array of chars (to use C parlance), while AX expects to read a word at a time.

The little endian effect is more prominent with 32-bit registers, where loading a four byte into, for instance, EAX places the first byte in the low (far right) end of the register, and the highest addressed byte at the high (far left) end of the register.

When you declare a DW (integer) or DD (long integer), MASM automatically reverses the order of the bytes when it writes the object file. It writes strings to the object file without scrambling them.

Note: The table on page 137 lists MASM data types and their mnemonic names. We often try to read or write two bytes at a time, because memory access is slow on processors before the 80486, and lodsb and stosb take the same amount of time as lodsw and stosw. Remember, though, that reading or writing text two bytes at a time results in everything being byte-reversed while it's in the CPU, requiring a little thought to keep it all sorted out.

Newer Processors

IBM's PC/AT of 1984 was powered by the then-new 80286; the computer took-off, but the 286 never met with much joy from programmers. The 286 had a few interesting new instructions, but its main strength was the ability to address beyond the 1MB limit of the 8086. This is why the only 286-specific programs you're likely see are memory managers. After this boring step, Intel got serious: the 80386; a sigh of relief echoed through the programming community when they saw the beloved 32-bit 386.

While the 80386 has been with us since 1986, it was expensive, and its features were under-exploited until 1990, when Microsoft introduced Windows 3.0 -- about the same time, 80386 prices started to fall.

Programming the 80386 and newer processors is superficially the same as coding for the 8086. From a 16-bit programming standpoint, we usually consider a 486 to be a fast 386, and Pentiums to be mega-fast 386s. This means that we can freely use 386-specific instructions, which we'll call 32-bit instructions, but we avoid 486 and newer instructions. Contrary to what we read in magazines, the 386 is not dead -- it's living inside our Pentium; there are few advantages to the 486, or even the Pentium Pro, for the programmer. (There are advantages in using MMX, 3DNow!, Streaming SIMD, and a host of other hardware enhancements that weren't around when this was written.)

The sad irony for me is that now that we have 32-bit processors that are easier to program, everyone has moved to high-level languages, never to see the grace and beauty of the Intel 32-bit instruction set.

Intel 32-Bit CPU Registers

Intel 32-bit processors widen the standard registers to 32-bits, add two new segment registers, and a host of useful new general-purpose instructions.

General Purpose 32-Bit Registers
Reg-32Reg-1688Name
EAXAXAHALAccumulator
EBXBXBHBLBase
ECXCXCHCLCounter
EDXDXDHDLData

32-Bit Pointer Registers
Reg-32Reg-16Name
ESISISource Index
EDIDIDestination Index
EBPBPStack Frame Base Pointer
ESPSPStack Pointer

Segment/Selector Registers
RegName
CSCode Segment
DSData Segment
ESExtra Segment
FSExtra Segment
GSExtra Segment
SSStack Segment

The EAX register is the AX register, extended to 32-bits. While AX is still the mnemonic representing the lower-half of EAX, there is no register name for the upper-half of EAX. So, while there is no "EAH" register, the shift instructions are extended to the full 32-bit register, so you can copy register halves with shifts and rotates:

  ror   eax, 16         ; Exchange high- and low-word of EAX
  shr   eax, 16         ; Shift right, clear the high-word

Another advantage of 32-bit processors is the enhanced instruction set. Instructions like bsf (bit scan), bt (bit test), cdq (convert double to quad), movzx (move with zero-extend), setz (set byte to 1 if zero flag is set), shld (shift left double precision), and enhancements to existing instructions greatly enhance the aged Intel 8086 architecture.

When we exploit the features and power of 32-bit processors, our programs rise from the doldrums of tired-old XT's. We can combine 16- and 32-bit registers and instructions within a single function.

Tip: Several books have promoted the misconception that 32-bit registers are not available to 16-bit DOS programs. This is not true. You cannot use 32-bit offsets larger than 64K, but all 32-bit general purpose registers are available. For example, you can store any value in EBX, but you cannot use EBX for addressing memory if the offset is greater than 64K.

That's it. The rest is as easy as 32-bit code (and floating point, MMX, 3DNow!, and Streaming SIMD), but we'll save that for another article.

This information is borrowed from a 16-bit edition of my Assembly language book.

Note: Teachers find this document on the Internet as often as students do.

Top | Notice | Home | | © Copyright 2007 R. E. Harvey, All rights reserved.