Chapter 3 - Switching to Protected Mode

Now, for what you've long been waiting for - a way to actually apply protected mode concepts in a real program. The code we will describe shortly will switch the CPU of a computer from real mode to protected mode, briefly demonstrate memory access in protected mode, switch back to real mode, and return to DOS.

Making sure we're really in real mode to start

The code to switch to and from protected mode makes use of what are called privilged instructions. If you are already in protected mode, priviged instructions are disabled for applications programs and can only be used by the operating system itself. In real mode, however, any program can access privileged instructions. It may appear that any DOS program created in MASM will always run in real mode, however this is not the case. In fact, any program run from a Windows DOS box, is actually running in protected mode. (Windows synthetically implements the shift-and-add method of segmentation, so for most ordinary programs run just like they would in real mode.) Since programs run in a DOS box are in protected mode (and clearly application programs), all privleged instructions are disabled. (It may seem odd that Windows doesn't run DOS programs in real mode. The reason is that if DOS programs were run in real mode, Windows shortcuts such as Alt-Tab, Ctrl-Alt-Delete, and the taskbar would not work properly. Also, only one program could run at a time and a crash in one program would crash the entire computer).

You can positively determine if you are in real mode to start by checking the PE bit of the CR0 register. The following code performs this test:

      mov eax,cr0
      test eax,1
      je L1
      ;already in protected mode - privileged instructions are disabled
      call ReportError
      jmp Exit
L1:
      ;real mode - privileged instructions are enabled.  Switch to protected mode is allowed.

The above code makes no use of privileged instructions so you can append this to any program that switches to protected mode. If the program is already in protected mode, it can display an error message and then exit without a crash.

Format of a Descriptor

The first thing we need to do is set up a GDT somewhere in memory. In order to set up the GDT, however, we must first examine the format of a descriptor.

Each descriptor is eight bytes long. For our purposes, we'll use a simplified format as follows:

Data Descriptor

Code Descriptor

Offset	Byte Contents
0	FFh
1	FFh
2	Bits 0-7 of segment base
3	Bits 8-15 of segment base
4	Bits 16-23 of segment base
5	92h
6	10h
7	Bits 24-31 of segment base

Offset	Byte Contents
0	FFh
1	FFh
2	Bits 0-7 of segment base
3	Bits 8-15 of segment base
4	Bits 16-23 of segment base
5	9Ah
6	10h
7	Bits 24-31 of segment base

Note that the segment base is not stored contiguously.

Note: Although a GDT is required for protected mode, an LDT is optional. Obviously, selectors cannot refer to an LDT entry if an LDT does not exist. The LDT has exactly the same format as the GDT.

Data Descriptors vs. Code Descriptors

The processor distinguishes between data descriptors and code descriptors by the contents of byte 5 in the descriptor. Selectors referring to data descriptors can be loaded into any segment register, except for cs. Data descriptors can be accesed through instructions such as 'mov ax,ds:[2]', but cannot be referenced in an instruction like 'jmp 0010:0000'. Attempting to load a data descriptor into the cs register with a far jmp or a far ret will crash the computer.

Code Descriptors, however, are designed explicitly for the purpose of being loaded into cs, that is, jumped to, called, or returned to. Selectors referencing code descriptors can also be loaded into segment registers ds,es,fs, and gs (but not ss) and the cs segment can still be used to access data (like 'mov eax, cs:[0]'). When a code segment is accessed like data, however, the segment becomes read-only. This means that 'mov eax, cs:[0]' would be allowed, but 'mov cs:[0],eax' would not. This prevents a program from accidentally modifying its own code. (Self-modifying code can still be done by creating a data descriptor in the GDT that has the same base as a code descriptor).

Technically, you only have to have a data descriptor if your program needs to write to memory (which in practice means all the time). At least one code descriptor, however, must be present no matter what. This is because cs must reference a valid code descriptor at all times while in protected mode so the processor will be able to read the instructions from memory that need to be executed.

Lining up our segments

Now, we use the GDT format to set up a table. Our first descriptor will be our program's code descriptor. In our program, we can make our code simpler by setting up our code descriptor so that when our program switches to protected mode, every procedure has the same offset as in real mode (although obviously not the same segment!). When compiling code, MASM assumes, rightly or wrongly, that your program stays in real mode at all times. This means that when a call instruction is assembled, MASM calls the real-mode offset of the destination procedure. For example, suppose you have a procedure "getch" that waits for a keystroke, and is located at the real-mode logical address 1234:5678. Whenever you have an instruction to call the getch procedure, MASM will always compile code to call offset 5678 of the code segment. In protected mode, however, the actual offset of the getch procedure, is purely arbitrary depending on what base you put in the code descriptor. However, if we cleverly choose a base in this case of 12340h, the getch procedure will have the same offset in protected mode as in real mode and the attempt to call it will work properly in both modes. Even though MASM does not know which segment contains the getch procedure (since this is arbitrary too depending on where in the GDT the code descriptor is located), this does not matter because as long as you stick to near calls, MASM does not even need to know the segment. Far calls and jumps can be made to work by explicitly stating the destination selector (ex. "call 0010:getch").

Just as we line up our code descriptor with our program's code segment, we can do the same with the data descriptor. Since MASM is implicitly referencing the real-mode offset of you data variables, by lining up the data segment, you can make the real-mode and protected mode offsets identical so that MASM will correctly interpret data references.

Setting up the GDT

Our GDT will contain four descriptors, after the NULL descriptor. The first will refer to our program's code. The second will have a base of zero. The third will refer to our program's data and the fourth will have a base of b8000h and will be used for displaying text on the screen. Our GDT is summerized in the following table:

Offset	Contents
00-07	NULL Descriptor (ignored)
08-0F	index=1 (selector=0008h), code, base=[physical address of [program's code segment]:[0] ]
08-0F	index=2 (selector=0010h), data, base=00000000h
10-17	index=3 (selector=0018h), data, base=[physical address of [program's data segment]:[0]
18-1F	index=4 (selector=0020h), data, base=000b8000h

In memory, our GDT can be stored as follows :

	GDT dq 0000000000000000h,          ;NULL descriptor (ignored, 
                                                            ;so we can just put a zero here)
	??109A??????FFFFh			;index 1 descriptor (? marks 
						;contain the location of the program's code)

	001092000000FFFFh,			;index 2 descriptor

	??1092??????FFFFh,			;index 3 descriptor (? marks contain 
						;the location of the program's data)

	0001920b8000FFFFh			;index 4 descriptor

Now, we need to tell the CPU where to find the GDT and how long the GDT is. To do this, we make use of the instruction 'ldgt'. (lgdt is a privileged instruction and will crash your program if you are not truely in real mode.) LGDT takes one memory operend which specifies the address of a 6-byte buffer called the GDTR. The first four bytes of the buffer contain the physical address of the GDT and the remaining two bytes contain the number 1 less than the size (in bytes) of the GDT table. We can now set up are GDT with the following code:

.data		

	GDT dq 0000000000000000h,		;NULL descriptor (ignored, so we 
						;can just put a zero here)

	00109A000000FFFFh			;index 1 descriptor (? marks 
						;substituted with zeros, filled in at run-time)

	001092000000FFFFh,			;index 2 descriptor

	001092000000FFFFh,			;index 3 descriptor (? marks substituted 
						;with zeros, filled in at run-time)

	0001920b8000FFFFh			;index 4 descriptor

GDTR df ?

.code

setup_gdt proc
	;get data segment in ds
	mov eax,@data			;note: upper-word of eax set to zero
	mov ds,ax

	;update base of data descriptor
	shl eax,4			;eax now contains base of data 
					;descriptor (shift-and-add method applied here).
	mov ds:GDT[1Ah],eax		;store base of data descriptor in the GDT

	;repeat the above procedure to store 
	;the base of the code descriptor
	
	mov eax,@code
	shl eax,4
	mov ds:GDT[0Ah],eax

	call setup_gdtr
	ret
	
setup_gdt endp

setup_gdtr proc
	
	;get physical address of GDT table

	mov eax,@data	;get data segment in eax (loads segment 
			;in ax and sets upper work of eax to zero)

	shl eax,4
	add eax,offset GDT		;apply the shift-and-add method.  
					;eax now contains the physical address of the GDT.

	mov di,offset GDTR
	mov dword ptr ds:[di],eax	;copy physical address of GDT to GDTR

	mov word ptr ds:[di+4],27h	;Copy GDT length to GDTR.  GDT is 28h bytes long.  
					;We need to subtract 1 so 27h is the number we need.

	lgdt fword ptr ds:[di]		;Specify the location of the GDTR, which 
					;specifies the location and size of the GDT.  
					;(This is a privileged instruction

	ret				;done - GDT is properly set up
setup_gdtr endp

Here are some important points to remember about the above code:

Pay special attention to the differance between the use GDT and the GDTR. They are not the same!.
In MASM syntax, 'fword' is syntactically the type for the 6-byte buffer needed for the the lgdt instruction. 'dword' or 'qword' will generate an error because MASM will think the operend sizes don't match and you're program won't compile.
Note that the LGDT instruction itself uses a real-mode logical address to specifiy the location of the GDTR (remember, we're still in real mode!). The GDTR specifies the actual physical address of the GDT.
Notice where the GDT limit field in the GDTR comes from. (Just multiply the number of gdt entries, including the NULL descriptor, by 8 and then subtract 1.)
If you want to use an LDT, you can set one up just like you would a GDT. You would use an LDTR, which is just like a GDTR, and the 'LLDT' instruction, which works just like the 'LGDT' instruction.
Make sure you understand exactly how the code and data segment bases were chosen.

Disabling Interrupts

Now that the GDT is set up, we need to disable interrupts. Interrupts in protected mode do not make use of the interrupt vector table as in real mode, but instead use a new table called the Interrupt Descriptor Table or IDT. For right now, for simplicity's sake, we won't set up an IDT, but interrupts must still be disabled so that hardware interrupts won't crash the computer (even if you do set up an IDT, you should still disable interrupts before switching to protected mode and re-enable them once protected mode has been reached). To disable interrupts, we use the instruction:

cli

performing the switch

Once the GDT is set up, we're ready for the switch. Remember that bit 0 of the cr0 (called the PE bit) stores the CPU's mode. A 1 denotes protected mode, while a 0 denotes real mode. Just as you can determine the CPU's mode by reading the PE bit, you can also set the CPU's mode by writing to it. Therefore, once interrupts are disabled and the GDT is set up, you can switch to protected mode with just three instructions:

	mov eax,cr0	;obtain cr0 register (non-previleged 
			;instruction).  (Still in real mode).

	or eax,1	;set the PE bit without changing the other 
			;bits in cr0.  (Still in real mode).	

	mov cr0,eax	;copy the image of cr0 in eax(including the PE bit) 
			;to the actual cr0.  (this is a privileged instruction, CPU is now
			;in protected mode when this instruction is complete)

Once in protected mode, we must immediately execute a far jmp instruction. This is because cs still contains the same real-mode value as before, which is not valid in protected mode. Thus, our first protected mode instruction must be:

	jmp far ptr 0008: pmode_entry_point

pmode_entry_point:

It may appear that the above code does nothing, but it is crucial since it sets cs to the proper value (0008 is the selector that references our code descriptor since our code descriptor is located at index 1 of the GDT.)

Next, we have the problem that the segment registers (cs,ds,es,fs,gs, and ss) still contain 'real mode' values that are incompatible with protected mode. All segment registers should immediately at this point be loaded with sensible values for protected mode. It is good practice to load a value in every segment register, including those that you think you won't be using. Segment registers that you do not anticipate using should be loaded with the NULL selector while segments that you do anticipate using should be loaded with a selector that refers to a data descriptor. Thus our code continues with:

	mov ax,0	;NULL selector
	mov bx,0018h 	;data selector

	mov ds,bx
	mov ss,bx

	mov es,ax
	mov fs,ax
	mov gs,ax

Setting up the stack

To set up the stack, all you have to do is load a valid data selector into the ss register and set sp to an appropriate value. By aligning your stack segment so that the base of the stack descriptor has the same value as as (16 * @stack), the value in sp left over from real mode is already appropriate. Not only do you not have to mess with the value of sp, but you also have the advantage that data pushed on the stack in real mode can still be popped off the stack in protected mode (and vice-versa). In our program, the above code meets this criteria if it is compiled in the "tiny" memory model where the data and the stack share the same segment. Even if do not need to use the stack, the ss segment must still contain a valid data (non-NULL) selector.

Switching back to real mode

On a 286 computer, the only way to switch from protected mode back to real mode is to reboot the computer. On all computers, 386 or later, you can switch back to real mode simply by clearing the PE bit. Thus, the following code switches from protected mode back to real mode.

	;clear the PE bit
	mov eax,cr0
	and eax,0FFFFFFFEh
	mov cr0,eax

	;make cs have correct real mode value
	jmp far ptr @code : real_mode_code
real_mode_code:
	mov ax,@stack
	mov ss,ax
	mov ax,@data
	mov ds,ax
	mov es,ax
	mov fs,ax
	mov gs,ax

	sti		;enable interrupts

	;done

Note: If your program is transperently starting in protected mode, as in the case of a dos box, you cannot simply switch back to real mode by clearing the PE bit and then enter protected mode. This is because the instruction to modify cr0 is a privleged instruction. When your program has switched from real mode to protected mode, rather than Windows, however, the privileged instruction to switch back to real mode is allowed because the CPU thinks that your program is part of the operating system.

Putting it all together

Here is the complete framework for a protected mode program. It will do nothing except display the letter 'A' in the upper-left-hand corner of the screen.

.model tiny
.stack 100h
.data
	GDT dq 0000000000000000h,	;NULL descriptor (ignored, 
					;so we can just put a zero here)

	00109A000000FFFFh		;index 1 descriptor (? marks substituted 
					;with zeros, filled in at run-time)

	001092000000FFFFh,		;index 2 descriptor

	001092000000FFFFh,		;index 3 descriptor (? marks substituted 
					;with zeros, filled in at run-time)

	0001920b8000FFFFh		;index 4 descriptor

GDTR df ?

.code

setup_gdt proc
	;get data segment in ds
	mov eax,@data			;note: upper-word of eax set to zero
	mov ds,ax

	;update base of data descriptor

	shl eax,4			;eax now contains base of data 
					;descriptor (shift-and-add method applied here).
	mov ds:GDT[1Ah],eax		;store base of data descriptor in the GDT

	;repeat the above procedure to store the 
	;base of the code descriptor

	mov eax,@code
	shl eax,4
	mov ds:GDT[0Ah],eax

	call setup_gdtr
	ret
	
setup_gdt endp

setup_gdtr proc
	
	;get physical address of GDT table

	mov eax,@data	;get data segment in eax (loads segment in ax and 
			;sets upper work of eax to zero)

	shl eax,4
	add eax,offset GDT		;apply the shift-and-add method.  eax now 
					;contains the physical address of the GDT.

	mov di,offset GDTR
	mov dword ptr ds:[di],eax	;copy physical address of GDT to GDTR

	mov word ptr ds:[di+4],27h	;Copy GDT length to GDTR.  GDT is 28h bytes long.  
					;We need to subtract 1 so 27h is the number we need.

	lgdt fword ptr ds:[di]		;Specify the location of the GDTR, which implicitly 
					;specifies the location and size of the GDT.  
					
					;(lgdt is a privileged instruction)

	ret				;done - GDT is properly set up
setup_gdtr endp

main proc

	;step 1 : make sure we're really in real mode

	test eax,1
	jne Exit
	
	;already in protected mode - privileged 
	;instructions are disabled
	
	call ReportError
Exit:
	;return to dos - not in real mode to start
	mov ax,4c00h
	int 21h

L1:
	;real mode - privileged instructions are enabled.  
	;Switch to protected mode is allowed.

	;step 2 : setup the GDT and GDTR
	call setup_gdt

	;step 3 : perform the switch

	mov eax,cr0	;obtain cr0 register (non-previleged 
			;instruction).  (Still in real mode).
	or eax,1	;set the PE bit without changing the 
			;other bits in cr0.  (Still in real mode).	
	mov cr0,eax	;copy the image of cr0 in eax(including the PE 
			;bit) to the actual cr0.  (this is a privileged instruction, 
			;CPU is now in protected mode when this instruction is complete)

	;step 4 : update cs
	jmp far ptr 0008: pmode_entry_point

pmode_entry_point:
	;step 5 : update remaining segment registers

	mov ax,0	;NULL selector
	mov bx,0018h 	;data selector
	mov cx,0020h	;screen selector

	mov ds,bx	;make ss and ds refer to data/stack segment
	mov ss,bx

	mov es,cx	;make es refer to screen segment

	mov fs,ax	;assign fs and gs the NULL selector 
			;since they won't be used

	mov gs,ax

	;step 6 : display the 'A' by directly writing to 
	;the screen buffer (dos/bios interrupts don't work in protected mode)

	mov byte ptr es:[0],'A'

	;step 7 : switch back to real mode

	;clear the PE bit
	mov eax,cr0
	and eax,0FFFFFFFEh
	mov cr0,eax

	;make cs have corrent real mode value
	jmp far ptr @code : real_mode_code
real_mode_code:
	mov ax,@stack
	mov ss,ax
	mov ax,@data
	mov ds,ax
	mov es,ax
	mov fs,ax
	mov gs,ax

	sti		;enable interrupts

	;done
	
	;step 8 : return to dos
	mov ax,4c00h
	int 21h

General points to consider

DOS and BIOS interrupts will not work in protected mode. To call a DOS or BIOS interrupt, you must switch to real mode first, call the interrupt as you normally would, and then switch back to protected mode.
Text can be displayed by writing to the text buffer at physical address 0b8000h just as in real mode.
Any physical addresses above 1 MB can be accessed freely (they can't be reserved by DOS since as a real mode operating system, DOS can't be using those addresses.)
If you boot the computer in DOS mode and you find that the PE bit is still set at the beginning of you program, then that means you have a memory manager installed. You can update the files config.sys or autoexec.bat to disable the memory manager, or you can store your program on the boot location of a floppy disk and your program should start in real mode.
Even in a DOS box under Windows, protected mode can still be accessed through a technique called DPMI. DPMI uses a series of interrupts which are implemented by windows to simulate a switch to protected mode by disabling Windows's implementation of the shift-and-add method. Interrupts can also be used to add or edit the LDT. Google search "DPMI" for more information.
To find more information on protected mode, don't search "protected mode" because you'll get junk ranging from protecting the president to the operating mode of a company. Instead, use a technical protected-mode-related term as a search term. Searching "lgdt", "mov cr0,eax", or "gdt", should yield more informative information. For information on implementing interrupts in protected mode, a good search term is "lidt". ("lidt" is a special assembly instruction used to set up the protected-mode equivlent of an interrupt vector table.) Also, for more information on protected mode and assembly programming in general, you can check our link page.

Back to table of
contents

Chapter 1

Chapter 2

Chapter 3

Programming
Suggestions

Hosted by www.Geocities.ws