www.xbdev.net xbdev - software development
Thursday April 27, 2017
home | about | contact | Donations

     
 

Assembly Language

What every pc speaks..1010...

 

Protected Mode.....

by bkenwright@xbdev.net

 

 

 

Well I should at this stage add some sample code on how to go about booting from a floppy, boot sector information, etc....but we'll just take it as we go at first.

 

Now in the days of...of...hmmm...star wars...the first one in 1977, we had the good old 80x86, and life was simple....1meg of memory would make people shake, as it was sooOOooo much memory....*smile*...well that sure change...but processors got faster and memory got bigger.  Then came the 286 and 386 etc...which needed more memory and needed to be able to handle multiple threads etc...but most of all they wanted the chips backward compatable.  So the CPUs which originally could operate in real mode, could be switched into protected mode and give you more...so much more.

386 offered newer registers and operated at considerably faster speads.  It still performed the basic 8086 code but more.

 

When the chip starts up its in Real mode... we need to see how we can go from real to protected mode...and how memory is arranged etc.

 

 

Real Mode & Protected Mode

 

Segment Registers (CS,DS,SS..) remember them?  Code Segment, Data Segment, etc... well the CPU uses these to calculate the offset to your code/data etc.   If we take the segment address and multiply it by 16, this shifts it right 4 bits.  Then we add the 4 bit address to the segment value and this gives us our physical address.

 

Real Mode

Segment = 0000 0000 0000 0000 (16 bits)

+

Offset    = 0000 (0000 0000 0000 0000)  4 bits for the offset

=

Physical Address

 

 

Protected Mode

It gets a little tricky at first when your new to protected mode.  As theres a lot of new things to grasp.  I must admit, if it wasnt' for the fact that I read a lot of material on protected mode and also fiddled around with a lot of code on the subject it would still be a little fuzzy to me.  So I'm going to get right down to basics on this and work upwards to a complete protected mode system with all the frilly bits added on.

The first things you have to put into your mind is that protected mode uses descriptors to describe each piece of memory.  As memory is described using 'descriptors'.  These descriptors are all stored together in groups, usually called tables.  Now there are two main types of tables, well 3 if you include the interrupts.  Global Descriptor Table (GDT), Local Descriptor Table (LDT) and Interrupt Descriptor Table (IDT).  The GDT is the main one, and usually small programs get a LDT each which is referenced in the GDT.  The IDT is for all the interrupts, and it lists all the interrupts, there type, position in memory etc.

 

Each Descriptor is 8 bytes long, which is 64bits...

 

Now this descriptor contains the base address value and a limiting size...it also contains a few bits which describe the type of memory, such as is its read only, code, stack etc.

 

dw 0x0ffff ;                   Limit 4Gb bits 0-15 of segment descriptor
dw 0x0000 ;                 Base 0h bits 16-31 of segment descriptor (sd)
db 0x00 ;                     Base addr of seg 16-23 of 32bit addr,32-39 of sd
db 0x09a ;                    P,DPL(2),S,TYPE(3),A->Present bit 1,Descriptor ; privilege level 0-3,Segment descriptor 1 ie code ; or data seg descriptor,Type of seg,Accessed bit
db 0x0cf ;                     Upper 4 bits G,D,0,AVL ->1 segment len is page ; granular, 1 default operation size is 32bit seg ; AVL : Available field for user or OS
                                    ; Lower nibble bits 16-19 of segment limit
 

 

As I said earlier, where not by default in Protected Mode....so we have to swtich into it....so how can we tell if where in PM or how can we switch in and out of PM etc

 

Well the 386 has three control registers called CR0, CR1, CR2 and CR3, each is 32 bits.  CR1 is reserved and not used.  CR0 is the register where interested in!  And it contains a single bit which is used to switch PM on and off.  Bit 0 of CR0 tells us if where in PM, the bit is sometimes called the PE bit as well just for your knowledge.  When its set to 1 where in PM, when its set to 0 where in Real Mode.

 

But we can't go just swtiching this bit on and off.  I mean we have to tell the CPU where things are, we have to let it know about our GDT and LDT etc.  We can do this using 3 new registers which contain the location and size of our GDT, LDT, IDT memory locations.  These new registers are GDTR, LDTR and IDTR.  Each register is 48bits, and is basically made up of two parts, a 32 bit linear memory location address and a 16 bit limiting value.  Simple eh?  If only the rest of it was that simple.

 

Immediately after setting the PE bit to 1 we have to execute a jump instruction to flush the execution pipeline of any instructions that may have been fetched in the real mode. This jump is typically to the next instruction. The steps to switch to protected mode then reduces to the following

 

Few things we need to do to have a simple working PM piece of codes is:

* Build a simple GDT and LTD

* Set GDTR and LDTR

* Disable Interrupts (If left disabled we can skip IDTR here)

* Enable protected mode by setting the PE bit in CR0

* Jump to clear the prefetch queue

 

Looks messy and I think it is...I mean they could have done the descriptors where the base address is one continues value instead of it being split up....but still we can take it.

 

Lets look at a piece of code that shows a basic (but still working) GDT and how we set it up asm code:

 


   cli		    ; Clear or disable interrupts
   lgdt[gdtr]	   ; Load GDT
   mov eax,cr0	   ; The lsb of cr0 is the protected mode bit
   or al,0x01	    ; Set protected mode bit
   mov cr0,eax	   ; Mov modified word to the control register
   jmp codesel:go_pm

bits 32
go_pm : 
   mov ax,datasel   
   mov ds,ax	     ; Initialise ds & es to data segment
   mov es,ax	
   mov ax,videosel   ; Initialise gs to video memory
   mov gs,ax	
spin : jmp spin      ; Loop forever


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Our GDTR register value
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

gdtr :
   dw gdt_end-gdt-1    ; Length of the gdt
   dd gdt	       ; physical address of gdt

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; This is the start of our gdt - its actual value
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

gdt
nullsel equ $-gdt      ; $->current location,so nullsel = 0h
gdt0 		       ; Null descriptor,as per convention gdt0 is 0
   dd 0		      ; Each gdt entry is 8 bytes, so at 08h it is CS
   dd 0                ; In all the segment descriptor is 64 bits
codesel equ $-gdt      ; This is 8h,ie 2nd descriptor in gdt
code_gdt	       ; Code descriptor 4Gb flat segment at 0000:0000h
   dw 0x0ffff	      ; Limit 4Gb  bits 0-15 of segment descriptor
   dw 0x0000	      ; Base 0h bits 16-31 of segment descriptor (sd)
   db 0x00             ; Base addr of seg 16-23 of 32bit addr,32-39 of sd	
   db 0x09a	       ; P,DPL(2),S,TYPE(3),A->Present bit 1,Descriptor	
                       ; privilege level 0-3,Segment descriptor 1 ie code	    		                  
                       ; or data seg descriptor,Type of seg,Accessed bit
   db 0x0cf	       ; Upper 4 bits G,D,0,AVL ->1 segment len is page		                                     
                       ; granular, 1 default operation size is 32bit seg		                              
                       ; AVL : Available field for user or OS
                       ; Lower nibble bits 16-19 of segment limit
   db 0x00	       ; Base addr of seg 24-31 of 32bit addr,56-63 of sd
datasel equ $-gdt      ; ie 10h, beginning of next 8 bytes for data sd
data_gdt	       ; Data descriptor 4Gb flat seg at 0000:0000h
   dw 0x0ffff	      ; Limit 4Gb
   dw 0x0000	      ; Base 0000:0000h
   db 0x00	       ; Descriptor format same as above
   db 0x092
   db 0x0cf
   db 0x00
videosel equ $-gdt     ; ie 18h,next gdt entry
   dw 3999	      ; Limit 80*25*2-1
   dw 0x8000	      ; Base 0xb8000
   db 0x0b
   db 0x92	       ; present,ring 0,data,expand-up,writable
   db 0x00	       ; byte granularity 16 bit
   db 0x00
gdt_end

 

 

We've set up 3 main descriptors, one for code, one for data and one for video, so we can output things to the screen (debug information).  The best way to start off is to learn the default values that are best - as for the code, I've used 0xffff for the limit, so we get the full 4gig....a base address of 0, so our memory starts at 0000:0000...then theres a few flags to show that its a segment for read writing and has code in etc.

 

You can see that our GDT descriptor has a null descriptor at the start, this is always there.....you just have to accept we always need a null descriptor.  Then we have done a code and data descriptors...doesn't have to be in this order...you can do it any way you want.  And at the end, I've added a video descriptor, you could in fact leave this out, but its so much easier to set a video descriptor as your always doing graphical output.  You'll probably have lots of other descriptors, for the stack, audio etc.

 

As I mentioned at the start of the tutorial, we need to boot from a floppy or run the program from dos...not from under windows.  As windows is running in protected mode, and it won't let our program take over, well not without a struggle... but we don't want to crash windows and mess up our settings.  So usually its better to have it boot from protected mode or boot from a floppy like an operating system.

 

I prefer to use nasm assembler myself, as you can create binary files very easily and its a free and very powerful assembler, that lets you do everything you need.

 

For those who might have forgotten, here is a simple assembly program you can assemble and run, to show a basic text out in dos.

 

; Hello World Example

 

; assemble using 'nasm' assembler
; C:>nasm hello.asm -o hello.exe

org 100h        ;reserve 256 bytes for DOS

 

[BITS 16]


mov dx, msg1    ;register dx=msg1
mov ah, 9       ;register ah=9 -- the print string function
int 21h         ;dos services interrupt...looks at register ah to figure out what to do

mov dx, msg2    ;same code as above, but this time we're displaying msg2
mov ah, 9
int 21h

mov ah, 4Ch     ;register ah=4Ch -- the end program function
int 21h         ;dos services interrupt...looks at ah to figure out what to do

msg1 db 'Hello world!',0dh,0ah,'$'   ;stores "hello world!" and a new line into msg1
msg2 db "Goodbye world!$"            ;stores "goodbye world!" in msg2

 

 

Its interesting to note that when we run a program from dos, our code is loaded in at 0x100 base address, but if we boot from a floppy, as the OS does on your computer, our base address for our code is at 0x07c00....you'll find that that using asm gives you a real understanding of how your computer operates and what power it holds :).

 

*slurps some coffee*

 

Lets put some code together so we can expand on our protected mode understanding.  We'll do it so that we can boot our code from dos....we can boot from our dos disk then run our program.  We can do a bootable disk later on just to show how you can boot from a disk into protected mode...just a matter of putting the code in the right place on a disk and knowing where in memory the code is loaded into.

Because our sample is being booted from dos here...there are a few tricks which I think are very very important...mostly because of memory alignment.  I mean if we where booting from the boot up process and our program is loaded into memory location 0x07C0, then you'll find it easier, as we know where our code is loaded and we know what the offsets are.  When we compile for dos we don't know where are code is going to be at run time...so we can't work out our offsets to our GDT, IDT tables etc.  But you'll see how we get around that :)

 

Anyhow, here is the code:

 

code: asm_2.asm
; Run in dos (not under windows) and it will take us to 32 bit protected mode

[ORG 0x100]         ; Reserve 256 bytes for dos

    
[BITS 16]           ; Dos is 16 bits

; assemble using 'nasm' assembler

; C:>nasm asm_2.asm -o test.exe

jmp entry           ; Jump to the start of our code

msg1 db 'Where good to go..$';

jumpOffset:
    dd go_pm
    dw 0x08

entry:
 
; Display a message showing where alive!

mov dx, msg1        ; register dx=msg1
mov ah, 9           ; register ah=9 -- the print string function
int 21h             ; dos service interrupt .. looks at register ah to figure out what to do

; Thanks from Brendan, as we have to make sure our GDTR points to the actual
; memory address, add 0x100 onto our loaded offset 

    mov eax,0
    mov ax,cs
    shl eax,4
    add [gdtr+2],eax
    add [jumpOffset],eax


  cli		    ; Clear or disable interrupts
  lgdt[gdtr]	    ; Load GDT
  mov eax,cr0	    ; The lsb of cr0 is the protected mode bit
  or al,0x01	    ; Set protected mode bit
  mov cr0,eax	    ; Mov modified word to the control register
  

jmp far dword [jumpOffset]  ;can't just use "jmp go_pm" as where in dos!


nop                 ; ignore - no operation opcodes :)
nop

; Once we reach here where in protected mode!  32 Bit!  Where not in
; the real world (mode) anymore :)
[BITS 32]
go_pm :

mov ax, 0x10        ; use our datasel selector ( alternatively mov ax, datasel )
mov ds, ax,
mov es, ax

mov word [es: 0xb8000],0x740 ; put a char to the screen!...yeahh!


lp: jmp lp  ; loops here forever and ever...


; We use 16 bits here - as you'll notice we use dw and dd only,
; and out data will be packed together nice and tight.

[BITS 16]

align 4


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Our GDTR register value
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

gdtr :
   dw gdt_end-gdt-1    ; Length of the gdt
   dd gdt	       ; physical address of gdt

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; This is the start of our gdt - its actual value
; We only have 2 descriptor tables here at present...but once we understand
; them we can easily add more :)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

align 4

gdt:
nullsel equ $-gdt      ; $->current location,so nullsel = 0h
gdt0 		       ; Null descriptor,as per convention gdt0 is 0
   dd 0		       ; Each gdt entry is 8 bytes, so at 08h it is CS
   dd 0                ; In all the segment descriptor is 64 bits
codesel: ; equ $-gdt      ; This is 8h,ie 2nd descriptor in gdt
code_gdt:	       ; Code descriptor 4Gb flat segment at 0000:0000h
   dw 0x0ffff	       ; Limit 4Gb  bits 0-15 of segment descriptor
   dw 0x0000	       ; Base 0h bits 16-31 of segment descriptor (sd)
   db 0x00             ; Base addr of seg 16-23 of 32bit addr,32-39 of sd	
   db 0x09a	       ; P,DPL(2),S,TYPE(3),A->Present bit 1,Descriptor	
                       ; privilege level 0-3,Segment descriptor 1 ie code	    		                  
                       ; or data seg descriptor,Type of seg,Accessed bit
   db 0x0cf	       ; Upper 4 bits G,D,0,AVL ->1 segment len is page		                                     
                       ; granular, 1 default operation size is 32bit seg		                              
                       ; AVL : Available field for user or OS
                       ; Lower nibble bits 16-19 of segment limit
   db 0x00	       ; Base addr of seg 24-31 of 32bit addr,56-63 of sd
datasel: ; equ $-gdt      ; ie 10h, beginning of next 8 bytes for data sd
data_gdt:	       ; Data descriptor 4Gb flat seg at 0000:0000h
   dw 0x0ffff	       ; Limit 4Gb
   dw 0x0000	       ; Base 0000:0000h
   db 0x00	       ; Descriptor format same as above
   db 0x092
   db 0x0cf
   db 0x00

    
gdt_end:



TIMES 0x500-($-$$) DB 0x90    ; And of course, this will make our file size
                             ; equal to 0x500 a nice round number -
                             ; 0x500 ... so if you assemble the file
                             ; you should find its that size exactly.

 

EEeeekkKKk....it looks bad...to newbie's to asm, it looks horrifying I bet.  But this isn't to bad, it gets us into protected mode, and uses the bare minimum to get us there.  We can play around with this asm code now, displaying output to the screen to test our results and things.  What happens is it starts in standard dos real mode and uses the bios interrupts to display a message, from this we break all ties with real mode and interrupts and go into protected mode.  Once we are in protected mode we modify the graphics memory area so some output is done on the screen, so we know it made it into protected mode okay.

 

Remember our screen/graphics card is still set in its original settings, so graphical output memory is at 0000:b800 memory....later on we'll se how we can change this, so we can have an improved graphical output.  Well just the basics, as the graphics card internal registers and settings is a big topic.

 

Warning : DOS Memory Locations!
I've put the few different lines in bold, the ones that stand out!  I mean if you don't account for dos loading your code into some unknown location, and just use 'lgdt [gdtr]' and 'jmp go_pm' etc, it will just reboot your pc or crash.  As we know in dos that the first 0x100 bytes of memory is for the stack, then the code is loaded just after that.  Our offsets to go_pm and gdtr etc will all be determined by nasm as the offsets from the start of our code...in this case, org 0x100.  We can though determine the codes location, using the cs register which is our current code segment register offset, and tells us our codes location in memory at run time.  Using this knowledge we can then modify our values by adding this value plus the 0x100 to get the correct....'real memory locations'...of gdt and go_pm. 

Just a tip, but to add the 0x100, we just shift left 2 bytes, which is the same as adding 0x100 onto a value :)

 

Also if we did a boot disk and ran our code as a bootup program!  Our code would be loaded into 0x07C0, so we could do 'org 0x07c0 at the top of our code and all our offsets would be exact and right...and we wouldn't need to modify them as I did above :)

As they usually do a kernal asm file and a bootloader asm file, so at bootup the code would be loaded into an exact memory location and the kernel would do all the work, while all the bootloader would do, is simple cpu checks and load the kernal binary into memory and get it started on its working journey.  But you can do it anyway you want in real life...that's the beauty of code :)

 

 

When the program does eventually run, you will get a white '@' character in the top left of your screen.... and you'll know it all went well

 

Some people at this point, are probably saying...'well thats way to messy for me'....'what on earth happened'....'whats mov eax, 0'....lol.....'eeKKkkk'..... but its not to bad.  I mean once you can set up a basic Descriptor Table your mostly there.  The rest is just adding more bits so you can do more things....as if you know what one table does you should know what the other tables do :)

 

Add just a tad more...

We have a basic startup phase going now... its a lot more complicated than you think....there the understanding of the GDT's and how they are set up and what there initialisation values mean etc.  Its a lot tricker than people think...especially when you come to bugs that cause crashing our unpredicatable results....which happens a lot to me...*grin*...but you learn a lot more through the trials and errors than anything.

 

What we should do next, is add another table, do some more text out to the screen....build from what we already have.  As where not setting up any stack, so we can't call any functions, and because we don't know our code memory offsets it makes things tricker as we see later on.  Not as bad as you think, but we can easily overcome this and we get a far greater understanding and power over this....we'll have this code in its place in no time at all.

 

 

Well next we'll have a fiddle around with interrupts!...IDT.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 
 Visitor: 9534626  { 209.237.238.175 } Copyright (c) 2002-2017 xbdev.net - All rights reserved.
Designated tutorial and software are the property of their respective owners.