Programs Under the Hood...Introduction Posted by: dargueta in Untagged  on
 

(Part 1)

I have a habit of re-inventing the wheel to see why it rolls. So one day, I hit upon a way to inflict severe mental punishment on myself and invent...ProDIA, a program disassembler written in assembly language. Ironic, I know. This'll be a great way for you to get to know how processors and programs work under the hood, as well as see how this little project is going along. Technical discussions will be placed in separate posts, so you can skip the boring or uninteresting things. Well, without further ado, let's start!


VAGUE INTRODUCTION TO ASSEMBLY AND MACHINE LANGUAGE

We higher-level programmers have it easy. We don't have to worry about how our program is executed, how memory is handled and used, so long as our program works. The compiler's job is to take that high-level printf statement and turn it into 9A BE BA FE CA, etc. We can read the former, our computers can read the latter. An assembler, or compiler, translates human-readable code into machine code-a whole bunch of instructions represented by numbers. A disassembler or decompiler does the exact opposite. Simple, right? The problem is when we run into other machines. Different processors have different instruction sets. This project will concern itself solely with Intel processors, who have maintained more or less the same instruction format since the 8086 in the Stone Age of computers. Although the i8086's capabilities are laughable nowadays-16-bit processor running at a blindingly fast 8 MHz-this little piece o' junk is the foundation on which our quad-core monsters are built. We'll start there, and gradually work our way to more modern times. From there you can build disassemblers for the Zilog Z80, the Motorola 6800, or whatever else you want. The principles are more or less the same, just the details are different.


Let's take a look under the hood of a simple program, command.com. Anyone familiar with MS-DOS or who has used the command prompt knows what this is for-almost nothing now. It's a holdover from the old days, when you typed commands at a black screen with a little blinking cursor to get things done. Anyway, we're going to attempt to disassemble it with debug.exe, an ancient utility by our friends at Microsoft. You can assemble and disassemble COM programs, as well as many other things. debug is quite unintelligent-it treats data like code and tries to disassemble it, and only recognizes instructions for the i8086 CPU and 8087 FPU. That means it'll choke if you try to disassemble anything coded after 1982 or so. Wow, that's useful...this program is a decade older than me...

  1. Go to Start > Run and type debug. Press Enter. A little console screen should appear (black with a white dash and a blinking cursor). Welcome to debug.exe.
  2. Type N  c:/windows/system32/command.com and press enter. It should look like nothing happened. This gives debug the name of the file you want to open, in this case command.com.
  3. Type L and press enter. Again, no message should appear. This opens the file and loads it into debug's memory.
  4. Type U  1660 and press enter.

DEBUG screenshot You should get a few lines of assembly language instructions, like in figure 1. U stands for "unassemble" (since d is already taken for "dump") and the 1660 is the address in hexadecimal to start disassembling at. It'd take a while to explain where I got this address, so just bear with me here.

This, my friend, is what assembly language looks like. If you don't know how to program in assembly language, or want to know more, go to http://burks.bton.ac.uk/burks/language/asm/artofasm/artof001.htm for a good thorough tutorial. If you want a quick introduction, go to http://www.skynet.ie/~darkstar/assembler/. It may look daunting at first, but trust me...it's worth it. From now on I'll assume you know at least some assembly language.


Trackback(0)
feed5 Comments
John
June 22, 2008
24.191.57.21
Votes: +0

This has to be one of the better blogs/tutorials I have read in a while. I'm looking forward to reading more from you. As a side note, did your slashes in "c:windowssystem32command.com" get removed automatically?

report abuse
vote down
vote up
Jordan
June 23, 2008
63.211.21.46
Votes: +0

Looks like the numbered LI is broken on the blog as well. Also, you posted a screenshot but left it as local on your computer:

file:///Fsmilies/cheesy.gifocuments and SettingsDiegoDesktopDiegoProgramsProDIADay1screenshot_d isasm_cmdcom.bmp

Excellent blog/read! I look forward to more blogs by you.

report abuse
vote down
vote up
dargueta
June 23, 2008
70.131.130.194
Votes: +0

Thank you all. I've been having problems figuring out how this blog thing works...this was the first blog I've ever written. Anyway, I'm working on the next one, which'll hopefully be better. Let me know if I can improve anything.

report abuse
vote down
vote up
TcM
June 26, 2008
92.251.112.46
Votes: +0

I'd love to know how you got the 1660...

report abuse
vote down
vote up
dargueta
June 26, 2008
70.131.130.194
Votes: +0

Instead of typing U 1660, type U by itself. The first instruction is JMP 1660. The rest is incorrectly disassembled by debug.exe because the following data up until address 1660 is a data section.

report abuse
vote down
vote up

Write comment
 
 
quote
bold
italicize
underline
strike
url
image
quote
quote
smile
wink
laugh
grin
angry
sad
shocked
cool
tongue
kiss
cry
smaller | bigger
 

security image
Write the displayed characters


busy