| Programs Under the Hood...Introduction Posted by: dargueta in Untagged on Jun 22, 2008 |
(Part 1)
I have a habit of re-inventing the wheel to see why it rolls. So one day, I hit upon a way to inflict severe mental punishment on myself and invent...ProDIA, a program disassembler written in assembly language. Ironic, I know. This'll be a great way for you to get to know how processors and programs work under the hood, as well as see how this little project is going along. Technical discussions will be placed in separate posts, so you can skip the boring or uninteresting things. Well, without further ado, let's start!
VAGUE INTRODUCTION TO ASSEMBLY AND MACHINE LANGUAGE
We higher-level programmers have it easy. We don't have to worry about how our program is executed, how memory is handled and used, so long as our program works. The compiler's job is to take that high-level printf statement and turn it into 9A BE BA FE CA, etc. We can read the former, our computers can read the latter. An assembler, or compiler, translates human-readable code into machine code-a whole bunch of instructions represented by numbers. A disassembler or decompiler does the exact opposite. Simple, right? The problem is when we run into other machines. Different processors have different instruction sets. This project will concern itself solely with Intel processors, who have maintained more or less the same instruction format since the 8086 in the Stone Age of computers. Although the i8086's capabilities are laughable nowadays-16-bit processor running at a blindingly fast 8 MHz-this little piece o' junk is the foundation on which our quad-core monsters are built. We'll start there, and gradually work our way to more modern times. From there you can build disassemblers for the Zilog Z80, the Motorola 6800, or whatever else you want. The principles are more or less the same, just the details are different.
Let's take a look under the hood of a simple program, command.com. Anyone familiar with MS-DOS or who has used the command prompt knows what this is for-almost nothing now. It's a holdover from the old days, when you typed commands at a black screen with a little blinking cursor to get things done. Anyway, we're going to attempt to disassemble it with debug.exe, an ancient utility by our friends at Microsoft. You can assemble and disassemble COM programs, as well as many other things. debug is quite unintelligent-it treats data like code and tries to disassemble it, and only recognizes instructions for the i8086 CPU and 8087 FPU. That means it'll choke if you try to disassemble anything coded after 1982 or so. Wow, that's useful...this program is a decade older than me...
- Go to Start > Run and type debug. Press Enter. A little console screen should appear (black with a white dash and a blinking cursor). Welcome to debug.exe.
- Type N c:/windows/system32/command.com and press enter. It should look like nothing happened. This gives debug the name of the file you want to open, in this case command.com.
- Type L and press enter. Again, no message should appear. This opens the file and loads it into debug's memory.
- Type U 1660 and press enter.
You should get a few lines of assembly language instructions, like in figure 1. U stands for "unassemble" (since d is already taken for "dump") and the 1660 is the address in hexadecimal to start disassembling at. It'd take a while to explain where I got this address, so just bear with me here.
This, my friend, is what assembly language looks like. If you don't know how to program in assembly language, or want to know more, go to http://burks.bton.ac.uk/burks/language/asm/artofasm/artof001.htm for a good thorough tutorial. If you want a quick introduction, go to http://www.skynet.ie/~darkstar/assembler/. It may look daunting at first, but trust me...it's worth it. From now on I'll assume you know at least some assembly language.
ocuments and SettingsDiegoDesktopDiegoProgramsProDIADay1screenshot_d
isasm_cmdcom.bmp