Programs Under the Hood...Part 7: Making Decisions Posted by: dargueta in Untagged  on

 

Welcome to Part 7 of Programs Under the Hood. Today we're going to learn a bit more about how programs make decisions on the assembly language level. I promise after this we'll get back to the disassembler project.

 

FLOW CONTROL STRUCTURES

Everyone who knows C/C++ is familiar with the goto keyword, the bane of many a programmer's existence. Using goto makes code harder to read and terrible to debug because you have to jump around inside the code. That's why Java doesn't have any such statement.

Take a look at the following code:

 

while(!sleeping)

{

if(!programming)

            BeginProgramming();

      else

            ContinueProgramming();

}

 

Easy enough to understand, right? You can easily trace the flow of the program. I want you to notice one thing, however: at some point or another you have to make a jump, whether it's in the while loop or in the if-else statement. To understand flow control in assembly language, you must realize that

 

All higher-level flow control statements, at their most basic level, are made up of two parts: 1) a test; 2) and a jump. The only exception to this rule is the function call, which saves the return address first, then jumps.

 

Examine the above example: You test sleeping at the beginning of the while loop. If it's true, then goto the end of the block. When you encounter the if statement, you test programming. If it's true, then you call BeginProgramming() and then goto the end of the while block where you'll goto the beginning again. Basically the same if programming is false.

So does that mean that switch, if-elseif-else, function calls, the question-mark operator, etc. is all made up of tests and goto statements? Yep. There are some ways of getting around it when using highly optimized assembly language (c.f. CMOV instruction), but that's for later. Right now we're going to get the basics down, then worry about making it look pretty later.

 

SIMPLIFYING TO THE BASICS

How does one break down all of these into goto's? Well, every one of the previously mentioned statements can be made into an if-else statement, which is broken down like so:

 

if(condition)

goto $TRUE_BLOCK;

$FALSE_BLOCK:

      /*code to execute if condition is false

      skip over the code executed if true.

      this section doesn't always have to exist,

      by the way. If you just want to do some-

      thing when the condition is true, just

      jump automatically to $RESUME if it's

      false.*/

goto $RESUME;

$TRUE_BLOCK:

      //code to execute if condition is true

$RESUME:

      //resume execution after the if-else statement

 

Messy and very difficult to read, isn't it? Imagine converting a ten-block switch statement into one of these. No wonder we decided to ditch programming in straight assembly language a long time ago. Nowadays the compiler takes care of it all for us. I'm going to leave it as an exercise to the reader to figure out how to reduce the more complex flow control statements into simple tests and jumps.

 

Side note: Why do we use assembly language, anyway? Because it's anywhere from 2 to 10 times faster than compiler-produced code, so sections that need to be fast, i.e. graphics, etc., are typically written in pure assembly language. It's also used when you need low-level access to ports, enter protected mode, write a virus, and so on.

 

TESTING CONDITIONS IN ASSEMBLY LANGUAGE

The CPU tests conditions in a slightly different manner than we high-level programmers seem to think. There is no if(a & b != 0xFFFF) in assembly language. Instead, we use CPU flags. These flags are set after every arithmetic and bitwise operation (and a few others) is executed. From these, we can check to see whether the result of an AND operation is zero, see if an ADD instruction overflowed, etc. There are a few flags that you need to worry about:

  • Zero Flag-Set to 1 if the result of the operation is 0, otherwise it's 0.
  • Carry Flag-Set if the result of an operation results in a carry. Also used by DOS interrupts to indicate an error.
  • Parity Flag-Set if the number of 1's in the result is even, 0 if odd.
  • Overflow Flag-Set if an operation resulted in operand overflow. For example, the highest unsigned 16-bit number is 0xFFFF, or 65,535. If we add 1, the result is 0x0000 because of the carry, so the overflow and carry flags would be set. (This is the reason why some for loops end up becoming infinite loops.)
  • Sign Flag-Set if the result of the operation, if treated as a signed number, would be negative.

 

So how do we "test" these? First we have to tell the CPU to make a test, usually (but not always, as we will see) by using the CMP instruction, like in the following examples:

 

cmp   eax,5

cmp   [437F03DAh],esi

cmp   ebx,edx

 

CMP subtracts the source operand from the destination operand, sets the flags, and discards the result so that neither operand is changed.

 

CONDITIONAL BRANCHING

Another instruction goes hand-in-hand with the CMP instruction: the conditional jump, usually abbreviated Jcc where cc takes the place of a condition code. There are a lot of condition codes, all involving the five flags I outlined above. There are a lot of conditions you can make out of those five flags. To negate any condition (e.g. jz jumps if zero, jnz jumps if not zero), just stick an n immediately after the j. Remember, these are with respect to the source operand.

  • JZ / JE                 Jump if zero flag is set. JE stands for jump if equal. Both of these do the exact same thing; it's really a matter of preference which one you use. I use JE for arithmetic comparisons, JZ for logic comparisons.
  • JL, JLE, JG, JGE              Signed arithmetic comparisons; jump if less than, jump if less than or equal, jump if greater than, jump if greater than or equal respectively.
  • JB, JBE, JA, JAE             Unsigned arithmetic comparisons; jump if below, jump if below or equal, jump if above, and jump if above or equal respectively.
  • JC           Jump if carry flag is set. This is often used after calling a BIOS interrupt because most of them set the carry flag if an error has occurred.

 

Let's take a look at some C/C++ code (addresses are for illustrative purposes only):

 

eat(FOOD_TWINKIE);

if(tired)

      sleep(20);

else

      program();

eat(FOOD_HOT_POCKET);

 

In assembly language, this would end up being something like:

 

    push        0000H

    call        eat

    ;entering the IF statement

    cmp         tired,1

    jne         $FALSE

$TRUE:

    push        0014H

    call        sleep

    jmp         $RESUME

$FALSE:

    call        program

    ;fall through to resume execution

$RESUME:

    push        0001H

    call        eat

 

Recognize the structure? It's exactly the same as the goto form of the if-else statement I showed you earlier.

 

All right, that's it for today. Next time, we'll use everything we've learned to get going on this disassembler project. If you have any questions, comments, or suggestions, feel free to post a message on my page or just leave a comment here.


Trackback(0)
feed0 Comments

Write comment
 
 
quote
bold
italicize
underline
strike
url
image
quote
quote
smile
wink
laugh
grin
angry
sad
shocked
cool
tongue
kiss
cry
smaller | bigger
 

security image
Write the displayed characters


busy