Thursday, April 25, 2013

Memory Segmentation of a Compiled Program

When a program is compiled and executed, it is loaded into the computer memory as binary codes.
Such binary codes of one program is structured in five segments, in the memory space. Each segment represents a special portion of the memory that is set side for a certain purpose.

Figure 1



The five segments of program memory are shown in figure 1. The first segment is called the Text segment, or Code segment. This is where the assembled machine language instructions of the program are located. Once started, the cpu will read and execute these instructions, one at a time. The register EIP always points to the next instruction to be executed. The execution of the instructions is not necessarily linear in the memory space, since the flow of execution is often redirected by control structures and functions.

The Data and Bss segments, are used to stored global and static variables. Specifically, the data segment is filled with initialized global and static variables, whereas the bss segment is filled with their  uninitialized  counterparts. Both data and bss segments are of fixed sizes.

The next segment down in the memory axis is the  Heap segment. It is a segment of memory that a programmer can directly control. Blocks of memory in this segment can be allocated and used. The size of the heap is dynamic, and it can be managed by allocator and deallocator algorithms. This means a programmer using the heap allocation functions can reserve and free memory on the fly. The growth of the heap moves downward toward higher memory addresses.

The last segment, Stack segment, is a very important a unique segment that deserves more explanation. In a modern program where a lots a functions are called by the main function, The flow of the instruction execution is managed by a mechanism call stack frame. When a function is called by the program, the program needs to remember a few things: the address of the next instruction to run after the function call is finished, the set of local variables associated with this function, the parameters passed to this function during this call. All of these information is stored collectively in the block of memory called stack frame. Each function call will create a stack frame in the stack segment. Figure 2 shows the content of one stack frame.

Figure 2


In the stack frame shown in Figure 2. Starting from the bottom of the stack, parameters passed to the functioned called are stored. The address of the instruction that the execution flow will return after the function call finish is stored above the parameters. The next slot up (lower address) is the so-called saved framed pointer, which is the address of the frame pointer of the previous stack frame. The address of this slot is stored in the EBP register, which always stores the frame pointer of the CURRENT stack frame. Above the saved frame pointer, the local variables are stored. The top of the current stack frame is always pointed by another register, ESP. If the function in question is calling another function, then the value in ESP will become the value of the frame pointer in the next stack frame. After the call of the function, the current stack frame will be popped out, and the previous stack frame then becomes the current stack frame.

No comments: