Home  |   Deutsch  |  Gordon  |  Radionetworkprocessor  |  Self Language  |  Engineering Consulting  
 
   
 
Implementation Details
   
 
This page describes technical details about the implementation of the port. .
   
 
 
 

About
History
Implementation Details
Download
Installation
Links
Acknowledgements

 

File Names

Most of the time, there is a file className.h (the header), and a file className.c (the implementation). Sometimes, there is also a file className.inline.h that holds the implementation of inline functions for that class. However, at some places there is className.h and className.dcl.h. Sometimes the class definition is in .h and inline functions are in .dcl.h, and sometimes vice versa. To simplify my grepping through the source, I felt as free to rename the files containing inline code into .inline.h.

Additionally, I started all files containing i386 specifics with 'i386-'. I did not use 'i386.' as 'sun4.' because I started in vm/asm with files 'sparc-*'. So it is consistent with other GNU naming rules, and it also simplifies my grepping.

Assembler

The assembler has been derived from the machine specific parts of gas (GNU binutils). It is wrapped into a similar kind of object oriented shell as the Sparc one is. (See directory vm/asm, especially i386*)
The supported instructions are:

mov, push, pop, pusha, popa, add, sub, inc, dec, neg, mul, imul, div, idiv, cmp, test, and, or, xor, not, shl, shr, sal, sar, jmp, jcc, call, ret, lea, clr, nop, align

As in the Sparc assembler, there is a function for each assembler instruction that is called like the instruction's name with the first letter capitalized (f.e. Call). Other than the Sparc assembler, the operand type is not given in the function name. The functions do not take locations or numbers as operands but object of class Operand. This class has several subclasses for different addressing modes:
 
class name description assembler notation
R register direct %eax
I immediate $3
M memory my_label
B based (%ebx)
Bd based with displacement 4(%ebx)
Id indexed with displacement 4(,2,%eax)
BId based with index and displacement 4(%ebx,2,%eax)
PC PC relative (for jumps)
Md memory direct (for jumps), creates a relocated PC relative jump

Operand type is given separately for each operand. The generated AddrDescs contain offsets to the respective values instead of the whole instruction as in Sparc code.

Register Usage

i386 has 8 registers:
 
Name Description
EAX Return value
EDX call clobbered, temp
ECX call clobbered, temp
EBX call saved, local
ESI call saved, ByteMapBase
EDI call saved, currentProcess
EBP call saved, frame pointer (FP)
ESP hardwired stack pointer (SP)

I wanted to have as little trouble when calling C code as possible, so I stuck to the C calling convention as much as possible. EAX is used for result passing. ECX and EDX are temporary registers, EBX is a 'local' register, ESI is used to hold the ByteMapBase pointer, and EDI points to the currentProcess where all other values that uses Sparc global registers are stored. EBP is used for the frame pointer.

In the future, if byte map base is compiled in as absolute address, and if some new kind of AddrDesc is used for updating all referring locations if the base address changes, ESI can be used as an additional local register. The register for currentProcess can also be abandoned in favor of the global variable in the future.

Stack Frame

Since there is no register saving mechanism in i386 as in Sparc, and consequently, the receiver has to go into a stack location anyway, the receiver is stored on the stack conforming to the C calling convention. In absence of 'branch and link' instructions, calls to 'recompile' or 'di' are calls that push its return address on the stack.
 
SP --> outgoing receiver
outgoing
arguments
local 
slots
saved registers (currently EBX)
current_pc
pc_chain
FP --> saved EBP
return address
incoming receiver
incoming 
arguments

SendDescs

On a method call, more detailed information about the state are stored inside the calling code that can be referenced by the return address. Although I do not like filling the I-Cache with data, I did not see a simple way to remove this. The sending call instruction is aligned oddly in order to produce a 32 bit aligned return address. The called code return with 'ret' which can make use of a call-return buffer sometimes found in i386 CPUs. The following 2 words (32 bit quantities) which are occupied with a call and a delay slot in Sparc, are used for a jump that jumps over the whole data section that follows.

Stack Walking

A big difference to Sparc is as follows: Sparc calls a subroutine and saves the return address to a register. A stack frame is then atomically allocated with a 'save' instruction that stores the return address along with other saved valued on the stack (lazily). On i386, there is first the call that pushed the return address, then comes a push of the old FP, and then new space is allocated by subtracting from the stack pointer. If you look at the stack pointer you never really know whether it points to the top of a stack frame, to a new return address, or to a saved FP. Actually stack walking code should orient itself at the frame pointer (EBP).

The code that possibly walks the stack is in directory 'vm/runtime' in files frame, stack, process, .[ch] respectively. A stack frame 'frame' is composed of two 'halfFrame's that represent the incoming and outgoing end. A Sparc stack starts at the current stack pointer (SP), and winds upwards as a linked list where the stack pointer is called frame pointer (FP) once it is saved. However, on i386 this list is rooted on the current frame pointer (EBP). To obtain the location of the outgoing arguments of the topmost frame, it is probably necessary to look at the executing code for its frame size. Furthermore, since Sparc saves from outgoing to incoming, and then to the stack, while i386 saves on the stack in the first place, I assume that the access to arguments or local variables must be un-shifted by one frame.

 
           
 
 
 
 
   
© 1997-2019 Gordon Cichon - Contact