## Wednesday, September 23, 2009

### Decoding operations

We know that Z-machine decodes and executes one operation at a time. Let's look at exactly how it does this.

Remember the call stack? One piece of information stored in the call stack is the current program counter (PC). This points to an address in the Z-machine's memory (often beyond the 64k mark, so a 32-bit integer value is recommended). We begin by reading the byte in memory pointed to by the PC. We increment the counter every time we use it. The byte we just read contains the opcode number. After we have the opcode number we need to load the operands to be used during this operation. First we need to determine how many and what type they are.

The Z-machine has three types of operands: small constants (1 byte value), large constants (2 byte value) , and variables (1 byte variable number). It is useful to define them in an enum like this:

public enum OperandType
{
LargeConstant = 0,
Omitted = 3,
SmallConstant = 1,
Variable = 2
}

The enum values correspond to a pair of bits, where a large constant is binary 00, a small constant is 01, a variable is 10, and an omitted operand is 11.

Where do we read these values from? That depends on the opcode number.

If the opcode number is less than 128, the operation uses 'long form'. The operation always has two operands with either small constant or variable operands. Their types are determined by the seventh and sixth bits of the opcode number. Add 1 to the bit value to determine the operand type. Example: The opcode number is 78 which in binary is 01001110. The seventh bit is 1 and the sixth bit is 0. This gives the first operand as 2 (variable) and the second as 1 (small constant).

If the opcode is > 127 and less than 192 then it uses 'short form'. There may be zero or one operands. The type is given by the sixth and fifth bits taken together. It may be any of the three types or omitted.

If the opcode is > 191 (but not 236 or 250, which are special) then it uses 'variable form'. The operand types are specified by an additional byte. We read the byte of memory pointed to by the PC. Each pair of bits in this byte corresponds to one operand, which can be any of the types, including omitted. This gives us 0-4 operands.

If the opcode is 236 or 250 it uses 'double variable form'. The operands are determined by two additional bytes read in the same way as variable form. This gives us 0-7 operands.

Once we have the number and types of operands we can then get their values. Small constants are loaded by reading the next byte pointed to by the PC. Large constants by the next word (two bytes) pointed to by the PC. Variable operands are more complicated. First the variable number is determined by reading the next byte pointed to by the PC. What happens next depends on the variable number: If it is zero, we pop a value off the temporary stack in the call stack. If it is between one and fifteen we read the value of a routine local variable (also stored in the call stack), otherwise we read from a so-called "global" variable, which is really just an offset from a known location in the Z-machine memory. In any of those three cases, the resulting 16-bit value becomes the value of the variable operand.

Now we can execute our operation with the given operands. The following routine decodes and executes a single operation.

private void ExecuteInstruction()
{
this.Operands.Reset();
if (operationCode < 128)
{
this.PopulateOperand((OperandType)((operationCode >> 6 & 1) + 1));
this.PopulateOperand((OperandType)((operationCode >> 5 & 1) + 1));
}
else
{
if (operationCode < 192)
{
this.PopulateOperand((OperandType)(operationCode >> 4 & 3));
}
else
{
byte operandByte2 = 255;
if (operationCode == 236 || operationCode == 250)
{
}

this.PopulateOperand((OperandType)(operandByte1 >> 6 & 3));
this.PopulateOperand((OperandType)(operandByte1 >> 4 & 3));
this.PopulateOperand((OperandType)(operandByte1 >> 2 & 3));
this.PopulateOperand((OperandType)(operandByte1 & 3));
this.PopulateOperand((OperandType)(operandByte2 >> 6 & 3));
this.PopulateOperand((OperandType)(operandByte2 >> 4 & 3));
this.PopulateOperand((OperandType)(operandByte2 >> 2 & 3));
this.PopulateOperand((OperandType)(operandByte2 & 3));
}
}

Operations[operationCode](this);
}

We can take advantage of the fact that a byte value of of 255 corresponds to all operands being omitted to combine the variable and double variable case. The final line of the routine executes the operation by invoking a delegate stored in an immutable array called Operations and indexed by the opcode number.

Here's the code for populating the operands:

protected void PopulateOperand(OperandType type)
{
switch (type)
{
case OperandType.SmallConstant:
break;
case OperandType.Variable:
break;
case OperandType.LargeConstant:
this.CallStack.ProgramCounter += 2;
break;
}
}

and finally, here is the code for reading a variable:

{
if (variableNumber > 15)
{
}

return variableNumber > 0 ? this.CallStack.LocalVariableRead((byte)(variableNumber - 1)) : this.CallStack.Pop();
}

CallStack.Pop() pops a value off of the temporary stack within the current frame. It does not remove the entire frame. I use methods called BeginRoutine and EndRoutine for that.

That's it for today.

1. I'm intrigued why you chose to use a switch for the ReadVariable method.

Is it not cleaner just to write this as an if?

if (variableNumber == 0)
...
else if (variableNumber < 16)
...
else
...

If performance is an issue the tests can be rewritten as bit tests, but that really would be premature optimization.

2. I suppose it would be cleaner as an if/else. Honestly that was one of the methods I wrote very early on and I suppose I just got used to seeing it that way. However, I do dislike having lots of if/else statements and usually use a switch if possible.

I'll update the code.

3. I replaced the switch with an if and a conditional. It looks very readable in visual studio, but not quite so much here because of the wrapping.