Tuesday, October 27, 2009

Variable references and stack fun!

As I mentioned last time, there are some simple opcodes which are easy to get wrong on your first attempt at writing an interpreter. Section 14 of the Z-machine standard lists the complete table of opcodes and among these are seven which are listed as having a single operand marked as (variable). This means the operation uses the value of that operand as a variable number. This has nothing to do with variable operand types discovered while decoding the instruction.

To help us see what the standard means, we'll use the example of the load opcode. The description for load in the standard is "The value of the variable referred to by the operand is stored in the result." This sounds simple enough and we might try a naive implementation using the methods ReadVariable and Store like so:

ushort variable = this.Operands[0];
ushort value = this.ReadVariable(variable);
this.Store(value);


In most cases this will work properly but there is a corner case where it will fail, namely when the operand value is zero. In this case ReadVariable pops a value off the stack. What's wrong with this? Surely it is correct because section 6.3 of the standard says:

Writing to the stack pointer (variable number $00) pushes a value onto the stack; reading from it pulls a value off. Stack entries are 2-byte words as usual.

Unfortunately this case is not mentioned in the 1.0 standard but it is clarified in version 1.1:

Indirect variable references
----------------------------
In the seven opcodes that take indirect variable references (inc, dec,
inc_chk, dec_chk, load, store, pull), an indirect reference to the stack
pointer does not push or pull the top item of the stack - it is read
or written in place.


For these opcodes only, this behavior overrides that defined in the older standard.

Knowing this, a better implementation is:

ushort variable = this.Operands[0];
ushort value = this.ReadVariable(variable);

this.WriteVariable(variable, value);
this.Store(value);


This writes the value back to the variable immediately after reading, which in the above case would push the popped value back onto the stack. In all other cases, the write essentially does nothing since it is writing an identical value back into the variable. The other six opcodes can be implemented similarly by making sure whenever we read a variable we follow it with a write. I think this is a fairly clean way of dealing with the situation. Although it is possible to allow in-place reads/writes to the stack, doing so requires extras methods or extra parameters to the existing ones to change their behavior.

The rest of this post contains some of my random thoughts about stacks.

Typically a stack will have at least three methods, Push, Pop, and Peek. Sometimes Peek is referred to as Top, although I dislike that naming as it seems less descriptive to me. Push puts a value on the stack, Pop removes a value, and Peek let's you see the top element without removing it.

Why do we only allow read-only access to the top element? It is easy to imagine a method called Poke which allows in-place writes to the top element. If a Poke method seems like an abuse of the stack type, it makes me wonder if Peek might also be. Imagine a stack with just Push and Pop methods. No peeking allowed! You couldn't use the stack as a convenient holding area for the current working element. I'm curious as to what pressure for design changes this might exert and whether they would be good or bad.

Working on this reminded me of some comments about stacks on Eric Lippert's blog. Although I don't remember exactly when it was, the discussion involved whether 'Pop' should return a value or merely throw away the top element. I would argue that Pop should return the top element because it is the inverse of Push. One puts an element in, one takes an element out. With this arrangement, Peek is merely a performance optimization for a common set of operations! The equivalent of Peek could be achieved by a Pop followed by a Push of the same element back again. A hypothetical Poke method to allow in-place writes could just as easily be imagined to be a Pop followed by a Push of an altered value. It is interesting to me that this is a far less common case.

If you prefer Pop not to return a value, then it seems fair that Push not take a value, but rather create a default valued element. If you also added our hypothetical Poke method, you could set the new element to the value you want. From this point of view, Push accepting a value is just a performance improvement for a common set of operations!

What do you think?

Tuesday, October 20, 2009

Basic operations and branches

  Now that we have some pieces in place to run operations, let's look at a simple operation for the Z-Machine that is used in all of its versions:

///
/// Operation 20 (ADD).
///
protected override void Operation20()
{
  var num1 = (short)this.Operands[0];
  var num2 = (short)this.Operands[1];
  this.Store((ushort)(num1 + num2));
}

  This obviously performs a simple signed addition and stores the result. Math and comparison operations in the Z-machine use signed numbers, but many other operations treat operands and memory values as unsigned.

Here's another:

///
/// Operation 3 (JG).
///
protected override void Operation3()
{
  var num1 = (short)this.Operands[0];
  var num2 = (short)this.Operands[1];
  this.Branch(num1 > num2);
}

  This operation compares two operands and branches if the first is greater than the second. I mentioned branch instructions before but now we will see exactly how they work.

protected void Branch(bool condition)
{
  byte branchData = this.Memory.ReadByte(this.CallStack.ProgramCounter++);
  var offset = (ushort)(branchData & 63);
  if ((branchData & 64) != 64)
  {
    offset = (ushort)((offset << 8) + this.Memory.ReadByte(this.CallStack.ProgramCounter++));
  }

  if (condition ^ ((branchData & 128) != 128))
  {
    if (offset > 1)
    {
      this.Jump((short)(offset > 8191 ? offset - 16384 : offset));
    }
    else
    {
      this.Return(offset, this.CallStack.EndRoutine());
    }
  }
}

  We begin by reading a byte pointed to by the program counter. The top two bits are used as flags, while the remaining six bits are the offset value. If the second highest bit is not set, we need to read another byte and combine this byte with the current offset to produce a single 14-bit signed offset. If the top bit is not set, then the branch condition is inverted and we branch on false instead of true. In any case, if the branch is taken, the program counter jumps based on the offset. There is a special case where if the offset is one or zero we return that value from the current routine immediately instead of jumping.

  The actual jump is implemented as another method. This is simply because it is identical to the behavior of the unconditional jump operation (opcode #140), so I factored it out into a separate method even though it is just a one-liner:

protected void Jump(short offset)
{
  this.CallStack.ProgramCounter += offset - 2;
}

  The other math and comparison operators are implemented similarly. They are:

#1 Jump if equal (Sometimes used with more than two operands, in which case the branch is taken if the first operand is equal to any of the others)
#2 Jump if less than
#3 Jump if greater than
#20 Add
#21 Subtract
#22 Multiply
#23 Divide
#24 Modulus
#128 Jump if zero

  Next time I'll look at some other seemingly simple operations that can be a source of problems for first time interpreter writers.

Wednesday, October 14, 2009

In the beginning...

For anyone interested in the history of the Z-machine, it is very much the history of Zork itself. Zork was inspired by an earlier (non-Infocom) game sometimes called 'Colossal Cave' or 'Adventure', which also has an interesting history and was inspired by a real cave in Kentucky. I've been to the end of the real road where the game Adventure begins. It may sound a bit silly but it was fun just being there.

My fascination with Zork began on a dark and stormy night in 1984. I was visiting a cousin of mine who had just gotten one of the first Macintosh computers and we sat down to play the first game he had gotten for it: Zork I. I was thrilled by the game and quickly become a fan of all things Infocom. Even though this was my introduction to Infocom it was not my first foray into text adventures.

I got my first computer in September of 1983. It was a Commodore Vic-20 and my first game for it was a cartridge of Scott Adams' adventure game 'Pirate's Cove'. I got several more of his cartridges over the next year, including Voodoo Castle, The Count, and Adventureland. The next year I upgraded to a C-64, bought a 1541 floppy disk drive and started getting Infocom games like Zork, Deadline, and The Lurking Horror (my all-time favorite). I also got a very interesting program called 'Adventure Writer' (also known as 'Quill') which allowed you to make your own games and came with two game compilations: The 'Junkyard' series and the 'Thriller' series. These were the first computer games I owned, but even they weren't my first contact with text adventures.

In 1981 I was in fifth grade. Once, I stayed overnight at a classmate's house. He and his brother had a TRS-80 and we made a trip to Radio Shack to pick up a game to play. We picked out Raaka-Tu and played it much of the night. The game was very unforgiving by today's standards, tending to kill you off rather randomly, but nevertheless I was enthralled. This was what first sparked my interest in computers and text adventures especially.

Monday, October 5, 2009

Static and Virtual?

  A few times while reading developer forums I've come across people asking if it possible to have a class member which is both static and virtual. The obvious answer is 'no'. Imagine a hypothetical static and virtual method named Foo. Because static members belong to no particular instance, when we say BaseClass.Foo() there is no way to determine whether we really meant to invoke DerivedClass.Foo(). Therefore it makes no sense to be both static and virtual.

  All very interesting, but how does it apply to the Z-machine? I wanted to be able to invoke a particular operation by using a simple array of delegates and accessing them by index (the opcode number). While the eight different versions of the Z-machine define different operations, all instances of a particular version have identical sets. Because of this and the fact that I implement each version as a class, the array of operations for a given version would logically be a static member of that version's class. This led to the situation of eight different static arrays, each one defined in a different class. This was wasteful because they have more operations in common than they have differences. Additionally it was ugly because it forced me to wrap the call to the delegate in a virtual method to ensure the proper array was accessed. Yuck.

  After thinking about the situation a little I thought it would be nice to have a static array of delegates to virtual methods, then I could just use a single array in my base class. After unsuccessfully trying to do just that, I ended up creating a single static array in my base class populated with delegates which take a ZMachine parameter and that call a virtual method on the instance passed to it to ensure the proper operation is invoked. This was ugly too because it involved a wrapper method to call the virtual method for each operation. I was able to use lambda syntax to clean it up quite a bit.

The result in my abstract ZMachine class looks like this:

protected delegate void Operation(ZMachine machine);

private static readonly ImmutableArray<Operation> operations = InitializeOperations();

private static ImmutableArray<Operation> InitializeOperations()
{
  var op = new ImmutableArray<Operation>.Builder(256);
  op[0] = op[32] = op[64] = op[96] = op[192] = z => z.Op0();
  op[1] = op[33] = op[65] = op[97] = op[193] = z => z.Op1();
  ...

  op[254] = z => z.Op254();
  op[255] = z => z.Op255();
  return op.MakeImmutable();
}


  The methods Op1, Op2, etc. are virtual methods in the abstract base class with the default behavior of calling an abstract method called InvalidOperation, allowing derived classes to implement their own handling for any illegal operations as well as override only those operations which they care about.

  Overall I'm pleased with the result. I ended up with a single array of delegates to invoke, and virtual operations to override, with the only wart being the extra method call using the lambdas.