Saturday, June 25, 2011

Putting the text in 'text adventures'

Decoding text so it can be displayed on-screen is something the Z-machine has to do a lot of. Let's see how it's done:

Firstly, text in a z-machine story file is stored in an encoded format where every two bytes represents three five-bit 'z-characters', each with values of 0-31. These z-characters are decoded into ZSCII characters, a character encoding scheme similar to ASCII. The algorithm used to convert z-characters to ZSCII varies somewhat in Z-machine versions 1 through 3, making it awkward to implement cleanly using an inheritance model like I have chosen.

First we convert the encoded text into z-characters.


protected ImmutableStack<byte> EncodedTextToZCharacters(ref int address)
{
  ushort character;
  ImmutableStack<byte> characters = null;
  var storyLength = this.StoryLength;
  do
  {
    if (address >= storyLength)
    {
      this.FrontEnd.ErrorNotification(ErrorCondition.InvalidAddress, "Encoded text does not end before the end of memory.");
      break;
    }

    character = this.Memory.ReadWord(address);
    characters = characters.Add((byte)(character >> 10 & 31));
    characters = characters.Add((byte)(character >> 5 & 31));
    characters = characters.Add((byte)(character & 31));
    address += 2;
  }
  while ((character & 32768) == 0);
  return characters.Reverse();
}


You may notice that three five-bit values per two bytes leaves one bit unused, this bit is set to mark the end of the text as no other length indicator is given. Once we have our z-characters, we can convert them to ZSCII. The common parts of the algorithm use the following method which in turn calls a virtual method by the same name containing the version specific differences.

protected ImmutableStack<Zscii> ZCharactersToZscii(bool calledRecursively, ImmutableStack<byte> zcharacters)
{
  byte lockedAlphabet = 0;
  byte nextAlphabet = 0;
  ImmutableStack<Zscii> zsciiText = null;
  while (zcharacters != null)
  {
    var currentAlphabet = nextAlphabet;
    nextAlphabet = lockedAlphabet;
    var zcharacter = zcharacters.Top;
    zcharacters = zcharacters.Tail;
    this.ZCharactersToZscii(calledRecursively, zcharacter, currentAlphabet, ref nextAlphabet, ref lockedAlphabet, ref zcharacters, ref zsciiText);
  }
  return zsciiText.Reverse();
}


This method loops through all the z-characters, building up the zscii text as it goes. Alphabet values range from 0 to 2 and affect the meaning of an individual z-character. Initially the alphabet is zero but can be shifted for a single character (changing nextAlphabet) or shift-locked (changing lockedAlphabet). The calledRecursively parameter is used in version 2 and above as we'll see in a bit. Next we'll see the actual decoding as it is done in version 1.

protected virtual void ZCharactersToZscii(bool calledRecursively, byte zcharacter, byte currentAlphabet, ref byte nextAlphabet, ref byte lockedAlphabet, ref ImmutableStack<byte> zcharacters, ref ImmutableStack<Zscii> zsciiText)
{
  switch (zcharacter)
  {
    case 0:
    zsciiText = zsciiText.Add(Zscii.Space);
    break;
    case 1:
    zsciiText = zsciiText.Add(Zscii.NewLine);
    break;
    case 2:
    case 3:
    nextAlphabet = (byte)((lockedAlphabet + zcharacter - 1) % 3);
    break;
    case 4:
    case 5:
    nextAlphabet = lockedAlphabet = (byte)((lockedAlphabet + zcharacter) % 3);
    break;
    default:
    if (zcharacter == 6 && currentAlphabet == 2)
    {
      if (zcharacters.Count() > 1)
      {
        zsciiText = zsciiText.Add((Zscii)((zcharacters.Top * 32) + zcharacters.Tail.Top));
        zcharacters = zcharacters.Tail.Tail;
      }
      else
      {
        zcharacters = null;
      }

      break;
    }

    zsciiText = zsciiText.Add(this.GetZsciiAlphabetCharacter((byte)((currentAlphabet * 26) + zcharacter - 6)));
    break;
    }
  }

  return zsciiText.Reverse();
}


As you can see, z-characters 0 and 1 are translated to 'space' and 'newline' respectively. Values 2 and 3 are single character alphabet shifts while 4 and 5 are shift locks. With a single exception, all other values are converted to zscii by subtracting 6 and adding 26 times the current alphabet number. The exception to this is the z-character 6 when the current alphabet is 2, which represents an escape sequence. In this case the zscii value is determined by the next two z-characters and is calculated as 32 times the first z-character plus the second. Next is version 2, which adds a new feature called abbreviations.

protected override void ZCharactersToZscii(bool calledRecursively, byte zcharacter, byte currentAlphabet, ref byte nextAlphabet, ref byte lockedAlphabet, ref ImmutableStack<byte> zcharacters, ref ImmutableStack<Zscii> zsciiText)
{
  if (zcharacter == 1)
  {
    this.AppendAbbreviation(zcharacter, calledRecursively, ref zcharacters, ref zsciiText);
    return;
  }

  base.ZCharactersToZscii(calledRecursively, zcharacter, currentAlphabet, ref nextAlphabet, ref lockedAlphabet, ref zcharacters, ref zsciiText);
}


Z-character 1 is no longer a newline, but instead represents an abbreviation decoded in the following method.

protected void AppendAbbreviation(byte zcharacter, bool calledRecursively, ref ImmutableStack<byte> zcharacters, ref ImmutableStack<Zscii> zsciiText)
{
  if (calledRecursively)
  {
    this.FrontEnd.ErrorNotification(ErrorCondition.NestedAbbreviation, "Nested abbreviation detected.");
    return;
  }

  if (zcharacters != null)
  {
    var abbreviationNumber = ((zcharacter - 1) * 32) + zcharacters.Top;
    zcharacters = zcharacters.Tail;
    var abbreviationsTableAddress = this.Memory.ReadWord(24);
    var abbreviationAddress = 2 * this.Memory.ReadWord(abbreviationsTableAddress + (2 * abbreviationNumber));
    var abbreviation = this.ZCharactersToZscii(true, this.EncodedTextToZCharacters(ref abbreviationAddress));
    foreach (var zsciiCharacter in abbreviation.Enumerable())
    {
      zsciiText = zsciiText.Add(zsciiCharacter);
    }
  }
}


An abbreviation is essentially just another encoded string which needs to be decoded separately and appended to the text we've decoded so far. After looking up the location of the text we end up calling our first method recursively (remember that parameter?) The calledRecursively parameter allows us to detect the situation where an abbreviation contains another abbreviation, which is illegal according to the Z-machine standard and could potentially lead to an endless loop otherwise. Lastly, version three expands the number of possible abbreviations and alters the behavior of alphabet shifts.

protected override void ZCharactersToZscii(bool calledRecursively, byte zcharacter, byte currentAlphabet, ref byte nextAlphabet, ref byte lockedAlphabet, ref ImmutableStack<byte> zcharacters, ref ImmutableStack<Zscii> zsciiText)
{
  switch (zcharacter)
  {
    case 1:
    case 2:
    case 3:
    this.AppendAbbreviation(zcharacter, calledRecursively, ref zcharacters, ref zsciiText);
    break;
    case 4:
    case 5:
    if (currentAlphabet == 0)
    {
      nextAlphabet = (byte)(zcharacter % 3);
      break;
    }

    nextAlphabet = lockedAlphabet = (byte)(currentAlphabet - ((zcharacter - currentAlphabet) % 3));
    break;
    default:
    base.ZCharactersToZscii(calledRecursively, zcharacter, currentAlphabet, ref nextAlphabet, ref lockedAlphabet, ref zcharacters, ref zsciiText);
    break;
  }
}


This implementation allows shift locks in version 3 caused by consecutive single shifts, although the 1.1 standard disallows it. Many (all?) of Infocom's version 3 and later interpreters did this which is the reason I included it.

That's it. The resulting ZSCII is mostly ready to be displayed. I say mostly because some ZSCII values fall in what is called the 'extra characters' range, outside the standard ASCII printable characters and used primarily to display accented characters and the like. These are easily converted to unicode via a simple lookup.

Sunday, March 20, 2011

Fun with source control

I've been doing a lot of clean up on my code, however many of my changes haven't made their way to the project at http://grue.codeplex.com just yet. For some time now I've been using two different source repositories. Back before I created the project on codeplex I had set up my own TFS 2010 server. I did this both to keep better track of the changes I was making and to become more familiar with TFS 2010, which I really enjoy using. I still check my changes into my TFS and copy these files over to a duplicate project for uploading to codeplex (not as often as I should). I probably ought to just switch entirely over to codeplex, but haven't because I haven't had time to look into what I might be giving up by doing so.

Despite my procrastination, I did manage to add upper window support a while back, making most games playable now. My next update will likely include support for color. I'm also planning on putting up at least one frontend as an example, but I haven't decided which would be the best to use. I have frontends for a variety of platforms: console, WPF, Xbox, etc. The console is a good example of a simple frontend without a lot of distracting code, but it would prohibit demonstrating more advanced features, like mouse input or graphics.

Tuesday, October 12, 2010

Small update

This is just a small update to let everyone know I'm still around. A lot of things conspired this summer to keep me from posting. The latest is a recent surgery from which I am still recovering.

Still, I haven't stopped work on the project. In fact I recently created a project at http://grue.codeplex.com/ to host the source code for my class library.
I haven't done a binary release yet as I want to make a few more changes first. I eventually want to add the code for some of my front end applications as well, although at the moment the library's public API is still changing enough that they become outdated fairly quickly.

If anyone has the desire to dig through the source, I'd be happy to answer any questions. If anyone is feeling particularly adventurous and wants to try their hand at writing a front end, the project files require Stylecop be installed in order to build. If you aren't using Stylecop yet...why?

The current status of the library is such that Version 1 through 3 games run great (with the notable exception of Seastalker versions which use the SonarScope display). Upper window support and color is currently broken, which means V4-5, V7-8 will run but display incorrectly. Version 6 support is unfinished. Your mileage may vary.

Tuesday, April 6, 2010

You learn something everyday...if you read the documentation

As I mentioned once, I have a static array of delegates in my Z-machine base class.

The delegate signature looks like this:
protected delegate void Operation(ZMachine machine);

The array is intialized with a series of statements like this:
op[0] = op[32] = op[64] = op[96] = op[192] = z => z.Op0();
op[1] = op[33] = op[65] = op[97] = op[193] = z => z.Op1();
...
op[254] = z => z.Op254();
op[255] = z => z.Op255();

I was forced to create a wrapper method which allows the delegate (which is static) to call an instance method specified at the time of invocation. This is because the actual operations are virtual to deal with the changing behavior of the various Z-machine versions. Using lambdas instead of named methods makes things look cleaner and involves a lot less typing, but always bugged me because of the extra method call needed for each instruction that is executed. I could find no better way though.

Today while reading through some documentation I discovered a way to do this without the wrapper represented by the lambda. About a third of the way down the page I found this:

"When a delegate represents an open instance method, it stores a reference to the method's entry point. The delegate signature must include the hidden this parameter in its formal parameter list; in this case, the delegate does not have a reference to a target object, and a target object must be supplied when the delegate is invoked."

I'd never heard of open instance delegates before. This was exactly what I was looking for!

Now I can write a method like:
private static Operation CreateOperation(string method)
{
return (Operation)Delegate.CreateDelegate(typeof(Operation), typeof(ZmachineV1).GetMethod(method, BindingFlags.NonPublic BindingFlags.Instance));
}


and my array initialization looks like:
op[0] = op[32] = op[64] = op[96] = op[192] = CreateOperation("Operation0");
op[1] = op[33] = op[65] = op[97] = op[193] = CreateOperation("Operation1");
...
op[254] = CreateOperation("Operation254");
op[255] = CreateOperation("Operation255");

This reduces the size of the binary and removes the extra method call when each instruction is executed. I only wish C# had some sort of native support for this type of delegate, as the CreateOperation method is ugly and can fail at runtime if the wrong name is passed whereas delegate creation using a method group name is checked at compile time.

I've nearly finished my refactoring and hope to make some code publicly available soon.

[Update: Thanks for the comments. They inspired me to keep trying.]

I'll admit I'm inexperienced with functional programming. It gives me headaches.
From what I see, expression trees reintroduce the code bloat I had with the lambdas. Honestly I didn't look into it deep enough to see if the extra method call gets reinserted at execution time.

Anyway, I played around with it some more and came up with this (sorry about the formatting):

namespace ConsoleApplication1
{
using System;

///
/// The program.
///

internal class Program
{
///
/// Program entry.
///

private static void Main()
{
var z = new ZmachineV2();
z.Run();
}

///
/// The zmachine V1.
///

private class ZmachineV1
{
///
/// The operations.
///

private static Operation[] operations;

///
/// A fake op (Needed to get the MethodInfo).
///

private delegate void FakeOp();

///
/// An operation.
///

///
/// The zmachine.
///
private delegate void Operation(ZmachineV1 zmachine);

///
/// Gets the operations.
///

///
/// The operations.
///

private Operation[] Operations
{
get
{
return operations ?? (operations = this.InitializeOperations());
}
}

///
/// Runs one cycle of the zmachine.
///

public void Run()
{
this.Operations[0](this);
}

///
/// Operation 0.
///

protected virtual void Operation0()
{
Console.WriteLine("Called ZmachineV1.Operation0.");
}

///
/// Creates an operation.
///

///
/// The fake operation.
///
///
/// An operation.
///

private static Operation CreateOperation(FakeOp fake)
{
return (Operation)Delegate.CreateDelegate(typeof(Operation), fake.Method.GetBaseDefinition());
}

///
/// Initializes the operations.
///

///
/// The operations.
///

private Operation[] InitializeOperations()
{
var ops = new Operation[1];
ops[0] = CreateOperation(this.Operation0);
return ops;
}
}

///
/// The zmachine V2.
///

private class ZmachineV2 : ZmachineV1
{
///
/// Operation 0.
///

protected override void Operation0()
{
Console.WriteLine("Called ZmachineV2.Operation0.");
}
}
}
}


The only downsides to this that I see are that I had to give up the readonly modifier on the delegate array, wrap it in a property, create another delegate type, and the initialization generates a bit of garbage for the FakeOp delegates. On the plus side, the CreateOperation method looks better and I no longer need to refer to anything in the Reflection namespace, so I can drop the using statement for that. I no longer need to pass strings and my method names are verified at compile time. I also don't need any wrapper methods. What do you think?

Tuesday, February 16, 2010

Design decisions

This post is long overdue and was originally going to feature my code for playing sounds in the Z-machine. That will have to wait however as I've recently been making some significant changes to my Z-machine library and the sound code faces some significant rewriting.

Supporting all eight Z-machine versions in one codebase is an interesting challenge. Back when I first started this project, I had one class which encompassed every version. It was filled with hundreds of if/else statements and switches which directed behavior based on the machine version. This worked well enough, but was a mess to look at and maintain. After some time I decided to use an inheritance model to separate the behavior of each version. This helped me clean up the code as well as fix a few bugs which were hidden before.

One thing that became clear to me as I refactored my code was that my original design, in the interest of typing less code, had actually created a sort of hybrid machine where capabilities of later versions are actually available to earlier ones. An interpreter can get away with this sort of design by relying on the fact that earlier games will not try to use these capabilities, as I mentioned during my last post. Indeed, I think this approach is fairly common among Z-machine interpreters. While this is fine from a practical point of view of running games. I've recently come to the conclusion that it is poor from a code-as-documentation standpoint. If we had to reverse engineer existing interpreters to reproduce the Z-machine standards document, a hybrid machine would give us a badly distorted picture. Theoretically, it could also encourage the development of games that use capabilities not properly belonging to a given version. Because of this I've decided to adopt a more strict interpretation of the specification and make the appropriate changes to my library.

Moving to an inheritance model has helped me improve my code a lot, but it hasn't all been a cure all. There are a few methods, namely converting between Z-characters and ZSCII which don't lend themselves easily to inheritance. Key pieces of the algorithms change between versions, which makes it difficult to implement them as virtual/override methods and leads to blocks of repeated code. This is the exception however, and the overall improvement in the code has been substantial.

I still have some refactoring to do to implement my latest design changes, but I hope to make my source code available soon.

Thursday, December 10, 2009

What the bleep?! Sounds and Music in the Z-machine

Before I get to the topic of this post, I'll tell you where I got the idea for my blog's name. When I was trying to decide on a title, I knew I wanted something which was relevant to my project, but also was reminiscent of Infocom. It didn't take long to find. After I made my choice, I felt it was particularly apt considering my goal is to recreate the Z-machine, which runs quite possibly the most 'classic' games of all, using a modern language and object oriented design principles. So how is it related to Infocom? My favorite Infocom game is 'The Lurking Horror'. In it, you play a college student at G.U.E. Tech who is trying to finish an assignment at the last minute in the campus computer lab during a winter snowstorm. You begin the game in the lab holding an assignment. If you read the assignment (which is not needed to win the game), you'll find the name of the course is "The Classics in the Modern Idiom".

Even if you are an old pro at Infocom's text adventures, sound is not likely to come to mind when thinking about them. That's because few Infocom games actually made sounds at all. Of the thirty or so games Infocom released just two had sound effects: The Lurking Horror and Sherlock. To make matters worse, only selected versions of those two games actually played sound effects. Beyond this there were nine other games which could make simple beeping sounds referred to as 'bleeps' in the Z-machine standards document. Long after the age of Infocom came to a close, the Z-machine was extended to play music as well, but we'll get to that a bit later.

All sound effect manipulation in the Z-machine, including loading, playing, stopping, and unloading is done using the seriously overloaded sound_effect opcode #245. A remark at the bottom of the table of opcodes:

The notation "5/3" for sound_effect is because this plainly Version 5 feature was used also in one solitary Version 3 game, 'The Lurking Horror' (the sound version of which was the last Version 3 release, in September 1987).

I take issue with that statement because several version 4 games also make use of the sound_effect opcode, including 'Trinity', 'A Mind Forever Voyaging', and 'Bureaucracy'. What I think the author of the standard was getting at is only one version 3 game played sampled sound effects. It would be a mistake to fail to include support for the sound_effect opcode in version 4 games.

The sound_effect opcode is rather messy to implement due to the way its operands are handled. It can take 1 to 4 operands. The first operand is the sound effect number. A value of 1 indicates a high pitched beep should be played, while 2 indicates a low pitched one. Values 3 and higher indicate sampled sound effects and require additional operands. A value of zero means all sounds and is allowed only if a second operand is provided and has a value of 3 or 4 which means stop all sounds and unload all sounds respectively (this was clarified in the 1.1 standard). If a sampled sound effect is indicated, a second operand is required. The additional operand has the following meanings: 1:load, 2:play, 3:stop, and 4:unload. Values 1 and 4 are merely hints as to what the game needs at any given time and may be implemented in any convenient way. If the second operand is 2 (play) then a third operand is required. The third operand's lower byte contains the volume to play the sound at. From version 5 onward, the upper byte contains the number of times to play the sound. In the single version 3 game with sampled sounds the sound either played once or looped forever until stopped. This looping information was contained in the sound file itself and not in the game code. Starting with version 5 a fourth operand can be supplied containing the address of a routine to be called when the sound completes but not if it is stopped prematurely by another sound being started.

In addition to the operand madness above, the behavior of playing sampled sounds can be complicated as well. Originally, it was meant to be simple: only one effect can play at a time and starting another will terminate a currently playing one. This becomes more complicated today because 'The Lurking Horror' plays several sounds in sequence. Computers at the time of the game's release would be slow enough to play each sound, but faster computers will cut off the sounds prematurely, so we need a mechanism to prevent this. Another complication is that the 1.1 update to the standard introduced the concept of 'music' as a second sound channel. Sound effects only interrupt other sound effects and music only interrupts music. When we play a sound, the channel is chosen using the file type of the sampled sound itself...yuck. It would have made far more sense to introduce a new opcode at this point, instead of complicating the behavior of sound_effect. The effort required to introduce a new opcode into existing interpreters is far less than the work required to support the new music file types needed and so would not have been an additional hardship on interpreter writers. One could make the argument that the chosen method did not require any updates to Inform to create games supporting music, but if you are going to do something, then I think it is worth doing it right.

In my implementation, I provide the full sound_effect capabilities from version 3 onward***. This includes the fourth operand which is restricted to version 5 and higher in the standard because the version 5 game Sherlock is the only Infocom game ever to use it. It also includes the 1.1 music extension which does not specify supported versions. I felt that the extra work to exclude features from version 3, then restrict them even further in version 4 only to reintroduce them in version 5 to be excessive considering all the existing games are compatible with the notion that these capabilities existed from version 3 onwards. The extra capabilities simply go unused in current version 3 and 4 games.

Next time...we'll see some code.

***Update: I've changed my mind on this (see next post).

Thursday, November 26, 2009

Happy Thanksgiving

I just wanted to say Happy Thanksgiving.

I haven't posted in a while...The last month has been crazy with family members getting the flu and work demanding much of my time.

I will soon be posting about sound and music in the Z-machine, but for now I'm just going to leave you with this question: Can you guess where I got the title for my blog?