Friday, May 13, 2011

My first scripting language

I've always enjoyed tinkering with scripting languages, especially after years of working with/on Unrealscript for the Unreal Engine, and so at some point in the last few months (March 2nd was the first changelist it looks like) I decided to take a stab at creating my own language. It's been a fun challenge filled with many rewrites already and there's a tremendous amount of work left before I think I'll start seeing real benefits from the effort, but it's been quite rewarding and educational regardless.

So far the feature list is pretty mundane - simple math operations, declaring/calling functions (both 'script-only' and mapping to C++ functions), setting context through member properties on objects, etc. Up until yesterday there was no serialization either as I was just entering expressions manually in the in-game console after every compile and sifting through the contents of my debug log to see where things went wrong (or surprisingly right). Now that I've built a reasonable foundation for parsing and execution however I've taken a step back to improve usability, probably most important being the introduction "useful" error messages from the parser.

To keep it interesting I've been purposely building this from scratch with no significant prior knowledge on a good way to build a good parser/interpreter. My plan is to get to a working state that I'm happy with and then go read a few compiler design books to see where I landed, and discover a few ways to improve it further for the next attempt. This obviously doesn't lead to finished results as quickly but it does keep it entertaining and satisfying as I find myself learning quite a bit.

typedef bool (*ParseFunc)(TString &Token, ParseContext *Context, ExprStream *Stream);
static ParseFunc StatementParsers[] =
{
 &Declare_Function::Parse,
 &Call_Function::Parse,
 &Context_Assign::Parse,
 NULL,
};

bool Parser::ParseStatement(TString &Token, ParseContext *Context, ExprStream *Stream)
{
 Token.TrimWhiteSpace();
 if (Token.Length() == 0)
 {
  return false;
 }
 bool ParsedToken = false;
 int ParserIdx = 0;
 while (StatementParsers[ParserIdx] != NULL)
 {
  if (!StatementParsers[ParserIdx](Token,Context,Stream))
  {
   if (Errors.Num() > 0)
   {
    // fatal error, abort
    return false;
   }
   ParserIdx++;
  }
  else
  {
   return true;
  }
 }
 return false;
}

My first 'recursive' attempt yielded simple results very quickly (i.e. '1+2') but fell apart once I needed to start re-arranging or inserting expressions elsewhere in the stream for execution. Especially once I started supporting automatic type conversions and simple type deduction things started to get messy very quickly, and so I scrapped that approach for a new one that attempted to parse the entire token linearly. At some point that became frustrating as well, so I ended up with the current approach which is a strange hybrid of the two previous attempts.

     Parse: Parsing token 'func test_a()'...
     Parse: Attempting to declare function 'test_a'...
     Parse:    Parsing token '{'...
     Parse:    Parsing token 'result = "hello"'...
     Parse:    Parsing assignment statement...
     Parse:        Parse expression from 'result ', expected type 'Undefined'...
     Parse:            Add expr 'set_context: prop -> Result' at [0]
     Parse:        Add expr 'noop' at [1]
     Parse:        Parse expression from ' "hello"', expected type 'String'...
     Parse:            Add expr 'const_Str: hello' at [2]
     Parse:        Insert expr 'assign' at [0]
     Parse:    Parsing token '}'...
     Parse:    Declared function 'test_a' for type 'Environment'
     Parse: Parsing token 'test_a()'...
     Parse: Add expr 'func: test_a' at [0]
     Parse: Calling function 'test_a' of type 'Undefined'...

One thing that has been a surprise with building a language is how once you get all the puzzle pieces in the correct places you've accidentally unlocked a large amount of functionality. With functions for example it sort of went like this -

  • Write initial simple declaration parser
  • Debug a few test parsing cases and work out the kinks
  • Write the initial simple call function parser
  • Debug a few test cases as well, test out simple execution
  • Go back and add support for declaring and calling functions with parameters
  • Add support for return types/values
  • Debug a while, track down a case where you're corrupting the stack, debug some more
  • Holy shit, can now call functions that call functions with other functions as parameters, etc :)

Like I said, I've got a large list of work items ahead of me before I can start seeing benefits in the game, but the potential is great. I'm excited about the prospects of being able to edit behaviors on the fly, defining new item/creature/whatever types on the fly and build the game in real-time within the game. It's definitely been a long detour from actually building the meat of the game however...

Sunday, February 6, 2011

Fun with autoexp.dat in Visual Studio

I decided to add some auto expansion for debugging various classes in my project, the simplest of which just expose the object name more or less...



DataType=<typename.elements,s>
Property=<name.elements,s>
AObject=<objname.elements,s> [Type: <objtype>]


Works well enough but doesn't look so great when you're uninitialized or otherwise invalid data, at which point enter the obtuse syntax of visualizers...

TString{
preview (
#if ($e.Elements == 0) ( "null" )
#else ( [$e.Elements,s] )
)
}

Simple enough, if it's a null pointer then show "null" instead of the default error text.  Now for something slightly more entertaining...

ObjectType{
preview (
#if ($e.SuperType == 0) (
[$e.TypeName.Elements,s]
) #else (
#(
[$e.TypeName.Elements,s],
" [",
*$e.SuperType,
"] "
)
)
)
}

Make sure you don't mess up your pairings as you'll have to restart Visual Studio and a debugging session to figure out if you got it right or not.  Oh and the documentation for the syntax apparently exists solely as examples for other types (mainly STL) in the autoexp.dat file itself.

Monday, July 26, 2010

Binary woes.

Did you know if you mix binary reads with normal text reads that eventually the reads will fail?  I didn't know until today and it took a while to track down the root cause.  All of my IO up to this point has been simple fputs/fputc/fgetc which has worked fine, but with the level loading I wanted to serialized the tiles array (basically the level layout) as a binary chunk in the midst of a normal text file.  Saving that worked as expected, where I end up with something like the following...

object Level_0 type=Level
  TileBinaryData:{insert lots of random bits here}
  Width=50
  Height=50
  LevelName="TestLevel"
end

I added a couple hooks to my Reader/Writer classes that wrap around fread/fwrite which just takes a pointer and a size essentially.  Writing worked fine as I said, but about ~400 bytes into reading the tiles some internal buffer in the FILE structure would run out, and fread would start to fail from then on.  That led me to believe there's a fixed limit on how big text files will successfully read but that's not the case as my binary data actually reduces the file size quite a bit as I'm able to be more efficient with the storage.

At any rate, adding a little 'b' to the fopen calls when appropriate seems to have solved the issues just fine.  Not a big deal once I figured it out but it was a bit frustrating to come across it in this manner.  Looks like this isn't terribly uncommon either as a quick google search yielded a few hits...

http://stackoverflow.com/questions/474733/unexpected-output-copying-file-in-c

File IO seems like exactly the kind of API that is unforgiving, somewhat finicky to get working right, and once you're done you never touch it again.  I look forward to never having to touch it again some day.  As an upside level saving/loading is now significantly faster, and there's even further room to optimize as I add binary import/export options to the property system (currently only doing manual binary writes for the tiles).

Great success!

Level saving and loading works!  I need to fix up a few more things and optimize some obvious bits, but the core functionality is there!  I would post a screenshot to show the success but it really wouldn't look any different than a normal shot - oh well.

I'm currently exporting out all level objects in a text format (the same format I've built for loading data) which seems to work fairly well.  It's fairly slow to parse (string manipulation in C++ isn't really a pleasant process, and I'm sure my string class implementation isn't helping matters much) and it takes quite a bit of space (current 50x50 level with 15 creatures is taking about 185kb on disk), but the upside is that it's human readable/editable/debuggable.

The basic process for saving is as follows:

- Construct a new package object
- Add current level to the root set of objects for that package
- Copy that root set to a new set of ObjectsToSave
- Start traversing all objects in the ObjectsToSave set, adding any objects they reference to ObjectsToSave such that we recurse through the entire object "tree"
- Write out a "packageinfo" block, which is really just forward declarations of objects contained in the package (useful for loading which I'll point out below)
- Iterate through the ObjectsToSave set and write out any data that differs from the current set of defaults

Fortunately I'm able to leverage the object/property system pretty well for this, so at the highest logical point it's only 15 lines of code or so to save a set of objects.  Loading is a little more complicated at the top level since I have to do some extra parsing to know what data is incoming, but it's still pretty manageable at this point.  At any rate, the process for loading is currently:

- Construct a new package object
- Open the package, read the "packageinfo" which will create all of the objects contained in this package using the default values and creates them using the same names that they had on save.  This allows us to easily find references to other objects contained in the packages when loading without having to resort to a separate fix-up phase.
- Read in the objects' data
- Hand out some post-load notifications for specific objects to fix-up any data (currently only the level object does some munging)

There's also a little trickery of finding the loaded player, copying over it's values to the current player, and then deleting the loaded version.  All in all it's fairly straightforward and seems to work fairly well.  It took quite a while to find all the properties that I either wasn't saving out that I needed to, or all the other bits of data that get initialized on level creation and now needs to be handled separately.  For the most part this has helped me find older chunks of code that needed a slight refactoring anyways which is always a good thing (although I did wonder if it would ever end on several occasions).

Woot!

Saturday, July 24, 2010

Good design reference

Listing this more so that I don't forget about it, but someone else may find it useful -

http://www.designersnotebook.com/Design_Resources/No_Twinkie_Database/no_twinkie_database.htm

Thursday, July 22, 2010

Bit in the ass...

Ran into this the hard way, and the first hit on google clearly explains the issue:
http://www.artima.com/cppsource/nevercall.html.  I can't wait until I finish this project and move on to another language with a different (hopefully more obvious) set of quirks.

Tuesday, July 20, 2010

Hey, that ain't right.

Still plugging away, although some friends at work convinced me to play some WoW again to prepare for the impending goblin invasion, so progress has been a little stilted.  At any rate, I've managed to make some headway on a few important additions, the easiest and yet most work creating of which was writing a proper logging system.

In the past I only had in-game logging meant for game messages that I was hijacking for specific errors/warnings I cared about (syntax errors when loading data files mostly).  Given that I didn't want to flood the game UI with a bunch of other information it tended to be very specific and not terribly helpful - but now I've added a log which goes straight to disk, which has given me freedom to spread little Logf()'s all throughout my code base.  Adding some to a few key places exposed some interesting bugs that would have surely gone undetected for quite some time.

- All objects were doing duplicate work from a copy and paste error in my object definition macros (gg virtual destructors, you win).
- Level objects were being added to the tickable list twice which caused double game updates.  I don't have much in game animation yet, so it's not really noticeable at this stage but it would have been annoying once I started implementing projectiles and other effects.
- Object instance counts were incrementing on one type (correct) and decrementing on another (incorrect) for dynamic types.  Fortunately the right counter was being incremented, so I wasn't seeing any funky object naming (or name clashing) yet, but again this would inevitably cause pain at some point.

Aside from that fun I've written most of the framework I think I'll need to save level instances.  I've already added the logic to link up levels correctly to each other (stairs up match the stairs down on the other side for example) and left the hooks needed to handle other entities transitioning across levels once I get to that feature.  Hopefully in another 100 lines of code or so this game will finally be susceptible to save scumming!