Porting Games from DOS to Modern Platforms

For someone like myself that started programming long after assembly programming went out of style, the task of porting an old game from MS-DOS to modern platforms may seem impossible. The games tend appear to be half written in assembly and to make things worse, 16-bit assembly. If they are 16-bit the portable C part of the program is filled with compiler specific extensions in order to cover some of the details of the platform. On top of that the operating system provides next to no abstractions, so it's all talking directly to hardware.

However, it turns out that this is a very possible task. Reading assembly code isn't particularly difficult as one can wave over a number of details which one would need to know if they wanted to write assembly code. Compared to the games of today, the old game engines are actually quite small. So even if it appears on the surface to contain heavy use of hand tuned assembly, a sizable portion of that is probably code to talk to the hardware. The rest is typically tight rendering loops. The take away from this is most of the assembly code will need to be rewritten any way, and it is likely that most of the code we do care about keeping is already reasonably portable C.

Before we start, I would like to throw in a disclaimer. My experience with game porting comes down to porting the Super 3D Noah's Ark changes from the disassembly, converting some Blake Stone assembly routines to C, and porting almost all of the Catacomb 2 engine to modern systems. However, I think the information here will still be useful as a general case.

Porting Methodology

If you ever tried to start a project like this before and the original code base was written for Borland C++ 2 or 3. Your first thought may have been to try to port it to a development environment that supports both DOS and Windows such as Borland C++ 5.02. The idea is to try to port the game in little steps and hopefully keep things working correctly along the way. My recommendation, however, is to not do this, but rather to make a straight jump to the desired platform. There are a few reasons for this.

How familiar are you with DOS programming? If you're reading this, it's probably safe to assume that the answer is "not at all." Even updating the tools can break things, and are you prepared to know what to do when the executable just doesn't run? Do you know how to use the debuggers of the day? What if the error occurs before the program even starts executing? The Internet may have a lot of information, but good luck finding help with 16-bit x86 programming. On top of all this, you still need to deal with portability issues when going from 16 to 32-bit all at once and then you need to take care of abstracting away the hardware.

To me, this sounds like a waste of time with marginal benefits. So instead I will suggest a method that I heard about in some presentation on porting Windows games to Linux: Just make it compile. Feed the C code into GCC or MSVC (I'd probably recommend GCC in this case due to better C support) and if you get an error, stub out the code. I will return to this idea in a moment, but first we need to talk about the C extensions, which means we need to talk about 16-bit memory models!

X86-16 Memory

The first thing to go over is the size of types in a typical 16-bit program. For Borland C, short and int are 16-bit; and long is 32-bit. Note that in modern 32 and 64-bit compilers int is 32-bit and long is either 32-bit (MSVC) or 64-bit (GCC). This generally does not matter, and I would not recommend changing everything variable to use fixed size types for performance reasons. However you will want to convert structures which are read directly from disk to fixed sized types. Which leads us to what to do about pointers.

For those used to modern programming with a flat address space, the first surprise of 16-bit x86 is segmented memory. As to be expected, in 16-bit mode a pointer is 16-bits. However, there may be more than 64KB of memory. The way this was handled was by dividing the memory address into two 16-bit values: Segment and offset. However, this does not result in an effectively 32-bit address space since the segment and offset overlap. Instead the segment only provides 4 unique bits giving a 20-bit effective address.

In Borland C there are three pointer types: near (normal), far, and huge. You may notice the far and huge keyword scattered throughout the program on pointers. The plain old pointer is a 16-bit offset value, and the effective address is determined by some assumed segment. Far pointers are 32-bit with the lower 16-bits (assume little endian) being the offset and the upper 16-bits being the segment. Huge pointers are an emulation of a flat address space. They are used where pointer arithmetic on a full address is needed.

While it is important to know what these keywords do, for the purposes of porting code, it is adequate to simply strip the far and huge keywords. The only catch to this are data structures which are read from disk. If you have such a structure with pointers, you will probably want to have two structures. One for the on disk representation and one for the in memory representation. The on disk one will need the pointers either converted to a 16-bit integer type or a 32-bit type for far and huge pointers. If you have far pointers on disk you may find the following code useful to convert far pointers to a flat offset:

typedef struct { uint16_t ofs; uint16_t seg; } farptr_t;
static inline uint32_t flatptr(farptr_t ptr) { return (ptr.seg<<4) + ptr.ofs; }

For the most part, the only other time we will need to be concerned about far pointers is when dealing with assembly code. One final note about type sizes, the size of a enum is the same as an integer, which is 16-bit in this case. For on disk structures you will not be able to use the enum type and will have to use int16_t.

Step 1 - Stub Everything

Now that we know how to handle the foreign keywords, we can now resume the discussion on handling code that doesn't compile. Here's a piece of code I recommend using:

// Always start by having the function terminate the program.
#define FIXME { printf("FIXME: %s\n", __FUNCTION__); assert(false); }
// Use this when you know it is safe for the function to do nothing.
#define STUB { printf("STUB: %s\n", __FUNCTION__); }

When you reach some code that doesn't compile, wrap it in #if 0/#endif and then before that write FIXME. If you have an assembly function that you need to translate. Write out the protocol and then put FIXME after it:

void SomeASMFunction() FIXME

The goal here is to get a binary file built. When you run it, it will terminate when it reaches a function that hasn't been ported and print the name. All you need to do from here is fix the function it tell you to and then rinse and repeat. Easy as that.

At least it would be if we didn't have to deal with assembly and code that knocks on the hardware!

Step 2 - Assembly

As I stated in the introduction, I'm not going to teach how to write assembly code. In the context of porting 16-bit applications, it's pointless. For those who are totally unfamiliar with assembly, it is a way to mnemonically write machine code. The exact syntax and available directives varies between processor architecture and assembler. Although the mnemonics for each instruction are generally the same since they're typically modeled after the processor's specification sheet. To make matters worse, each instruction has different constraints on how you can use them. We need not concern ourselves with any of this, and instead focus on the basic ideas and how they translate to C.

The first thing one needs to know about assembly is that the processor generally does not operate directly on main memory. Instead, the processor has a handful of registers to work with. Registers can basically be seen as variables on the processor. In x86-16 we have 4 general purpose registers: AX, BX, CX, and DX; four pointer registers SI, DI, SP, and BP; and four segment registers CS, DS, SS, and ES. The general purpose registers can be split into 8-bit registers by changing the X to H (high) or L (low). For example AH is the upper byte of AX. There are a few other registers, but these are the ones that you will probably encounter. The classification of the registers (general purpose, pointer, and segment) doesn't matter to us. They're all 16-bit values. The classification is important if you're writing code.

Before we look at function code, lets first examine how global variables are stored or referenced.

; External global variable.
EXTRN tileptr:WORD

; Uninitialized variable (uint16_t LastRnd;)
LastRnd      dw ?
; This is an uninitialized array (uint16_t RndArray[17];)
RndArray     dw 17 dup (?)

; This is an initialized array (uint16_t baseRndArray[] = { ... };)
baseRndArray dw 1,1,2,3,5,8,13,21,54,75,129,204
             dw 323,527,850,1377,2227

The syntax here should be fairly self explanatory. The variable is given a name, then a size (db = 8-bits, dw = 16-bits, dd = 32-bits) and finally some initial value or '?' for uninitialized. The uninitialized array uses some short hand notation to specify 17 uninitialized values. You may see other directives, but they should be fairly obvious what they're doing. One thing to note here is that there's no signed or unsigned. We will have to guess that based on the context of the program code.

Let's take a look at some code. An assembly instruction typically looks like a mnemonic followed by some number of comma separated arguments. What types of arguments are accepted depends on the addressing modes supported by the instruction. Since we're not writing assembly code, we don't need to worry about knowing what modes are available. If the instruction takes arguments, the destination is typically the first value and the operands follow. Knowing this the following example should be easy to follow:

mov ax, [myvar]
shl ax, 4
add ax, 10
mov [myvar], ax

What this would do is load some global "myvar" into the ax register, multiply it by 16 (left shift), add 10, and then store it back into memory. One thing that may be tricky for a C developer is how arrays work in assembly. There really is no such thing as an array in assembly, just consecutive addresses. The important thing to know here is that if you have an array of something other than bytes, the index will be incremented by sizeof(array[0]). You might have already been aware that C does that behinds the scenes.

Something I would like to point out about memory accesses. You may occasionally see some constant arithmetic (or some times arithmetic with a register) in the brackets. It should be really obvious what that means. What may not be obvious is what [WORD cs:bx] means. Recall the previous topic of segmented memory. The segment registers are typically used to hold a segment address, and then SI, DI, BX, or BP is used to store the offset. In general, the segment register can be ignored for the purposes of porting, so it is typically correct to treat that as [WORD bx]. Just know that the 32-bit addresses are going to be split into two 16-bit registers.

Hopefully, it's starting to seem like assembly porting isn't too bad after all. For the most part this is true. The code that works directly with hardware may be difficult to follow, but it can usually be solved by doing some research on how the hardware worked (and it's typically possible to find programming guides). If that fails, since we don't care about running on the old hardware, it may be possible to redo the function from scratch by examining the inputs and not worrying about the how.

All is fine and dandy until we come across self modifying assembly code. In case you're not familiar with the details of self modifying code. The idea is that since the number of registers on x86 is slow, it may be faster to build executable code on the fly and substitute in new constant values. The idea in itself isn't bad, the issue is that it's not really possible to translate such code into C line by line. From what I've seen self modifying code is usually produced in C. Here's an excerpt from Wolf3D's WL_SCALE.C:

//
// mov al,[si+src]
//
*code++ = 0x8a;
*code++ = 0x44;
*code++ = src;

Please refer to the complete code to get a full understanding, but what we have here is the machine code for the instruction "mov al,[si+<const>]". The code inserts that instruction into a byte array which will then be executed later as a function. This kind of code can't be translated literally so you will need to identify what the generated functions are doing and write a generic version of it. Self modifying code can be done in assembly as well as can be seen in Catacomb 3D's C3_SCA_A.ASM:

mov  bx,[endtable+bx]
push [cs:bx]              ;save the code that will be modified over
mov  [WORD cs:bx],0d18eh  ;mov ss,cx
push [cs:bx+2]            ;save the code that will be modified over
mov  [WORD cs:bx+2],90c3h ;ret / nop
push bx

The only difference here is that the old code is stored on the stack (to be popped later) and two bytes are written at a time.

Step 3 - Hardware

I briefly mentioned working with hardware in the assembly section. Code that does this is usually easy to recognize due to using the functions inportb (read) and outportb (write). The nice thing is if the hardware in question is the OPL chip (or similar) these calls probably correspond to some function in the emulator interface. Ultimately if all else fails, look for some programming information on the hardware and try to translate the calls with that information.

The real catch is hardware like the video chip (be it CGA, EGA, or VGA). The display buffer for these cards are mapped into memory. Literally the way to display something is to declare a pointer to some constant address and then write to it. This has some implications when translating to modern hardware. For example, the game probably expects that stuff in video memory just shows up on the screen, which typically isn't the case these days. You'll probably need to scatter calls throughout the program in order to handle getting the display buffer to the screen.

Secondly, I would like to mention that if the game runs at 320x200 like most did, the refresh rate for the screen is 70Hz. The game logic is probably tied to the vsync signal, so if the engine doesn't specify a tic rate, that is probably what it is.

Hardware is a really complex subject, so I can't possibly cover all the details. Look up any documentation on the hardware you can find and learn how it expects things to be formatted. (Usually the game data is formated for the hardware.) If all else fails, try to work from the inputs to the function and write an implementation from scratch.

Conclusion

Hopefully this brief overview of porting DOS games has provided some useful information. If assembly still seems to be difficult, I would recommend getting a copy of IDA Free and trying to map out the original binary. Mapping out the binary just means going through it and assigning the proper names to the functions and variables. If you're like me, you might find disassembling stuff oddly addictive.

Good luck!