It was Science!

Writing a Gameboy Emulator in NIM

I've always wanted to sit down and write an emulator for a gaming system. It's an excellent opportunity to learn how the hardware works, the fun quirks that can be done with it and a fun programming project in general. This post isn't going to detail every line of code I've written but will contain some notes on how I structured my approach and some design choices I made. A big thank you to a friend of mine which helped flesh out the SM83 core code.

Source Code

You can find the code here: Github - itwasscience/nimboy

Hardware

Instead of detailing the hardware in this post I will just provide a link to the excellent PanDOCS reference^1.

The gameboy uses an SM83 CPU, which is a modified version of the Z80 cpu. A few registers are missing and a few extra opcodes are available. As far as emulating a CPU core it is very straightforwards, with the only "tricky" part being variable-length opcodes that start with 0xCB. You can check out a full opcode list on GBDocs1.

Other things I found fun include the following:

  • Cartridge Paged RAM and ROM
  • Color Palettes
  • 8bb Pixel Encoding
  • Color Support (on the GameboyColor)
  • Audio with waveform generator
  • Cartridges may extend the capabilities of the system

Language Choice

Going in I wanted to use a strictly typed compiled language that made bit manipulation easy. Years ago I wrote an SM83 core in C but never finished my emulator. This time I decided to try the language NIM2. I was very pleased with the syntax, the compiler and the ability to use ValGrind to profile my application.

Goals

The goal of the project was to run a ROM of Pokemon Red, the first game I owned on the gameboy. Additionally, I wanted to follow the recently discovered operation of the Picture Processing Unit detailed in the Ultimate Gameeboy Talk3.

Overview of Modules

I broke the application into a few modules, this diagram shows a brief overview of how internal communication works. Essentially the entire thing is kept on a global loop where I increment the state of each module synchronously. This is not perfect but was much simpler to code.

Since many modules may touch memory all modules use a shared memory module to access memory, which handles what page is active and any other special operations such as flipping pages when writing to specific locations in the ROM.

Modules

Progress Over Time

In around three weeks I had great progress. Here's some screenshots during the work. Each "week" is around a real-time week but also around 40-50 hours of development (I really tend to do overdo things when I have a passion project).

Week 1 - Debugger and SM83 Core

After the first week of work I'd learned the basics of NIM and started on the SM83 core. A friend of mine who was already writing his own 8080 CPU in Python jumped in and fleshed out many of the more interesting opcodes while I started mapping out the modules for the rest of the system.

I wrote the debugger this week and spent some time trying to figure out how to construct the PPU logic. By the end of this week I was seeing the Youtube diagrams in my sleep.

Debugger

Week 2 - Interrupts, Timers and Pixels

The CPU can now process interrupts and the global timer is running this week. After learning about encoding of sprites (objects in Gameboy parlance) I had the background mapper debugger working. As the week was coming to a close I started to get the screen actually going in a very broken fashion.

PPU Nearly Working

Week 3 - PPU Work, Profiling

The struggle with video is real. The "fetcher" solution is awesome since it provides true cycle accuracy with the real hardware, but it's complicated and very, VERY slow. Throwing this through ValGrind it's apparent that the PPU is as nearly as intensive as the CPU. On my i5 8600k machine I can hit 60 ticks per second, just fast enough.

Valgrind

TETRIS IS RUNNING! It's possible to actually complete the game and I wrote a very simple scaler in the rendering system to make it a bit easier to see.

Author's Note: It's a bit humbling when 200 lines of code take 60 hours to figure out.

PPU Working

Playing Tetris

Week 4 - PPU Work

There's much more to be done. Tetris doesn't use windowing nor does it use any object masking / transparency. Nor does it use any object OAM bits such as the horizontal / vertical flipping which I haven't implemented. Or the 8x16 size object model rendering.

All of these things must be working for the next game, Kirby's Dreamland, to be playable, and I've started work on all of them.

Week 5 - Pause for Now

It's been a wild ride. For now, I'm setting this project aside a bit to let my mind rest. I plan to come back to this project one day and finish the PPU and start working on audio emulation. After that it's just cartridge-specific implementations.

References

There's a huge list of resources available here that I strongly recommend reading: Awesome Gameboy Development