Apologies for the non-coders who view this post, these are pretty much all screenshots of code as I have not been focusing on testing and trying to fix many games this year. It's important to remember I view decaf-emu as an experiment in software engineering, it is my playground to have fun, test new programming ideas and to be a good opportunity to write some quality code. Although saying that, after Part 2 I'm sure we're all a bit bored of screenshots of game menus.
I updated the Windows build to also use CMake - this is great because it creates less of a maintenance nightmare. At this time both Windows devs (Brett and I) were not working on the project so any change achurch or others pushed would break the Windows build. So instead of having to maintain two separate build systems I did the obvious thing and started working on a unified CMake.
With my new found knowledge of CMake I decided to port my Wii U toolchain wut to CMake also, this should make it much easier to write Wii U homebrew - the old Makefiles were a confusing mess! The benefit of this for decaf-emu would be so that we could integrate test homebrew into the decaf CMake build system.
One of my pet projects that I had been trying to do for a long time was to rewrite our implementation of the coreinit.rpl filesystem emulation from the ground up to be much more close to the original implementation in coreinit.rpl. This was a nightmare to reverse, especially the FS State Manager which I still don't fully understand!
I really jumped down the rabbit hole when rewriting the filesystem - I ended up reversing it all the way to the IOS layer where it communicates over IPC with the ARM chip in the Wii U. This is actually a very important change, having IOS emulation is great as it allows for some exciting stuff in the future. For example many system libraries use the IOS to communicate with hardware devices, this means we would be able to do cool stuff like USB or Bluetooth passthrough for supporting peripherals such as Wii motes. It also opens the door for low level emulation of system libraries which rely on IOS.
The actual control flow of a filesystem operation is quite complex - that's why writing good comments is useful. I'd never remember this if I came back to it later without a written explanation.
Obviously this meant I had to rewrite every single filesystem command, with reversing the serialisation to the IOS IPC buffers and then implementing the command over in the IOS FSA driver. There's a lot of commands!
At the same time I was also trying to help achurch with debugging the last remaining issues in his amazing new JIT that he wrote for decaf-emu, libbinrec. It's an incredible effort, it decompiles PPC to an internal representation then performs optimisation passes over to generate some beautifully optimised x64. I was very impressed by the comprehensive test suite he wrote for his new library, there was 3296 of them!
One last thing I wanted to make sure worked before merging the JIT was debugging. I modified our breakpoints to write actual trap instructions and then invalidate the JIT code cache in the region the breakpoint was placed. So that when the JIT recompiles the function it will see the tw and emit a handler which processes the breakpoint and returns control to the debugger. The debugger will fall back to the interpreter for stepping through instructions, or when properly resumed will continue with the JIT.
I decided to start playing around with low level emulation of system libraries. Previously we had to reimplement every single library by hand in C++ function by function, which is a slow and tedious process. It is also would be a very bad approach in the long term, for example some system libraries are simply drivers around USB devices. Now that we have IOS support it would be much better to allow USB passthrough to these LLE libraries rather than have to implement our own USB drivers! The LLE approach is only suitable for non performance critical libraries, and for any library which does not directly use MMIO. This would exclude the important libraries, coreinit, gx2 and snd_core. For example however, erreula.rpl is a great candidate - it's a system library which sole purpose is to display error messages on the screen, and now by LLEing it Bit Trip Runner 2 the game now shows an error message instead of a black screen:
I tried playing around with LLE of swkbd.rpl, and now the Wii U system keyboard almost works in Xenoblade. There is lower level functionality it relies on which I did not implement yet so it's not useable but it's another great candidate for LLE. It's also cool to see the real Wii U keyboard in decaf-emu!
A big part of emulation is figuring out the system internals, I finally got around to writing a much nicer RPC server/client for the Wii U which lets me easily prod at Wii U functions from python. Thanks to all the recent strides in Wii U homebrew it's much easier now than it is before, no more messing around with browser exploits, I can simply compile a .rpx using my toolchain, upload it to my console via FTP, and run it from homebrew launcher!
My IOS IPC reversing days are far from over, after figuring out the IOS device /dev/fsa for the filesystem, I now moved onto reversing /dev/mcp for the MCP_ family of functions. These were much harder to reverse but I have made some progress. MCP is responsible for title management (installed games, patches and DLC).
After reworking the debugger for the new JIT and trying to debug some issues for some games, I found myself cross referencing our debugger and the same code in IDA, so i thought why not merge the two and expose decaf-emu's debugger as GDB stub which IDA can connect to. Hopefully this will be useful for some people, but I also found it rewarding because it allowed me to clean up all the debugger code which I have wanted to refactor for a long time!
I finally started looking into the issue with the new Zelda title, an allocation was failing due to us not providing the game with enough memory. A simple hack is to increase the memory available to games, but that is not the correct approach. I ended up rewriting the whole virtual memory management and physical memory mapping from scratch to now be much more accurate with how a real console works. In line with my low level IOS work I've been doing lately I decided to jump down that rabbit hole once again. I ended up implementing the virtual memory stuff as kernel syscalls as they are on the console, and used my new python RPC tool to prod at the actual virtual memory map on the console to reproduce the same mappings in decaf. Then I decided to re-evaluate our approach to accessing virtual memory from HLE code and ended up trying to write a much safer way of using it which is what is shown in this screenshot. Sorry for those people who don't know C++ ;D.
I decide to bring my previous works with CMake together and integrate building the HLE unit tests with my toolchain wut from inside the decaf-emu tree. It even works from inside the Visual Studio solution!
A new game boots.
After unifying the test building, I can now use CTest to actually execute the tests. A great step forward for automated testing of the emulator, now we just need to write thousands of tests... simple! And also write a framework to get these tests running automated on console so we can match up results to verify behaviour.
Playing around with a POC of a Qt based PM4 graphical debugger tool. This tool would allow you to inspect the state of the latte GPU and would function similar to how graphical debuggers work for OpenGL / DirectX etc. This is a significant amount of work however, and I'm not sure how valuable the gains are, I feel effort is better spent in gx2 unit tests and hardware testing.
Another prime example of games with full function symbols included being very useful in the development process. Here I was working on a library for wut to allow easy loading of GFD files (which are files that can store shaders and textures). This is the first step towards easier use of GX2 from homebrew (and more importantly, graphical tests for decaf!).
Future So what's the future of decaf-emu going to look like? At this point our weakest area is the GPU emulation, so most of my ideas are around that. Some of the things I have been thinking about are:
- Fix the mysterious file system bug - ever since I rewrote the file system several games are failing in file system related functions for no apparent reason. Including most (all?) Unity engine games (Meme Run!!!), also seen in sound loading functions for several games which presumably are sharing the same library code.
- Improve PM4 replay recording - fix texture size bug (now fixed), reduce file size of replays by supporting memory diffs.
- PM4 replay GUI.
- Vulkan graphics backend - it's good to write a new backend from scratch using the knowledge we've gained, hopefully it will clean out some of the organically grown mess from the GL backend.
- latte shader assembler, super important for homebrew and for GX2 tests.
- GX2 graphics unit tests, with results dumped from console to compare to decaf.
- Something like dolphin's FifoCI using our PM4 replay tool.
- Improve the interface between gx2.rpl and our GPU emulation. This involves another one of those low level rabbit holes I seem to enjoy diving down lately. More accurate emulation of the GPU ringbuffer control via tcl.rpl, and hardware display control via dc.rpl and avm.rpl.
- More code documentation!
- More HLE unit tests.
- Framework for running HLE unit tests on console and comparing results with decaf-emu.
- Figure out ways to improve the development infrastructure to help new and existing developers.