I'm an electronics engineer by profession. Unlike IT engineers, we're used programming in assembly language for whatever processor, microcontroller or PIC we're dealing with. Nobody uses Hex dump, unless they are a masochist. It's easier to feed the hex code into a monitor programme so it is displayed in assembler, in this case for the 80x86.
I've been programming in assembly language since 1977 with the Signetics 2650 processor on my first computer, moving onto the Z80, 6502, 6800, 68000, 8088, 8086, 80x86, 8031, 8049, 8o51, 80251, PIC, etc. and whatever new processors that will appear in the future. It has become second nature and a mindset. It's why electronics and computer engineers know themselves to be superior in skills to IT engineers.
I've programmed the PC in assembly language a few times in the past, including graphics displaying moving objects, creating multiple nested arrays, etc. without using C or C++.
C++ writes more than 20 times the amount of actual code needed to perform the task, due to the way it compiles. This is due to the included libraries of background function code it needs to add. All high level language compliers do this. It's a trade off between simplifying down software creation and cluttering up memory space when executed. These will be thing like keyboard scanning, mouse tracking, screen display routines, IO routines, etc. as the creators of C++ assume that the majority of coders are (A) too stupid to do something as simple as write a few bytes of assembly code to track a mouse or (B) too lazy too if they do know how to.
Clues to look when reverse engineering somebody else's assembly code on the PC:
Any calls to INT 27 preceded by register loading of values, either immediate or from a memory location, will be a graphic pixel display call. This will typically be within a loop of some description.
As the SFC games code was created before Core processors, with their dedicated ALU's for interrupt calls, IO calls, Subroutines, etc. there will be a lot of register stacking occurring, to and from memory. This is why running the game on Core processors doesn't give any speed increase as the software is still using up clock cycles stacking registers instead of deligating to the dedicated assistant ALU's and registers.
If you run pre Core complied code on a Core processor you can still witness a stack overflow error, something impossible on code purely created to run on a Core processor.
You also cannot run code complied for a Core processor on a pre Core processor.
Stacking and unstacking the registers in sequential and reverse sequential order to the stack space reserved in the code's memory map. This will indicate a subroutine's start and termination points. What is happening in between the stacking and unstacking is worth investigation.
A typical sequence of events would be: Stack all processor registers -> load registers for pixel coords, colour, brightness, duration -> call INT 27 -> reload all processor registers from stack.
On a Core processor running Core code: Load Interrupt registers for pixel coords, colour, brightness, duration -> INT 27.
When reverse engineering code the experts use a paper and pen / white board diagram methodology tool called a Jackson Structure to chart out gradually what the code is doing, to pinpoint where changes can be made and what needs to be done.
I keep putting off reverse engineering SFC 1 so that I can add in more weapons, such as SFB Lasers and Particle Beam Cannons.