Bootrom
Gamecube bootrom reverse engineering for CubeDocumented project
This reversing is not pretending to be compilable, and can be used for education purposes only. I dont hold any responsibility for damage of your Gamecube clock settings or memory card saves or else, produced from using code/information represented in this document.
Documented by Andrei Chestakov (org). Version 1.1. (19 Jul 2005)
Additions and corrections by tmbinc.
19 Jul 2005: Small corrections regardling "BS" acronym (see Appendix).
PART I. Reset Vector. Boot Stage.
Bootrom NTSC 1.0 version is used for reverse engineering.
See brief description of BS run flow at the end of disassembly.
After hardware reset (POWER ON button pressed), CPU will jump to reset vector. Reset exception handler address is 0x00100 by default, but Gekko's MSR[IP] bit is implemented to be set, after reset, so final address will be 0xFFF00100. 0xFFFxxxxx physical address is mapped by Flipper's memory interface to the first megabyte of GC bootrom area. Execution of any exception handler will proceed with disabled memory translation.
This part of GC bootrom is called BS - Bootstrap Stage. Bootstrap is written on assembly language. BS mainly purposed to load second logical part of bootrom, called BS2 or IPL (initial program loader). BS is placed in bootrom at 0x000-0x800 offsets. Disassembly obtained from Dolwin Debugger, skipped parts are filled by zeroes and not shown. Well, lets go..
Set HID0 to 0x00110C64. This will initialize Gekko implementation specifics. Meaning of this operation :
- Mask machine check exception (MCP line).
- Ignore memory bus parity errors.
- Enable dynamic power management mode.
- Check hard reset bit (to distinguish software/hardware resets)
- [!] Disable data and instruction cache.
- Cache invalidation will not write-back to memory.
- Disable memory coherency for instruction fetch.
- Disable gathering of non-word accesses.
- Force data cache to set invalid entries on data miss.
- Enable branch-instruction cache (BTIC).
- [!] Disable broadcasting. CPU now can read/write data only from data cache.
- Enable on-chip branch history (Gekko has 512-entry BHT).
- Enable dcbt and dcbtst instructions.
:FFF00100 lis r4, 0x0011 :FFF00104 addi r4, r4, 3172 :FFF00108 mtspr HID0, r4
Set machine state register to 0x2000. This will initialize CPU program model. Meaning of this operation :
- Disable power management.
- [!] Disable interrupts and decrementer exception.
- [!] Set supervisor privilege level.
- Enable floating-point instructions.
- Disable machine check exception.
- Disable FPU exceptions.
- [!] Set exception vectors base to point on low memory.
- [!] Disable instruction and data address translation.
- Disable Gekko performance monitor.
- Clear recoverable exception flag.
- Enable little-endian mode (default for PowerPC).
:FFF0010C lis r4, 0x0000 :FFF00110 addi r4, r4, 8192 :FFF00114 mtmsr r4
Initialize auxiliary memory (ARAM). Meaning of bits in 0x5012 register is still unknown. Set this register value to 0x40 | 3. Last two bits are set only when ARAM expansion is present (looked in ARAM library). Set ARAM refresh value to 156MHz (0x501A register).
:FFF00118 lis r4, 0x0C00 :FFF0011C addi r4, r4, 0x5012 :FFF00120 li r5, 67 :FFF00124 sth r5, 0 (r4) ; Set 0x5012 register value to 0x43. :FFF00128 li r5, 156 :FFF0012C sth r5, 8 (r4) ; Set ARAM refresh value.
Initialize Flipper memory interface. Meaning of 0x4026 register is still unknown. Set this register value to 0x40.
:FFF00130 lis r3, 0x0C00 :FFF00134 ori r3, r3, 0x4000 :FFF00138 li r4, 64 :FFF0013C sth r4, 38 (r3) ; Set 0x4026 register value to 0x40. :FFF00140 nop :FFF00144 nop
Enable data and instruction cache (set HID0[ICE] and HID0[DCE])
:FFF00148 mfspr r3, HID0 :FFF0014C ori r4, r3, 0xC000 :FFF00150 mtspr HID0, r4 :FFF00154 nop :FFF00158 nop :FFF0015C nop :FFF00160 isync
Initialize CPU memory model. Clear BATs and segment registers.
:FFF00164 li r4, 0 :FFF00168 mtspr DBAT0U, r4 :FFF0016C mtspr DBAT1U, r4 :FFF00170 mtspr DBAT2U, r4 :FFF00174 mtspr DBAT3U, r4 :FFF00178 mtspr IBAT0U, r4 :FFF0017C mtspr IBAT1U, r4 :FFF00180 mtspr IBAT2U, r4 :FFF00184 mtspr IBAT3U, r4 :FFF00188 isync :FFF0018C lis r4, 0x8000 :FFF00190 addi r4, r4, 0 :FFF00194 mtsr 0, r4 :FFF00198 mtsr 1, r4 :FFF0019C mtsr 2, r4 :FFF001A0 mtsr 3, r4 :FFF001A4 mtsr 4, r4 :FFF001A8 mtsr 5, r4 :FFF001AC mtsr 6, r4 :FFF001B0 mtsr 7, r4 :FFF001B4 mtsr 8, r4 :FFF001B8 mtsr 9, r4 :FFF001BC mtsr 10, r4 :FFF001C0 mtsr 11, r4 :FFF001C4 mtsr 12, r4 :FFF001C8 mtsr 13, r4 :FFF001CC mtsr 14, r4 :FFF001D0 mtsr 15, r4
Configure memory model. Dolphin OS is using MIPS-like translation to simplify memory operations. Segment memory model (page tables) is not used in Dolphin OS. Although, some applications can use page talbes to simulate direct ARAM access. BAT registers 0 and 1 are reserved by Dolphin OS. BAT registers 3 and 4 can be used by applications. BAT 3 register is temporary used by bootrom.
Configuration of DBAT and IBAT registers in Dolphin OS is as follow :
DBAT0: 80001FFF 00000002 Write-back cached main memory, 256MB block. DBAT1: C0001FFF 0000002A Write-through cached main memory, 256MB block. DBAT2: 00000000 xxxxxxxx Dont care, reserved. DBAT3: FFF0001F FFF00001 Bootrom, 1MB block.
IBAT0: 80001FFF 00000002 Write-back cached main memory, 256MB block. IBAT1: 00000000 xxxxxxxx Dont care, reserved. IBAT2: 00000000 xxxxxxxx Dont care, reserved. IBAT3: FFF0001F FFF00001 Bootrom, 1MB block.
Write-back mean that all writes goes only in cache, and will be written back in memory only after cache "store" operation. Write-through mean that all writes goes immediately in main memory (data in cache is always same with data in main memory). Read is always performed from cache.
Data access on hardware registers is always broadcasting, since WIMG bit M is set for write-through block (0xC0000000-0xCFFFFFFF).
So, here is Dolphin OS (effective addressed) and GC (physical addressed) memory maps. Note, that bootrom's effective address is useful only during hard reset.
Effective Address (Dolphin OS)
80000000 24MB Main Memory (RAM), write-back cached C0000000 24MB Main Memory (RAM), write-through cached C8000000 2MB Embedded Framebuffer (EFB) CC000000 Command Processor (CP) CC001000 Pixel Engine (PE) CC002000 Video Interface (VI) CC003000 Peripheral Interface (PI) CC004000 Memory Interface (MI) CC005000 DSP and DMA Audio Interface (AID) CC006000 DVD Interface (DI) CC006400 Serial Interface (SI) CC006800 External Interface (EXI) CC006C00 Audio Streaming Interface (AIS) CC008000 PI FIFO (GX) FFF00000 1MB Boot ROM (first megabyte), used during BS only.
Other memory access will ISI/DSI at first (if not mapped by MMU) and then Flipper memory interface interrupt (if its missing or not allowed).
Physical Address (Flipper memory interface)
00000000 24MB Main Memory (RAM) 08000000 2MB Embedded Framebuffer (EFB) 0C000000 Command Processor (CP) 0C001000 Pixel Engine (PE) 0C002000 Video Interface (VI) 0C003000 Peripheral Interface (PI) 0C004000 Memory Interface (MI) 0C005000 DSP and DMA Audio Interface (AID) 0C006000 DVD Interface (DI) 0C006400 Serial Interface (SI) 0C006800 External Interface (EXI) 0C006C00 Audio Streaming Interface (AIS) 0C008000 PI FIFO (GX) FFF00000 1MB Boot ROM (first megabyte)
Other memory access will generate Flipper memory interface interrupt.
:FFF001D4 lis r4, 0x0000 :FFF001D8 addi r4, r4, 2 :FFF001DC lis r3, 0x8000 :FFF001E0 addi r3, r3, 8191 :FFF001E4 mtspr DBAT0L, r4 :FFF001E8 mtspr DBAT0U, r3 :FFF001EC isync :FFF001F0 mtspr IBAT0L, r4 :FFF001F4 mtspr IBAT0U, r3 :FFF001F8 isync :FFF001FC lis r4, 0x0000 :FFF00200 addi r4, r4, 42 :FFF00204 lis r3, 0xC000 :FFF00208 addi r3, r3, 8191 :FFF0020C mtspr DBAT1L, r4 :FFF00210 mtspr DBAT1U, r3 :FFF00214 isync :FFF00218 lis r4, 0xFFF0 :FFF0021C addi r4, r4, 1 :FFF00220 lis r3, 0xFFF0 :FFF00224 addi r3, r3, 31 :FFF00228 mtspr DBAT3L, r4 :FFF0022C mtspr DBAT3U, r3 :FFF00230 isync :FFF00234 mtspr IBAT3L, r4 :FFF00238 mtspr IBAT3U, r3 :FFF0023C isync
Enable instruction and data translation by setting MSR[IR] and MSR[DR] to 1.
:FFF00240 mfmsr r4 :FFF00244 ori r4, r4, 0x0030 ; Enable address translation. :FFF00248 mtmsr r4 :FFF0024C isync
Write 0x0245248A to 0x3030 register. Meaning is unknown. Register is unknown.
:FFF00250 lis r3, 0xCC00 :FFF00254 ori r3, r3, 0x3000 :FFF00258 lis r4, 0x0245 :FFF0025C ori r4, r4, 0x248A :FFF00260 stw r4, 48 (r3) ; Write 0x0245248A to 0x3030 register.
Reset DVD, through PI reset register. Meaning of bits in reset register is still unknown.
:FFF00264 lwz r4, 36 (r3) ; Read PI reset register. :FFF00268 ori r4, r4, 0x0001 :FFF0026C rlwinm r4, r4, 0, 31, 28 ; Set bit 31, clear bit 29. :FFF00270 stw r4, 36 (r3) ; Write new value in reset register. :FFF00274 mftbl r5 :FFF00278 mftbl r6 :FFF0027C sub r7, r6, r5 :FFF00280 cmplwi r7, 4388 :FFF00284 blt+ 0xFFF00278 ; Wait ~9 us (with 486MHz clock) :FFF00288 ori r4, r4, 0x0003 ; Set bit 31, set bit 29. :FFF0028C stw r4, 36 (r3) ; Write new value in reset register.
Allow 32MHz EXI clock setting by CPU.
:FFF00290 lis r14, 0xCC00 :FFF00294 ori r14, r14, 0x6400 :FFF00298 li r4, 0 :FFF0029C stw r4, 60 (r14) ; SI EXICLK[LOCK] = 1
Probe EXI AD16. It is still unknown what type of hardware is representing AD16. Seems its used to store bootrom "trace" step (this is only one kind of value, have seen to be written in AD16 so far). AD16 is working at 8MHz.
AD16 trace state (what values was written in AD16 and when):
BS: 0x01000000 ? 0x02000000 ? 0x03000000 ? 0x04000000 Memory test passed. 0x05xxxxxx \ 0x06xxxxxx | Memory test failed. 0x07xxxxxx /
IPL: 0x08000000 IPL and OSInit called. 0x09000000 DVDInit. 0x0A000000 CARDInit. 0x0B000000 VIInit. 0x0C000000 PADInit.
To probe AD16 we must read its EXI ID. It should be 0x04120000. Place it to R20. Readed value will be checked in AD16 write routine.
:FFF002A0 lis sd2, 0xCC00 :FFF002A4 ori sd2, sd2, 0x6800 :FFF002A8 lis r22, 0x0000 :FFF002AC ori r22, r22, 0x00BA :FFF002B0 li r8, 1 :FFF002B4 li r10, 0 :FFF002B8 lis r21, 0x0412 :FFF002BC ori r21, r21, 0x0000 :FFF002C0 lis r3, 0x0000 :FFF002C4 ori r3, r3, 0x0000 :FFF002C8 lis r7, 0x0000 :FFF002CC ori r7, r7, 0x0015 :FFF002D0 stw r3, 56 (sd2) ; EXI2 DATA = 0 (Get ID command) :FFF002D4 stw r22, 40 (sd2) ; Select AD16, through EXI2 CSR. :FFF002D8 lwz r16, 40 (sd2) :FFF002DC stw r7, 52 (sd2) ; Write immediate 2 bytes from DATA. :FFF002E0 lwz r16, 52 (sd2) ; \ :FFF002E4 and. r16, r16, r8 ; | Wait until transfer complete. :FFF002E8 bgt+ 0xFFF002E0 ; / :FFF002EC lis r7, 0x0000 :FFF002F0 ori r7, r7, 0x0031 :FFF002F4 stw r7, 52 (sd2) ; Read immediate 4 bytes to DATA (ID). :FFF002F8 lwz r16, 52 (sd2) ; \ :FFF002FC and. r16, r16, r8 ; | Wait until transfer complete. :FFF00300 bgt+ 0xFFF002F8 ; / :FFF00304 stw r10, 40 (sd2) ; Deselect device. :FFF00308 lwz r16, 40 (sd2) ; Read EXI2 CSR twice. Why? No idea. :FFF0030C lwz r16, 40 (sd2) ; Maybe its deselect attribute.. :FFF00310 lwz r20, 56 (sd2) ; r20 = DATA. It should contain ID. :FFF00314 b 0xFFF00320 :FFF00318 :FFF0031C
Write "trace step" value to AD16. Only when probe was success (R20 = AD16 ID). Input value (trace step) must be in R15.
:FFF00320 b 0xFFF00344 :FFF00324 lis r3, 0xA000 :FFF00328 ori r3, r3, 0x0000 :FFF0032C lis r7, 0x0000 :FFF00330 ori r7, r7, 0x0005 :FFF00334 stw r3, 56 (sd2) ; EXI2 DATA = 0xA0000000 (Write AD16 command) :FFF00338 stw r22, 40 (sd2) ; Select AD16, through EXI2 CSR. :FFF0033C lwz r16, 40 (sd2) :FFF00340 b 0xFFF00348 :FFF00344 b 0xFFF00364 :FFF00348 stw r7, 52 (sd2) ; Write immediate 1 byte from DATA. :FFF0034C lwz r16, 52 (sd2) ; \ :FFF00350 and. r16, r16, r8 ; | Wait until transfer complete. :FFF00354 bgt+ 0xFFF0034C ; / :FFF00358 nop :FFF0035C nop :FFF00360 b 0xFFF00368 :FFF00364 b 0xFFF00384 :FFF00368 lis r7, 0x0000 :FFF0036C ori r7, r7, 0x0035 :FFF00370 stw r15, 56 (sd2) ; EXI2 DATA = trace step :FFF00374 stw r7, 52 (sd2) ; Write immediate 4 bytes from DATA. :FFF00378 lwz r16, 52 (sd2) ; \ :FFF0037C and. r16, r16, r8 ; | Wait until transfer complete. :FFF00380 b 0xFFF00388 ; / :FFF00384 b 0xFFF003A0 :FFF00388 bgt+ 0xFFF00378 :FFF0038C stw r10, 40 (sd2) ; Deselect device. :FFF00390 lwz r16, 40 (sd2) :FFF00394 lwz r16, 40 (sd2) :FFF00398 blr :FFF0039C :FFF003A0 b 0xFFF003B0 :FFF003A4 cmplw r20, r21 ; If AD16 probe failed, then skip. :FFF003A8 beq+ 0xFFF00324 :FFF003AC blr
Trace step 0x01 - Nothing ?
:FFF003B0 lis r15, 0x0100 ; AD16 = 0x01000000 :FFF003B4 bl 0xFFF003A4 :FFF003B8 nop :FFF003BC nop :FFF003C0 nop :FFF003C4 nop :FFF003C8 nop :FFF003CC nop :FFF003D0 nop :FFF003D4 b 0xFFF003E0 :FFF003D8 :FFF003DC
Trace step 0x02 - Nothing ?
:FFF003E0 nop :FFF003E4 nop :FFF003E8 nop :FFF003EC nop :FFF003F0 nop :FFF003F4 lis r15, 0x0200 ; AD16 = 0x02000000 :FFF003F8 bl 0xFFF003A4 :FFF003FC b 0xFFF00400
Memory self test with given pattern. Input parameters: R25 - base address, R26 - word pattern. Registers R17, 18, 19, 27, 28 and 29 are used to save failed address information and cleared before test.
Fail test information has following format:
R17 \ R18 | Boundary. R19 / R27 Number of fails on address with last digit 0,1,2,3,4,5,6,7. R28 Number of fails on address with last digit 8,9,A,B,C,D,E,F. R29 Holds last address where test is failed.
(Well its not important and only for addicted people :))
It is not good idea to test memory with same pattern. Better test memory at address X with value X (e.g. compare [0x80000000] with 0x80000000 etc).
Here is C version of this routine :
SelfTest(base, pattern) { memsize = 24 * 1024 * 1024; u32 *ptr = base; // Fill memory by test pattern. for(i=0; i<memsize/32; i++) { *ptr++ = pattern; *ptr++ = pattern; *ptr++ = pattern; *ptr++ = pattern; *ptr++ = pattern; *ptr++ = pattern; *ptr++ = pattern; *ptr++ = pattern; } ptr = base; // Test memory. for(i=0; i<memsize/4; i++) { val = *ptr; if(val != pattern) { R17 |= 1 << (ptr & 0x1f); R18 |= 1 << (((ptr >> 18) - 32) & 0x1f); R19 |= 1 << (((ptr >> 18) - 64) & 0x1f); if( (ptr & 0xf) < 8 ) R28++; else R27++; if(R29 < ptr) R29 = ptr; } ptr++; } }
:FFF00400 b 0xFFF00424 :FFF00404 nop :FFF00408 nop :FFF0040C nop :FFF00410 nop :FFF00414 nop :FFF00418 mr r23, r25 :FFF0041C lis r24, 0x0180 ; Main memory size (24MB) :FFF00420 b 0xFFF00428 :FFF00424 b 0xFFF00444 :FFF00428 rlwinm r24, r24, 27, 5, 31 ; Fill memory (by 32 byte portions). :FFF0042C mtctr r24 :FFF00430 stw r26, 0 (r23) :FFF00434 stw r26, 4 (r23) :FFF00438 stw r26, 8 (r23) :FFF0043C stw r26, 12 (r23) :FFF00440 b 0xFFF00448 :FFF00444 b 0xFFF00464 :FFF00448 stw r26, 16 (r23) :FFF0044C stw r26, 20 (r23) :FFF00450 stw r26, 24 (r23) :FFF00454 stw r26, 28 (r23) :FFF00458 addi r23, r23, 32 :FFF0045C bdnz+ 0xFFF00430 :FFF00460 b 0xFFF00468 :FFF00464 b 0xFFF00484 :FFF00468 mr r23, r25 :FFF0046C lis r24, 0x0180 :FFF00470 rlwinm r24, r24, 30, 2, 31 :FFF00474 mtctr r24 :FFF00478 lwz r15, 0 (r23) ; Begin to test. :FFF0047C cmplw r15, r26 :FFF00480 b 0xFFF00488 :FFF00484 b 0xFFF004A4 :FFF00488 beq- 0xFFF00514 :FFF0048C rlwinm r15, r23, 14, 18, 31 :FFF00490 andi. r15, r15, 0x001F :FFF00494 li r16, 1 :FFF00498 slw r16, r16, r15 :FFF0049C or r17, r17, r16 :FFF004A0 b 0xFFF004A8 :FFF004A4 b 0xFFF004C4 :FFF004A8 rlwinm r15, r23, 14, 18, 31 :FFF004AC subi r15, r15, 32 :FFF004B0 andi. r15, r15, 0x001F :FFF004B4 li r16, 1 :FFF004B8 slw r16, r16, r15 :FFF004BC or r18, r18, r16 :FFF004C0 b 0xFFF004C8 :FFF004C4 b 0xFFF004E4 :FFF004C8 rlwinm r15, r23, 14, 18, 31 :FFF004CC subi r15, r15, 64 :FFF004D0 andi. r15, r15, 0x001F :FFF004D4 li r16, 1 :FFF004D8 slw r16, r16, r15 :FFF004DC or r19, r19, r16 :FFF004E0 b 0xFFF004E8 :FFF004E4 b 0xFFF00504 :FFF004E8 rlwinm r15, r23, 0, 28, 31 :FFF004EC cmplwi r15, 8 :FFF004F0 bge- 0xFFF004FC :FFF004F4 addi r28, r28, 1 :FFF004F8 b 0xFFF00514 :FFF004FC addi r27, r27, 1 :FFF00500 b 0xFFF00508 :FFF00504 b 0xFFF00520 :FFF00508 cmplw r29, r23 :FFF0050C bge- 0xFFF00514 :FFF00510 mr r29, r23 :FFF00514 addi r23, r23, 4 :FFF00518 bdnz+ 0xFFF00478 :FFF0051C blr
Clear registers for memory test (see next). Trace step 0x03 - Nothing ?
:FFF00520 li r17, 0 :FFF00524 li r18, 0 :FFF00528 li r19, 0 :FFF0052C li r27, 0 :FFF00530 li r28, 0 :FFF00534 li r29, 0 :FFF00538 lis r15, 0x0300 ; AD16 = 0x03000000 :FFF0053C bl 0xFFF003A4 :FFF00540 b 0xFFF00560 :FFF00544 :FFF00548 :FFF0054C :FFF00550 :FFF00554 :FFF00558 :FFF0055C
Test memory by some patterns. Results are saved in AD16:
0x04000000 - Test passed. 0x05xxxxxx - Failed on address with last digit 0,1,2,3,4,5,6,7. 0x06xxxxxx - Failed on address with last digit 8,9,A,B,C,D,E,F. 0x07xxxxxx - Failed on address with last digit 0-F.
'x' will represent bits 6...29 of last address, where test failed.
:FFF00560 b 0xFFF00584 :FFF00564 lis r25, 0x8000 :FFF00568 lis r26, 0xAAAA :FFF0056C ori r26, r26, 0xAAAA :FFF00570 bl 0xFFF00404 ; Test memory with 0xAA pattern. :FFF00574 not r26, r26 :FFF00578 bl 0xFFF00404 ; Test memory with 0x55 pattern. :FFF0057C nop :FFF00580 b 0xFFF00588 :FFF00584 b 0xFFF005A4 :FFF00588 lis r15, 0x0400 :FFF0058C mr. r16, r27 :FFF00590 beq- 0xFFF00598 :FFF00594 oris r15, r15, 0x0200 :FFF00598 mr. r16, r28 :FFF0059C beq- 0xFFF005AC :FFF005A0 b 0xFFF005A8 :FFF005A4 b 0xFFF005C4 :FFF005A8 oris r15, r15, 0x0100 :FFF005AC rlwinm r29, r29, 30, 2, 31 :FFF005B0 or r15, r15, r29 :FFF005B4 bl 0xFFF003A4 ; Set AD16 value. :FFF005B8 nop :FFF005BC nop :FFF005C0 b 0xFFF005CC
Halt execution if memory test failed.
:FFF005C4 cmplw r20, r21 :FFF005C8 beq+ 0xFFF00564 :FFF005CC mr. r16, r27 ; Bad address with last digit 0-7 ? :FFF005D0 bne+ 0xFFF005CC :FFF005D4 mr. r16, r28 ; Bad address with last digit 8-F ? :FFF005D8 bne+ 0xFFF005CC
Prepare GPR registers for IPL loading.
:FFF005DC lis sd2, 0xCC00 ; EXI registers base :FFF005E0 ori sd2, sd2, 0x6800 :FFF005E4 lis r6, 0x0000 ; EXI0 CSR setup: device 1, 32MHz :FFF005E8 ori r6, r6, 0x0150 :FFF005EC lis r7, 0x0000 :FFF005F0 ori r7, r7, 0x0035 :FFF005F4 li r8, 1 :FFF005F8 lis r9, 0x0000 :FFF005FC ori r9, r9, 0x0003 :FFF00600 li r10, 0 :FFF00604 lis r11, 0x0000 ; Max length of single transfer :FFF00608 ori r11, r11, 0x0400 :FFF0060C lis r12, 0x0001 :FFF00610 ori r12, r12, 0x0000 :FFF00614 lis r3, 0x0002 ; Bootrom starting offset (0x800) :FFF00618 ori r3, r3, 0x0000 :FFF0061C lis r4, 0x012F ; Main memory starting address :FFF00620 ori r4, r4, 0xFFE0 :FFF00624 lis sd1, 0x0017 ; Transfer length :FFF00628 ori sd1, sd1, 0x0000 :FFF0062C b 0xFFF00640 :FFF00630 :FFF00634 :FFF00638 :FFF0063C :FFF00640 b 0xFFF00664
Final step. Transfer IPL to main memory. EXI is used for DMA transfer, since bootrom is mapped as EXI device. Max length of single DMA transfer is 1024 bytes, thus whole IPL is loaded alternately by pieces (it is not hardware limitations, but BS specifics).
Starting memory address: 0x012FFFE0. Starting bootrom offset: 0x800. Common length of transfer: 0x170000 bytes.
Bootrom scrambler is decrypting data on the fly.
:FFF00644 cmpwi sd1, 0 ; All bytes transferred ? :FFF00648 beq- 0xFFF006D4 :FFF0064C mr r5, r11 :FFF00650 cmplw sd1, r5 :FFF00654 bgt- 0xFFF0065C :FFF00658 mr r5, sd1 :FFF0065C stw r6, 0 (sd2) ; Select bootrom, through EXI0 CSR. :FFF00660 b 0xFFF00668 :FFF00664 b 0xFFF00684 :FFF00668 stw r3, 16 (sd2) ; EXI0 DATA - offset in bootrom + write command :FFF0066C lwz r16, 0 (sd2) :FFF00670 stw r7, 12 (sd2) ; Write immediate 4 bytes from DATA. :FFF00674 lwz r16, 12 (sd2) ; \ :FFF00678 and. r16, r16, r8 ; | Wait until transfer complete. :FFF0067C bgt+ 0xFFF00674 ; / :FFF00680 b 0xFFF00688 :FFF00684 b 0xFFF006A4 :FFF00688 stw r4, 4 (sd2) ; EXI0 MAR - DMA memory address. :FFF0068C lwz r4, 4 (sd2) :FFF00690 stw r5, 8 (sd2) ; EXI0 LEN - DMA transfer length. :FFF00694 lwz r5, 8 (sd2) :FFF00698 stw r9, 12 (sd2) ; Start EXI0 DMA write transfer. :FFF0069C lwz r16, 12 (sd2) :FFF006A0 b 0xFFF006A8 :FFF006A4 b 0xFFF006C4 ; \ :FFF006A8 and. r16, r16, r8 ; | Wait until transfer complete. :FFF006AC bgt+ 0xFFF0069C ; / :FFF006B0 stw r10, 0 (sd2) ; Deselect device. :FFF006B4 lwz r16, 0 (sd2) :FFF006B8 lwz r16, 0 (sd2) :FFF006BC add r3, r3, r12 :FFF006C0 b 0xFFF006C8 :FFF006C4 b 0xFFF006E4 :FFF006C8 add r4, r4, r11 ; Advance pointers. :FFF006CC sub sd1, sd1, r5 :FFF006D0 b 0xFFF00644
Set link register to IPL entrypoint.
:FFF006D4 lis r4, 0x8130 :FFF006D8 ori r4, r4, 0x0000 :FFF006DC mtlr r4 ; LR = 0x81300000 :FFF006E0 b 0xFFF006E8
Disable bootrom decryption logic and disallow 32MHz EXI clock setting by CPU.
:FFF006E4 b 0xFFF00704 :FFF006E8 lis r6, 0x0000 :FFF006EC ori r6, r6, 0x2000 :FFF006F0 stw r6, 0 (sd2) ; Set ROMDIS bit in EXI0 CSR. :FFF006F4 li r4, 1 :FFF006F8 stw r4, 60 (r14) ; SI EXICLK[LOCK] = 1 :FFF006FC lwz r4, 60 (r14) :FFF00700 b 0xFFF00708 :FFF00704 b 0xFFF00720
Clear OS pointer to DVD BI2 location. Jump to IPL entrypoint.
:FFF00708 lis r4, 0x8000 :FFF0070C li r3, 0 :FFF00710 stw r3, 0x00F4 (r4) :FFF00714 blr ; !! IPL START TO EXECUTE !!
Why BS is jumping around? The jumping is because the way the instructions are fetched. They must be fetched in exact linear order, otherwise the scrambling goes out of sync. So in order to do any loops, BS enable the icache and to fill the icache, its jump to the first location in each icache line. That's why BS jump in 0x20 byte steps.
:FFF00720 b 0xFFF00740 ; No-ops between branches. :FFF00740 b 0xFFF00760 :FFF00760 b 0xFFF00780 :FFF00780 b 0xFFF007A0 :FFF007A0 b 0xFFF007C0 :FFF007C0 b 0xFFF007E0 :FFF007E0 b 0xFFF00644
Since BS is very jumpy, here is brief run flow :
- Init Flipper (ARAM, MI, reset DVD).
- Init Gekko (enable cache, set Dolphin OS memory model, enable translation).
- Probe AD16.
- Test memory. Halt CPU, if test failed.
- Load IPL into main memory and disable bootrom scrambler.
- Jump to IPL's __start.
PART II. IPL - Initial Program Loader (Boot Stage 2)
IPL is written on C, and its using old development version of default Nintendo SDK libraries and Dolphin OS. You can think, that IPL is a DOL application inside bootrom.
I will not reverse old versions of library calls. Old version of __start differs from new version by missed OSInit call. At the end, it will call IPL's main, as always. (Just remind that __start is CRT init for GCC-like compilers).
void main() { ???(); OSInit(); AD16Init(); AD16WriteReg(0x08000000); DVDInit(); AD16WriteReg(0x09000000); CARDInit(); AD16WriteReg(0x0A000000); ???(); __VIInit(0); VIInit(); AD16WriteReg(0x0B000000); ???(); ???(); ???(); PADSetSpec(PAD_SPEC_5); PADInit(); AD16WriteReg(0x0C000000); ???(); OSHalt("BS2 ERROR >>> SHOULD NEVER REACH HERE"); }
PART III. Bootrom hack tutorial and hardware details
Questions
- Q: Where is bootrom chip placed? I cant see it..
- Q: What length does actually bootrom have?
- Q: Who developed bootrom chip and its data protection?
- Q: What algorithm is used for bootrom encryption?
- Q: Is it possible to read bootrom by EXI DMA after reset (in my application)?
- Q: Yes, I know its not possible for EXI DMA, but how about physically mapped direct access from 0xFFFxxxxx area?
- Q: How do I dump encrypted bootrom?
- Q: Yes, decryption algo is unknown, but I've seen some "IPL Replacements", how its possible?
- Q: Where I can get XOR cipher key, to decrypt my encrypted dump?
Appendix: Glossary of Terms and Acronyms
ToDo
- Get all bootrom versions and crosscheck against them.
- Find more details of unknown registers.
- Get more info about AD16. Trace steps 1, 2, 3.
- Continue on IPL.
- Finish bootrom hack tutorial.
Source
This information was found at: [1]