Hardware/GX/Blitting Processor
The Blitting Processor is a component of the Wii's GX subsystem. It is responsible for copying the EFB to the XFB, doing the RGBA->YCbCr conversion and scaling/antialiasing in the process.
BP (blitting processor) registers
The BP registers are accessed by writing a 8-bit value of 0x61 to the FIFO, followed by 32 bit value. This value is a bit weird - the high 8 bits are the register, and the low 24 bits are the register value.
EFB source registers
One can specify which part of the EFB is copied to the XFB or texture, using the following BP registers:
GX_BP_EFB_BOXCOORD (0x49) | ||||||||||||||||||||||||
23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
Access | U | R/W | R/W | |||||||||||||||||||||
Field | Y | X |
- 0x49: coordinates to top left of rectangle in EFB that will be copied (packed format, unknown)
GX_BP_EFB_BOXSIZE (0x4a) | ||||||||||||||||||||||||
23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
Access | U | R/W | R/W | |||||||||||||||||||||
Field | height-1 | width-1 |
- 0x4a: width and height-1 of rectangle to copy in EFB (again, unknown packed format)
XFB destination registers
The destination of the copy in the XFB is specified by the physical address of the XFB and the row stride (basically width of row, but no scaling appled).
GX_BP_XFB_ADDR (0x4B) | |
230 | |
Access | R/W |
- 0x4b: Address of destination (XFB). BEWARE: Address is a PHYSICAL address, SHIFTED RIGHT by 5.
GX_BP_XFB_STRIDE (0x4D) | ||
2310 | 90 | |
Access | U | R/W |
- 0x4d: Low 10 bits specify row stride of destination.
Copy control register
GX_BP_COPY_CONTROL (0x52) | ||||||||||||||||||||||||
23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
Access | U | W | R/W | U | ||||||||||||||||||||
Field | Start | Clr |
- 0x52: This register starts a copy. Important bits:
Field | Description |
Start | Writing both of these bits to 1 will start a copy. |
Clr | Enables or disables clearing of the EFB during the copy. |
Copy clear registers
If EFB clearing is enabled in GX_BP_COPY_CONTROL, at each copy, the EFB is filled with the values specified by these registers.
GX_BP_COPY_CLEAR_COLOR_HIGH (0x4F) | ||||||||||||||||||||||||
23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
Access | U | R/W | R/W | |||||||||||||||||||||
Field | Alpha | Red |
- 0x4F: This register defines the alpha and red components of the copy clear color.
GX_BP_COPY_CLEAR_COLOR_LOW (0x50) | ||||||||||||||||||||||||
23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
Access | U | R/W | R/W | |||||||||||||||||||||
Field | Green | Blue |
- 0x50: This register defines the green and blue components of the copy clear color.
GX_BP_COPY_CLEAR_DEPTH (0x51) | |
230 | |
Access | R/W |
- 0x51: This register defines the copy clear depth.
A depth value of 0 represents 0.0 in floating-point, and a value of 0xFFFFFF (16777215) represents 1.0 .
So the formula to convert a floating-point depth to a 24-bit depth would be:
- 24_bit_depth = (floating_point_depth * 16777215.0)
Where floating_point_depth is between 0.0 and 1.0, of course.
Copy filter registers
GX_BP_FILTER_0 (0x01) | |
230 | |
Access | R/W |
GX_BP_FILTER_1 (0x02) | |
230 | |
Access | R/W |
GX_BP_FILTER_2 (0x03) | |
230 | |
Access | R/W |
GX_BP_FILTER_3 (0x04) | |
230 | |
Access | R/W |
Registers 0x01-0x04 are used for tricks like antialiasing. For a plain copy (i.e. no antialiasing) set all for to 0x666666.
Vertical filter registers
GX_BP_VFILTER_0 (0x53) | ||||||||||||||||||||||||
23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
Access | R/W | R/W | R/W | R/W | ||||||||||||||||||||
Field | f3 | f2 | f1 | f0 |
GX_BP_VFILTER_1 (0x54) | ||||||||||||||||||||||||
23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
Access | U | R/W | R/W | R/W | ||||||||||||||||||||
Field | f6 | f5 | f4 |
Like the filter registers, these vertical filter registers must be set up properly for you to see anything at all. Default values for no fancy operations are as follows:
Field | Description |
f0 | 0x00 |
f1 | 0x00 |
f2 | 0x15 |
f3 | 0x16 |
f4 | 0x15 |
f5 | 0x00 |
f6 | 0x00 |
Scissor registers
BPMEM_SCISSORTL (0x20) | ||||||||||||||||||||||||
23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
Field | Unused | Left (inclusive) | Unused | Top (inclusive) | ||||||||||||||||||||
Value | +1024 | +512 | +256 | +128 | +64 | +32 | +16 | +8 | +4 | +2 | +1 | +1024 | +512 | +256 | +128 | +64 | +32 | +16 | +8 | +4 | +2 | +1 |
BPMEM_SCISSORBR (0x21) | ||||||||||||||||||||||||
23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
Field | Unused | Right (inclusive) | Unused | Bottom (inclusive) | ||||||||||||||||||||
Value | +1024 | +512 | +256 | +128 | +64 | +32 | +16 | +8 | +4 | +2 | +1 | +1024 | +512 | +256 | +128 | +64 | +32 | +16 | +8 | +4 | +2 | +1 |
BPMEM_SCISSOROFFSET (0x59) | ||||||||||||||||||||||||
23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
Field | Unused | Y offset / 2 | Unused | X offset / 2 | ||||||||||||||||||||
Value | +512 | +256 | +128 | +64 | +32 | +16 | +8 | +4 | +2 | +512 | +256 | +128 | +64 | +32 | +16 | +8 | +4 | +2 |
All values have 342 added to them by the SDK/libogc, so a scissor located at (0, 0) with a size of (640, 528) would have left = 342, top = 342, right = 981, bottom = 869, x offset = 342/2 = 171, y offset = 171. Note that 342 is also added to the viewport's center position; the scissor offset is used to undo this.
The offset value is subtracted from a pixel's position when writing it to the EFB, and wraps around 1024 on a per-pixel basis. (The EFB's size is (640, 528), and the maximum texture size is (1024, 1024).) With an offset of 2 (encoded as (2+342)/2 = 172), a pixel at x=0 will pass the scissor test but not be written into the EFB as it is off screen, while a pixel at x=2 will be written to x=0 in the EFB and a pixel at x=639 will be written to x=637 in the EFB.
With some offsets, the wrapping will result in writing to both sides of the EFB. For instance, an offset of 510 (encoded as (342+510)/2 = 426) will write a pixel with x=510 into the EFB at x=0 and a pixel with x=639 at x=129, but also will also write a pixel with x=0 at x=1024-510=514 and a pixel with x=125 at x=639. (This only happens if the scissor left and right values are set to allow pixels with x=0 through x=639 to pass the scissor test and the viewport is set to draw to that region of the screen.)
The scissor test itself allows setting a maximum value of 2047. Since the wrapping when writing to the EFB happens on a per-pixel basis, it is possible for multiple pixels within a single triangle to write to the same location in the EFB. For instance, if the left is set to 0 (342) and the right is set to 1024+639=1663 (2005), and the x offset is set to 0 (342/2 = 171) then both x=0 and x=1024 will write to x=0 in the EFB.
Beginning a copy
The following must take place to do a copy:
- Setup clear and z clear registers (optional)
- Set source and destination registers
- Write to display copy control register to begin a copy
- Set clear, z, and control registers again (what? doubt necessary, libogc GX is stupid)