Changes

861 bytes added ,  22:52, 10 July 2010
Added misc. that should finish it.
Line 1: Line 1: −
Paired singles are a unique part of the Gekko/[[Hardware/Broadway|Broadway]] processors used in the Gamecube and Wii. They provide fast vector math by keeping two single-precision floating point numbers in a single floating point register, and doing math across registers. This page will demonstrate how these instructions are to be used.
+
Paired singles are a unique part of the Gekko/[[Hardware/Broadway|Broadway]] processors used in the Gamecube and Wii. They provide fast vector math by keeping two single-precision floating point numbers in a single floating point register, and doing math across registers. This page will demonstrate how these instructions work.
    
== Quantization and Dequantization ==
 
== Quantization and Dequantization ==
Line 21: Line 21:  
To load and store Paired-singles, one must use the psq_l and psq_st instructions respectively, or one of their variants.
 
To load and store Paired-singles, one must use the psq_l and psq_st instructions respectively, or one of their variants.
 
=== psq_l ===
 
=== psq_l ===
  psq_l     frD, d(rA), W, I
+
  psq_l     frD, d(rA), W, I
 
This instruction dequantizes values from the memory address in '''d'''+('''rA'''|0) and puts them into PS0 and PS1 in '''frD'''. If '''W''' is 1, however, it only dequantizes one number, and places that into PS0. PS1 is loaded with 1.0 always when '''W''' is 1. '''I''' specifies the GQR to use for dequantization parameters. The two numbers read from the memory are directly after each other, regardless of size (for example, if the GQR specified to load as a u16, you would have '''d'''+('''rA'''|0) point to a two-element array of u16s)
 
This instruction dequantizes values from the memory address in '''d'''+('''rA'''|0) and puts them into PS0 and PS1 in '''frD'''. If '''W''' is 1, however, it only dequantizes one number, and places that into PS0. PS1 is loaded with 1.0 always when '''W''' is 1. '''I''' specifies the GQR to use for dequantization parameters. The two numbers read from the memory are directly after each other, regardless of size (for example, if the GQR specified to load as a u16, you would have '''d'''+('''rA'''|0) point to a two-element array of u16s)
 
===== psq_lx =====
 
===== psq_lx =====
  psq_lx   frD, rA, rB, W, I
+
  psq_lx     frD, rA, rB, W, I
 
This instruction acts exactly like psq_l, except instead of ('''rA''') being offset by '''d''', it is offset by ('''rB''').
 
This instruction acts exactly like psq_l, except instead of ('''rA''') being offset by '''d''', it is offset by ('''rB''').
 
===== psq_lu =====
 
===== psq_lu =====
  psq_lu   frD, d(rA), W, I
+
  psq_lu     frD, d(rA), W, I
 
This instruction acts exactly like psq_l, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
 
This instruction acts exactly like psq_l, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
 
===== psq_lux =====
 
===== psq_lux =====
  psq_lux   frD, rA, rB, W, I
+
  psq_lux   frD, rA, rB, W, I
 
This instruction acts exactly like psq_lx, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
 
This instruction acts exactly like psq_lx, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
    
=== psq_st ===
 
=== psq_st ===
  psq_st   frD, d(rA), W, I
+
  psq_st     frD, d(rA), W, I
 
This instruction quantizes values from the Paired Singles in '''frD''' and places them in the memory address in '''d'''+('''rA'''|0). If '''W''' is 1, however, it only quantizes PS0. '''I''' specifies the GQR to use for dequantization parameters. The two numbers written to memory are directly after each other, regardless of size (for example, if the GQR specified to store as a u16, '''d'''+('''rA'''|0) would be treated as a two-element array of u16s)
 
This instruction quantizes values from the Paired Singles in '''frD''' and places them in the memory address in '''d'''+('''rA'''|0). If '''W''' is 1, however, it only quantizes PS0. '''I''' specifies the GQR to use for dequantization parameters. The two numbers written to memory are directly after each other, regardless of size (for example, if the GQR specified to store as a u16, '''d'''+('''rA'''|0) would be treated as a two-element array of u16s)
 
===== psq_stx =====
 
===== psq_stx =====
  psq_stx   frD, rA, rB, W, I
+
  psq_stx   frD, rA, rB, W, I
 
This instruction acts exactly like psq_st, except instead of ('''rA''') being offset by '''d''', it is offset by ('''rB''').
 
This instruction acts exactly like psq_st, except instead of ('''rA''') being offset by '''d''', it is offset by ('''rB''').
 
===== psq_stu =====
 
===== psq_stu =====
  psq_stu   frD, d(rA), W, I
+
  psq_stu   frD, d(rA), W, I
 
This instruction acts exactly like psq_st, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
 
This instruction acts exactly like psq_st, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
 
===== psq_stux =====
 
===== psq_stux =====
  psq_stux frD, rA, rB, W, I
+
  psq_stux   frD, rA, rB, W, I
 
This instruction acts exactly like psq_stx, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
 
This instruction acts exactly like psq_stx, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
   Line 49: Line 49:  
These functions operate on one FPR.
 
These functions operate on one FPR.
 
=== ps_abs ===
 
=== ps_abs ===
  ps_abs   frD, frB
+
  ps_abs     frD, frB
    
  frD(ps0) = abs(frB(ps0))
 
  frD(ps0) = abs(frB(ps0))
Line 55: Line 55:     
=== ps_mr ===
 
=== ps_mr ===
  ps_mr     frD, frB
+
  ps_mr     frD, frB
    
  frD(ps0) = frB(ps0)
 
  frD(ps0) = frB(ps0)
Line 61: Line 61:     
=== ps_nabs ===
 
=== ps_nabs ===
  ps_nabs   frD, frB
+
  ps_nabs   frD, frB
    
  frD(ps0) = -abs(frB(ps0))
 
  frD(ps0) = -abs(frB(ps0))
Line 67: Line 67:     
=== ps_neg ===
 
=== ps_neg ===
  ps_neg   frD, frB
+
  ps_neg     frD, frB
    
  frD(ps0) = -frB(ps0)
 
  frD(ps0) = -frB(ps0)
Line 73: Line 73:     
=== ps_res ===
 
=== ps_res ===
  ps_res   frD, frB
+
  ps_res     frD, frB
    
  frD(ps0) = -1/frB(ps0)
 
  frD(ps0) = -1/frB(ps0)
Line 80: Line 80:     
=== ps_rsqrte ===
 
=== ps_rsqrte ===
  ps_rsqrte frD, frB
+
  ps_rsqrte frD, frB
    
  frD(ps0) = -1/sqrt(frB(ps0))
 
  frD(ps0) = -1/sqrt(frB(ps0))
Line 89: Line 89:  
Simple everyday math.
 
Simple everyday math.
 
=== ps_add ===
 
=== ps_add ===
  ps_add   frD, frA, frB
+
  ps_add     frD, frA, frB
    
  frD(ps0) = frA(ps0) + frB(ps0)
 
  frD(ps0) = frA(ps0) + frB(ps0)
Line 95: Line 95:     
=== ps_div ===
 
=== ps_div ===
  ps_div   frD, frA, frB
+
  ps_div     frD, frA, frB
    
  frD(ps0) = frA(ps0) / frB(ps0)
 
  frD(ps0) = frA(ps0) / frB(ps0)
Line 101: Line 101:     
=== ps_mul ===
 
=== ps_mul ===
  ps_mul   frD, frA, frC
+
  ps_mul     frD, frA, frC
    
  frD(ps0) = frA(ps0) * frC(ps0)
 
  frD(ps0) = frA(ps0) * frC(ps0)
Line 107: Line 107:     
=== ps_sub ===
 
=== ps_sub ===
  ps_sub   frD, frA, frB
+
  ps_sub     frD, frA, frB
    
  frD(ps0) = frA(ps0) - frB(ps0)
 
  frD(ps0) = frA(ps0) - frB(ps0)
Line 114: Line 114:  
== Comparison ==
 
== Comparison ==
 
=== ps_cmpo0 ===
 
=== ps_cmpo0 ===
  ps_cmpo0 crfD, frA, frB
+
  ps_cmpo0   crfD, frA, frB
  ps_cmpu0 crfD, frA, frB
+
  ps_cmpu0   crfD, frA, frB
    
  cfrD = frA(ps0) compare frB(ps0)
 
  cfrD = frA(ps0) compare frB(ps0)
    
=== ps_cmpo1 ===
 
=== ps_cmpo1 ===
  ps_cmpo1 crfD, frA, frB
+
  ps_cmpo1   crfD, frA, frB
  ps_cmpu1 crfD, frA, frB
+
  ps_cmpu1   crfD, frA, frB
    
  cfrD = frA(ps1) compare frB(ps1)
 
  cfrD = frA(ps1) compare frB(ps1)
Line 128: Line 128:  
These instructions multiply in complex ways
 
These instructions multiply in complex ways
 
=== ps_madd ===
 
=== ps_madd ===
  ps_madd   frD, frA, frC, frB
+
  ps_madd   frD, frA, frC, frB
    
  frD(ps0) = frA(ps0) * frC(ps0) + frB(ps0)
 
  frD(ps0) = frA(ps0) * frC(ps0) + frB(ps0)
Line 134: Line 134:     
=== ps_madds0 ===
 
=== ps_madds0 ===
  ps_madds0 frD, frA, frC, frB
+
  ps_madds0 frD, frA, frC, frB
    
  frD(ps0) = frA(ps0) * frC(ps0) + frB(ps0)
 
  frD(ps0) = frA(ps0) * frC(ps0) + frB(ps0)
Line 140: Line 140:     
=== ps_madds1 ===
 
=== ps_madds1 ===
  ps_madds1 frD, frA, frC, frB
+
  ps_madds1 frD, frA, frC, frB
    
  frD(ps0) = frA(ps0) * frC(ps1) + frB(ps0)
 
  frD(ps0) = frA(ps0) * frC(ps1) + frB(ps0)
Line 146: Line 146:     
=== ps_msub ===
 
=== ps_msub ===
  ps_msub   frD, frA, frC, frB
+
  ps_msub   frD, frA, frC, frB
    
  frD(ps0) = frA(ps0) * frC(ps0) - frB(ps0)
 
  frD(ps0) = frA(ps0) * frC(ps0) - frB(ps0)
Line 152: Line 152:     
=== ps_muls0 ===
 
=== ps_muls0 ===
  ps_muls0 frD, frA, frC
+
  ps_muls0   frD, frA, frC
    
  frD(ps0) = frA(ps0) * frC(ps0)
 
  frD(ps0) = frA(ps0) * frC(ps0)
Line 158: Line 158:     
=== ps_muls1 ===
 
=== ps_muls1 ===
  ps_muls1 frD, frA, frC
+
  ps_muls1   frD, frA, frC
    
  frD(ps0) = frA(ps0) * frC(ps1)
 
  frD(ps0) = frA(ps0) * frC(ps1)
Line 164: Line 164:     
=== ps_nmadd ===
 
=== ps_nmadd ===
  ps_nmadd frD, frA, frC, frB
+
  ps_nmadd   frD, frA, frC, frB
    
  frD(ps0) = -(frA(ps0) * frC(ps0) + frB(ps0))
 
  frD(ps0) = -(frA(ps0) * frC(ps0) + frB(ps0))
Line 170: Line 170:     
=== ps_nmsub ===
 
=== ps_nmsub ===
  ps_nmsub frD, frA, frC, frB
+
  ps_nmsub   frD, frA, frC, frB
    
  frD(ps0) = -(frA(ps0) * frC(ps0) - frB(ps0))
 
  frD(ps0) = -(frA(ps0) * frC(ps0) - frB(ps0))
 
  frD(ps1) = -(frA(ps1) * frC(ps1) - frB(ps1))
 
  frD(ps1) = -(frA(ps1) * frC(ps1) - frB(ps1))
 +
 +
== Miscellaneous ==
 +
Whatever doesn't fit into the other categories
 +
=== ps_merge00 ===
 +
ps_merge00 frD, frA, frB
 +
 +
frD(ps0) = frA(ps0)
 +
frD(ps1) = frB(ps0)
 +
 +
=== ps_merge01 ===
 +
ps_merge01 frD, frA, frB
 +
 +
frD(ps0) = frA(ps0)
 +
frD(ps1) = frB(ps1)
 +
 +
=== ps_merge10 ===
 +
ps_merge10 frD, frA, frB
 +
 +
frD(ps0) = frA(ps1)
 +
frD(ps1) = frB(ps0)
 +
 +
=== ps_merge11 ===
 +
ps_merge11 frD, frA, frB
 +
 +
frD(ps0) = frA(ps1)
 +
frD(ps1) = frB(ps1)
 +
 +
=== ps_sel ===
 +
ps_sel    frD, frA, frC, frB
 +
 +
if(frA(ps0) >= 0)
 +
        frD(ps0) = frC(ps0)
 +
else
 +
        frD(ps0) = frB(ps0)
 +
if(frA(ps1) >= 0)
 +
        frD(ps1) = frC(ps1)
 +
else
 +
        frD(ps1) = frB(ps1)
 +
 +
=== ps_sum0 ===
 +
ps_sum0    frD, frA, frC, frB
 +
 +
frD(ps0) = frA(ps0) + frB(ps1)
 +
frD(ps1) = frC(ps1)
 +
 +
=== ps_sum1 ===
 +
ps_sum1    frD, frA, frC, frB
 +
 +
frD(ps0) = frC(ps0)
 +
frD(ps1) = frA(ps0) + frB(ps1)
47

edits