Line 21:
Line 21:
To load and store Paired-singles, one must use the psq_l and psq_st instructions respectively, or one of their variants.
To load and store Paired-singles, one must use the psq_l and psq_st instructions respectively, or one of their variants.
=== psq_l ===
=== psq_l ===
−
psq_l frD, d(rA), W, I
+
psq_l frD, d(rA), W, I
This instruction dequantizes values from the memory address in '''d'''+('''rA'''|0) and puts them into PS0 and PS1 in '''frD'''. If '''W''' is 1, however, it only dequantizes one number, and places that into PS0. PS1 is loaded with 1.0 always when '''W''' is 1. '''I''' specifies the GQR to use for dequantization parameters. The two numbers read from the memory are directly after each other, regardless of size (for example, if the GQR specified to load as a u16, you would have '''d'''+('''rA'''|0) point to a two-element array of u16s)
This instruction dequantizes values from the memory address in '''d'''+('''rA'''|0) and puts them into PS0 and PS1 in '''frD'''. If '''W''' is 1, however, it only dequantizes one number, and places that into PS0. PS1 is loaded with 1.0 always when '''W''' is 1. '''I''' specifies the GQR to use for dequantization parameters. The two numbers read from the memory are directly after each other, regardless of size (for example, if the GQR specified to load as a u16, you would have '''d'''+('''rA'''|0) point to a two-element array of u16s)
===== psq_lx =====
===== psq_lx =====
−
psq_lx frD, rA, rB, W, I
+
psq_lx frD, rA, rB, W, I
This instruction acts exactly like psq_l, except instead of ('''rA''') being offset by '''d''', it is offset by ('''rB''').
This instruction acts exactly like psq_l, except instead of ('''rA''') being offset by '''d''', it is offset by ('''rB''').
===== psq_lu =====
===== psq_lu =====
−
psq_lu frD, d(rA), W, I
+
psq_lu frD, d(rA), W, I
This instruction acts exactly like psq_l, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
This instruction acts exactly like psq_l, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
===== psq_lux =====
===== psq_lux =====
−
psq_lux frD, rA, rB, W, I
+
psq_lux frD, rA, rB, W, I
This instruction acts exactly like psq_lx, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
This instruction acts exactly like psq_lx, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
=== psq_st ===
=== psq_st ===
−
psq_st frD, d(rA), W, I
+
psq_st frD, d(rA), W, I
This instruction quantizes values from the Paired Singles in '''frD''' and places them in the memory address in '''d'''+('''rA'''|0). If '''W''' is 1, however, it only quantizes PS0. '''I''' specifies the GQR to use for dequantization parameters. The two numbers written to memory are directly after each other, regardless of size (for example, if the GQR specified to store as a u16, '''d'''+('''rA'''|0) would be treated as a two-element array of u16s)
This instruction quantizes values from the Paired Singles in '''frD''' and places them in the memory address in '''d'''+('''rA'''|0). If '''W''' is 1, however, it only quantizes PS0. '''I''' specifies the GQR to use for dequantization parameters. The two numbers written to memory are directly after each other, regardless of size (for example, if the GQR specified to store as a u16, '''d'''+('''rA'''|0) would be treated as a two-element array of u16s)
===== psq_stx =====
===== psq_stx =====
−
psq_stx frD, rA, rB, W, I
+
psq_stx frD, rA, rB, W, I
This instruction acts exactly like psq_st, except instead of ('''rA''') being offset by '''d''', it is offset by ('''rB''').
This instruction acts exactly like psq_st, except instead of ('''rA''') being offset by '''d''', it is offset by ('''rB''').
===== psq_stu =====
===== psq_stu =====
−
psq_stu frD, d(rA), W, I
+
psq_stu frD, d(rA), W, I
This instruction acts exactly like psq_st, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
This instruction acts exactly like psq_st, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
===== psq_stux =====
===== psq_stux =====
−
psq_stux frD, rA, rB, W, I
+
psq_stux frD, rA, rB, W, I
This instruction acts exactly like psq_stx, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
This instruction acts exactly like psq_stx, except '''rA''' cannot be 0, and '''d'''+('''rA''') is placed back into '''rA'''.
+
+
== Single Parameter Operations ==
+
These functions operate on one FPR.
+
=== ps_abs ===
+
ps_abs frD, frB
+
This instruction gets the absolute values of both paired-singles in '''frB''', and stores them in the paired-singles in '''frD'''.
+
=== ps_mr ===
+
ps_mr frD, frB
+
This instruction moves both paired-singles in '''frB''' into the paired-singles in '''frD'''.
+
=== ps_nabs ===
+
ps_nabs frD, frB
+
This instruction gets the negative absolute values of both paired-singles in '''frB''', and stores them in the paired-singles in '''frD'''.
+
=== ps_neg ===
+
ps_neg frD, frB
+
This instruction negates the values of both paired-singles in '''frB''', and stores them in the paired-singles in '''frD'''.
+
=== ps_res ===
+
ps_res frD, frB
+
This instruction gets an estimate of the reciprocals of both paired-singles in '''frB''' accurate to a precision of 1/4096, and stores them in the paired-singles in '''frD'''.
+
=== ps_rsqrte ===
+
ps_rsqrte frD, frB
+
This instruction gets an estimate of the reciprocals of the square roots of both paired-singles in '''frB''' accurate to a precision of 1/4096, and stores them in the paired-singles in '''frD'''.