Difference between revisions of "Inline Assembler"
Randomdude (talk | contribs) m |
m (wikified) |
||
Line 1: | Line 1: | ||
− | Inlining assembler allows to insert assembler instructions into C or C++ code. GCC provides the "asm" expression for that. This tutorial describes how to use PowerPC instructions as inline assembler code with GCC. | + | Inlining assembler allows to insert assembler instructions into C or C++ code. [[GCC]] provides the "asm" expression for that. This tutorial describes how to use PowerPC instructions as inline assembler code with [[GCC]]. |
For a start this example will add an ori 0,0,0 = NOP (no operation) instruction to the C code: | For a start this example will add an ori 0,0,0 = NOP (no operation) instruction to the C code: | ||
Line 8: | Line 8: | ||
If "asm" or "volatile" are already used by your program as names you can use __asm__ und __volatile__ instead. | If "asm" or "volatile" are already used by your program as names you can use __asm__ und __volatile__ instead. | ||
− | Some compilers do save all registers before executing inline assembler instructions and restore them afterwords. GCC does not do that so. Modifying a register and not putting that into the "clobber list" may crash your program. The clobber list is described below. | + | Some compilers do save all registers before executing inline assembler instructions and restore them afterwords. [[GCC]] does not do that so. Modifying a register and not putting that into the "clobber list" may crash your program. The clobber list is described below. |
To add several assembler statements at once you can use: | To add several assembler statements at once you can use: | ||
Line 43: | Line 43: | ||
In the next line the input operand is specified in equivalent syntax to the output operand. If there are more than one input operand, these are separated by commas within this part. These operands can be any C expression, e.g. a variable or a member of a structure. | In the next line the input operand is specified in equivalent syntax to the output operand. If there are more than one input operand, these are separated by commas within this part. These operands can be any C expression, e.g. a variable or a member of a structure. | ||
− | Finally, in the clobber list the registers are specified which will be modified or overwritten by the assembler statements. This way GCC can make sure that these modifications do not corrupt the rest of the C code. Otherwise the program will operate wrong or crash! If you use compare instructions you have to include "cc" here for the condition register. | + | Finally, in the clobber list the registers are specified which will be modified or overwritten by the assembler statements. This way [[GCC]] can make sure that these modifications do not corrupt the rest of the C code. Otherwise the program will operate wrong or crash! If you use compare instructions you have to include "cc" here for the [[condition register|Registers]]. |
In the assembler code section the C expressions are referenced by a percent sign followed by the symbolic name in square brackets. Multiple statements are separated by a semicolon or \n. Output operands must be long values. | In the assembler code section the C expressions are referenced by a percent sign followed by the symbolic name in square brackets. Multiple statements are separated by a semicolon or \n. Output operands must be long values. | ||
Line 60: | Line 60: | ||
); | ); | ||
− | In older versions of GCC there were no symbolic names in square brackets available. The output and input operands were referenced by numbers. The sample above would have looked like that then: | + | In older versions of [[GCC]] there were no symbolic names in square brackets available. The output and input operands were referenced by numbers. The sample above would have looked like that then: |
long in_value; | long in_value; | ||
Line 72: | Line 72: | ||
); | ); | ||
− | You will find this type of coding in source files and it is still supported by GCC. Here the assembler code lines are separated by \n\t. \t adds a tab on the new line in the assembler output file. | + | You will find this type of coding in source files and it is still supported by [[GCC]]. Here the assembler code lines are separated by \n\t. \t adds a tab on the new line in the assembler output file. |
The most common constraints are: | The most common constraints are: | ||
Line 108: | Line 108: | ||
); | ); | ||
− | With older versions of GCC this example would be written as: | + | With older versions of [[GCC]] this example would be written as: |
long out_value;<br> | long out_value;<br> | ||
asm ( | asm ( |
Revision as of 17:31, 24 May 2022
Inlining assembler allows to insert assembler instructions into C or C++ code. GCC provides the "asm" expression for that. This tutorial describes how to use PowerPC instructions as inline assembler code with GCC.
For a start this example will add an ori 0,0,0 = NOP (no operation) instruction to the C code:
asm volatile ("ori 0,0,0"); /* NOP */
The assembler code is in quotation marks. The modifier "volatile" will stop GCC from optimizing the assembler code. This could involve stuff which may not be what is intended. If "asm" or "volatile" are already used by your program as names you can use __asm__ und __volatile__ instead.
Some compilers do save all registers before executing inline assembler instructions and restore them afterwords. GCC does not do that so. Modifying a register and not putting that into the "clobber list" may crash your program. The clobber list is described below.
To add several assembler statements at once you can use:
asm volatile ( "ori 0,0,0\n\t" "mr 0,0\n\t" "rlwinm 0,0,0,0,31" );
All these statements in this example should do nothing in effect. The \n\t parameters are added to separate the statements and improve the readability of the assembler source GCC will generate from this should you use the -save-temps switch when compiling the code. You can use also use the -mregnames command line switch with GCC to compile the code. GCC will then output register names in the assembly language output. So instead of "mr 1,11" GCC will output "mr %r1,%r11".
To access the variables within the C code the asm expression includes additional parts which are separated by colons. This is the general form of the asm expression:
asm(code : output operand list : input operand list : clobber list);
If e.g. no clobber list needs to be specified, the last colon may be omitted. The first example above was short for:
asm volatile ("ori 0,0,0" :::); /* NOP */
In the following example the long integer value in_value is moved into the long integer value out_value. Using the "r" constraint these variables are passed to the assembler code as registers:
long in_value; long out_value;
asm ( "mr %[in_value],%[out_value]" :[out_value]"=r" (out_value) /* output */ :[in_value]"r" (in_value) /* input */ :"20" /* GPR20 will be clobbered */ );
Following the code, which just consists of the MR instruction, the output operand is specified. In square brackets the symbolic name is specified that will be used within the assembler code to access this variable. In the example the same name as in the C code is used to improve the readability. Following that in quotation marks is the so-called "constraint" which specifies the type of the operand and will be discussed below. Then enclosed in parentheses there is the C expression determining the variable passed to the assembler code. Output operand must be long values.
In the next line the input operand is specified in equivalent syntax to the output operand. If there are more than one input operand, these are separated by commas within this part. These operands can be any C expression, e.g. a variable or a member of a structure.
Finally, in the clobber list the registers are specified which will be modified or overwritten by the assembler statements. This way GCC can make sure that these modifications do not corrupt the rest of the C code. Otherwise the program will operate wrong or crash! If you use compare instructions you have to include "cc" here for the Registers.
In the assembler code section the C expressions are referenced by a percent sign followed by the symbolic name in square brackets. Multiple statements are separated by a semicolon or \n. Output operands must be long values.
In the following example in_value and out_value are passed to the assembler code as memory addresses by using the "m" constraint .
long in_value; long out_value;
asm ( "lwz 20,%[in_value];" /* move in_value to GPR20 */ "stw 20,%[out_value]" /* move GPR20 to out_value */
:[out_value]"=m" (out_value) /* output */ :[in_value]"m" (in_value) /* input */ :"20" /* GPR20 will be clobbered */ );
In older versions of GCC there were no symbolic names in square brackets available. The output and input operands were referenced by numbers. The sample above would have looked like that then:
long in_value; long out_value;
asm ( "lwz 20,%1\n\t" /* move in_value to GPR20 */ "stw 20,%0" /* move GPR20 to out_value */
:"=m" (out_value) /* output - %0 -> operand zero */ :"m" (in_value) /* input - %1 -> operand one */ :"20" /* GPR20 will be clobbered */ );
You will find this type of coding in source files and it is still supported by GCC. Here the assembler code lines are separated by \n\t. \t adds a tab on the new line in the assembler output file.
The most common constraints are:
r = general register m = memory address i = symbolic constant to be used as an immediate value
There are more constraints documented in the GCC manual.
You can add modifiers to these constraints. If no modifier is added to the constraint, it means a read-only operand. Otherwise add:
"=" for a write-only operand "+" for a read-write operand. "&" marks an operand as "earlyclobber".
A "=" or "+" modifier usually has to be added to the constraint of the output operand.
The "earlyclobber" modifier can be added to output operands (e.g. "=&r") to make sure GCC uses different registers for input and output operands. GCC assumes that the output operands are not used before the code is done with all input operands and reuses the input registers for output operands. If there are a lot of statements an output operand may be used before the code is finished with the input operands.
In the next example out_value is set to four. The constant "const_value" will be passed as an immediate operand:
#define const_value 4 long out_value;
asm ( "li %[out_value],%[const_value]" :[out_value]"=r" (out_value) /* output */ :[const_value]"i" (const_value) /* input */ );
If you want to use the same variable for input and output you have to add the "+" modifier to the output operand. This is done in this example which takes out_value for input and doubles that:
long out_value;
asm ( "add %[out_value],%[out_value],%[out_value]\n\t" :[out_value]"+r" (out_value) /* output and input */ );
With older versions of GCC this example would be written as:
long out_value;
asm ( "add %[out_value],%[out_value],%1\n\t" :[out_value]"=r" (out_value) /* output */ :"0" (out_value) /* input */ );
The constraint "0" (zero) for operand 1 (input) specifies that this operand must occupy the same location as operand 0 (output). A number in a constraint may only be used for an input operand and this has to refer to an output operand. Observe that this operand is used as %1 in the assembler instruction.
Here is a very simple example for a subroutine including an "asm" statement. It just tests if the input value (which is equal to the output value) is four. If this is the case the value will be changed to eight. Since the "cmpwi" instruction will modify the condition register we have to add that to the clobber list:
int test_for_4(out_value) {
asm ( "cmpwi %[out_value],4\n\t" /* Compare value in out_value with 4 */ "bne else_label\n\t" /*if not 4 goto else_label */ "li %[out_value],8\n\t" /* if 4 then make it 8 */ "b endif_label\n\t" /* jmp over else part */
"else_label:\n\t" "ori 0,0,0\n\t" /* nop */
"endif_label:\n\t"
:[out_value]"+r" (out_value) /* output and input */ : /* no separate input operand */ :"cc" /* condition register will be clobbered */ );
return out_value; }
This function is then called e.g. with:
out_value = test_for_4(out_value);
You may branch within one "asm" expression only, you cannot jump to a different "asm" expression within the C code.
The following example reads bit 31 of the machine state register to determine whether the processor is in big or little endian mode:
asm( "mfmsr 20\n\t" /* Move from Machine State Register - into GPR20*/ "rlwinm %[out_value],20,1,31,31\n\t" /* rotate GPR20 left one bit (move 31 to 0) then mask all bits but 0 */ :[out_value]"=r" (out_value) /* output operand */ : /* no input operand */ :"20" /* GPR20 will be clobbered */ );
if (out_value==0) { printf("Big endian mode\n"); } else { printf("Little endian mode\n"); }
The PowerPC processor defaults to big endian mode but can be switched into little endian mode on startup.
- Links