Take your typical 'Hello World' basic C program [ hello.c ]:
/* Hello World in traditional C using printf() */
#include <stdio.h>
int main() {
printf("Hello World!\n");
}
When I compile the code into an executable file, I can jump into the assembler code by using the command:
gcc -o hello hello.c          -> compile the hello.c file into the hello executable file
./hello                               -> to execute the file (make it run on command line)
objdump -d hello | less     -> break into the hello file's assembler code
It will produce a whole bunch of header code and other filler, but the section that produces the result we asked for in our program looks like this:
00000000004004f6 <main>:
  4004f6:       55                             push   %rbp
  4004f7:       48 89 e5                   mov    %rsp,%rbp
  4004fa:       bf a0 05 40 00          mov    $0x4005a0,%edi
  4004ff:       e8 ec fe ff ff              callq  4003f0 <puts@plt>
  400504:       b8 00 00 00 00        mov    $0x0,%eax
  400509:       5d                            pop    %rbp
  40050a:       c3                            retq   
  40050b:       0f 1f 44 00 00         nopl   0x0(%rax,%rax,1)
This is basically the way that you would deconstruct a C file.
For our lab, we are going to deconstruct similar files with X86_64 and Aarch Registers. For example, looking at the hello-gas.s file (the code, not an executable file yet):
/* 
   This is a 'hello world' program in x86_64 assembler using the 
   GNU assembler (gas) syntax. Note that this program runs in 64-bit
   mode.
   CTyler, Seneca College, 2014-01-20
   Licensed under GNU GPL v2+
*/
.text
.globl  _start
_start:
        movq    $len,%rdx                       /* message length */
        movq    $msg,%rsi                       /* message location */
        movq    $1,%rdi                         /* file descriptor stdout */
        movq    $1,%rax                         /* syscall sys_write */
        syscall
        movq    $0,%rdi                         /* exit status */
        movq    $60,%rax                        /* syscall sys_exit */
        syscall
.section   .rodata
msg:    .ascii      "Hello, world!\n"
            len = . - msg
We are going to need to build this into an executable file by using the following commands, which I read from the Makefile, describing how to build them:
as hello-gas.o hello-gas.s                 -> turn the assembler code into an intermediary object file 
ld hello-gas hello-gas.o                    -> turn the object file into an executable file
Then we can jump into what the assembler code looks like in the executable file:
objdump -d hello-gas
Which will give the result:
hello-gas:     file format elf64-x86-64
Disassembly of section .text:
0000000000400078 <_start>:
  400078:       48 c7 c2 0e 00 00 00      mov    $0xe,%rdx
  40007f:       48 c7 c6 a6 00 40 00      mov    $0x4000a6,%rsi
  400086:       48 c7 c7 01 00 00 00     mov    $0x1,%rdi
  40008d:       48 c7 c0 01 00 00 00     mov    $0x1,%rax
  400094:       0f 05                              syscall 
  400096:       48 c7 c7 00 00 00 00     mov    $0x0,%rdi
  40009d:       48 c7 c0 3c 00 00 00     mov    $0x3c,%rax
  4000a4:       0f 05                              syscall 
I repeated the same steps to build all the X86_64 files and then view their assembly code, then I moved on to the Aarch64 files. What is interesting to note before moving on, is that the x86_64 assembler code strictly had what was necessary to perform the task my program was designed for. The C programs that I had deconstructed earlier had a LOT more lines of assembler code that had nothing to do with the actual function of the program (which is simply print "Hello, world!"). So it appears that C programs have a bit of work to do in translating more information to the registers concerning the layout of the programs or something like that I suppose.
For the Aarch64, the hello.s file looks like:
.text
.globl _start
_start:
        mov    x0, 1              /* file descriptor: 1 is stdout */
        adr      x1, msg         /* message location (memory address) */
        mov    x2, len           /* message length (bytes) */
        mov     x8, 64          /* write is syscall #64 */
        svc       0                  /* invoke syscall */
        mov      x0, 0           /* status -> 0 */
        mov     x8, 93          /* exit is syscall #93 */
        svc       0                  /* invoke syscall */
.data
msg:    .ascii      "Hello, world!\n"
len=    . - msg
I follow a similar process of:      as -o hello.o hello.s
However, right away I got this response:
hello.s: Assembler messages:
hello.s:5: Error: too many memory references for `mov'
hello.s:6: Error: no such instruction: `adr x1,msg'
hello.s:7: Error: too many memory references for `mov'
hello.s:9: Error: too many memory references for `mov'
hello.s:10: Error: no such instruction: `svc 0'
hello.s:12: Error: too many memory references for `mov'
hello.s:13: Error: too many memory references for `mov'
hello.s:14: Error: no such instruction: `svc 0'
I'm not sure why I got that, but for now I will continue moving onwards with the rest of the lab until I figure it out... The next part of the lab is using a looping function with some variations. The original code looks like:
.text
.globl    _start
start = 0                       /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 10                        /* loop exits when the index hits this number (loop condition is i<max) */
_start:
    mov     $start,%r15         /* loop index */
loop:
    /* ... body of the loop ... do something useful here ... */
    inc     %r15                /* increment index */
    cmp     $max,%r15           /* see if we're done */
    jne     loop                /* loop if we're not */
    mov     $0,%rdi             /* exit status */
    mov     $60,%rax            /* syscall sys_exit */
    syscall
By itself, the code will do nothing, but I need to adjust the code so that it prints something like:
Loop Loop Loop Loop Loop Loop Loop Loop Loop Loop
So I'm going to look for where I can make a change that creates that result.
Looking to our previous examples, I may need something like: msg:    .ascii      "Hello, world!\n"
So I'll change it to: msg:    .ascii      "Loop\n"
[ NEED TO FINISH THE LAB ]
 
No comments:
Post a Comment