Hello World on a 32-bit x86 GNU machine.

I used to know x86 assembly intimately.  It was required to pass high school advanced comp sci.  But I haven’t used it since.  So I thought I’d sit down and see if I could remember how to run Hello, World in 32 bit x86 assembly.  It ran the first time.  Yayy.  So I annotated it and now I’m posting it here for others who want to learn x86 assembly.

.text				; begins an executable code section
LC0:				; a local constant called LC0
	.ascii "Hello, world!\12\0"

.globl _main			; exported symbol of the _main label
_main:				; start of execution

	; The first three instructions are common to all programs.  They
	; allocate space on the stack for local variables.  In this case, we
	; allocate 8 bytes.
	pushl	%ebp		; saves the base address for locals onto the s
	movl	%esp,%ebp	; replaces the local base address with the stac
	subl	$8,%esp		; grows the stack downward
	
	; We're going to use the upper half of the allocated stack memory to
	; store the desired output stream (STDOUT, fd 0).
	xorl	%eax,%eax	; XORing a register with itself invokes the STZ
	movl	%eax,-4(%ebp)	; puts zero into the upper half of the 8 bytes
	
	; setup C library environment for future calls
	call	__alloca
	call	___main

	; Put the base address of the string constant in the lower half of the
	; allocated stack.
	movl	$LC0,(%esp)
	
	; Call the C printf function to print the message.
	call	_printf

	; printf stored the return value in EAX but we don't care.
	; We're going to zero EAX since that'll also be our program's return value.
	xorl	%eax,%eax
	leave			; shorthand for movl %ebp,%esp; popl %ebp
	ret			; return the value of EAX to the OS

I know I could’ve done the bare-metal thing and move $4 into EAX (4 is the Linux syscall number for SYSWRITE) and interrupt the kernel. But nobody does that anymore. There’s little performance penalty for calling printf since the C libraries are already in memory on a running GNU system.

This assembly code is equivalent to the following C code:

#include <stdio.h>

int main() {
	printf("Hello, World!\n");
	return 0;
}
Advertisements
  1. #1 by Joshua on October 14, 2010 - 2:32 PM

    I named the string LC0 because that’s the way gas does it. I could’ve called it anything. Lest you think I just generated the assembly from the C code, mine’s much shorter. gcc fills in argc and argv as arguments to main() even though they’re never used. GCC also allocates space for SIMD execution even though I never use any SIMD instructions. Also, unless you’re using an optimization switch, gcc won’t use xorl %eax,%eax to zero a register – it’ll use the slower mov $0,%eax.

  2. #2 by Chadwick on October 14, 2010 - 4:49 PM

    It’s always so nice when things work the first time through…and when you find out you haven’t forgotten how to do things.

    Oh, and these code boxes are way nicer than what you used to post.

    • #3 by Joshua on October 14, 2010 - 5:26 PM

      totally. Major props to that dude for telling me about the blocks.

    • #4 by Joshua on December 12, 2010 - 10:05 AM

      I’m just sort of disappointed that there’s no syntax highlighting for AT&T syntax assembly code. The C code looks pretty, though.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: