Assembly HOWTO aka *Free* 32-bit x86 Assembly FAQ aka Linux x86 Assembly HOWTO Francois-Rene Rideau rideau@ens.fr v0.3c, 15 Jun 1996 HOWTO program in x86 assembly using only *FREE* programming tools. Keywords: assembly, assembler, free, macroprocessor, preprocessor, asm, inline asm, 32-bit, x86, i386, gas, as86, nasm LEGAL BLURP: Copyright (C) 1996 Francois-Rene Rideau. You can freely distribute this document, provided no modification is done to the text, but annotations that are clearly indicated as such. You can freely ask me to distribute the document otherwise. The Linux Documentation Project maintainers are free to do anything with this document, that all other LDP documents simultaneously allow, which they can understand optimistically as for a fix-point meaning. IMPORTANT NOTE: This is still a *VERY PRELIMINARY* version for this document. You (hey, that's *you* I'm talking to, so please listen!) are especially invited to ask questions, to answer to questions, to correct given answers, to add new FAQ answers, to give pointers to other software, to insult the current maintainer (me), and TO TAKE OVER THE MAINTENANCE OF THE FAQ in his place (mine), because I have other things to do... For any of these, please contact me mailto:rideau@ens.fr Perhaps we can convince Raymond Moon to add a section to his FAQ for comp.lang.asm.x86... ? 1. INTRODUCTION. This document aims at answering frequently asked questions of people who program or want to program 32-bit x86 assembly using *free* assemblers, particularly under the Linux operating system. It may also point to other documents about non-free, non-x86, or non-32-bit assemblers, though it is not its primary goal. Because the main interest of assembly programming is to build to write the guts of operating systems, languages, and games, where a C compiler fails to provide the needed expressivity (performance is more and more seldom an issue), we stress on development of such software. 1.1 How to use this document This document contains answers to some frequently asked questions. At many places, Universal Resource Locators (URL) are given for some software or documentation repository. Please see that the most useful repositories are mirrored, and that by accessing a nearer mirror site, you relieve the whole Internet from unneeded network traffic, while saving your own precious time. Particularly, there are large repositories all over the world, that mirror other popular repositories. You should learn and note what are those places near you (networkwise). Sometimes, the list of mirrors is listed in a file, or in a login message. Please heed the advice. Else, you should ask archie about the software you're looking for... The most recent version for this documents sits in http://www.eleves.ens.fr:8080/home/rideau/Assembly but what's in Linux HOWTO repositories *should* be fairly up to date, too (I can't know): ftp://sunsite.unc.edu/pub/linux/docs/HOWTO/ (?) 1.2 Other related documents * If you don't know what *free* software is, please do read *carefully* the GNU General Public License, which is used in a lot of free software, and a model for most; it generally comes in a file named "COPYING", with a library version in a file named "COPYING.LIB". Litterature from the FSF (free software foundation) might help you, too. * Particularly, the interesting kind of free software comes with sources that you can consult and correct, or sometimes even borrow from. Read your particular license carefully, and do comply to it. * There is a FAQ for comp.lang.asm.x86 that answers generic questions about x86 assembly programming, and questions about some commercial assemblers in a 16-bit DOS environment. Some of it apply to free 32-bit asm programming, so you may want to read this FAQ... http://www2.dgsys.com/~raymoon/faq/asmfaq.zip * FAQs and docs exist about programming on your favorite platform, whichever it is, that you should consult for platform-specific issues not directly related to programming in assembler. 2. ASSEMBLERS. 2.1 GCC Inline Assembly The well-known GNU C/C++ Compiler (GCC), an optimizing 32-bit compiler at the heart of the GNU project, supports the x86 architecture quite well, and includes the ability to insert assembly code in C programs, in such a way that register allocation can be either specified or left to GCC. GCC works on most available platforms, notably Linux, *BSD, VSTa, OS/2, *DOS, Win*, etc. 2.1.1 Where to find GCC The original GCC site is ftp://prep.ai.mit.edu/pub/gnu/ together with all the released application software from the GNU project. However, there exists a lot of mirrors. However, sources adapted to your favorite OS, and binaries precompiled for it, should be found at your usual FTP sites. For GCC under Linux, see around http://www.linux.org.uk/ For most popular DOS port of GCC is named DJGPP, and can be found in directories of such name in FTP sites. See: http://www.delorie.com/djgpp/ There is also a port of GCC to OS/2 named EMX, that also works under DOS; see around: http://www.leo.org/pub/comp/os/os2/gnu/emx+gcc/ http://warp.eecs.berkeley.edu/os2/software/shareware/emx.html 2.1.2 Where to find docs for GCC Inline Asm The documentation of GCC includes documentation files in texinfo format, that you can convert to tex, compile (with tex), and print, convert to interactive emacs .info format and browse, convert (with the right tools) to whatever you like, or just read as is. The .info files are generally found on any good installation for GCC. The right section to look for is: C Extensions::Extended Asm:: Section Invoking GCC::Submodel Options::i386 Options:: might help too. Particularly, it gives the i386 specific constraint names for registers: abcdSDB correspond to %eax, %ebx, %ecx, %edx, %esi, %edi, %ebp respectively (no letter for %esp). A URL for this document and section, as converted in HTML format, is http://www.cygnus.com/doc/usegcc_89.html#SEC92 The DJGPP Games resource (not only for game hackers) has this page specifically about assembly: http://www.rt66.com/~brennan/djgpp/djgpp_asm.html Finally, there is a web page called, "DJGPP Quick ASM Programming Guide", that covers URLs to FAQs, AT&T x86 ASM Syntax, Some inline ASM information, and converting .obj/.lib files: http://remus.rutgers.edu/~avly/djasm.html GCC depends on GAS for assembling, and follow its syntax (see below); do mind that inline asm needs percent characters to be quoted so they be passed to GAS. See the section about GAS below. Find *lots* of useful examples in the linux/include/asm-i386/ subdirectory of the sources for the free Linux OS. 2.1.3 How should I invoke GCC for it to properly inline my assembly code ? Be sure to invoke GCC with the "-O" flag, to enable optimizations and inline assembly. If you don't, your code may compile, but not run properly!!! More generally, good compile flags for GCC on the x86 platform are gcc -O2 -fomit-frame-pointer -m386 -O2 is the good optimization level. Optiimizing besides it yields code that is a lot larger, but only a bit faster; such overoptimizationn might be useful for tight loops only (if any), which you may be doing in assembly anyway; if you need that, do it just for the few routines that need it. -fomit-frame-pointer allows generated coode to skip the stupid frame pointer maintenance, which makes code smaller and faster, and frees a register for further optimizations. It precludes the easy use of debugging tools (gdb), but when you use these, you just don't care about size and speed anymore anyway. -m386 yields more compact code, without any measurable slowdown, (note that small code also means less disk I/O and faster execution) but perhaps on the above-mentioned tight loops. To optimize even more, option -mregparm=2 and/or corresponding function attribute might help, but might pose lots of problems when linking to foreign code... Note that you can add make these flags the default by editing file /usr/lib/gcc-lib/i486-linux/2.7.2/specs or wherever that is on your system. 2.2 GAS GAS is the GNU Assembler, that GCC relies upon, with 2.2.1 Where to find it Find it at the same place where you found GCC, in a package named binutils. 2.2.2 What is this AT&T syntax Because GAS was invented to support a 32-bit unix compiler, it uses standard "AT&T" syntax, which resembles a lot the syntax for standard 680x0 assemblers. This syntax is no worse, no better than the "Intel" syntax. It's just different. When you get used to it, you find it much more regular than the Intel syntax, though a bit boring. A program exists to help you convert programs from TASM syntax to AT&T syntax. See ftp://x2ftp.ou