Zero Grams of Trans Fat Binaries

tons of xcode build targetsPeople like their applications to work. Even better, they like them to work, even when things change. For the WinTel world, this isn’t a big deal (Vista aside ;), because the underlying CPU architecture hasn’t really changed, from a program’s point of view, in the past two decades. Unless you have a weird program that’s designed for AMD’s 3DNow! instruction set and you switch to an Intel CPU, or perhaps an application designed for a more esoteric old SIMD architecture, your application should run just fine (as long as your Operating System is ok with it).

Mac OS X doesn’t have the luxury of working on the same underlying CPU though, so things need to be handled a little bit differently. The solution Apple came up with was the idea of a “Fat” binary, sometimes called a “Universal Binary”. In other words, instead of a single program being contained in a program file, the program file can contain several programs for different architectures. For example:

cwright@phendrana:~>file /bin/ls 
/bin/ls: Mach-O universal binary with 2 architectures
/bin/ls (for architecture i386):    Mach-O executable i386
/bin/ls (for architecture ppc7400): Mach-O executable ppc

or an even more convoluted example:

cwright@phendrana:~>file GLTools 
GLTools: Mach-O universal binary with 4 architectures
GLTools (for architecture ppc7400): Mach-O bundle ppc
GLTools (for architecture i386):    Mach-O bundle i386
GLTools (for architecture ppc64):   Mach-O 64-bit bundle ppc64
GLTools (for architecture x86_64):  Mach-O 64-bit bundle x86_64

This increases file size considerably (4x in the last example), but it provides you with the cool side effect of being able to drop the exact same program onto a PowerPC Mac, and have it operate identically – as long as you’re properly handling architectural differences such as endianness. Overall, this is a pretty slick solution, and with the exception of a few small tweaks, I doubt I could have come up with a better idea. (The small tweaks, in case you’re wondering, would be shared data segments across the binaries inside, such that non-code stuff only needs to be included once, instead of 4 times. This doesn’t work well when the data contains code though, so you’d need to have flags to control how it operates).

One of many problems rears its ugly head though when developing such portable applications: Linking with static and dynamic libraries.

Out of the box, OS X ships with many libraries that are all appropriately compiled to support all the above architectures, so you never notice this problem when compiling against standard included libraries. However, if you stray off the beaten path, and use another library, you’re destined for trouble. Open Source libraries, especially the ones whose build system depends on the monstrosity that is AutoConf (./configure scripts and all that), are surprisingly difficult to get working. In part, because they’re not designed to be built for multiple architectures in parallel, and in part because AutoConf is infuriatingly worthless when it comes to documentation.

Of course, since I’m writing all of this, I’m obviously in the middle of such a battle :)

By default, I run configure like this:

CFLAGS="-Os -fomit-frame-pointer" ./configure [options]

where options is stuff like --enable-shared and other library stuff. On occasion (only 85% of the time), I also have to override other environment variables because parts like pkg-config don’t work, because other libraries don’t install properly, and a whole host of other problems. I really can’t believe I actually liked dealing with this crap when I used linux… but I digress.

So first off, I think “Hey, I can add some magic to the CFLAGS parameter, and it’ll just compile!” … hahaha … I wish. Here’s what happens:

cwright@phendrana:~/Desktop/Recent Source Stuff/libSomeLib-X.Y.Z>CFLAGS="-Os -fomit-frame-pointer -arch i386 -arch ppc -arch x86_64 -arch ppc64" ./configure --enable-shared

Configure does its thing, and then says it’s done and you’re ready to build. It’s lying, of course:

cwright@phendrana:~/Desktop/Recent Source Stuff/libSomeLib-X.Y.Z>make
make  all-recursive
Making all in libSomeLib
/bin/sh ../libtool --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I.. -I..    -Os -fomit-frame-pointer -arch i386 -arch ppc -arch x86_64 -arch ppc64 -MT io.lo -MD -MP -MF .deps/io.Tpo -c -o io.lo io.c
 gcc -DHAVE_CONFIG_H -I. -I.. -I.. -Os -fomit-frame-pointer -arch i386 -arch ppc -arch x86_64 -arch ppc64 -MT io.lo -MD -MP -MF .deps/io.Tpo -c io.c  -fno-common -DPIC -o .libs/io.o
gcc-4.0: -E, -S, -save-temps and -M options are not allowed with multiple -arch flags
make[2]: *** [io.lo] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

So, -M options aren’t allowed to have multiple arch flags, says gcc. I do some spelunking to find out exactly what these flags do, and find that they’re for creating Makefile dependencies. But wait a minute, isn’t that Configure’s job? Man these build tools are awesome. … (technically speaking, it is Make’s job to make this stuff. There just isn’t a hook to add architecture support anywhere else without completely re-engineering the build system.)

Next up, I read some Apple documentation to see how they go about doing it (for stuff like OpenSSL, etc.). They recommend making a new XCode project, and then making a zillion build targets, taped together with some shell scripts. To be honest, it doesn’t seem very Apple-like. If I’m going to be dealing with shell scripts, I’ll just do it myself in Terminal.

The second attempt consists of duplicating the source tree into 4 directories, one for each architecture. Then, my plan goes, I’ll use lipo to glue them all together into a fat binary, and I’ll be on my way.

I run configure in each directory tree, careful to include only one -arch parameter in each one. I issue make in the i386, ppc, and x86_64 trees without issue, and start to think a bit smugly to myself that I’ve defeated this silly monster. But then, another Configure Dragon charges.

cwright@phendrana:~/Desktop/Recent Source Stuff/libSomeLib-X.Y.Z-ppc64>CFLAGS="-Os -fomit-frame-pointer -arch ppc64" ./configure --enable-shared
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
/bin/sh: /Users/cwright/Desktop/Recent: No such file or directory
configure: WARNING: `missing' script is too old or missing
checking for a thread-safe mkdir -p... ./install-sh -c -d
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
checking whether make sets $(MAKE)... yes
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... configure: error: cannot run C compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details.

Of course, whenever configure tells you to check “config.log” for more details, it’s about as informative as reading a Blue Screen. Also, note the 4th or 5th line, which shows a lack of quoted paths. That looks safe…

The problem, according to configure, is that it “cannot run C compiled programs,” which is actually accurate for this case: Rosette doesn’t translate ppc64 binaries to x86. However, gcc can compile such programs, so we know we can do it. Configure suggests using the “—host” command.

Running configure with —help reveals how —host is supposed to be used. It’ll look like this, I suppose: “—host=ppc64”.

It issues this warning, but continues to do its thing:

configure: WARNING: In the future, Autoconf will not detect cross-tools
whose name does not start with the host triplet.  If you think this
configuration is useful to you, please write to autoconf@gnu.org.

I hope that’s ok.. ?

I run make, and it builds, and finishes. A bit earlier than the others. In the .libs directory, there’s no .dylib, which is what —enable-shared is supposed to create. So, I look at the output from previous builds, and copy the line it inexplicably skips. It’s a pretty long, but simple gcc line that takes all the .o’s and makes them into a .dylib. No idea why it skipped over that one…

And at last, we’re able to build our fat binary using lipo. Hurray for portable cross-platform build tools!