Developers who write code inevitably write code that crashes. Luckily for our users, 99% of those crashes happen on our own development machines, and we get the opportunity to fix them before shipping them off to our unsuspecting computers.
But every once in a while even the most polished and refined software exhibits that nastiest of malfunctions: the crashing bug. In the middle of a possibly important user task, the application comes screeching to a halt and disappears, leaving the user with nothing more than a cryptic message from the operating system, along with an offer to send the crash log off to Apple.
Needless to say, crashes suck. A lot. For users, especially. If there’s one behavior of your application that you should focus on eliminating, it’s the behavior of crashing. Above all other “nuisances,” this is the one that is absolutely unacceptable.
Which is why crash logs are so important. Occasionally, when a user is sophisticated enough to understand the value of the crash log and, instead of choosing to send it to Apple, copies and pastes it into an email to you, you should be overjoyed! Finally! A crash has been captured in the wild and delivered to you for easy inspection. This artifact not only proves that a crashing bug exists, but usually includes highly detailed information about where and how the crash has happened.
Which is why it boggles the mind to think that some developers see these crash logs as relatively useless. In my own experience of reporting bugs to other developers, I have found frequently that my crash log is considered more as a statistic than as a specimen. While I expect a response like “great, I see what the bug is, and I’ll get this fixed,” I often get responses more like “sigh, I’ve seen this a few times but have no idea what’s going on!”
Sometimes I find that the developer in question simply needs to be educated about the tools at his or her disposal. I imagine the hours wasted speculating about what the crash log means, and can’t help but think if they knew how to analyze it, they could have the bug solved (in many cases) in moments. Oh – and I’m sounding high and mighty right now, so before I get too carried away, let me just say that I too need to be educated on a great many things, but making sense of crash logs is not particularly high on that list. I leave it to other bloggers to fill in the gaps of my education.
Crash Logs: The Crappy Way
So what is this big thing that developers are missing? Well, a crash log essentially shows the “path” that the little hamsters in your computer were running on when they fell off the cliff. It paints a picture of where the program got misguided and went off doing something it never should have done. Problem is, the picture is a little blurry. Below is an actual crash log submitted by an actual MarsEdit user (abbreviated and condensed to fit the blog):
Thread 0 Crashed: Foundation 0x93cc2a2e _NSAPAddPage + 94 Foundation 0x93cc3b27 __NSAutoreleaseObject + 263 Foundation 0x93cc3962 -[NSObject(NSObject) autorelease] + 18 Foundation 0x93cceaca +[NSString stringWithFormat:] + 106 marsedit 0x000637fb 0x1000 + 403451 marsedit 0x00053883 0x1000 + 338051 marsedit 0x000539ea 0x1000 + 338410 marsedit 0x00053d7b 0x1000 + 339323 marsedit 0x00053e45 0x1000 + 339525 marsedit 0x0005238b 0x1000 + 332683
For any non-programmers who are still reading (bravi!), you may have a little trouble understanding what the heck this means. This is an ordered description of the path the computer followed while trying to do something. Problem is it ended up crashing. The path is shown in reverse-order, so the crash is at the top of the list, inside Foundation (an Apple system library). Trying to make human sense of this goes something like this:
- Something in MarsEdit at address 0x0005238b…
- … called something in MarsEdit at address 0x00053e45
- … called something in MarsEdit at address 0x00053d7b
- … called something in MarsEdit at address 0x000539ea
- … called something in MarsEdit at address 0×00053883
- … called something in MarsEdit at address 0x000637fb
- … which called a standard string library method “stringWithFormat”
- which eventually crashed in the system.
You can sort of see how some developers are forced to throw up their arms in despair, right? But what if you could turn this muddled picture into something a bit sharper? The fact is, you can. The reason the picture is so blurry is because MarsEdit shipped without debugging symbols. The reasons for this are mostly to keep the shipping application as small as possible, but there are also some good reasons to hide the symbols if they reveal too much about your business strategy.
Crash Logs: The Happy Way
Many developers seem to think that gathering useful crash logs requires shipping a symbol-laden application, but aren’t willing to do so. The fact is, you can have the best of both worlds by shipping a symbol-stripped version of your application, but keeping a symbol-laden version on-hand for the developer’s convenience.
What am I getting at? Apple offers a tool called “atos”, which makes it relatively easy to map an address from a symbol-stripped application to a name in a symbol-laden version of that same application. As long as the two applications were built with the same sources and using the same build options, the addresses are bound to match up. It’s just the symbols that are “stripped” from the shipping application.
Let’s take a look at what happens when I apply my atos powers to the crash log I pasted above. From the Terminal, I invoke atos with the literal memory addresses of all the mysterious points in the “path” from the crash log. You can list the addresses in whatever order you want, but I’m listing them so that the result will be in order from top to bottom, the same way I reasoned out the crash in plain “crappy” English above:
iBook> cd MarsEdit.app/Contents/MacOS iBook> atos -arch i386 -o MarsEdit \ 0x0005238b 0x00053e45 0x00053d7b 0x000539ea 0x00053883 0x000637fb
Note also that I specified the architecture explicitly. This is because I am debugging this on a PowerPC Mac, but the crash log I received showed that the customer is running on an Intel machine. This yields a magical response with, behold, real meaningful symbols!
-[RSXMLRPCCall invoke:] (in MarsEdit) (RSXMLRPCCall.m:397) -[RSXMLRPCRequest requestText] (in MarsEdit) (RSXMLRPCRequest.m:239) -[RSXMLRPCRequest serializeParams:] (in MarsEdit) (RSXMLRPCRequest.m:216) -[RSXMLRPCRequest serializeDictionary:toString:] (in MarsEdit) (RSXMLRPCRequest.m:153) -[RSXMLRPCRequest serializeData:toString:] (in MarsEdit) (RSXMLRPCRequest.m:123) -[NSData(extras) base64StringWithLineLength:] (in MarsEdit) (NSData+extras.m:215)
Now I can apply my human sense again, but with the benefit of symbolic names instead of raw memory addresses:
- MarsEdit invoked an RPC network request
- … which while building the actual request cargo text
- … was trying to serialize the parameters
- … when it tried to serialize a dictionary object
- … whose data
- … upon being converted to a Base64 representation
- … caused a “stringWithFormat:” call that went blammo!
Not only does the plot to this story make a lot more sense now, but I’ve even got handy pointers to the line numbers in the source code. Let’s see, NSData+extras.m, around line 215:
NSString *charString = [NSString stringWithFormat:@"%c", base64EncodingTable [outbuf [i]]];
Survey says? That’s the line I’m going to zero in on now as I try to figure out why this crash is occurring. Sure, I’ll have to now speculate as to why, and under what circumstances the observed crash might occur. So the magic of “atos” doesn’t take all the work out of debugging a crash log, but it brings most crashes out of the “frustratingly impossible” into the realm of “all in a day’s work.” I can use my knowledge of the situation to ask intelligent questions of the user, and hopefully get to the bottom of it, so that no other user ever has to see the nasty crash again.
I can do more than just tell the user “thanks for the report, I have no clue what’s happening here. Sorry.”
Doing It For Yourself
I hope I’ve intrigued you by the power of what I was able to achieve with “atos”, but of course to do the same with your own project, you’ll need to make sure you can come up with a “symbol-laden version” of your release app, so you can apply the same magic. The best way I know to do this is to simply add another build configuration in xcode, that is just like your “Release” configuration, but doesn’t strip out debugging symbols. You can read more about this approach and others in Apple’s fine documentation.
If you find yourself dealing with crash logs on a regular basis, it might be a good idea to make building a “symbol-laden” version of your app part of the standard build and distribute process. For myself, I’ve found it happens rarely enough that I’m happy with a process whereby I re-checkout a tagged version of the sources from Subversion, and then rebuild a copy using the “Release With Symbols” build configuration.
Nota Bene: If you’re on Leopard and trying to replicate a symbol-laden version of an app you shipped with Tiger, you’ll probably have to tweak Xcode to use the old linker, in order to get a build whose memory addresses match up with your old Tiger-built application. Might be easier to just go back to Tiger and rebuild it there.
I know this was probably common knowledge for some of you, but I also know from experience of chatting with my colleagues that this is completely new, forehead-slapping “why didn’t I know about that?!” type material for many others. I hope it saves you hours of work. Possibly even enough hours to continue reading long articles such as this one!