Comments on: Bit Hacking http://www.red-sweater.com/blog/1947/bit-hacking Mac & Technology Writings by Daniel Jalkut Sat, 11 Oct 2014 01:25:38 +0000 hourly 1 http://wordpress.org/?v=4.0 By: johne http://www.red-sweater.com/blog/1947/bit-hacking/comment-page-1#comment-247115 Sat, 24 Sep 2011 00:43:09 +0000 http://www.red-sweater.com/blog/?p=1947#comment-247115 This “hack” is not just a “really bad idea”, it ranks as “Not Even Wrong” on the “Right, Wrong, and Not Even Wrong” scale.

This change breaks code that depends on the previously documented fact that the first (and absolutely required) thing that pointers to Objective-C objects point to that objects “isa” Class.

Can you guess what happens when you use `id object; Class objectClass = object->isa;` with these new, “fantastic” tagged pointers? Make no mistake about it, this was a well documented fact in the official documentation, including the official public headers (i.e., `objc/*.h`, and the `typedef` / declaration for `id`, with no warnings anywhere that `id` was a private, opaque type).

Setting aside those fundamental problems, this is a /horrific/ idea when examined from the perspective of the C99 standard. There’s no equivocating or prevaricating on the fact that the “clever tagged pointers hack” results in all kinds of formal “undefined behavior” according to the C99 standard. A few of the obvious violations:

6.2.5 Types

27 A pointer to void shall have the same representation and alignment requirements as a pointer to a character type. Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements.

Specifically, “All pointers to structure types shall have the same representation and alignment requirements as each other.”

6.3.2.3 Pointers

5 An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

6 Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.

Michel Fortin comments that “There’s a big difference between hacking the lowest bits of the pointer versus hacking the highest bits.”. I violently disagree- both make use of very specific architectural and implementation specific details. They both break when the initial assumptions turn out later to be wrong. What they break may be different, but breaking is still breaking.

You then say

“Since those are pointers to *objects*, you can assume they will always be memory-aligned, and unless we go back to 8- or 16-bit processors the last bit is always going to be zero for non-packed data structures longer than one byte, such as Objective-C objects.”

… which is a very strong argument as to why this is a horrific idea. Anything that deals with points to objects is allowed to make that very same assumption. Anything that critically depends on this assumption will break when given a 10.7 tagged pointer. In fact, some of these assumptions are not actually assumptions at all, but are requirements mandated by the standard. This means that some of the “extra bits” that tagged points so cleverly use “because they are zero” are /required/ by the standard to be zero.

This means that anything that breaks because of the lower bits not being zero in 10.7s tagged pointers can not, by very definition, be “buggy” or “incorrect”.

As a practical example of where this could go horribly wrong, one need only consider the case of the optimizer for C. Optimizers make extensive use of the pedantic details in the C standard. This can result in unbelievably complicated corner cases that are shockingly non-obvious. Any optimization done by the C compiler that depends on the lower bits being zero as required by the C standard will break when used on a 10.7 tagged pointer.

Although this is an admittedly contrived example, assume that for some reason the optimizer decides that some particular optimization and instruction selection requires that the pointer be logically shifted right by two places, and then eventually logically shifted back left two places at the end. The optimizer can take advantage of the fact that, in this particular context, `((p>>2)<<2)` must be identical to `p`.

Now you have a condition where just having a 10.7 tagged pointer passing through code that has nothing to do with Objective-C (i.e., the standard C library, any of the numerous C based shared libraries) will now break when given a 10.7 tagged pointer.

The fact this this is such an unbelievably bad idea from any number of compelling technical standpoints makes it all the more shocking that this massive change to the way pointers are dealt with that is ABI compatibility breaking isn't even mentioned as a foot note in the 10.7 developer release notes.

]]>
By: Michel Fortin http://www.red-sweater.com/blog/1947/bit-hacking/comment-page-1#comment-205646 Thu, 21 Jul 2011 18:14:29 +0000 http://www.red-sweater.com/blog/?p=1947#comment-205646 There’s a big difference between hacking the lowest bits of the pointer versus hacking the highest bits. Since those are pointers to *objects*, you can assume they will always be memory-aligned, and unless we go back to 8- or 16-bit processors the last bit is always going to be zero for non-packed data structures longer than one byte, such as Objective-C objects.

]]>