There are a few functions used to zero out memory on most unix variants.
calloc() are all a few such functions.
calloc() isn’t very useful for clearing already allocated memory, so it won’t be appearing much more in this article. However, the other two are somewhat more interesting than meets the eye.
On the surface,
memset() look similar enough to be related –
bzero() being a special case of
memset() takes 3 arguments, a pointer, a value, and a length.
bzero(), on the other hand, omits the value parameter, since it’s implied to be zero.
memset()’s documentation provides one somewhat worrying detail though:
memset()function writes n bytes of value c (converted to an unsigned char) to the string s.
That conversion bit could mean that naive implementations operate a byte at a time – lousy for the wide register sizes of today. It’s trivial to combine that value a few times though, so hopefully that’s done?
bzero(), on the other hand, reports similar details:
bzero()function writes n zeroed bytes to the string s.
So perhaps it’s in similar standing?
(At this point, I’m going to be rather open, and state that it’s almost 100% ridiculous to presume that a non-embedded c library would be so lame as to work with individual bytes in such a manner. It’s been known to happen, but I’m pretty sure that this isn’t the case on Linux, Windows, or Mac OS X (where testing takes place for this article)).
Firing up a debugger, we uncover some even more interesting information.
bzero() is a mere stub:
(gdb) x/4i bzero 0x9603ebc8 <bzero>: mov $0xffff0600,%eax 0x9603ebcd <bzero +5>: jmp *%eax 0x9603ebcf <bzero +7>: nop
See that 0xffff0600 in there? That’s an address near the top of memory, and
bzero() is just a trampoline into it.
(gdb) x/16i memset 0x95fe7318 <memset>: mov 0x8(%esp),%eax 0x95fe731c <memset +4>: mov 0xc(%esp),%edx 0x95fe7320 <memset +8>: and $0xff,%eax 0x95fe7325 <memset +13>: jne 0x95fe7332 <memset +26> 0x95fe7327 <memset +15>: mov $0xffff0600,%eax 0x95fe732c <memset +20>: mov %edx,0x8(%esp) 0x95fe7330 <memset +24>: jmp *%eax ...
There’s that address again. 0xffff0600. There’s a bit more checking going on (expected, since
memset()’s a bit more generic than
bzero()), and a lot more happening afterwards, but for the value zero, it uses the same engine.
So, for cases where you know you’re going to zero out a block of memory,
bzero() will be a bit more direct (avoiding the checking and what not). No profiling necessary, since it’s the same underlying algorithm.