Support this website or
listen to my music
Deadwood on OpenWRT; The case for 32-bit Skein
September 12 2010
Original post
When I was designing Deadwood, I made sure to make the program work
really nicely on 32-bit processors, while retaining compatibility on
64-bit processors. In the places where word size really matters,
namely the hash compression function and the cryptographic random
number generator, I went out of my way to use 32-bit algorithms (my
own home-grown algorithm for the 32-bit hash compression function;
RadioGatun[32] for the random number generator).
During
Deadwood’s development cycle, I even gave a nod to systems
that can work with both 16-bit and 32-bit numbers, by having it so
“int” in the source code means “something that can
fit in 16 bits”, and using int32_t for anything that would need
more than 16 bits (near the end of the Deadwood development cycle,
I went from “int32_t” to “int_fast32_t”
for “integer that needs more than 16 bits”). This is
untested—I don’t think there are any systems in use today
capable of routing packets that have 16-bit wide registers—but
allows it to be at least possible to port Deadwood to a 16-bit system.
The reason for all this attention to detail was because my
target platform while developing Deadwood was to have it run nicely on
the 32-bit router in the corner used to connect people to the internet.
And, I have succeeded.
Sebastian Müller has already gotten
Deadwood to recursively process queries
on an embedded router and is working on porting
Deadwood to a number of platforms. He was able to pull this off
in a single afternoon because Deadwood is, from beginning to end,
an architecture-independent recursive DNS server optimized for 32-bit
platforms.
Of course, Deadwood also runs nicely on 64-bit
platforms; one of my regressions I perform when building Deadwood is
to make sure nothing breaks in 64-bit. The only issue is that some
operations (the hash compression, the cryptographic random number
generator) are less efficient because they were written to run on
32-bit systems. [1]
Now, while Deadwood does run nicely on
various 32-bit embedded systems, I unfortunately can not readily support
these configurations because I run Deadwood on x86_32 and can not solve
problems for people running it on different systems. It works, but
people want to do this are on their own and need to take responsibility
for any problems they see.
This leads us to
Skein, a proposed hash primitive for
SHA-3.
Skein is a very good hash with a lot of features, including
having, in its core, Threefish, the only tweakable block
cipher primitive that appears secure.
However, I
can’t
use Skein because there’s no
native 32-bit version of it, even though
such
a thing is possible. When
I
pointed out this problem on Schneier’s blog, I was dismissed
by another poster with “all 32-bit desktop processors (and even
some ARM chips) have instructions that can do math on pairs of 64-bit
numbers”.
That’s all well and good, but Deadwood
isn’t just for the desktop. It’s also for the embedded 32-bit
space and that means no MMX/SSE instructions that work on 64-bit numbers.
One of the reasons why I feel so frustrated is because
I
can’t just make a 32-bit Skein variant. Not without possibly
making 10 different errors that could destroy its security.
I
understand why Schneier, et. al. don’t feel comfortable coming
out with some rotation constants to make a 32-bit Skein variant;
if they did make such a Skein variant and some cryptographers
found weaknesses in them, it would make them look bad. So, they
are being very conservative about what they will declare to be the
official “magic numbers” for Skein. Numbers I simply
do not feel comfortable changing.
If I were to use one
of the SHA-3 submissions for Deadwood’s PRNG, I would use
Keccak. Like Skein, it can output
a stream of infinite length from any input of any length. Unlike Skein,
it is more 32-bit compatible; not only is there a 32-bit “reduced
word length” variant officially blessed by the algorithm’s
creators, but also 64-bit Keccak more easily scales down to 32-bits
than Skein, since the only operations done are permutes, rotates, and
exclusive ORs.
32-bits is still very much a reality today.
While the transition to 64-bit desktops is well underway, it will be a
few years before 64-bit desktops become more common than 32-bit desktops.
The embedded space is very slowly letting go of 8-bit systems, replacing
them with 32-bit systems. I wouldn’t be surprised if 32-bit is
still around in 2038, when we will have to worry about systems with 32-bit
time_t overflowing (Deadwood will run until 2143 on systems with a 32-bit
time_t, uses 64-bit timestamps internally, and won’t overflow in
2143 if compiled on a system with a 64-bit time_t).
Yes,
we are making the transition to 64-bit. But we can’t make the
transition with solutions like Skein that pretend 32-bit no longer
exists.
[1] In terms of making a full 64-bit
port of Deadwood, I would have to:
- Make most, but not all, of
the int32_t declarations int_fast32_t declarations (64-bit compilers
should make int_fast32_t 64 bits wide)
- Use RadioGatun[64] instead of
RadioGatun[32] (because of how I wrote the RadioGatun code, this is a
fairly quick change)
- Replace the hash compression core with one using
64-bit instead of 32-bit operations. The most difficult part of this
is making a 63-bit random prime number; we can quickly brute-force test
the primality of a 31-bit number, but not for a 63-bit number.
No,
I don’t have plans to do this before MaraDNS 2.0 is released.