OpenSolaris, Pluggable Crypt, and the SunMD5 Password Hash Algorithm

Several years ago now, Darren Moffat, Casper Dik and I started swapping e-mail about how pathetic it was to still be using the traditional 8-character-password unix crypt() routine in Solaris, and how we could architect something to be much better.

The result was the Solaris Pluggable Crypt Framework, as was announced:

  1. A project exists within Solaris engineering, to integrate a pluggable crypt() routine into Solaris, which will allow use of arbitrary password-hashing algorithms, of arbitrary lengths, etc, in Solaris.

  2. Interoperability with Linux/*BSD MD5- and Blowfish-based hash algorithms is a goal of the project

  3. If I remember right – and I may well be incorrect, as I am not responsible for this aspect of the project – the release is scheduled for Solaris9.

    I am consulting with the team/doing development work with them, on account of my (erm) extensive experience with crypt() implementations…

  4. PAM was considered as a solution for this, and it was decided to not be the appropriate vehicle for delivery of an alternative crypt() routine, because (in summary) PAM is essentially an API for user-interfaces (/bin/login, ftpd, etc) – as opposed to an API for interfacing to the directory-services within which the password entries reside; consider “getpwent()” and family.

Pluggable Crypt was rolled-out in Solaris 9 (Update 2, if my memory is correct?) and we were quietly pleased that our users suddenly reported being able to transparently use a mixture of Traditional, Linux-MD5 and Blowfish hash algorithms. GNU Autoconfig (aka: "./configure") spotted it immediately, next time Darren built Apache. All was cool.

Incidentally, we didn’t get any awards or anything, but that’s management for you. It was a nice example of co-operation inside the company because I was at that point actually working for Sun Professional Services, and not really an engineer.

The hook which permitted us to implement “Pluggable Crypt” was the presence of a dollar-sign (‘$’) as a field separator in the latter two kinds of hashes. We took that as a style cue, and designed the new crypt() shim so that lack of a ‘$’ prefix causes a fall-through to the “traditional” algorithm.

After several weeks of considerable argument (three extremely opinionated and notoriously frank security programmers designing a API, it must have been funny to watch from a distance behind smoked glass) we decided that for a given ciphertext:

$foo$bar$wibblebongthud

…that “foo” – the token being terminated by a dollar or comma – would be used to look up and load a shared library in crypt.conf(4), and that the latter would be at liberty to re-interpret the ciphertext as it willed.

We argued over the design using the following two ciphertexts as examples of desirable syntax:

$caesar$shift=13$frfnzr

$rot13$frfnzr

…which are essentially the same algorithm and might possibly be implemented by the same shared library, but which would have different output syntaxes. The mechanism we created could permit this.

Provided with this wonderful power I mused what I could do in order to highlight the new features it provided, and so we came up with the SunMD5 Hash Algorithm that would be written exclusively with the purpose of annoying people who write brute-force or dictionary-based password guessing engines.

In short: for annoying people like me.

After some beer consideration of the problem, I concluded that password-cracker writers:

  • really hate hash-algorithms which allow large, near-infinite, clash-free numbers of salts.

  • really hate hash-algorithms with large, variable, possibly near-infinite round-counts.

  • really hate hash-algorithms which make precomputation/table-lookup impractical.

…and I further reasoned that:

  • writing a wholly new hash algorithm would be a really really unutterably silly thing to do, but…

  • aha! pumping vast and variable quantities of information through an existing hash algorithm would be reasonably computationally expensive, and the process may be tweakable to make it demonstrate the above desired qualities.

So, after some more beer coffee considerable thought, I hit upon the following process:

  1. choose a PLAINTEXT, and a SALTSTRING

  2. push PLAINTEXTSALTSTRING through MD5, generating HASH0; let n=0

  3. condense HASHn into a single BIT, using MuffettCoinTossAlgorithm

  4. where “NNN” is a decimal ASCII representation of “n”; and where “HAMLET” is the complete 1517 bytes of “To Be Or Not To Be” soliloquy (Hamlet act 3 scene 2, copyright-free source) …

  5. if BIT=0, push HASHnNNN through MD5 generating HASH(n+1)

  6. if BIT=1, push HASHnHAMLETNNN through MD5 generating HASH(n+1)

  7. let n=n+1

  8. repeat steps 3 thru 7 for (4096+X) iterations, where X is an arbitrary positive integer 32-bit number specified in the input salt/ciphertext/metainformation.

  9. result:

    $md5$rounds=X$SALTSTRING$HASHfinal

    – using base64 encoding where needed to encode binary data.

That’s a quick rundown and I think it’s approximately correct. I’ve not been drinking coffee this week so I may be a little off in the fine details, so check the source if you want verification. In terms of the blame we all argued the overall design, I did the SunMD5 algorithm, Darren the glue and Casper and Darren both the larger shared library interfaces, docs, integration and so forth.

The only blackbox in the above is MuffettCoinTossAlgorithm which I truly cannot be arsed to explain (or, more pointedly, why should I grep my mailbox archives for the design docs when you can read the source and figure it out for yourself?) – but the essence was to create something which was somewhat a function of the roundcount (strictly, there are 128 different ways it can go, since it’s f(roundcount modulo hashbits)) and which self-referentially recombines bits of the previous round’s hash both as dynamic-lookup-table and as data-looked-up, eventually extracting two bits from the hash, which then get XOR’ed to create a truth value. It was designed with painful memories of trying to speed-optimise DES S-boxes, plus a few hints of “let’s try to align a few fetches unevenly bridging word boundaries, in case someone tries to inline it with longword fetches.”.

You could try speculative computation, but then you’re a masochist.

Hmmm.

<DIG TYPE=’amusing’> Aside: Reviewing the code nowadays, I’m sure it read a lot more clearly before Darren put it through cstyle. </DIG>

The benefits of the SunMD5 design:

  1. The saltstrings are of near-arbitrary length and can contain all sorts of stuff, including (for instance) the username as a substring, which could thereby guarantee the uniqueness of a salt within a given environment.

  2. The roundcount can be enormous; 1 round = (1517 * 50%) + 8 + 4 = 770 bytes pushed through MD5, so setting “rounds=904″ (making 904 + 4096 builtin = 5000 rounds total) pushes 3.8Mb of data through MD5. If you want it to take longer, set “rounds=68000″ and pump 50Mb through it instead.

As ever, much of the power of an algorithm is not merely how it works, but also in how you use it. The above should supply you with some idea of how it’s meant to be used. In terms of the actual implementation, there’s a feature which I don’t like and for which there is a RFE somewhere; it’s demonstrated by the following code:



#!/bin/perl
# sunmd5.pl – test the sunmd5 implementation on solaris9u2 or above
# roundcount: 904 (user specified) + 4096 (builtin) = 5000 rounds total
# syntax examples
$exsalt1 = ‘$md5$rounds=904$saltstring$dummy';
# this one is strictly correct
$exsalt2 = ‘$md5$rounds=904$saltstring$';
# this one ought to be the same
$exsalt3 = ‘$md5$rounds=904$saltstring';
# as should this one
# test plaintexts
@words = qw( sesame abcdefgh abcdefghijklmnop );
foreach $word (@words) {
$ct1 = crypt($word, $exsalt1);
$ct2 = crypt($word, $ct1);
printf “with exsalt1: %-20s => %s\n”, $word, $ct1;
printf “sanity check: %-20s => %s\n”, $word, $ct2;
printf “with exsalt2: %-20s => %s\n”, “”, crypt($word, $exsalt2);
printf “with exsalt3: %-20s => %s\n”, “”, crypt($word, $exsalt3);
print “\n”;
}
exit 0;

Basically: in order to produce correct outputs, the current parser for SunMD5 algorithm requires a dummy (or real!) ciphertext at the end of the salt-input; this is not a problem in operational usage since /bin/passwd and the other user-admin tools are meant to do the right thing, however it’s not really a nice user experience for Joe Shmoe who’s hacking around in perl as above.

So: when the examples with exsalt2/exsalt3 show the same output as the preceding, then it means that my RFE’s gone through and the patch has been installed. Until then, be aware.

All this has been written in response to a bunch of questions from David Magda, who (over over the course of several e-mails) posed the following questions, the answers to which I shall use to wrap-up this posting:

  • What advantages does the SunMD5-based algorithm have over the BSD one?

    See the above; the algorithm is meant to demonstrate interesting features, innovative ideas in password hashing / what constitutes a ciphertext, and to highlight the Pluggable-Crypt framework which is now (of course) published as part of OpenSolaris.

    Also: it exposes more programmers to Shakespeare, which has got to be a good thing.

  • With the recent attacks on MD5 (and SHA-1), is there any thought to moving to the Blowfish-based hash as a default, or is MD5 still “good enough” for this purpose?

    I reckon that given our modus operandi it’s more than good enough, but for interest’s sake Darren and I have long been planning an uprated version and we’re considering SHA-512 as the underlying hash algorithm.

  • Why another hash?

    “Because People Always Want The Latest iPod” 8-)

    Plus, well, why not? Why can’t we live in a world with a dozen different hash algorithms and support the lot of them? Pluggable Crypt is a really good idea:

    WarStory: I gather that there is at least one site where old Ultrix / OpenVMS / OSF-1 ciphertexts have been imported into a Solaris environment by installing a suitable Pluggable-Crypt module, then prefixing the ciphertexts with $bc$ (bigcrypt) or $c16$ (crypt16) and performing migration from these old, weak algorithms by immediately deprecating $bc/$c16 by using the controls in policy.conf(4).

    Insanely neat, huh?

    Further: it allows the spooks to use their own password hashes. They really like that sort of thing. Makes them feel safe.

  • Given that the Blowfish-based one is extendable by increasing the rounds, why not use it?

    Because I thought it insufficiently evil, and evidently the progeny of a cryppie, rather than a cracker-programmer, mindset. 8-)

Solais manpages you need to read: passwd(1), crypt(3C), crypt.conf(4), crypt_genhash_impl(3C), crypt_gensalt(3C), crypt_gensalt_impl(3C), getpassphrase(3C), passwd(4), crypt_unix(5)

Manpage browser at [docs.sun.com]

17 thoughts on “OpenSolaris, Pluggable Crypt, and the SunMD5 Password Hash Algorithm

  1. Katz
    An update from the field.

    Thanks to you we’re using Blowfish on all our Solaris 9 servers.

    Reply
  2. Robin Wilton
    re: OpenSolaris, Pluggable Crypt, and the SunMD5 Password Hash Algorithm

    Thanks Alec, most enlightening. Your next mission is to convince our IT Ops people that allowing a maximum 8-character password is silly, even if it does have to contain enough non-alphabetic characters to make it truly unmemorable. Good luck.

    Reply
  3. alecm
    re: OpenSolaris, Pluggable Crypt, and the SunMD5 Password Hash Algorithm

    Actually, i side with them until the day that we can purge all instances of traditional unix crypt from our directory, because it is still there for backwards compatibility reasons.

    You have 8 characters and have to fill them with highly entropic gobbledegook, make the best of them.

    Come the revolution, maybe we’ll all be running SunMD5 with 32-char passphrases ?

    Reply
  4. Darren Moffat
    re: OpenSolaris, Pluggable Crypt, and the SunMD5 Password Hash Algorithm

    One of the other reasons for moving to a SHA-512 based crypt(3c) plugin is to use only FIPS 140-3 approved algorithms. Neither Blowfish or MD5 fit there and SHA-1’s days are numbered.

    Plus if we do an new SHA-512 based plugin we can add more Shakey script to the code, more Hamlet or something else this time ?

    Reply
  5. Stephen Usher
    re: OpenSolaris, Pluggable Crypt, and the SunMD5 Password Hash Algorithm

    Surely, we all await the day when DNA can be used as an identifier, then you could use the whole human genome as a salt string. No-one could brute force that. :-)

    Then again, the user database might need a few terrabytes of its own.

    Reply
  6. David Magda
    thanks

    Thanks for answering my questions, and for taking the time to write up this post.

    Reply
  7. Nik
    re: OpenSolaris, Pluggable Crypt, and the SunMD5 Password Hash Algorithm

    Fascinating, and the use of Hamlet adds a nice touch of class.

    Presumably CESG’s implementation substitutes a suitable Infosec memorandum :-)

    Reply
  8. Dave Walker
    re: OpenSolaris, Pluggable Crypt, and the SunMD5 Password Hash Algorithm

    While this code indeed first saw the light of day in Solaris 9 update 2, it was subsequently bundled into the patching system so that it can be applied to all versions of Solaris 9 back to GA.

    See patch 112874-32 (SPARC) or 114432 (x86) on Sunsolve for details.

    Reply
  9. Stuart

    Hi Alec, if you’re still listending:
    We now have pluggable crypt, plus the SHA256 and SHA512 algorithms are supplied with Solaris 10 (at least, recent releases, I haven’t checked when they were introduced).

    Do we have any compatibility with the hashed passwords stored by the Sun LDAP server e.g. via pam_unix?
    Such that they could be extracted, syntax/encoding reformatted, and pushed into /etc/shadow for (e.g.) disconnected machines which don’t authenticate to the LDAP server. The idea being to enable consistent password changes.

    By default SunDirectoryService/er seems to use sha1/seededsha1, for which I don’t see an algorithm in the crypt library — or can the SDS also do the identical SHA256 or SHA512 algorithms? (even though the stored format is different)

    Reply
  10. Stuart

    (or the reverse, for that matter: your sunmd5 as an available algorithm in the SunDS LDAP server)

    Perhaps that’s even better, your mooted SHA-512 based algorithm, available for both! :-)

    Reply

Leave a Reply