The Passlib documentation has moved to https://passlib.readthedocs.io
passlib.hash.sun_md5_crypt
- Sun MD5 Crypt¶
This algorithm was developed by Alec Muffett [1] for Solaris, as a replacement for the aging des_crypt
.
It was introduced in Solaris 9u2. While based on the MD5 message digest, it has very little at all
in common with the md5_crypt
algorithm. It supports
32 bit variable rounds and an 8 character salt.
See also
password hash usage – for examples of how to use this class via the common hash interface.
Note
The original Solaris implementation has some hash encoding quirks which may not be properly accounted for in Passlib. Until more user feedback and sample hashes have been gathered, caveat emptor.
Interface¶
-
class
passlib.hash.
sun_md5_crypt
¶ This class implements the Sun-MD5-Crypt password hash, and follows the PasswordHash API.
It supports a variable-length salt, and a variable number of rounds.
The
using()
method accepts the following optional keywords:Parameters: - salt (str) – Optional salt string.
If not specified, a salt will be autogenerated (this is recommended).
If specified, it must be drawn from the regexp range
[./0-9A-Za-z]
. - salt_size (int) – If no salt is specified, this parameter can be used to specify the size (in characters) of the autogenerated salt. It currently defaults to 8.
- rounds (int) – Optional number of rounds to use. Defaults to 34000, must be between 0 and 4294963199, inclusive.
- bare_salt (bool) – Optional flag used to enable an alternate salt digest behavior
used by some hash strings in this scheme.
This flag can be ignored by most users.
Defaults to
False
. (see Bare Salt Issue for details). - relaxed (bool) –
By default, providing an invalid value for one of the other keywords will result in a
ValueError
. Ifrelaxed=True
, and the error can be corrected, aPasslibHashWarning
will be issued instead. Correctable errors includerounds
that are too small or too large, andsalt
strings that are too long.New in version 1.6.
- salt (str) – Optional salt string.
If not specified, a salt will be autogenerated (this is recommended).
If specified, it must be drawn from the regexp range
Format¶
An example hash (of passwd
) is $md5,rounds=5000$GUBv0xjJ$$mSwgIswdjlTY0YxV7HBVm0
.
A sun-md5-crypt hash string has the format $md5,rounds=rounds$salt$$checksum
, where:
$md5,
is the prefix used to identify the hash.rounds
is the decimal number of rounds to use (5000 in the example).salt
is 0-8 salt characters drawn from[./0-9A-Za-z]
(GUBv0xjJ
in the example).checksum
is 22 characters drawn from the same set, encoding a 128-bit checksum (mSwgIswdjlTY0YxV7HBVm0
in the example).
An alternate format, $md5$salt$$checksum
is used when the rounds value is 0.
There also exists some hashes which have only a single $
between the
salt and the checksum; these have a slightly different checksum calculation
(see Bare Salt Issue for details).
Note
Solaris seems to deviate from the Modular Crypt Format in that
it considers ,
to indicate the end of the identifier
in addition to $
.
Algorithm¶
The algorithm used is based around the MD5 message digest and the “Muffett Coin Toss” algorithm.
- Given a password, the number of rounds, and a salt string.
an initial MD5 digest is created from the concatenation of the password, and the configuration string (using the format
$md5,rounds=rounds$salt$
, or$md5$salt$
if rounds is 0).(See Bare Salt Issue for details about an issue affecting this step)
- for rounds+4096 iterations, a new digest is created:
- a buffer is initialized, containing the previous round’s MD5 digest (for the first round, the digest from step 2 is used).
MuffetCoinToss(rounds, previous digest)
is called, resulting in a 0 or 1.- If step 3.ii results in a 1, a constant data string is added to the buffer;
if the result is a 0, the string is not added for this round.
The constant data string is a 1517 byte excerpt from Hamlet [2]
(
To be, or not to be...all my sins remember'd.\n
), including an appended null character. - the current iteration as a zero-indexed integer is converted to a string (not zero-padded) and added to the buffer.
- the output for this iteration is the MD5 digest of the buffer’s contents.
The final digest is then encoded into
hash64
format using the same transposed byte order thatmd5_crypt
uses, and returned.
Muffet Coin Toss¶
The Muffet Coin Toss algorithm is as follows: Given the current round number, and a 16 byte MD5 digest, it returns a 0 or 1, using the following formula:
Note
All references below to a specific bit of the digest should be interpreted mod 128. All references below to a specific byte of the digest should be interpreted mod 16.
A 8-bit integer
X
is generated from the following formula: for eachi
in 0..7 inclusive:- let
A
be thei
‘th byte of the digest, as an 8-bit int. - let
B
be thei+3
‘rd byte of the digest, as an 8-bit int. - let
R
beA
shifted right byB % 5
bits. - let
V
be theR
‘th byte of the digest. - if the
A % 8
‘th bit ofB
is 1, divideV
by 2. - use the
V
‘th bit of the digest as thei
‘th bit ofX
.
- let
Another 8-bit integer,
Y
, is generated exactly the same manner asX
, except that:A
is thei+8
‘th byte of the digest,B
is thei+11
‘th byte of the digest.
if bit
round
of the digest is 1,X
is divided by 2.if bit
round+64
of the digest is 1,Y
is divided by 2.the final result is
X
‘th bit of the digest XORed againstY
‘th bit of the digest.
Bare Salt Issue¶
According to the only existing documentation of this algorithm [1],
its hashes were supposed to have the format $md5$salt$checksum
,
and include only the bare string $md5$salt
in the salt digest step
(see step 2, above).
However, almost all hashes encountered in production environments
have the format $md5$salt$$checksum
(note the double $$
).
Unfortunately, it is not merely a cosmetic difference: hashes of this format
incorporate the first $
after the salt within the
salt digest step, so the resulting checksum is different.
The documentation hints that this stems from a bug within the production
implementation’s parser. This bug causes the implementation to return
$$
-format hashes when passed a configuration string that ends with $
.
It returns the intended original format & checksum
only if there is at least one letter after the $
, e.g. $md5$salt$x
.
Passlib attempts to accommodate both formats using the special bare_salt
keyword. It is set to True
to indicate a configuration or hash string which
contains only a single $
, and does not incorporate it into the hash calculation.
The $$
hash is encountered more often in production since it seems
the Solaris salt generator always appends a $
; because of this bare_salt=False
was chosen as the default, so that hashes will be generated which by default
conform to what users are used to.
Deviations¶
Passlib’s implementation of Sun-MD5-Crypt deliberately deviates from the official implementation in the following ways:
Unicode Policy:
The underlying algorithm takes in a password specified as a series of non-null bytes, and does not specify what encoding should be used; though a
us-ascii
compatible encoding is implied by all known reference hashes.In order to provide support for unicode strings, Passlib will encode unicode passwords using
utf-8
before running them through sun-md5-crypt. If a different encoding is desired by an application, the password should be encoded before handing it to Passlib.Rounds encoding
The underlying scheme implicitly allows rounds to have zero padding (e.g.
$md5,rounds=001$abc$
), and also allows 0 rounds to be specified two ways ($md5$abc$
and$md5,rounds=0$abc$
). Allowing either of these would result in multiple possible checksums for the same password & salt. To prevent ambiguity, Passlib will throw aValueError
if the rounds value is zero-padded, or specified explicitly as 0 (e.g.$md5,rounds=0$abc$
).
Given the lack of documentation, lack of test vectors, and known bugs which accompany the original Solaris implementation, Passlib may not accurately be able to generate and verify all hashes encountered in a Solaris environment. Issues of concern include:
- Some hashes found on the web use a
$
in place of the,
. It is unclear whether this is an accepted alternate format or just a typo, nor whether this is supposed to affect the checksum in the resulting hash string. - The current implementation needs addition test vectors; especially ones which contain an explicitly specific number of rounds.
- More information is needed about the parsing / formatting issue described in the Bare Salt Issue section.
Footnotes
[1] | (1, 2) Overview of & motivations for the algorithm - http://dropsafe.crypticide.com/article/1389 |
[2] | The source of Hamlet’s speech, used byte-for-byte as the constant data - http://www.ibiblio.org/pub/docs/books/gutenberg/etext98/2ws2610.txt |