Header <boost/md5.hpp>


Abstract

«The [MD5] algorithm takes as input a message of arbitrary length and produces as output a 128-bit "fingerprint" or "message digest" of the input. It is conjectured that it is computationally infeasible to produce two messages having the same message digest, or to produce any message having a given prespecified target message digest. ... The MD5 algorithm is designed to be quite fast on 32-bit machines.» –RFC1321


Synopsis

namespace boost
{
    class md5
    {
    public:
        md5();
        ~md5();

        // Constructs a digest for given message data.
        md5(const char* a_str);
        md5(const void* a_data, uint32_t a_data_size);
        md5(std::istream& a_istream);

        // Updates the digest with additional message data.
        void update(const char* a_str);
        void update(const void* a_data, uint32_t a_data_size);
        void update(std::istream& a_istream);

        // A message digest.
        class digest_type
        {
        public:
            // A digest value as a 16-byte raw binary array.
            typedef uint8_t value_type[16];

            // A digest value as a 33-byte ascii-hex string.
            typedef char hex_str_value_type[33];

            digest_type();  // Constructs a zero digest.
            digest_type(const value_type& a_value);
            digest_type(const hex_str_value_type& a_hex_str_value);
            digest_type(const digest_type& a);

            void reset();  // Resets to a zero digest.
            void reset(const value_type& a_value);
            void reset(const hex_str_value_type& a_hex_str_value);
            digest_type& operator=(const digest_type& a);

            // Gets the digest value.
            const value_type& value() const;
            const hex_str_value_type& hex_str_value() const;

            ~digest_type();
        };

        // Acquires the digest.
        const digest_type& digest();
    };

    inline bool operator==(const md5::digest_type& a, const md5::digest_type& b);
    inline bool operator!=(const md5::digest_type& a, const md5::digest_type& b);
}

Examples

#include <boost/md5.hpp>
#include <iostream>
#include <fstream>

std::cout << boost::md5("message").hex_str_value();

boost::md5(std::ifstream("file.txt")).hex_str_value();

Portability

The code have been compiled and tested with:

Security

The code does its best to ensure that the memory storing potentially sensitive information gets zeroized before it is released to the operating system. Clearing is usually done at the point of the last use of such a memory fragment by the code. However, a decent compiler might often optimize this intentional behaviour away, because it falls directly under the optimization domain.

To actually try to achieve the behaviour needed even in the optimized version, the code defines and uses the secure_memset function:

    void* secure_memset(void* dst, int c, uint32_t size)
    {
        return memset(dst, c, size);
    }
This is a non-inlined wrapper for memset, and this fact prevents many current optimizers from getting rid of it. This technique is simple to implement, and is relatively compiler independent, and this is why it is being used. However, it still remains a trick, and like any trick, it is not guaranteed to always work. As it always is, when security is of any serious concern, the generated code should be carefully examined on a case-by-case basis.

There are other possible techniques to prevent some optimizers from eliminating the "dead code". They include writing to a memory fragment in question through a volatile pointer:

    inline void* secure_memset(void* dst, int c, uint32_t size)
    {
        volatile char* p = reinterpret_cast<char*>(dst);

        while (size-- != 0) *p++ = c;

        return dst;
    }
This method is not guaranteed to work with any compiler either. Optimizers are becoming smarter, and the C++ Standard volatile semantics does not actually imply the effect being exploited here.

Another method, of introducing a fake use of a memory fragment in question after calling memset, has been reported to not work anymore with some compilers. Some of the advanced things modern optimizers can do is to statistically analyse the code directly during its execution, or create background copies of some memory fragments at run-time. This means, that the only absolutely reliable way to make sure sensitive information is secure in such cases is turning off optimizations (locally). Which is very compiler dependent.


See also


License

Copyright © 2002-2003 Stanislav Baranov. Permission to copy, use, modify, sell and distribute this software and its documentation is granted provided this copyright notice appears in all copies. This software is provided "as is" without express or implied warranty, and with no claim as to its suitability for any purpose. Derived from the RSA Data Security, Inc. MD5 Message-Digest Algorithm.

Copyright © 1991-2, RSA Data Security, Inc. Created 1991. All rights reserved. License to copy and use this software is granted provided that it is identified as the "RSA Data Security, Inc. MD5 Message-Digest Algorithm" in all material mentioning or referencing this software or this function. License is also granted to make and use derivative works provided that such works are identified as "derived from the RSA Data Security, Inc. MD5 Message-Digest Algorithm" in all material mentioning or referencing the derived work. RSA Data Security, Inc. makes no representations concerning either the merchantability of this software or the suitability of this software for any particular purpose. It is provided "as is" without express or implied warranty of any kind. These notices must be retained in any copies of any part of this documentation and/or software.



© Copyright Stanislav Baranov, 2003