Thursday, August 31, 2006

Missing Functionality


The SGI extensions to the STL defines a hash_map being an efficient map without allowing ordered access to the elements and thus being "useful for dictionaries".

So far so good. The big surprise is that hash_map cannot be used with string as key out of the box, because the SGI extension does not define a proper hash function. From http://www.sgi.com/tech/stl/hash.html
The hash template is only defined for template arguments of type char*, const char*, crope, wrope, and the built-in integral types.
So, if you want to use string as the key for a hash_map you have to provide a template specialization like this (supposing you are using the GNU implementation of the STL)

namespace __gnu_cxx {
    template<> struct hash<std::string> {
    public:
        size_t operator()(const std::string& s) const
        {
            return h(s.c_str());
        }
    private:
        __gnu_cxx::hash<const char*> h;
    };
}

I have no idea why this spezialication is not included from the begining. Afterall dictionaries are often used to store strings. If someone knows some background please mail me. My conjecture is that SGI wanted rope to be prefered over string. First because there are hash functions for ropes and second because of http://www.sgi.com/tech/stl/basic_string.html:
Note that the C++ standard does not specify the complexity of basic_string operations. In this implementation, basic_string has performance characteristics very similar to those of vector: access to a single character is O(1), while copy and concatenation are O(N). By contrast, rope has very different performance characteristics: most rope operations have logarithmic complexity.
Thanks to Andreas Bernauer for reporting this.

No comments:

Post a Comment