A Fast String Implementation for STL Map

STL provides map support, and a natural choice for key is string under many circumstances. However, the performance of string is weak comparing to others. Let's test it first (using this test program) and see the results from a 550-MHz Celeron ThinkPad:
 
Key type MSVC 6 MSVC 6 (/Ox) MinGW 1.1 MinGW 1.1 (-O2)
int 3164 450 3274 310
const char* 3856 501 3065 310
std::string 19626 7791 18336 4206

It is clear that in whatever compilation mode and using whatever compiler string is at least 10 times slower than int or const char* as key of map. It makes things inconvenient. One has to decide between performance and ease of use. Any good solutions to this problem?

One method is to use a short string as an integer. For me, I chose the 64-bit integer, and thus allow for an 8-character string. The simplest definition might be

union mstring
{
    char        chars[8];
    ULONGLONG   index;
    mstring(const char* sz)
    {
        strncpy(chars, sz, 8);
    }
};

template<> bool
std::less<mstring>::operator()(const mstring& x, const mstring& y) const
{
    return x.index < y.index;
}

wherein ULONGLONG is the type definition for 64-bit unsigned integer on a specific platform. The updated test program is here. The results show that accessing map<mstring, ...> takes only one ninth to one fourth of the time accessing map<string, ...> takes. On MSVC 6 it is optimized so well that it takes only twice as much of time as map<int, ...>.

For a big-endian machine it should work perfectly. For a little-endian machine you will find the sort order a mess. I will not explain much about it but instead give my solution. You could have a look at my complete mstring.h. It also includes overridden operators, ostream support, and hash_map support (when HAVE_HASH is defined or <hash_map> from either SGI or STLport has been included). Many MinGW mailing list subscribers should be able to tell why I asked some silly questions about template specialization some time ago.

All code here is considered in the public domain though I do wish my name could be retained if anyone used them. :-)

2002-2-24, written by Wu Yongwei.


This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 Licence.

Return to Main