Is there a way to make this hash lookup any faster?

Asked 6/8, 2010 at 7:45 Answered 6/8, 2010 at 12:22

I have a requirement to (very) quickly process strings of a limited range, tallying their values. The input file is of the form:

January    7
March     22
September 87
March     36

and so forth. Because the line widths are identical, I can simply read in a line with fread reasonably fast, and I've developed a perfect hashing function which works, but I wanted to see if anyone could offer any advice on how to make it even faster. I'll profile each suggestion to see how it goes.

The hashing function is based on the month name to allow fast allocation of the value to a bucket. Bear with me here. I first figured out the minimal number of characters for a perfect hash:

January
February
March
April
May
June
July
August
September
October
November
December

Keep in mind that the months are all nine characters due to the fact I have the entire input line.

Unfortunately, there is no single column to mark a month unique. Column 1 duplicates J, column 2 duplicates a, column 3 duplicates r, column 4 duplicates u and columns 5 onwards duplicate <space> (there are other duplicates but one is enough to prevent a single-column hash key).

However, by using the first and fourth column, I get the values Ju, Fr, Mc, Ai, M<space>, Je, Jy, Au, St, Oo, Ne and De, which are unique. There will be no invalid values in this file so I don't have to worry about incorrect buckets for the input data.

By viewing the hex codes for the characters, I found I could get low unique values by just ANDing with strategic values:

FirstChar  Hex  Binary     &0x0f
---------  ---  ---------  -----
   A       x41  0100 0001      1
   D       x44  0100 0100      4
   F       x46  0100 0110      6
   J       x4a  0100 1010     10
   M       x4d  0100 1101     13
   N       x4e  0100 1110     14
   O       x4f  0100 1111     15
   S       x53  0101 0011      3

SecondChar  Hex  Binary     &0x1f
----------  ---  ---------  -----
 <space>    x20  0010 0000      0
    c       x63  0110 0011      3
    e       x65  0110 0101      5
    i       x69  0110 1001      9
    o       x6f  0110 1111     15
    r       x72  0111 0010     18
    t       x74  0111 0100     20
    u       x75  0111 0101     21
    y       x79  0111 1001     25

and this allowed me to set up a static array to create a (hopefully) blindingly-fast hash function:

#define __ -1
static unsigned int hash (const char *str) {
    static unsigned char bucket[] = {
        //   A       S   D       F               J           M   N   O
        __, __, __, __, __, __, __, __, __, __, __, __, __,  4, __, __, // space
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __,  2, __, __, // c
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, 11, __, __, __, __, __,  5, __, __, __, 10, __, // e
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __,  3, __, __, __, __, __, __, __, __, __, __, __, __, __, __, // i
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __,  9, // o
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __,  1, __, __, __, __, __, __, __, __, __, // r
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __,  8, __, __, __, __, __, __, __, __, __, __, __, __, // t
        __,  7, __, __, __, __, __, __, __, __,  0, __, __, __, __, __, // u
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __,  6, __, __, __, __, __  // y
    };
    return bucket[((unsigned int)(str[3]&0x1f)<<4)|(str[0]&0xf)];
}

Testing that with the code:

#include <stdio.h>
#include <string.h>

// Hash function here.

static char *months[] = {
    "January  ", "February ", "March    ", "April    ", "May      ", "June     ",
    "July     ", "August   ", "September", "October  ", "November ", "December "
};

int main (void) {
    int i;
    for (i = 0; i < sizeof(months)/sizeof(*months); i++)
        printf ("%-10s -> %2d\n", months[i], hash(months[i]));
    return 0;
}

shows that it's functionally correct:

January    ->  0
February   ->  1
March      ->  2
April      ->  3
May        ->  4
June       ->  5
July       ->  6
August     ->  7
September  ->  8
October    ->  9
November   -> 10
December   -> 11

but I want to know if it can be made faster.

Any suggestions out there? I'm open to any simple optimisations or even a total rewrite if there's something inherently bad with my hashing function.

I don't think this is that important but the final version will be using EBCDIC. The theory will still stand but the AND operation may change slightly since the characters have different code points. I'll be happy with any assistance only on the ASCII front since I'm confident whatever advice is offered will translate okay to EBCDIC.

Alkalimeter answered 6/8, 2010 at 7:45 Comment(24)

I dont understand why you need a hashtable when you have a fixed number of keys, that logically represent 0 - 11. Edit: I think I understand now, nevermind the question :) – Riband 6/8, 2010 at 7:49

To clarify, you cant just take an external string's address and hope it will give you an index to an array :) – Riband 6/8, 2010 at 7:53

@leppie, you're right, the input data are the strings, my test program output the (array index) numbers just for my debugging. Sorry about that, I'll edit to remove any confusion. – Alkalimeter 6/8, 2010 at 7:55

What kind of architecture, how large are the cache lines? – Cleisthenes 6/8, 2010 at 7:55

@roe, good luck with this one. Final architecture is a System z mainframe. Could be anything from a z900 to one of the new z10 EC machines. :-) – Alkalimeter 6/8, 2010 at 7:57

yeah, I figured something like that (EBCDIC kind of gave it away)... the z900 has a cacheline size of 256 bytes, which will just fit the table, so there'll be no cache-trashing because of it. You might be able to compact it with some bit-fiddling, but I doubt it'll be any faster. – Cleisthenes 6/8, 2010 at 8:0

If you're this concerned with performance, why not write it in assembly? – Pris 6/8, 2010 at 8:4

Oh wait, no! Your table is not 16x16... Yeah, you might have some thrashing. If you're able to fold it in to a 256 byte table, maybe you'll be able to speed it up some. – Cleisthenes 6/8, 2010 at 8:6

You ARE reading it more than one line at a time though, right? – Cleisthenes 6/8, 2010 at 8:8

@roe, I did consider using (str[3]&0x1e)<<2 then having less height in that structure (collapsing each two lines into one). The only relevant lines that would merge would be t and u and there's no clashes there because they don't share common columns. While that won't make it faster (in terms of the binary ops), it will take up less space (and possibly make it faster in re cache occupancy). So bang that in an answer and I'll give you a vote (at least). – Alkalimeter 6/8, 2010 at 8:8

@roe: And bang the multi-line in your answer too. At the moment, I'm relying on the caching of fread (or more precisely the equivalent in the language I'm using). It may well be that reading can be made more efficient as well. – Alkalimeter 6/8, 2010 at 8:10

@pelotom (sorry, missed your comment), it's actually already written in a very low level proprietary language, effectively asm with loops :-) – Alkalimeter 6/8, 2010 at 8:15

Actually, @roe, the current table is 16x26, collapsing the height will make it 16x13 at what looks like no execution penalty. Do you think that's worth a shot? – Alkalimeter 6/8, 2010 at 8:17

At this level of optimization, it seems reasonable to question why you're storing these data in such an inefficient format. Reading in the file has got to be taking up the majority of your time anyway, so why not just assign a 4-bit number to each month, and set aside 12 or 28 or however many bits you need for the value it maps to, and then you have a nice compact file which parses fast and requires no hashing once it's read in. – Pris 6/8, 2010 at 8:43

@pelotom, the file is from an external source, we have no control over its format. – Alkalimeter 6/8, 2010 at 9:0

Out of curiosity, what's the actual performance requirement? How many lines per second? – Waken 6/8, 2010 at 9:45

If you are reading the file from an external source over which you have no control, the format checking you need to do is going to swamp (in terms of processor cycles) anything you can squeeze out of the hash algorithm, never mind the time take to do the IO. – Chinchin 6/8, 2010 at 9:47

@Jeremy, no format checking is required. What we'll end up doing is having an extra bucket for all the other (invalid) months (currently set to __ or -1). We have agreements in place with the source that invalid data will be tossed away. – Alkalimeter 6/8, 2010 at 10:29

@Vinko, it's not a hard requirement, it's more about maximum efficiency. Mainframe time is still charged in some shops per CPU cycle and/or DASD/central-storage usage. We're trying to minimise those cost rather than get through N records per second. This is part of a performance monitoring application. We don't want to start showing up in the reports as one of the heavy CPU users :-) – Alkalimeter 6/8, 2010 at 10:33

@paxdiablo: If there's any possibility of invalid data, you have to do full string comparisons to detect and throw it away... – Foreman 6/8, 2010 at 12:25

Actually, that's a good point, R, the hash code might be out of range of the array. I'm pretty certain the specs state that the data has to be valid - I'll check that next week when I'm back at work. Good catch! – Alkalimeter 6/8, 2010 at 12:34

@paxdiablo: But if you have any invalid data there is a possibility of hash collisions with valid data which means you will be treating invalid data as valid. For instance, supposing an upstream bug occurs so that you get a month of Janeary? – Chinchin 6/8, 2010 at 12:35

Yes, but the veracity of the data will not be our concern. It is the customer supplying the data and the same customer using the resultant stats. Provided we've made it clear where the responsibilities lie (and I'm pretty certain we have), it will not be a problem. But now we're deviating from technical issues to contractual ones. I greatly appreciate the heads-up for these points but we don't need to go down that track for the purposes of this particular question. – Alkalimeter 6/8, 2010 at 12:58

@paxdiablo: I just hope your customer isn't doing anything that impacts on my life. – Chinchin 7/8, 2010 at 11:27

Here's the smallest sequence I could find for EBCDIC-US:

It has 24 elements in the bucket and uses only 2 operations to compute the index:

static unsigned int hash (const char *str)
{
 static unsigned char tab[] = {
    11, 4,__, 7,__,__, 9, 1,
    __,__,__,__,__,__,__,__,
     3, 5, 2,10, 8,__, 0, 6
 };
 return tab[0x17 & (str[ 1 ] + str[ 2 ])];
}

Second best, 25 items with xor:

static unsigned int hash(const char *str)
{
 static unsigned char tab[] = {
  9,__,__, 7,__,__,11, 1,
 __, 4,__,__,__,__, 3,__,
 __, 5, 8,10, 0,__,__, 6, 2
 };
 return tab[0x1f & (str[ 1 ] ^ str[ 2 ])];
}

(Actually, tab[] should be 32 entries long here, because 0x1f can generate an overflow for incorrect inputs).

Update from Pax: For what it's worth, the first option worked perfectly for EBCDIC code page 500:

## Month     str[1] str[2] Lookup
-- --------- ------ ------ ------
 0 January   a (81) n (95)      0
 1 February  e (85) b (82)      1
 2 March     a (81) r (99)      2
 3 April     p (97) r (99)      3
 4 May       a (81) y (a8)      4
 5 June      u (a4) n (95)      5
 6 July      u (a4) l (93)      6
 7 August    u (a4) g (87)      7
 8 September e (85) p (97)      8
 9 October   c (83) t (a3)      9
10 November  o (96) v (a5)     10
11 December  e (85) c (83)     11

Nerte answered 6/8, 2010 at 11:0 Comment(5)

I'll have to wait until next week to verify this one, I don't have a mainframe at home :-) But, since it looks good for the first few characters, I'll assume you've done the gruntwork correctly. +1. – Alkalimeter 6/8, 2010 at 11:34

The first solution looks great. In the 2nd one the array should really to be filled up with __ at the end to avoid illegal memory access. Both implementations are also case insensitive and don't rely on the trailing space. – Ciapha 6/8, 2010 at 14:21

I still wonder it we can take advantage from the empty middle row. I tried some shifting & pushing so that the month number can be pulled out off a 64 bit constant but this was much slower than the table. – Nerte 6/8, 2010 at 14:42

Luther, we have a winner (only just, but every CPU cycle counts for us, and it was substantially better than my first attempt). Thanks for your help. – Alkalimeter 8/8, 2010 at 13:22

\\(^_^)/. Will you be posting some stats later? I'm interested to hear what the throughput/cycle figures of the zOS machines are. – Nerte 8/8, 2010 at 19:29

I agree with the others that there is not much room for improvement. All I can suggest is a smaller lookup table that works with the same number of operations which might make it stay longer in the CPU cache. Additionally it does not rely on the space filling chars at the end and it works with any mixture of uppercase and lowercase characters. I found that adding some reasonable robustness against likely changes in the requirements often pays off in the future especially when the implementation is optimized to a point where small changes are not so easy anymore.

#define __ -1
static unsigned int hash (const char *str)
{
    static unsigned char tab[] = {
        __, __,  1, 11, __, __, __, __,  7, __, __, __, __,  6,  0,  5,
         8, __,  2,  3,  9, __, 10, __, __,  4, __, __, __, __, __, __
    };
    return tab[ ( ( str[ 1 ] >> 4 ) & 1 ) + ( str[ 2 ] & 0x1f ) ];
}

This works similar to your original idea but with less white space:

Month  s[1]          s[2]          s[1].4  s[2].4-0  sum  lookup
-----  ------------  ------------  ------  --------  ---  ------
Jan    61:0110 0001  6e:0110 1110       0        14   14       0
Feb    65:0110 0101  62:0110 0010       0         2    2       1
Mar    61:0110 0001  72:0111 0010       0        18   18       2
Apr    70:0111 0000  72:0111 0010       1        18   19       3
May    61:0110 0001  79:0111 1001       0        25   25       4
Jun    75:0111 0101  6e:0110 1110       1        14   15       5
Jul    75:0111 0101  6c:0110 1100       1        12   13       6
Aug    75:0111 0101  67:0110 0111       1         7    8       7
Sep    65:0110 0101  70:0111 0000       0        16   16       8
Oct    63:0110 0011  74:0111 0100       0        20   20       9
Nov    6f:0110 1111  76:0111 0110       0        22   22      10
Dec    65:0110 0101  63:0110 0111       0         3    3      11
             ^             ^ ^^^^
bits:        4             4 3210

Ciapha answered 6/8, 2010 at 8:47 Comment(5)

Hey, that's pretty good, gave me the same results as my larger table. I see your using different characters as well. Of course, I'm going to have to analyse it but it looks good from just plugging it into my test code. – Alkalimeter 6/8, 2010 at 8:56

That is elegant but, more importantly from my point of view, sneaky and underhanded :-) Added my analysis for completeness. – Alkalimeter 6/8, 2010 at 9:29

Thanks for the analysis. By the way, this implementation works also with the usual 3 letter abbreviations of months names. – Ciapha 6/8, 2010 at 10:12

Apologies, x4u. The solution was brilliant for ASCII but didn't translate that well for EBCDIC since the letters there cross three high-nybble ranges (so that you need two bits from the first character to ). I got it to work but the performance was (only slightly) less than Luther's answer. – Alkalimeter 8/8, 2010 at 13:20

No problem. I agree with your that Luther's implementation is the best for EBCDIC. I had not seen the EBCDIC requirement when I started my answer. – Ciapha 8/8, 2010 at 13:39

Here's the smallest sequence I could find for EBCDIC-US:

It has 24 elements in the bucket and uses only 2 operations to compute the index:

static unsigned int hash (const char *str)
{
 static unsigned char tab[] = {
    11, 4,__, 7,__,__, 9, 1,
    __,__,__,__,__,__,__,__,
     3, 5, 2,10, 8,__, 0, 6
 };
 return tab[0x17 & (str[ 1 ] + str[ 2 ])];
}

Second best, 25 items with xor:

static unsigned int hash(const char *str)
{
 static unsigned char tab[] = {
  9,__,__, 7,__,__,11, 1,
 __, 4,__,__,__,__, 3,__,
 __, 5, 8,10, 0,__,__, 6, 2
 };
 return tab[0x1f & (str[ 1 ] ^ str[ 2 ])];
}

(Actually, tab[] should be 32 entries long here, because 0x1f can generate an overflow for incorrect inputs).

Update from Pax: For what it's worth, the first option worked perfectly for EBCDIC code page 500:

## Month     str[1] str[2] Lookup
-- --------- ------ ------ ------
 0 January   a (81) n (95)      0
 1 February  e (85) b (82)      1
 2 March     a (81) r (99)      2
 3 April     p (97) r (99)      3
 4 May       a (81) y (a8)      4
 5 June      u (a4) n (95)      5
 6 July      u (a4) l (93)      6
 7 August    u (a4) g (87)      7
 8 September e (85) p (97)      8
 9 October   c (83) t (a3)      9
10 November  o (96) v (a5)     10
11 December  e (85) c (83)     11

Nerte answered 6/8, 2010 at 11:0 Comment(5)

Luther, we have a winner (only just, but every CPU cycle counts for us, and it was substantially better than my first attempt). Thanks for your help. – Alkalimeter 8/8, 2010 at 13:22

\\(^_^)/. Will you be posting some stats later? I'm interested to hear what the throughput/cycle figures of the zOS machines are. – Nerte 8/8, 2010 at 19:29

This is tested for EBDIC (CCSID 500), the table 32 byte (smaller than yours, same size as x4u's):

#define __ -1
static unsigned int hash(const char *str)
{
    static unsigned char bucket[] = {
        __, __, __, __, __, __,  1,  8,
        __,  7, __, __, __,  3, __, __,
        11,  6, __, __,  4, __,  2, __,
        __,  0, __,  5,  9, __, __, 10,
    }
    return bucket[(unsigned int)(str[0]|str[3]<<1)&0x1f];
}

Recruit answered 6/8, 2010 at 10:5 Comment(0)

I would start with a detailed profile of your larger process to make sure you aren't engaging in premature optimization.

This looks pretty fast on the face of it, but if memory is really cheap it might be better to just use an even sparser array and let your cache do some of the work. For instance (and thinking off the cuff here), what if you simply add the short found in the first two bytes to the short at the next two. That includes both the first and fourth characters, so at a guess it should produce your 12 distinct values, and it doesn't involve bit field extractions which may not optimize well. Then, make the matching bucket[] array have 64K entries, only 12 of which are ever hit. If it works out right, those 12 entries end up occupying some of your D cache and you've traded a handful of arithmetic operations for an index into a cached larger array.

But do profile both before and after any mucking about trying to make arithmetic faster, and don't bother optimizing where it won't actually save time. (I know Pax knows this, but its the obligatory warning attached to any optimization discussion.)

Aeroscope answered 6/8, 2010 at 8:5 Comment(5)

Actually, looking back at the data, that's not a bad idea. The third and fourth character are unique as a set on their own, so I could just use that single 16-bit value as an index, no arithmetic or bit manipulation at all. I'll have to test the larger lookup table size against the reduced instructions to see how it pans out (increased cache misses may offset fewer instructions) but +1 anyway. – Alkalimeter 6/8, 2010 at 8:32

You'd need to extraction anyway, there's no guarantee that the shorts are aligned.. don't know how that's handled on your architecture though. – Cleisthenes 6/8, 2010 at 8:41

@pax, Without knowing way more than I want to know about any system that still uses EBCDIC, I can't guess how the cache vs. arithmetic dynamic will play out. As long as the values are well spaced relative to your cache line size and associativity, there is a chance... so it seems worth a try. – Aeroscope 6/8, 2010 at 8:53

@roe, Good point. If the record length is fixed and even, then all you need to do is align the buffer. Murphy says it is neither, of course. – Aeroscope 6/8, 2010 at 8:54

If we're using a one character line-breaks, you're in luck (10 byte records), of not, then not so much. But yeah, that ****** OS is probably going to give you an unaligned buffer anyway... ;) – Cleisthenes 6/8, 2010 at 11:45

Ok, as everyone on SO, I'm all in it for the rep.. ;*) As I wrote in comments above, the lower end of your target architectures has a cache line size of 256 bytes, so you might end up with some cache trashing in your table lookups (your table is more than 256 bytes). An attempt to fold the table using some cheap bit-trick might actually gain some performance.

I've been playing around with your data. You also have the option of column 2 and 3. Haven't figured out a way to get that under 8 bits yet though.

... and as always, profile, make sure it's the best point to apply effort, and profile again afterward, make sure it's faster.

... and you are reading more than one line at a time, right? Fixed record sizes are good that way, that you don't have to search for separators (newlines), and you can read a big chunk of them at a time.

You can reduce the array size by using:

#define __ -1
static unsigned int hash (const char *str) {
    static unsigned char alloc_to[] = {
        //   A       S   D       F               J           M   N   O
        __, __, __, __, __, __, __, __, __, __, __, __, __,  4, __, __, // space
        __, __, __, __, __, __, __, __, __, __, __, __, __,  2, __, __, // c
        __, __, __, __, 11, __, __, __, __, __,  5, __, __, __, 10, __, // e
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __,  3, __, __, __, __, __, __, __, __, __, __, __, __, __, __, // i
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __,  9, // o
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __,  1, __, __, __, __, __, __, __, __, __, // r
        __,  7, __,  8, __, __, __, __, __, __,  0, __, __, __, __, __, // t/u
        __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, __, //
        __, __, __, __, __, __, __, __, __, __,  6, __, __, __, __, __  // y
    };
    return alloc_to[((unsigned int)(str[3]&0x1e)<<3)|(str[0]&0xf)];
}

which changes it from 16-by-26 to 16-by-13.

EDIT

If, as suggested by other posts, your strings ARE aligned, so that you may use them as shorts, you may add the first and second short, xor the two bytes together and you'll have a unique 8-bit key (well, seven, actually). Might be worth your while too. This is ASCII though, so might not work in EBCDIC. In ASCII, the keys turn out to be:

6e Jan
7f Feb
7b Mar
6a Apr
47 May
62 Jun
58 Jul
42 Aug
1a Sep
11 Oct
10 Nov
6d Dec

Cleisthenes answered 6/8, 2010 at 8:38 Comment(2)

I'll probably look into collapsing rows rather than columns since the row calculation already has an AND and a bit shift. To colapse columns, I'd have to add another bitshift. I can collapse every two columns to one by using (str[3]&0x1e)<<3 instead of (str[3]&0x1f)<<4. If you don't mind, I'd like to put that in your answer since the cache-miss minimisation was your idea. – Alkalimeter 6/8, 2010 at 8:47

@pax; thanks, I guess.. :) I wouldn't had thought of the (s&0x1e)<<3 though – Cleisthenes 6/8, 2010 at 9:7

Looks nice enough for me. The question is if the hash function itself is enough of a bottleneck to justify the ongoing efforts of eliminating one or two more simple binary operations from it. Given that file access seems to be involved, I doubt it, without knowing any details about the overall processing, of course.

EDIT:

Maybe you could see if you find any pair of characters that results in unique lower bits (4, 5 or 6) when added:

(str[1] + str[2]) & 0x1f

If addition won't do, maybe one of the other operations & | ^. If this won't help, maybe using three characters.

Orvah answered 6/8, 2010 at 7:59 Comment(1)

It may well be that the CPU improvements will not be relevant in relation to the file I/O. But mainframes are insanely efficient at I/O - that's where they blow other boxes out of the water since the CPUs, until recently, weren't actually that grunty (though they do come in multiple "books" of 54 each so that pretty well makes up for it). Your advice is noted - all these options will be undergoing rigorous performance testing. – Alkalimeter 6/8, 2010 at 9:42

In ASCII, if you take month[0] ^ month[2] ^ month[3] then you get a unique hash with a maximum value of 95 (July), which should allow you to reduce your table size a fair bit (and a minimum value of 20 (May), so a subtraction makes it smaller again).

The same might not be true in EBCDIC, but something similar might be.

Silvern answered 6/8, 2010 at 9:10 Comment(3)

That's not a bad strategy. It's less than my original table size and I could get a smaller table still with the cost of a single subtraction (since I assume there's a minimum value as well of at least x20 in ASCII since that's the upper/lower bit difference). I'll test it out on EBCDIC when I get back to work but +1 anyhow. – Alkalimeter 6/8, 2010 at 9:37

@paxdiablo: There was actually a collision I'd missed - but adding month[2] as per my updated answer fixes that, and makes the table smaller to boot. – Silvern 6/8, 2010 at 9:48

In ASCII, month[2]^month[4] is unique (no matter if "May"[4] == 0 or ' ') – Nerte 6/8, 2010 at 10:27

Do you really need the mapping between the hash and the month index to do the tallying? You could eliminate a lookup if instead of returning the month you returned the hash and use that for tallying. In x4u's answer the last line of the hash function could look like

return ( ( str[ 1 ] >> 4 ) & 1 ) + ( str[ 2 ] & 0x1f )

and you'd still be able to do the sums, sorting the results only in the end, not inside the loop.

Osteopathy answered 6/8, 2010 at 12:22 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags