[lug] Perl, sorting hashtables by value, and floating-point
D. Stimits
stimits at comcast.net
Tue Jun 1 04:42:56 MDT 2004
Chris Riddoch wrote:
> Hi, everyone.
>
> I have a bit of a Perl question. I've got a very large hash table,
> where all values are floating point numbers all between 0 and .001 or
> so. (They're very tiny probabilities) It's the fourth through tenth
> digits that I really give a darn about, because that's where they
> really start looking different. Perl supposedly represents floating
> point numbers as doubles, which is fine; on the 64-bit Opteron,
> doubles are way more than big enough for what I need.
>
> Okay, here's the standard Perl idiom for sorting a hash numerically by
> value:
>
> foreach my $k (sort { $a <=> $b} keys %hash) {
> print $hash{$k} . " = " . $k . "\n";
> }
>
> This completely fails to work when those numbers are floating point.
> I've tried precomputing logarithms to make the numbers more easily
> computable, with no change in Perl's sorting ability. I've tried 'use
> bignum' in case it might help, but it seems to overload everything but
> the spaceship (<=>) operator.
>
> I'm certain there's a way to do this.
>
> Ideas?
>
I don't know about perl hash, I'm just speaking about hash in general,
and floating point. One question comes to mind...what will you use the
hash for? I'll assume it is just a function used for fast indexing or
alternate lookup scheme.
Take the hash of value*1000. Don't hash on the original value, hash on
the value with the decimal shifted. Now they are all between 0 and
1.something. You lose the ability to hash on higher numbers but gain the
ability to hash on smaller numbers. If floating point itself does not
work for this hash function...and numbers that are a fraction less than
1.0...then shift the decimal until only the smallest number is 0. If you
are not working with any numbers over 0.002, that should work well.
D. Stimits, stimits AT comcast DOT net
More information about the LUG
mailing list