[lug] Perl, sorting hashtables by value, and floating-point

D. Stimits stimits at comcast.net
Tue Jun 1 04:42:56 MDT 2004


Chris Riddoch wrote:
> Hi, everyone.
> 
> I have a bit of a Perl question.  I've got a very large hash table,
> where all values are floating point numbers all between 0 and .001 or
> so.  (They're very tiny probabilities) It's the fourth through tenth
> digits that I really give a darn about, because that's where they
> really start looking different.  Perl supposedly represents floating
> point numbers as doubles, which is fine; on the 64-bit Opteron,
> doubles are way more than big enough for what I need.
> 
> Okay, here's the standard Perl idiom for sorting a hash numerically by
> value:
> 
> foreach my $k (sort { $a <=> $b} keys %hash) {
>     print $hash{$k} . " = " . $k . "\n";
> }
> 
> This completely fails to work when those numbers are floating point.
> I've tried precomputing logarithms to make the numbers more easily
> computable, with no change in Perl's sorting ability.  I've tried 'use
> bignum' in case it might help, but it seems to overload everything but
> the spaceship (<=>) operator.
> 
> I'm certain there's a way to do this.  
> 
> Ideas?
> 

I don't know about perl hash, I'm just speaking about hash in general, 
and floating point. One question comes to mind...what will you use the 
hash for? I'll assume it is just a function used for fast indexing or 
alternate lookup scheme.

Take the hash of value*1000. Don't hash on the original value, hash on 
the value with the decimal shifted. Now they are all between 0 and 
1.something. You lose the ability to hash on higher numbers but gain the 
ability to hash on smaller numbers. If floating point itself does not 
work for this hash function...and numbers that are a fraction less than 
1.0...then shift the decimal until only the smallest number is 0. If you 
are not working with any numbers over 0.002, that should work well.

D. Stimits, stimits AT comcast DOT net



More information about the LUG mailing list