[lug] Perl, sorting hashtables by value, and floating-point

Tkil tkil at scrye.com
Tue Jun 1 00:06:50 MDT 2004


>>>>> "Chris" == Chris Riddoch <chris-blug at syntacticsugar.org> writes:
Chris> Okay, here's the standard Perl idiom for sorting a hash
Chris> numerically by value:

Chris> foreach my $k (sort { $a <=> $b} keys %hash) {
Chris> print $hash{$k} . " = " . $k . "\n";
Chris> }

>>>>> "Tkil" == tkil  <tkil at scrye.com> writes:

Tkil> Nope, you're sorting the hash *keys* by value.  If you want to
Tkil> get a list of keys in sorted by their values, you need to
Tkil> compare the values associated with $a and $b, not $a and $b
Tkil> themselves:

Wow, that was amazingly poorly phrased.

The code sample that Chris gave simply traverses the set of keys in
increasing numeric order.

The code I gave should provide a list of the keys order such that, if
you fetch the value associated with each of those keys, the resulting
list of values will be in increasing numeric order.

Chris mentioned probabilities.  One application that comes to mind is
Monte Carlo evaluation of volumes.  I think I can come up with a
similar data set by shooting rays at the x/y range -10..+10, and
recording what portion of those rays hit the unit circle centered at
the origin; those hits further broken down on a 4x4 grid:

| #!/usr/bin/perl
| 
| use strict;
| use warnings;
| 
| my $N_RAYS = 1_000_000;
| 
| my %hash;
| my $points_per_hit = 1.0 / $N_RAYS;
| for ( my $i = 0; $i < $N_RAYS; ++$i )
| {
|     my $x = -10 + rand 20;
|     my $y = -10 + rand 20;
|     if ( $x*$x + $y*$y <= 1 )
|     {
|         my $xc = chr( ord('A') + 2*($x+1) );
|         my $yc = chr( ord('A') + 2*($y+1) );
|         $hash{ $xc . $yc } += $points_per_hit;
|     }
| }
| 
| foreach my $cell ( sort { $hash{$a} <=> $hash{$b} } keys %hash )
| {
|     printf "%.6f => %s\n", $hash{$cell}, $cell;
| }

Which gives the following output:

| $ ./chris1.plx
| 0.000190 => AD
| 0.000210 => AA
| 0.000215 => DA
| 0.000225 => DD
| 0.000542 => AB
| 0.000550 => DB
| 0.000578 => BC
| 0.000580 => BA
| 0.000585 => CD
| 0.000590 => AC
| 0.000594 => DC
| 0.000599 => CB
| 0.000599 => CA
| 0.000602 => BD
| 0.000643 => CC
| 0.000654 => BB

t.



More information about the LUG mailing list