[lug] Perl, sorting hashtables by value, and floating-point

Chris Riddoch chris-blug at syntacticsugar.org
Tue Jun 1 21:21:50 MDT 2004


First, thanks to everyone who sent me advice on the matter.

Tkil <tkil at scrye.com> writes:
> Tkil> Nope, you're sorting the hash *keys* by value.  If you want to
> Tkil> get a list of keys in sorted by their values, you need to
> Tkil> compare the values associated with $a and $b, not $a and $b
> Tkil> themselves:
<snip>
> The code I gave should provide a list of the keys order such that, if
> you fetch the value associated with each of those keys, the resulting
> list of values will be in increasing numeric order.

Indeed, I overlooked the difference between ($a <=> $b) and ($hash{$a}
<=> $hash{$b}) - I'd been sitting and coding for long enough that my
brain was getting fuzzy, in spite of taking intermittent short breaks.

It works now.

> Chris mentioned probabilities.  One application that comes to mind is
> Monte Carlo evaluation of volumes.  I think I can come up with a
> similar data set by shooting rays at the x/y range -10..+10, and
> recording what portion of those rays hit the unit circle centered at
> the origin; those hits further broken down on a 4x4 grid:

Curious. I've read a little about Monte Carlo methods, but it's a bit
above my head at the moment - I understand that they're occasionally
useful for modeling data in computational linguistics, and I've seen
reference to it in some historical linguistics work to analyze sound
changes over time.

What I'm actually doing is something involving computing the
probabilities of sequences of words.  This particular chunk of code
was partly responsible for debugging, but is primarily one of the
required results: the list of words ordered by the expectedness of
seeing a new word pair, given that we've seen the first word already
and how many types/tokens we've seen, and how many types/tokens have
followed the first word.  It's for a smoothing technique called
Witten-Bell.

I'd go into more detail, and offer code, except that the professor for
the project in question has a habit of re-using assignments from
previous times he's taught the class, and I'd rather not be the
driving force behind him making new assignments (and don't want
provoke any unnecessary work for him - he's on my thesis committee.)

-- 
epistemological humility
   - Chris Riddoch -




More information about the LUG mailing list