[lug] File modification

Bear Giles bgiles at coyotesong.com
Thu Mar 14 11:40:01 MST 2002


> Bear Giles <bgiles at coyotesong.com> writes:
> > Did you know the size of the problem going in?  Were you misled 
> > about the size of the problem?
> 
> I knew about the size of the problem, yes.

Lesson 1: always run at least one early test case of the approximate
size of the "real" problem.

> Determining an appropriate algorithm for the job requires considering
> alternatives. Considering alternatives requires... guess what, time.

It really doesn't.  Does this problem require finding a specific match
(or set of matches), or does it require processing everything?  Can the
problem be broken into independent subproblems, or do you always need
to look at the entire problem space?

If the problem is a "lookup," find a way to order the elements using
whatever fields are passed to you in the lookup calls.  Then finding
the answer once the data is read (or accessible) is O(lg(N)).  If you
have to iterate through the data it's O(N).  If you know you'll never
need to find ranges, you should also consider a hash and hash table.

If the problem *involves* a lookup internally, again you start by finding
a way to order the elements.  This is a very common problem, and by
doing a single lookup you can convert most O(N^n) algorithms to 
O(N^(n-1)lg(N)).

I know you don't yet have the experience to know this intuitively, but 
you develop that intuition by going through this process with every 
nontrivial problem you face.  It doesn't take more than a few minutes,
and that's time well spent because you need this information anyway to
understand the problem.

Lesson 2: time spent understanding the nature of the problem is never
time wasted.

It has never taken me more than a few minutes to do the first pass...
with the noteworthy exception of those times when I misunderstood the
problem and would have wasted everyone's time if I ran off and started
coding the first thing that came to mind.  By taking a few minutes to
think about the problem, I saved *days* of effort.

Lesson 3: don't bitch about how long something takes until you've actually
tried it at least once. 

Another thing to consider is how much you can delegate to existing
code.  E.g., I use binary searches and hash tables all the time, but I
haven't written that lookup code in many, many years.  The reason is
I've learned the very simple Sleepycat DB (or GDBM) and I let it handle
the details.

Maybe this wouldn't have helped you in this case (since we have no idea
what your assignment was), but the more general point applies:

Lesson 4: learn to use the standard libraries.

> Time was something I had nothing to spare. The full data set was about
> 500 points, I was using test cases of 5 and 10 points that I could
> verify results for by hand.

So you used a problem set 1-2% of the size of "real" problems, and
expected to learn anything more than a basic "smoke test"?

> Debugging the first solution I could come
> up with took a precious 20 minutes.

How long would it have taken to write a validation function?  How did
you ever to expect to validate the results of the 500-point run?

> It's an academic project. I think a better question is: Does it
> demonstrate that I know what's involved in implementing <project>?
> Yes. 

No it does not, because you have never successfully completed a
test run on realistic data.

> If I was in the industry right now and ran into this exact situation,
> my first question would be why I'm paying so much money to work
> 80-hour weeks, and why I don't have anyone to delegate anything to.

You may wish to consider a different major and career.

> 1) Start projects long enough in advance to think about proper
> algorithms.

The more I look at this statement, the less sense it makes. It doesn't
take any longer to pick a good algorithm than to pick a bad one.  It
usually doesn't take any longer to implement a good algorithm than a
bad one.  (I would argue it takes significantly less time when you
look at the total cost.)

But I'm also struck by the reversal - do you really think any
customer (your boss, a paying client, whoever) will be willing
to accept a half-assed job because you didn't get around to 
starting the project?

> 2) Remove social life so I can start projects referred to in #1 long
> enough in advance.
> 3) Remove relaxing activities so I can sleep in between working on
> projects.

I'll let others address this.  




More information about the LUG mailing list