[lug] Fwd: Simple counter?

Fri May 24 22:16:26 MDT 2013

Bear Giles <bgiles at coyotesong.com> writes:

> It's the concurrency that will get you. 

Oh, don't I know it.  :)

(Current $dayjob is an embedded(ish) linux system, 32+ threads, and
currently debugging a collision...)

> Even locks might not work if the underlying filesystem doesn't
> properly support them. 

Advisory locks (flock(2)) should be kernel-based, so not dependent on
FS.  So long as everyone accesses through the advisory locks,
everything is good.  (But this quickly becomes a battle of
specifications and assumptions.)

> That's why I was thinking along the lines of a named pipe - that way
> you have a single process on the other end of the pipe.

Interesting... although the persistence is an issue.  Have to make
sure you save it just once.  Also, no mechanism for "current value" if
all you have is a pipe.

> The database sequence is another possibility. I wouldn't use mysql
> though - there are lightweight JVM databases (e.g., H2) and you
> could also investigate SQLite.

That seems like sledgehammer-as-flyswatter, but it does serve to
properly separate out persistence of data from increment and/or
access.

> Re-reading the same value will always be problematic outside of a
> database transaction though - how do you know some other process
> asn't gotten a newer value in the interrim? The only way to be
> really sure is to cache the information in your own process somehow
> and again that's adding a lot of weight to the code.

Right.

My assumption was that the end user would use this wisely, which
includes dealing with concurrency sanely.

And it's almost necessary that it's the end user that is responsible
for this; there's no way to predict all possible use cases, and the
end result of trying to do so is a huge, heavyweight, probably
difficult-to-deploy mess.

To draw a comparison, the pthread standard doesn't make things safe
for all users; instead, it outlines rules and assumptions that one can
make when writing concurrent / threaded programs.  The programmer is
still ultimately responsible for doing it right.

My concept of "counter" is intended to be such a tool, for use by
shell scripts.  I'll admit that my interpretation of "seed" and "next
value" are both fairly loose and more designed for sanity checking
than rigorous use.

I suspect that the most advanced creatures of this sort will be found
in cluster filesystems, to get a unique identifier across thousands of
nodes.  Probably not sequential, but that's one more tradeoff, right?

t.