[lug] Interesting Common File Locking Problem
D. Stimits
stimits at comcast.net
Sat Mar 6 16:15:27 MST 2004
Zan Lynx wrote:
> On Fri, 2004-03-05 at 19:20, D. Stimits wrote:
>
>>Zan Lynx wrote:
>>
>>>I spent some time figuring this one out. I spent a few hours scratching
>>>my head over this one. I thought I would share for any of you
>>>programmers out there. Also, I just have to vent the frustration build
>>>up to _someone_, or it'll bother me all weekend! Heh.
>>>
>>>Some programs appear to lock files using this method:
>>>
>>>- open the file, get a file descriptor.
>>>- lock the file descriptor.
>>>- write data into a temporary file.
>>>- optionally fsync the data to ensure it is really in there.
>>>- rename temporary into real file name, which deletes the original,
>>>which removes the lock.
>>>
>>>
>>>
>>>At first glance, this looks reasonable and safe. What I discovered is
>>>that this can happen:
>>>- ProgA opens the file, get a file descriptor.
>>>- ProgB opens the file, gets a file descriptor.
>>>- ProgB locks the file descriptor. Now ProgA is waiting...
>>>- ProgB creates temporary file, writes data into it.
>>>- ProgB now renames temporary to real name.
>>>- ProgB unlocks, closes, exits, etc.
>>>- ProgA finally gets its lock! Yay for ProgA!
>>>But wait! What lock is this!?
>>>Could it be a lock on the now removed original file!? The file ProgB
>>>just deleted by renaming over it?
>>>Why Yes! It could!
>>>
>>>Does anything stop ProgC from coming along and getting a lock on the
>>>file name?
>>>Why No, nothing does! Because ProgA's lock is on a file that no longer
>>>has that name! In fact, ProgA's locked file no longer has any name!
>>>
>>>And then does it stop there? No indeed! Because ProgC has the lock on
>>>the file name, it assumes no one else is using the file. Now, using the
>>>same temporary file name ProgA is also using, ProgC goes ahead to
>>>truncate and write into the temporary file. What does this lead to? Me
>>>getting large chunks of zeros in my mail client spool file.
>>>
>>
>>You just described a textbook example from a thread programming book.
>>Sounds like you need a mutex/semaphore system built into the filesystem
>>itself. It might be interesting to see how journals in various
>>journaling filesystems do it.
>
>
> Well, that is what file locks are for. The filesystem locks implemented
> by lockf, flock and fcntl are all intended to make this work. They are
> the equivalent of the mutex.
>
> Programmers must use them _correctly_, however.
Heck, I know of one successful software company that says the *marketing
department* has to use them correctly...programmers are not required to
do that. :P
Seriously though, threaded programming has so many gotchas, doing it
correctly often requires several failures first.
D. Stimits, stimits AT comcast DOT net
More information about the LUG
mailing list