wting/autojump

autojump deletes its own database often

Closed this issue · 10 comments

Description of problem:
autojump seems to loose it's database, frequently. I have 249 entries in the autojump DB. I used it yesterday and didn't notice a problem. Today, I tried to use it and j

didn't change directories. I did autojump --stat and it said there were only a few directories in the DB. I went to ~/.local/share/autojump/ and found that autojump.txt was nearly empty. I restored autojump.txt.bak to autojump.txt and that worked. But, sometimes the backup is empty, too.

Version-Release number of selected component (if applicable):
autojump-22.3.2-2.fc25.noarch

How reproducible:
Cannot be reproduced on demand, but happens frequently.

Steps to Reproduce:

  1. use cd, let it create a database
  2. after a while, autojump will "forget" directories CD-ed-to in the past

Actual results:
j

will just do nothing

Expected results:
j

should go to you visited before.

Additional info:
I noticed this before, but seems to be quite frequent
now. It happened twice this week.

I reported this to Fedora, too, but I'm reporting here as I don't really know where lies the problem.

This happens very often to me too. Not sure what triggers the deletion. I made a screenshot to show this: https://asciinema.org/a/0fn2g5ni1g768sxxqcu8deu3f

What's your distro? Just trying to assure myself this issue isn't a package problem with my distro.

Yeah, I haven't thought of a way this could happen, so, I figure the authors will know. I'm glad I'm not the only one experiencing this, too.

@PorcelainMouse ArchLinux with zsh. What's yours?

Update:

could you try the following step?

My database seems to persist after this modification. I guess the problem is that the following mv is not really atomic across different filesystems. (https://stackoverflow.com/questions/3716325/is-pythons-shutil-move-atomic-on-linux)

@Frefreak Cool, good thinking; I follow your logic. That could be it.
I'd be happy to try that. I'm using Fedora.

Do you know the logic that lead to save()? Just trying to piece together the scenario that causes this.

I take it you are speculating about a race condition? Making the move atomic will help, but I think what you really want is to make autojump_data.save() atomic. Right? Just fixing move() doesn't completely work, right? I could be wrong, I just glanced at that function for a second.

Do you know what I'm getting at? I worry that the problem is multiple entry into save(), not actually the race condition on move().

Hi. I was not thinking about race condition, in which case it would only miss new directory entries (this indeed happens sometimes to me though, maybe there's also race condition but that's not the main problem), not truncating database file.

IIRC when I was debugging the load function occasionally gets empty result, and that's when the file truncating happens. So my guess was maybe at some time the database file is actually empty (possibly during the move since the only part it touches that db file is move_file). So yes, I think atomic is the real problem here, but those are just my thought.

I haven't experienced this issue since then. Looking forward to your feedback :)

Hmm, well, I made the change and we'll see how it goes.

You should submit your patch, as it seems reasonable and is an improvement, regardless. I reused code I found elsewhere like this:

xdg_data_home = os.path.join(os.path.expanduser('~'), '.local', 'share')
xdg_aj_home = os.path.join(xdg_data_home, 'autojump')
temp = NamedTemporaryFile(delete=False, dir=xdg_aj_home)

The first thing I don't understand is why there is a difference between new directory entries and truncated db? Sounds like you found out that autojump sometimes deletes the whole db. Please share. Why does autojump ever do that?

My other confusion is why you don't consider this a race condition. The atomic-ness of mv is irrelevant for any single thread since mv blocks in all cases. Even if mv is interrupted, no single thread could simultaneously call mv and read the db. mv could fail, but then autojump would have to ignoring the error and then read. And, even then, I'm confident mv isn't failing with the mv half completed...hmm, okay, I guess I'm not confident, but then why would mv only be failing for autojump and not other callers?

I think it's not autojump that deletes the db, it just loads the db and write to a temp file then add a new directory entry, finally mv that to the original db. The problems is that sometimes the load gets empty (or nearly empty) result, maybe its because at the load time the original file is indeed empty (caused by non-atomic move). Answer in this SO question explains how copyfile works, which is used by shutil.move when not in the same filesystem.

But that's just my hypothesis, I'm not so sure and that's why I don't submit a PR, maybe the real problem is somewhere else, or maybe there's a better way. My modification seems more like a workaround. After all, putting tmp files in /tmp is very reasonable. I prefer leaving this to the author.

I also have this problem on Lubuntu 17.04, and had it on 16.10 as well.

I don't recall it having been an issue when I first started using autojump.

Though I haven't yet read the preceding comments in-depth, a race condition seems possible, since I am running many terminals at once and, at some point, closing them en masse when, e.g., I shutdown the computer.

Look like #391 is where discussion of this issue is happening.

Thanks r-barnes. You're right. I wonder if I should close this bug? I'm following #391 now.