New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to avoid infinite recursion when finding the list of new revs. #322
Conversation
For the record, it looks like pbrobinson only pushed these commits:
That merge in the middle is suspicious. And the branch that it merged also contained merges not in the base branch.. but trying to trace it all this afternoon turned my brain into mush. |
I'm not seeing anything wrong with this. Have we tried to replicate the issue (take the git repo, drop the last commits and re-add them or so)? |
So, I still don't understand how it could have dove into a cycle via recursion, but I do see a flaw in the original implementation and in the new implementation in 3182fe4). Consider this git graph:
If the person pushing was updating master from I think the current implementation is not right either. It only stops once it How does this new implementation sound?
|
Consider this git graph: ``` * commit A <-- HEAD | * merge commit B | \ | * commit C | | | * commit D | | | * merge commit E | | \ | | * commit F | | | | | * commit G | | | | | | * | | commit H <-- BASE | | | | * | commit I |/ / * | commit J | | | * commit K |/ * commit L | * commit M | * commit N ``` If the person pushing was updating master from ``BASE`` to ``HEAD``, the original implementation sought out all parents of the ``HEAD`` commit recursively, and the only stop-condition was when ``BASE`` was found for that branch of the recursion tree. In the example above, the two branches that branch out from and back into the right side *do not have* ``BASE`` in their history. For them, the original algorithm would spider all the way back to the first commit. The current implementation is not right either. It only stops once it finds a commit that has already been seen (preventing cycles) but it won't prevent this scenario from publishing messages about commits J, L, M, and N. The implementation in this commit works like this: - First, calculate all the ``ancestors`` of the ``BASE`` rev, including the ``BASE`` rev. In the example above, that would be H, J, L, M, and N. - Then, traverse recursively backwards from the ``HEAD`` rev. Stop recursion for that branch of the call tree when a parent is found that is in the ``ancestors`` list that we previously computed. This would end up yielding commits A, B, C, D, E, F, G, I, K. (The compliment of the ``ancestors`` list).
Implemented in c0ef2be. |
One idea, maybe we could check how the email hook provided in git handles this? http://git.kernel.org/cgit/git/git.git/tree/contrib/hooks/multimail/ |
As far as I can see, it doesn't try to iterate through the commit tree to figure out which changes to send an email for. Instead, if it's run as post-receive hook, it reads its stdin, which gets one line per change, and just sends an email for each of these, nothing more. If it's run as an update script, it just sends an email for the single change that was passed to it as arguments of the command-line. This is also what the GNOME email hook we were using before was doing. What was the reason for trying to iterate through the history tree? |
I didn't find where the upstream hook does this.. but it's not true. The post-receive hook receives one line over stdin per ref (i.e., branch or tag) which has the old base commit, the new head commit, and the name of the ref. So, if you pushed three commits to master and three commits to develop at the same time, stdin would have two lines in it.. one for master and one for develop. http://git-scm.com/docs/githooks#post-receive |
In progit, @pypingou does the work done here by shelling out to I've run tests locally on the latest changes in f860757 and it's looking good to me. |
👍 for me |
Attempt to avoid infinite recursion when finding the list of new revs.
http://git.kernel.org/cgit/git/git.git/tree/contrib/hooks/multimail/git_multimail.py#n2378
And you are right. I had misread the post-receive hook documentation, and then misread the email hook code in a way that confirmed what I thought. Reading it more carefully, you clearly are right, and the upstream hook shells out to http://git.kernel.org/cgit/git/git.git/tree/contrib/hooks/multimail/git_multimail.py#n995 Apologies for the noise. 😃 |
All good. Genuine thanks for the review. I appreciate more eyes on this one (it was driving me a little crazy). |
These are changes that have already been applied to our ansible repo, but I'd appreciate review to see if they make sense/look right.
We hit a scenario today that I don't understand where a set of commits pushed by pbrobinson to the 'file' package sent the fedmsg hook into a loop that published some 30k messages before nirik was able to kill it.
There might be a better way than this to detect what commits should and shouldn't be published. Ideas welcome.