New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bodhi2Client doesn't support retries #144
Comments
Fixing this would make all Taskotron folks happy, because right now every single Bodhi server failure (5xx response) makes one of our checks crash (and we receive a crash notification). Since technology is never reliable, we made use of Bodhi automatic retry functionality in the past and would like to continue doing so with Bodhi2 :) |
This also affects fedora-easy-karma. If there are any errors, f-e-k stops and the user will need to start over because f-e-k relies on setting the retries in the bodhi client code. |
This is having a decent sized effect in Taskotron right now - between dev/stg/prod, we saw 88 failures yesterday due to failed bodhi requests that weren't retried. The only reason that nobody's noticing is that our current suite of checks will pull in previous items which haven't already been done. Is there any guess on a timeline for a fix? |
Looking into the openidproxyclient code we retry in two cases:
The first situation seems like something we could fix, I am a little less incline for the second situation as I expect our newer API to return a meaningfull message upon 400 or 404 and asking the question multiple times shouldn't (in theory) return a different result. Thoughts? |
That sounds reasonable for time-outs and 4xx codes. But our major problem is with 5xx codes, that's what Bodhi returns very often, and retrying definitely helps. Of course it would be great to fix the issue in Bodhi (or Apache or whatever else is along the way), but the safest and easiest way in this library seems to me to retry on 5xx codes, as it worked with Bodhi1 python-fedora (it reduced the number of our Bodhi-related task crashes down by >95%). |
hm, I'll wait for @lmacken and @ralphbean to weight on this as changing the behavior of the openidbaseclient will impact more than just the bodhi client. |
Yeah, re-trying on 5xx codes but not 4xx codes makes sense to me (in addition to timeouts). |
5xx codes might be a problem for other apps (pkgdb2 for example, maybe tagger and probably others) |
How so? The 500 codes are all server errors, in which case the request shouldn't have changed anything. It should be harmless to try again .. and if it is not harmless, then that's a problem for that server (if a 500 request leaves resources in an inconsistent state, for instance). |
I agree it should be harmless but that also means (in the example shown above) that we're trying 10 times a request that we know for sure won't work from the first attempt, that was more what I was thinking about. |
Not sure I follow. Since it's a server error, another attempt might work. If there are certain requests which we don't want to retry on 5xx error codes (let's say publishing a bodhi comment - we might rather fail on the first attempt than to risk publishing 10 comments in a row - just an example, not sure if it's a good idea or not), maybe those requests could call the underlying methods with some internal parameter e.g.
Btw, if you could introduce something like |
The pkgdb2 API returns a 500 as error code for generic errors (for which I couldn't find a better code) :-(. The |
A new patch is in #153. |
Fixed by #153. |
I tried #153 (python-fedora-0.6.1-1.fc23) and I was still able to reproduce this using the reproducer from the first post - no retry is performed, no matter what number of retries is provided, it sends only one request (checked by wireshark) and ends. I also tried to use python-requests (inspired by
And this didn't work either, only one request sent. Tried python-requests from fedora repo and pypi. |
@ralphbean @pypingou Can you please look at this? I have the same problem as @lukassparrow, only one request sent (but with |
Re-opening by request from @kparal in |
OK, I think I see what's going on. First, there is an HTTP method whitelist that Retry uses. >>> from requests.packages.urllib3.util import Retry
>>> Retry.DEFAULT_METHOD_WHITELIST
frozenset(['HEAD', 'TRACE', 'GET', 'PUT', 'OPTIONS', 'DELETE']) Notice that Second, in your example, @lukassparrow, you have to specify #!/usr/bin/env python
import logging
logging.basicConfig(level=logging.DEBUG)
import requests
from requests.packages.urllib3.util import Retry
s = requests.session()
s.mount('http://', requests.adapters.HTTPAdapter(max_retries=Retry(
total=7,
status_forcelist=[500],
method_whitelist=Retry.DEFAULT_METHOD_WHITELIST.union(frozenset(['POST'])),
)))
s.post('http://httpstat.us/500') I'll modify python-fedora to send that new |
Great detective work. Both the new example and the python-fedora patch work for me. |
Bodhi1Client
support retries: it tries several times (user specified) and after last unsuccessful try it raises an exception. This unfortunately doesn't work whenBodhi2Client
is used.When you try to run code bellow, you can see that when
Bodhi1Client
is used it actually tries 10 times before raising.Bodhi2Client
fails that respond cannot be parsed as JSON.I check the code and it seems that
Bodhi2Client
's parent classOpenIdBaseClient
overrides methodsend_request
from its parent classOpenIdProxyClient
, in which the retry code is present.The text was updated successfully, but these errors were encountered: