-
Notifications
You must be signed in to change notification settings - Fork 537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
McRouter not forwarding gets / sets / deletes for largish values #383
Comments
Hi there - https://github.com/facebook/mcrouter/blob/main/mcrouter/mcrouter_options_list.h#L617 Additionally, 831kb is not terribly large for this to not be able to handle. However, I have seen others setting a [smaller] value threshold via big value route (see https://github.com/facebook/mcrouter/blob/main/mcrouter/mcrouter_options_list.h#L119). This comes with a tradeoff of distributing pieces of the data across more than 1 machine however. Also, if you could, share more of the options and its routing configuration (with proprietary info redacted please)? E.g. command line parameters, etc. This would be helpful to understand the problem better. An easy way is to use the preprocess config dump for the routing side: https://github.com/facebook/mcrouter/wiki/Admin-requests#get-__mcrouter__preprocessed_config Thanks! |
Man, that sounds like just the thing. But we're at the default of 1000ms, still I may experiment. Here's the cli options:
And confirming that default value:
Note: To solve our issue, we added the Here's our config:
|
Mcrouter has been performing very well for us for years, but just recently we've started to notice a problem that's undermining our faith. Sometimes (for several minutes at a time) for large values, Mcrouter just stops forwarding sets / deletes (and some gets) for a particularly large value to both machines in our pool. For several minutes, sets and deletes counts are wayyyy off and the commands only make it to one of the two machines. At other times, they are even and for other slabs they are even.
Here's an example of the meta-data about one of the keys in question in memcache:
key={the key name in question} exp=1631856704 la=1631856415 cas=1382190321 fetch=yes cls=39 size=831500
The size is 831kb (though under the 1MB memcache limit and we don't have values splitting turned on) and the expire time is ~5min.
Here's a graph of command counts between our two memcache machines in the pool. At other times and for other slabs, the values are nearly identical. But occasionally (maybe 1/day, but it's becoming more frequent) we see these imbalances that lead to almost constant cache misses (cause we use
AllFastestRoute
for gets).A reduced view of our mcrouter config:
Something of note, each time we see one of these anomalies Mcrouter seemingly randomly doesn't send the commands to one of the two machines (but not the same machine each time)
CC @djmetzle @andyg0808 @sctice-ifixit
The text was updated successfully, but these errors were encountered: