-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29642 Active cluster file is not being updated after promoting … #7437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: HBASE-29081
Are you sure you want to change the base?
Conversation
…a new active cluster
This comment has been minimized.
This comment has been minimized.
…a new active cluster
This comment has been minimized.
This comment has been minimized.
|
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand the reasoning behind all this complicated logic. To my understanding the task is pretty simple:
Monitor the transition of read-only property and do the following:
- if it transitions from R/W -> R/O, delete the file,
- if R/O -> R/W, create the file.
If the above operation fails, log it and try to notify the user somehow.
That's basically it. No backoff logic, no retry logic is needed for now. User will fix the problem and restarts if needed.
|
|
||
| // Not timed out, log and sleep before retrying | ||
| LOG.warn( | ||
| "Failed to recreate active cluster ID file: {}. Retrying in {}ms... " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need to retry to create the file. Try it only once and log error if it failed. There's no such thing that a filesystem is unavailable for some reason, but will become available at some point and we wait for it. It should be available at all times or something is wrong.
| * specified permissions and overwrite flag. | ||
| * @param fs The FileSystem | ||
| * @param path The file path to recreate | ||
| * @param recursive If true, delete recursively (for a directory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is about a single file. 'Recursive' doesn't make sense to me here.
| * @param path The file path to recreate | ||
| * @param recursive If true, delete recursively (for a directory) | ||
| * @param perm The FsPermission to apply to the new file | ||
| * @param overwrite If true, overwrite an existing file (though it should be deleted by this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're trying to recreate a file. This should be always true.
| FsPermission perm, boolean overwrite) throws IOException { | ||
| LOG.debug("Attempting to delete path {} (recursive={}) for recreation.", path, recursive); | ||
| try { | ||
| if (!fs.delete(path, recursive) && fs.exists(path)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.exists() is redundant. fs.delete() should give your a clear result about whether the delete operation was successful or not.
| throw new IOException("Failed to delete " + path + " for recreation. " | ||
| + "delete() returned false and path still exists."); | ||
| } | ||
| } catch (IOException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this all complicated logic? Delete either failed or successful, right?
| overwrite); | ||
|
|
||
| // Create the new file with the specified permissions | ||
| return create(fs, path, perm, overwrite); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should not recreate the file when read-only mode is turned on. In that case you only need to delete the file and let the active cluster a new. How do cover that scenario?
|
💔 -1 overall
This message was automatically generated. |
…a new active cluster
This change