-
Notifications
You must be signed in to change notification settings - Fork 1
Rename geo index directories in place when database is deleted #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7e34516 to
6866129
Compare
davisp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong to me but I could be reading things wrong.
So far as I can tell, hastings_util:rename_dir/1 takes a path that looks lie "/srv/geo_index/shards/00-ff/davisp/testdb.12345/234adfe" and renames "/srv/geo_index/shards/00-ff/davisp/testdb.12345" to "/srv/geo_index/shards/00-ff/davisp/testdb.2017.09.21.09.50.34.deleted.12345/"
That is to say, that the rename_dir/1 takes a path to the directory of a specific index, and then renames the containing directory, thus effectively renaming all indexes for that given database. That seems wrong given the variable names and also makes me think that hastings_vacuum:rename_all_indexes/1 is broken given that its operating on all indexes even though they should all be gone after the first index is processed.
|
@davisp Hi Paul, thanks a lot for your review on this PR. yes, yes, it renames the containing directory. no, each call to Let us have one example: When calling hastings_util:get_existing_index_dirs/2 at https://github.com/cloudant-labs/hastings/blob/86318-rename-geo-indexfiles-when-dbdeleted/src/hastings_vacuum.erl#L270, we get DirList with In Lists:foreach/2, each call to If there are multiple geo design documents defined for one database, then the result from hastings_util:get_existing_index_dirs/2 could be Thus, dbname part of indexes belonging to same shard will be renamed for each call to Let me change from |
|
I don't think the function rename is what you want, I think you need to change the behavior. Your second example of multiple indexes is the case I'm worried about. Consider what you've got listed for multiple indexes: For simplification I'll call those: Your rename function takes this input: And renames the directory: To: When you fold across the original list of index directories this means that when you process the entry for What I think you should be doing instead is moving an index at a time, something along the lines of: Also I noticed while re-reviewing that we're throwing away the actual shard name and then falling back to |
|
To clarify, throwing away the shard name and using mem3:shards in the cleanup functions means that we're losing track of the suffix from dbname.suffix that caused the event. Thus if a user cycles a database and design doc its possible we end up grabbing the index from the wrong dbname.suffix and accidentally remove an index that should not have been moved (as it could be from a new database that has a newer suffix timestamp). |
|
@davisp Hi Paul, thanks for your comments. I thought that if we need to rename all indexes, we can rename them in one call. Even if there is failure for second index or later, all indexes belonging to one shard were already renamed. However, I agree with you that we need to modify codes to only rename specified index. Also, I adjust codes to avoid unexpected situation where database was deleted and re-created quickly. Could you please help review when you get time? Thanks a lot. |
davisp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor changes but good work otherwise.
src/hastings_util.erl
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need to alias this anymore since its not being modified.
src/hastings_vacuum.erl
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clean_sharddb is a bit difficult to read. Can you change it to clean_shard_db?
src/hastings_vacuum.erl
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're gonna have a case match issue here since you don't have a clause matching the compaction for Context.
|
@davisp can you double check my recent change when you get time? Thanks |
davisp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor change to help reading that case statement.
src/hastings_vacuum.erl
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this case clause is broken but in a non-obvious manner. We already asserted that Context == delete up in handle_cast so only deletion contexts come through this function. If you want to keep the assertion here, you should do something like delete = couch_util:get_value(context, Options, delete) on the first line and then remove the Context pattern match from the case.
davisp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
dd71e18 to
a8ce2bb
Compare
Overview
Before change, the geo index directories and files were not deleted when database was deleted. This will cause orphan geo index files and directories. In detail, when database was deleted, the
gen_server:cast(?MODULE, {cleanup, DbName})was called inhttps://github.com/cloudant-labs/hastings/blob/master/src/hastings_vacuum.erl#L99. Later,
{ok, JsonDDocs} = get_ddocs(DbName)inclean_db/1was called. Because database was deleted,{'DOWN', Ref, _, _, {error, database_does_not_exist}}message was sent in https://github.com/cloudant-labs/hastings/blob/master/src/hastings_vacuum.erl#L149. Finally,cleanup(DbName, ActiveSigs)was not called in https://github.com/cloudant-labs/hastings/blob/master/src/hastings_vacuum.erl#L127.This PR is aimed to address above issue and take action against geo index when database is deleted.
The direct approach is to delete geo index files and directoires when database is deleted, i.e. https://github.com/cloudant-labs/hastings/pull/3/files#diff-7ac6be6388d90e131766d8c5824cb226R259
The second approach is to rename geo index in place such as from
/srv/geo_index/shards/60000000-7fffffff/<dbname.ts>/e66df316792ab411705e2741bba44371to/srv/geo_index/shards/60000000-7fffffff/<dbname.YYMMDD.HHMMSS.deleted.ts>/e66df316792ab411705e2741bba44371when the corresponding database was deleted and "enable_database_recovery" configuration item is set to true. This allows geo index files to be re-used if database is recovered.https://github.com/cloudant-labs/hastings/pull/3/files#diff-7ac6be6388d90e131766d8c5824cb226R259
Testing recommendations
GitHub issue number
Bugzid: 86318
Related Pull Requests
#3 was deprecated with this PR
https://github.com/cloudant/chef-repo/pull/7719
N/A
Checklist