Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(WIP) Add option to import json dump file (#291) #293

Closed
wants to merge 1 commit into from

Conversation

hbruch
Copy link
Collaborator

@hbruch hbruch commented Jan 12, 2018

This is a first attempt to provide a json dump import (see #291). Importing a json dump on my machine was about 10x faster than importing from nominatim.

However, there are some points I'd like to receive feedback on before doing a PR for merge:

  • currently, no sanity checking is done on import, neither of the action line, nor the source line, should it?
  • the JsonDumper does not export the PhotonDoc.uid. It should either be exported or the PhotonDoc should be restored from json so it can be recalculated from source
  • JsonDumpConnector does not convert a source line into a PhotonDoc but bypasses the public Importer interface to add the source string. This is ugly, but avoids (unnecessary?) parsing and formatting of the source (but would require uid handling, see above)
  • the elasticsearcg.Importer uses BulkRequestBuilder. Using BulkProcessor would provide concurrent updates, as well as retries with exponential backup in case of failures (this should be another PR, I think)

@hbruch hbruch changed the title (WIP) Add option to import json dump file (WIP) Add option to import json dump file (#291) Jan 12, 2018
try {
this.reader = new BufferedReader(new FileReader(filename));
} catch (FileNotFoundException e) {
log.error("Json dump file '%1' not found", filename);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we should throw a RuntimeException to avoid usage of this connector.

}
}

public void setImporter(de.komoot.photon.Importer importer) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't it be provided on construction and then importer and reader could be final?

@karussell
Copy link
Collaborator

currently, no sanity checking is done on import, neither of the action line, nor the source line, should it?

Sanity of which form like empty content?

It should either be exported or the PhotonDoc should be restored from json so it can be recalculated from source

Which option do you prefer?

This is ugly, but avoids (unnecessary?) parsing and formatting of the source (but would require uid handling, see above)

Can't judge about it. Both methods seems ok to me.

this should be another PR, I think

Yes, speed is not important for the first implementation IMO

@karussell
Copy link
Collaborator

Have included some improvements of this in #438

@karussell karussell closed this Oct 12, 2019
@hbruch hbruch deleted the issue_291 branch May 22, 2020 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants