-
Notifications
You must be signed in to change notification settings - Fork 271
Add flat-file bypass for get.trait.data() #3601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this important step in the decoupling process!
A few requests
- Could you please update the docs
- Add a test that uses the file option
Questions:
- Is there a reason not to have an argument to specify the location of the file? It seems having an
input_fileargument to would make the option more clear and straightforward to use. - Will this require updating query_pfts and query_priors so that they can also take a file?
I'm not opposed and I agree it'd make this easier to call directly, but am comfortable leaving it for a separate PR (and thus also punting on changes in query_pfts and query_priors) because the existing trait workflow already assumes each step modifies a settings object. +1 to all the other suggestions |
|
Yes , it would be good if we keep that to next separate PR , or i can just add a commit in this PR to make it pass as an argument . |
|
@dlebauer , we don't need to specifically change the functions : Yes i will update the docs and write a test as well ! I have added the support to pass the file_path as in argument directly meanwhile keeping the logic of passing the path in the settings intact as well . |
|
@harshagr70 re: "we completely bypass these functions [query_pfts and query_priors] when using the flatfile" - perhaps out of scope for this PR, but where will the [pft,variable,prior] table used in the meta-analysis come from? |
| if(is.null(file_path)){ | ||
| file.path <- pfts$file_path | ||
| } | ||
| use_flatfile <- !is.null(file_path) && file.exists(file_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest throwing an error if file_path is specified but the file doesn't exist, rather than revert to DB mode (which would be surprising to the user).
| PEcAn.logger::logger.info("Using flat file for trait data instead of database") | ||
|
|
||
| # Load flat file as data.frame | ||
| trait_data_flat <- read.csv(pfts$file_path, stringsAsFactors = FALSE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| trait_data_flat <- read.csv(pfts$file_path, stringsAsFactors = FALSE) | |
| trait_data_flat <- read.csv(file_path, stringsAsFactors = FALSE) |
@dlebauer It will be a CSV file specified by the user as @harshagr70 when we discussed this live we realized it could use a line in the documentation -- can you add that, please? |
| #check for flatfile path if present use it | ||
| file_path <- input_file | ||
| if(is.null(file_path)){ | ||
| file.path <- pfts$file_path | ||
| } | ||
| use_flatfile <- !is.null(file_path) && file.exists(file_path) | ||
|
|
||
| if (use_flatfile) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible implementation of the error-if-not-exist pattern (If you like this, need to also add @importFrom rlang %||% in the Roxygen block above):
| #check for flatfile path if present use it | |
| file_path <- input_file | |
| if(is.null(file_path)){ | |
| file.path <- pfts$file_path | |
| } | |
| use_flatfile <- !is.null(file_path) && file.exists(file_path) | |
| if (use_flatfile) { | |
| #check for flatfile path, if present use it | |
| file_path <- input_file %||% pfts$file_path | |
| if (!is.null(file_path) { | |
| if (!file.exists(file_path) { | |
| PEcAn.logger::logger.error("trait data file not found at specified path", sQuote(file_path)) | |
| } |
This PR introduces an optional flat-file bypass for
get.trait.data(), allowing trait and prior data to be loaded directly from a CSV specified via<file_path>in<pfts>within settings.xml. When a file path is provided, the function reads PFTs, priors, and trait names from the flat file instead of querying BETYdb, enabling DB-independent runs (useful for offline workflows, testing, and CI). If no file path is specified, the original database workflow remains unchanged.fixes a part of #3602