-
Notifications
You must be signed in to change notification settings - Fork 8
fix: issue with path is returned from download_files is always set t… #731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -89,11 +89,10 @@ async def get_dataframe(self) -> "pd.DataFrame": | |
try: | ||
import pandas as pd | ||
|
||
path = self.path | ||
if self.input_prefix and self.path: | ||
path = await self.download_files(self.path) | ||
await self.download_files(self.path) | ||
# Use pandas native read_parquet which can handle both single files and directories | ||
return pd.read_parquet(path) | ||
return pd.read_parquet(self.path) | ||
|
||
except Exception as e: | ||
logger.error(f"Error reading data from parquet file(s): {str(e)}") | ||
# Re-raise to match IcebergInput behavior | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix assumes that
download_files
modifiesself.path
in-place, but this behavior is not evident from the code shown. Ifdownload_files
doesn't updateself.path
to point to the downloaded local file, this will still read from the original remote path. Consider either ensuringdownload_files
updatesself.path
or capturing its return value if it returns the local path.Copilot uses AI. Check for mistakes.