Skip to content

refactor: use API to fetch Hp index#53

Closed
sahiljhawar wants to merge 2 commits into
GFZ:mainfrom
sahiljhawar:logger-with-file
Closed

refactor: use API to fetch Hp index#53
sahiljhawar wants to merge 2 commits into
GFZ:mainfrom
sahiljhawar:logger-with-file

Conversation

@sahiljhawar
Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings February 13, 2026 09:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request refactors the HP index data fetching mechanism from FTP-based downloads to using the GFZ JSON API. The main change replaces FTP file downloads and text parsing with direct API calls that return JSON data. However, the PR also includes unrelated enhancements to the logging system.

Changes:

  • Migrated HP index data source from FTP file downloads to GFZ JSON API (https://kp.gfz.de/app/json/)
  • Refactored data processing to parse JSON responses instead of text files
  • Enhanced logging system with file output support and additional configuration options

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 10 comments.

File Description
swvo/io/hp/gfz.py Replaced FTP download logic with API calls; updated data processing to handle JSON format
tests/io/hp/test_hp.py Updated tests to mock API requests instead of FTP operations; adjusted test data format
swvo/logger.py Added file logging support, changed level parameter from int to str, added line numbers to log format

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread swvo/io/hp/gfz.py
Comment on lines +292 to +293
data_total = pd.DataFrame(
{f"Hp{self.index_number}": json_data[f"Hp{self.index_number}"]},
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling: If the API response doesn't contain the expected keys ('datetime' or f'Hp{self.index_number}'), this will raise a KeyError. Consider adding validation to check if the expected keys exist in json_data before accessing them, and raise a more informative error or log a warning if they're missing.

Suggested change
data_total = pd.DataFrame(
{f"Hp{self.index_number}": json_data[f"Hp{self.index_number}"]},
hp_key = f"Hp{self.index_number}"
required_keys = ["datetime", hp_key]
missing_keys = [key for key in required_keys if key not in json_data]
if missing_keys:
logger.warning(
"Missing expected key(s) %s in JSON file %s; returning empty DataFrame",
", ".join(missing_keys),
json_file,
)
return data_total
data_total = pd.DataFrame(
{hp_key: json_data[hp_key]},

Copilot uses AI. Check for mistakes.
Comment thread swvo/io/hp/gfz.py
{f"Hp{self.index_number}": json_data[f"Hp{self.index_number}"]},
index=pd.to_datetime(json_data["datetime"], utc=True),
)
data_total.index = data_total.index.tz_convert("UTC")
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant timezone conversion: Line 296 converts the timezone to 'UTC', but line 294 already creates the datetime index with utc=True, which makes it UTC-aware. The tz_convert operation is unnecessary and can be removed.

Suggested change
data_total.index = data_total.index.tz_convert("UTC")

Copilot uses AI. Check for mistakes.
Comment thread swvo/logger.py
Comment on lines +32 to +45
def setup_logging(level: str = "INFO", log_file: Optional[Path] = None, file_mode: str = "w"):
"""Setup logging for the swvo package and root logger.

Parameters
----------
level : str, optional
Logging level, by default is INFO
log_file : Path, optional
Path to log file. If None, only console logging is enabled.If provided, logs will be written to both console and file., by default None
"""
try:
level = getattr(logging, level.upper())
except AttributeError:
raise ValueError(f"Invalid logging level: {level}. Use one of DEBUG, INFO, WARNING, ERROR, CRITICAL.")
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setup_logging function signature has changed from accepting a logging level constant (int) to accepting a string. This is a breaking API change. Existing code calling setup_logging(level=logging.INFO) will break. Consider maintaining backward compatibility by accepting both int and str types for the level parameter.

Copilot uses AI. Check for mistakes.
Comment thread swvo/logger.py
Comment on lines +32 to +67
def setup_logging(level: str = "INFO", log_file: Optional[Path] = None, file_mode: str = "w"):
"""Setup logging for the swvo package and root logger.

Parameters
----------
level : str, optional
Logging level, by default is INFO
log_file : Path, optional
Path to log file. If None, only console logging is enabled.If provided, logs will be written to both console and file., by default None
"""
try:
level = getattr(logging, level.upper())
except AttributeError:
raise ValueError(f"Invalid logging level: {level}. Use one of DEBUG, INFO, WARNING, ERROR, CRITICAL.")

# Configure root logger so all loggers inherit the formatting
root_logger = logging.getLogger()

# Check if already configured
if any(isinstance(h, logging.StreamHandler) for h in root_logger.handlers):
return
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(
_ColorFormatter(
"[%(levelname)-8s] %(asctime)s - %(name)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
)

logger.setLevel(level)
logger.addHandler(handler)

# Console handler with colors
log_format = "[%(levelname)-8s] %(asctime)s - %(name)s:%(lineno)d - %(message)s"
datefmt = "%Y-%m-%d %H:%M:%S"

console_handler = logging.StreamHandler(sys.stdout)
console_handler.setFormatter(_ColorFormatter(log_format, datefmt=datefmt))

root_logger.setLevel(level)
root_logger.addHandler(console_handler)

if log_file:
file_handler = logging.FileHandler(log_file, mode=file_mode)
file_handler.setFormatter(logging.Formatter(log_format, datefmt=datefmt))
root_logger.addHandler(file_handler)
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger.py changes appear to be unrelated to the PR title "refactor: use API to fetch Hp index". These changes introduce new functionality (file logging support, file_mode parameter, line number in log format) that should be in a separate PR or the PR description should be updated to mention these improvements.

Copilot uses AI. Check for mistakes.
Comment thread swvo/io/hp/gfz.py
Comment on lines +154 to +160
data = response.json()
logger.debug(f"Downloaded data: {len(data)} records")

with open(local_path, "wb") as f:
ftp.retrbinary(f"RETR {filename}", f.write)
# Save the JSON response to a temporary file for processing
output_file = temporary_dir / f"hp_data_{start_time.year}.json"
with open(output_file, "w") as f:
json.dump(data, f)
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing validation: The code doesn't check if the API response contains valid data before saving it. If the API returns an empty response or an error message in JSON format, this could result in invalid data being saved and processed. Consider validating that the response contains the expected keys and non-empty data arrays before saving.

Copilot uses AI. Check for mistakes.
Comment thread swvo/io/hp/gfz.py
json_data = json.load(f)

data_total = pd.DataFrame(
{f"Hp{self.index_number}": json_data[f"Hp{self.index_number}"]},
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Column name mismatch: The processed data uses 'Hp30' as the column name (capitalized), but the rest of the codebase expects 'hp30' (lowercase). The _read_single_file method at line 313 uses str(self.index) which is lowercase ('hp30'), and the read method at line 215 also expects self.index in lowercase. This inconsistency will cause issues when combining DataFrames. The column name should use self.index instead of f"Hp{self.index_number}" to maintain consistency.

Suggested change
{f"Hp{self.index_number}": json_data[f"Hp{self.index_number}"]},
{str(self.index): json_data[f"Hp{self.index_number}"]},

Copilot uses AI. Check for mistakes.
Comment thread tests/io/hp/test_hp.py
mock_ftp.retrbinary.assert_called()
mock_ftp.quit.assert_called()
# Verify the API was called with requests
assert mocker.patch("swvo.io.hp.gfz.requests.get").called or True
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion will always evaluate to True due to the 'or True' clause, making the test meaningless. Either verify the mock was called properly by checking mock_get.called (without the 'or True'), or remove this assertion if the URL verification test at line 167 is sufficient.

Copilot uses AI. Check for mistakes.
Comment thread swvo/logger.py
level : str, optional
Logging level, by default is INFO
log_file : Path, optional
Path to log file. If None, only console logging is enabled.If provided, logs will be written to both console and file., by default None
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing documentation for the 'file_mode' parameter. The parameter is declared in the function signature but not documented in the docstring.

Suggested change
Path to log file. If None, only console logging is enabled.If provided, logs will be written to both console and file., by default None
Path to log file. If None, only console logging is enabled. If provided, logs will be written to both console and file., by default None
file_mode : str, optional
File open mode used for the log file handler (passed to :class:`logging.FileHandler`), by default "w".

Copilot uses AI. Check for mistakes.
Comment thread swvo/logger.py
level : str, optional
Logging level, by default is INFO
log_file : Path, optional
Path to log file. If None, only console logging is enabled.If provided, logs will be written to both console and file., by default None
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error: Missing space in the docstring. Should be "If None, only console logging is enabled. If provided, logs will be written to both console and file."

Suggested change
Path to log file. If None, only console logging is enabled.If provided, logs will be written to both console and file., by default None
Path to log file. If None, only console logging is enabled. If provided, logs will be written to both console and file., by default None

Copilot uses AI. Check for mistakes.
Comment thread swvo/io/hp/gfz.py
response.raise_for_status()

data = response.json()
logger.debug(f"Downloaded data: {len(data)} records")
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential data loss: The debug log message attempts to get the length of 'data' which is a dictionary returned from response.json(). This will only count the number of keys (likely 2: 'datetime' and 'Hp30'), not the number of records. Consider using len(data.get('datetime', [])) or len(data.get(f'Hp{self.index_number}', [])) to log the actual number of data records.

Suggested change
logger.debug(f"Downloaded data: {len(data)} records")
records_count = len(data.get("datetime", []))
logger.debug(f"Downloaded data: {records_count} records")

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants