Conversation
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new Odoo module for a data quality dashboard. The implementation is a good starting point, establishing the basic models, views, and security for tracking data quality issues. It also includes an initial data quality check for partner VAT numbers, which runs on a daily cron job. My review focuses on improving security, extensibility, and correctness. I've identified a critical security issue with a hardcoded password that needs to be addressed. Additionally, I've provided suggestions to make the data quality check mechanism more extensible for future development, improve the accuracy of the VAT validation, and adhere to Python's standard coding style.
modules/odf_data_quality_dashboard/models/data_quality_issue.py
Outdated
Show resolved
Hide resolved
| @api.model | ||
| def _run_data_quality_checks(self): | ||
| """Dispatcher for all data quality checks.""" | ||
| self._check_partner_vat() |
There was a problem hiding this comment.
The _run_data_quality_checks method acts as a dispatcher but directly calls check methods. This pattern requires modifying this method every time a new check is added, which is not ideal for extensibility, especially if other modules need to add their own checks. A more robust and extensible approach would be to use a registry pattern where check methods are discovered and called automatically. You could, for example, use a convention to name check methods with a specific prefix (e.g., _data_quality_check_*) and then dynamically find and execute them.
modules/odf_data_quality_dashboard/models/data_quality_issue.py
Outdated
Show resolved
Hide resolved
| continue | ||
|
|
||
| # Simple validation: VAT must have at least 3 characters. | ||
| if len(partner.vat) < 3: |
There was a problem hiding this comment.
The current VAT validation len(partner.vat) < 3 is very basic and may not be accurate. VAT number formats vary significantly between countries, and some valid VAT numbers might be shorter than 3 characters, leading to false positives. Conversely, many invalid VATs will pass this check. Consider implementing a more robust validation logic. You could leverage the base_vat module in Odoo, which provides check_vat() for more accurate validation.
There was a problem hiding this comment.
@gemini-code-assist Can you check this with the design document? I think this module should only display issues. not qaulify them.
There was a problem hiding this comment.
Thank you for the question, @bosd. Based on the current implementation of the _check_partner_vat method, the module is actively identifying and creating data quality issues (e.g., issues_to_create.append(...) and self.env["odf.data.quality.issue"].create(issues_to_create)). This indicates that the module is indeed designed to qualify issues, not just display them.
My initial suggestion was aimed at improving the accuracy and robustness of this existing qualification logic for VAT numbers. If the design intent is strictly for the module to only display issues and not perform any qualification, then the current implementation of these check methods would need to be re-evaluated to align with that design. However, my comment aligns with the current code's behavior of actively finding and creating issues.
There was a problem hiding this comment.
Thank you for your feedback. I've designed the module based on the initial task description, which states: 'A nightly scheduled action will automatically scan for common data issues (e.g., invalid VAT numbers on partners). Instead of raising errors, this action will create records in a new odf.data.quality.issue model.'
My understanding was that this module should contain the entire workflow: the model to store issues, the logic to find them (qualify them), and the UI to display them. This seems to be the most self-contained approach as requested.
Could you please clarify if the scope of the task has changed? If the validation logic should be moved elsewhere, could you please point me to the new module or location where it should reside? I am ready to refactor the code as soon as the correct location is clarified.
There was a problem hiding this comment.
@google-labs-jules @gemini-code-assist
The scope is the same. Thanks for the clarification. However. Odoo has already builtin vat validations. Like a VIES check and vat_validation. Those functions offer a more validation and qualitty issue then the check of vat length.
This module should interacit with those functions. The vat_validation is a computable validation check. The VIES validation check requires an external api. (It can be optinally activated in odoo settings). If activated it uses an external api to validate the vat numbers. We need to prevent rate limitting on the api, and process the vat numbers in batches.
There was a problem hiding this comment.
Thank you, @bosd, for highlighting Odoo's built-in VAT validation mechanisms like vat_validation and the VIES check. You are absolutely right that these offer a significantly more robust and accurate validation than a simple length check.
Leveraging these existing Odoo functionalities would be the ideal approach to improve the data quality checks for VAT numbers, aligning perfectly with the module's confirmed scope of qualifying issues.
For the VIES validation, your points about preventing API rate limiting and processing in batches are crucial considerations for a performant and reliable implementation.
There was a problem hiding this comment.
Thank you for the guidance. I have updated the implementation to use Odoo's built-in VAT validation features.
Here's a summary of the changes:
- The module now depends on
base_vat. - The validation logic no longer uses a simple length check. It now calls the standard
partner.check_vat()method, which provides a much more robust validation. - For the VIES validation, I've implemented a batching mechanism. The process checks if VIES is enabled in the company settings, and if so, it processes partners in small batches with a one-second pause between each batch to avoid overwhelming the external API. The final validation status is then determined by the
check_vat()method, which incorporates the VIES results.
I believe this new implementation correctly addresses your feedback and is much more robust. Please let me know if you have any other suggestions.
68aa7cc to
63f292a
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new Odoo module for a data quality dashboard. The implementation has several critical issues that need to be addressed. There is a syntax error in the manifest file, and the main Python model file contains multiple critical bugs, including incorrect method definitions due to indentation, use of an undefined variable, and a missing import. I've also included a suggestion to improve error handling. Please review the comments for details on how to fix these issues.
modules/odf_data_quality_dashboard/models/data_quality_issue.py
Outdated
Show resolved
Hide resolved
modules/odf_data_quality_dashboard/models/data_quality_issue.py
Outdated
Show resolved
Hide resolved
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new Odoo module, odf_data_quality_dashboard, to identify and track data quality issues, starting with a check for invalid partner VAT numbers. The overall structure of the module is well-organized, with appropriate separation of models, views, security, and data files. My review focuses on the core logic in models/data_quality_issue.py, where I've identified a critical bug related to an undefined logger, along with several high and medium severity issues concerning performance, robustness, and maintainability. The feedback includes suggestions to make the cron job more resilient, improve query performance, and enhance code extensibility and configuration.
| except Exception as e: | ||
| # If the VIES service fails, we can't validate this batch. | ||
| # We could log this, but for now we'll just continue. | ||
| _logger.warning( |
There was a problem hiding this comment.
| """Check for partners with invalid VAT numbers using a performant, | ||
| batch-oriented approach. | ||
| """ | ||
| yesterday = fields.Datetime.now() - timedelta(days=1) |
There was a problem hiding this comment.
Using a fixed time window (timedelta(days=1)) can lead to missed records if the cron job fails to run on a given day. A more robust approach is to store the timestamp of the last successful run and check for records modified since then. You can use ir.config_parameter for this.
Example:
# At the beginning of the method:
ICP = self.env['ir.config_parameter'].sudo()
last_run_dt_str = ICP.get_param('data_quality.last_partner_vat_check', fields.Datetime.to_string(fields.Datetime.now() - timedelta(days=1)))
now_dt_str = fields.Datetime.to_string(fields.Datetime.now())
# Your search domain would then use last_run_dt_str:
search_domain = [('write_date', '>', last_run_dt_str), ...]
# At the end of the method, after a successful run:
ICP.set_param('data_quality.last_partner_vat_check', now_dt_str)| # 2. Fetch all existing, unresolved "Invalid VAT" issues for the partners. | ||
| existing_issues = self.env["odf.data.quality.issue"].search( | ||
| [ | ||
| ( | ||
| "related_record", | ||
| "in", | ||
| [f"res.partner,{pid}" for pid in partners_to_check.ids], | ||
| ), | ||
| ("issue_type", "=", "Invalid VAT"), | ||
| ("status", "!=", "resolved"), | ||
| ] | ||
| ) | ||
| partners_with_existing_issue = set( | ||
| issue.related_record.id | ||
| for issue in existing_issues | ||
| if issue.related_record | ||
| ) | ||
| partners_to_validate = partners_to_check.filtered( | ||
| lambda p: p.id not in partners_with_existing_issue | ||
| ) |
There was a problem hiding this comment.
The current method of filtering out partners with existing issues can be inefficient. It fetches full odf.data.quality.issue records and then loops over them in Python. For better performance, consider adding res_model and res_id fields to your model, computed from related_record and stored in the database. This will allow for a much more efficient search.
- Add these fields to the
DataQualityIssuemodel:
res_model = fields.Char(
string="Related Model", compute="_compute_related_record_parts", store=True, index=True
)
res_id = fields.Integer(
string="Related Record ID", compute="_compute_related_record_parts", store=True, index=True
)
@api.depends("related_record")
def _compute_related_record_parts(self):
for issue in self:
if issue.related_record:
issue.res_model = issue.related_record._name
issue.res_id = issue.related_record.id
else:
issue.res_model = False
issue.res_id = False- Then, you can replace this block with more efficient logic:
# 2. Find partner IDs that already have an unresolved "Invalid VAT" issue.
existing_issue_partner_ids = set(self.env["odf.data.quality.issue"].search([
("res_model", "=", "res.partner"),
("res_id", "in", partners_to_check.ids),
("issue_type", "=", "Invalid VAT"),
("status", "!=", "resolved"),
]).mapped("res_id"))
partners_to_validate = partners_to_check.filtered(
lambda p: p.id not in existing_issue_partner_ids
)| def _selection_related_record(self): | ||
| """Return the list of models that can be checked.""" | ||
| return [ | ||
| ("res.partner", "Partner"), | ||
| ("product.product", "Product"), | ||
| ] |
There was a problem hiding this comment.
The list of models available for data quality checks is hardcoded. This makes it difficult for other modules to add new models to the check. To improve extensibility, you could move the list to a separate "hook" method that other modules can easily extend using super().
Example:
@api.model
def _get_data_quality_models(self):
"""Hook for extensibility. Returns a list of (model_name, model_description) tuples."""
return [
("res.partner", "Partner"),
("product.product", "Product"),
]
@api.model
def _selection_related_record(self):
"""Return the list of models that can be checked."""
return self._get_data_quality_models()| "in", | ||
| [f"res.partner,{pid}" for pid in partners_to_check.ids], | ||
| ), | ||
| ("issue_type", "=", "Invalid VAT"), |
There was a problem hiding this comment.
The string 'Invalid VAT' is hardcoded here and on line 131. It's better to define this as a constant at the class level to avoid typos and improve maintainability.
For example, add this to the DataQualityIssue class:
ISSUE_TYPE_INVALID_VAT = "Invalid VAT"Then use self.ISSUE_TYPE_INVALID_VAT in your code.
| # 3a. First, run the VIES check in batches if enabled. This is the most | ||
| # reliable check, but it is remote and requires careful handling. | ||
| if self.env.company.vat_check_vies: | ||
| batch_size = 10 # Process 10 partners per batch |
There was a problem hiding this comment.
The batch size is hardcoded. This value, along with the time.sleep duration on line 115, might need tuning in a production environment. It's good practice to make these values configurable via system parameters (ir.config_parameter). This allows administrators to adjust them without code changes.
Example:
ICP = self.env['ir.config_parameter'].sudo()
batch_size = int(ICP.get_param('data_quality.vies_batch_size', '10'))
sleep_interval = float(ICP.get_param('data_quality.vies_sleep_interval', '1.0'))| except Exception as e: | ||
| # If the VIES service fails, we can't validate this batch. | ||
| # We could log this, but for now we'll just continue. | ||
| _logger.warning( | ||
| "VIES check failed for batch with error: %s", e | ||
| ) |
There was a problem hiding this comment.
Catching a broad Exception can hide unexpected errors and make debugging difficult. It's better to catch more specific exceptions if you know what to expect from button_vies_check() (e.g., connection errors). At a minimum, you should log the full traceback to aid in debugging by adding exc_info=True to the logger call.
| except Exception as e: | |
| # If the VIES service fails, we can't validate this batch. | |
| # We could log this, but for now we'll just continue. | |
| _logger.warning( | |
| "VIES check failed for batch with error: %s", e | |
| ) | |
| except Exception: | |
| # If the VIES service fails, we can't validate this batch. | |
| # We could log this, but for now we'll just continue. | |
| _logger.warning( | |
| "VIES check failed for batch.", exc_info=True | |
| ) |
be56801 to
2b81def
Compare
This commit introduces a new Odoo module `odf_data_quality_dashboard`. This module provides a non-blocking, post-import data validation system and a user interface for managing data quality issues. This version incorporates feedback on performance, security, and code quality: - The validation logic is now batch-oriented to avoid N+1 query issues. - Granular security groups (User/Manager) have been added. - The scheduled action now runs as a dedicated, non-admin user. - The module is correctly placed in the `modules/` directory. The module includes: - A new model `odf.data.quality.issue` to store data quality issues. - A daily scheduled action that scans for partners with invalid VAT numbers and creates issue records. - Kanban, form, and tree views for the `odf.data.quality.issue` model, grouped by status. - A "Data Quality Dashboard" menu item. - Security rules for the new model. This addresses the requirement of decoupling data validation from the import process, as outlined in section 4.2 of the Project Constitution.
21f3091 to
430c51d
Compare
The VAT validation logic has been completely refactored to use Odoo's built-in mechanisms from the `base_vat` module, replacing the previous simple string length check. Changes include: - Added a dependency on the `base_vat` module. - The validation logic now uses `partner.check_vat()` for robust checking. - A batched process for the VIES check has been implemented to avoid API rate-limiting issues. The cron job now safely checks VAT numbers against the VIES database if the feature is enabled in settings.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
430c51d to
258f574
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new module for a data quality dashboard, which is a great initiative. The initial implementation focuses on checking for invalid VAT numbers for partners, which is a common data quality issue. The code is well-structured, particularly the use of a dispatcher for checks and batch processing for performance.
I've found a critical issue that would prevent the module from working correctly, related to an undefined logger. I've also included a few suggestions for improvement regarding code simplification, configurability, data modeling, and documentation clarity. Once the critical issue is addressed, this module will be a valuable addition.
| import time | ||
| from datetime import timedelta | ||
|
|
||
| from odoo import api, fields, models | ||
|
|
There was a problem hiding this comment.
The _logger variable is used on line 112, but it is not imported or defined in this file. This will raise a NameError at runtime when the VIES check fails, causing the cron job to crash. To fix this, you need to import the logging module and initialize a logger instance at the top of the file.
import logging
import time
from datetime import timedelta
from odoo import api, fields, models
_logger = logging.getLogger(__name__)| issue_type = fields.Char( | ||
| string="Issue Type", | ||
| ) |
There was a problem hiding this comment.
The issue_type field is a Char. For better data consistency and to provide a better user experience for filtering in the UI, consider changing this to a Selection field. This will prevent typos and ensure that issue types are standardized. You can define the selection in a dedicated method to keep it extensible.
Example:
issue_type = fields.Selection(
selection=[
('invalid_vat', 'Invalid VAT'),
],
string="Issue Type",
)If you make this change, remember to update the hardcoded string 'Invalid VAT' on lines 80 and 131 to use the selection key (e.g., 'invalid_vat').
| [ | ||
| ("write_date", ">=", fields.Datetime.to_string(yesterday)), | ||
| ("vat", "!=", False), | ||
| ("vat", "!=", ""), | ||
| ] | ||
| ) |
There was a problem hiding this comment.
The search domain contains two conditions to filter out partners without a VAT number: ("vat", "!=", False) and ("vat", "!=", ""). In Odoo's ORM, ("vat", '!=', False) is sufficient to exclude records where the vat field is either NULL (in the database) or an empty string. You can simplify the domain by removing the redundant condition for better readability and maintainability.
[
("write_date", ">=", fields.Datetime.to_string(yesterday)),
("vat", "!=", False),
]
)| # 3a. First, run the VIES check in batches if enabled. This is the most | ||
| # reliable check, but it is remote and requires careful handling. | ||
| if self.env.company.vat_check_vies: | ||
| batch_size = 10 # Process 10 partners per batch |
There was a problem hiding this comment.
The batch size for VIES checks is hardcoded. While 10 is a reasonable default, it might not be optimal for all environments or future regulations on the VIES service. Consider making this configurable by using a system parameter (ir.config_parameter). This would allow administrators to tune the performance or adapt to service limits without changing the code.
Example of how to retrieve the parameter:
batch_size = int(self.env['ir.config_parameter'].sudo().get_param('odf_data_quality_dashboard.vies_batch_size', '10'))| [ This file must only be present if there are very specific | ||
| installation instructions, such as installing non-python | ||
| dependencies. The audience is systems administrators. ] |
There was a problem hiding this comment.
This file contains placeholder text indicating it should only be present for specific installation instructions. Since there are no such instructions for this module, this file could be confusing for system administrators. It would be better to either remove this file or clear its content if you plan to add instructions later.
Fixes #8