Pull request for issue #8 by google-labs-jules[bot] · Pull Request #13 · bosd/odoo-data-flow

google-labs-jules · 2025-08-20T20:03:41Z

Fixes #8

bosd · 2025-09-05T10:20:39Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new Odoo module for a data quality dashboard. The implementation is a good starting point, establishing the basic models, views, and security for tracking data quality issues. It also includes an initial data quality check for partner VAT numbers, which runs on a daily cron job. My review focuses on improving security, extensibility, and correctness. I've identified a critical security issue with a hardcoded password that needs to be addressed. Additionally, I've provided suggestions to make the data quality check mechanism more extensible for future development, improve the accuracy of the VAT validation, and adhere to Python's standard coding style.

modules/odf_data_quality_dashboard/data/res_users.xml

modules/odf_data_quality_dashboard/models/data_quality_issue.py

gemini-code-assist · 2025-09-05T10:22:11Z

modules/odf_data_quality_dashboard/models/data_quality_issue.py

+    @api.model
+    def _run_data_quality_checks(self):
+        """Dispatcher for all data quality checks."""
+        self._check_partner_vat()


The _run_data_quality_checks method acts as a dispatcher but directly calls check methods. This pattern requires modifying this method every time a new check is added, which is not ideal for extensibility, especially if other modules need to add their own checks. A more robust and extensible approach would be to use a registry pattern where check methods are discovered and called automatically. You could, for example, use a convention to name check methods with a specific prefix (e.g., _data_quality_check_*) and then dynamically find and execute them.

modules/odf_data_quality_dashboard/models/data_quality_issue.py

gemini-code-assist · 2025-09-05T10:22:11Z

modules/odf_data_quality_dashboard/models/data_quality_issue.py

+                continue
+
+            # Simple validation: VAT must have at least 3 characters.
+            if len(partner.vat) < 3:


The current VAT validation len(partner.vat) < 3 is very basic and may not be accurate. VAT number formats vary significantly between countries, and some valid VAT numbers might be shorter than 3 characters, leading to false positives. Conversely, many invalid VATs will pass this check. Consider implementing a more robust validation logic. You could leverage the base_vat module in Odoo, which provides check_vat() for more accurate validation.

@gemini-code-assist Can you check this with the design document? I think this module should only display issues. not qaulify them.

Thank you for the question, @bosd. Based on the current implementation of the _check_partner_vat method, the module is actively identifying and creating data quality issues (e.g., issues_to_create.append(...) and self.env["odf.data.quality.issue"].create(issues_to_create)). This indicates that the module is indeed designed to qualify issues, not just display them.

My initial suggestion was aimed at improving the accuracy and robustness of this existing qualification logic for VAT numbers. If the design intent is strictly for the module to only display issues and not perform any qualification, then the current implementation of these check methods would need to be re-evaluated to align with that design. However, my comment aligns with the current code's behavior of actively finding and creating issues.

Thank you for your feedback. I've designed the module based on the initial task description, which states: 'A nightly scheduled action will automatically scan for common data issues (e.g., invalid VAT numbers on partners). Instead of raising errors, this action will create records in a new odf.data.quality.issue model.'

My understanding was that this module should contain the entire workflow: the model to store issues, the logic to find them (qualify them), and the UI to display them. This seems to be the most self-contained approach as requested.

Could you please clarify if the scope of the task has changed? If the validation logic should be moved elsewhere, could you please point me to the new module or location where it should reside? I am ready to refactor the code as soon as the correct location is clarified.

@google-labs-jules @gemini-code-assist
The scope is the same. Thanks for the clarification. However. Odoo has already builtin vat validations. Like a VIES check and vat_validation. Those functions offer a more validation and qualitty issue then the check of vat length.

This module should interacit with those functions. The vat_validation is a computable validation check. The VIES validation check requires an external api. (It can be optinally activated in odoo settings). If activated it uses an external api to validate the vat numbers. We need to prevent rate limitting on the api, and process the vat numbers in batches.

Thank you, @bosd, for highlighting Odoo's built-in VAT validation mechanisms like vat_validation and the VIES check. You are absolutely right that these offer a significantly more robust and accurate validation than a simple length check.

Leveraging these existing Odoo functionalities would be the ideal approach to improve the data quality checks for VAT numbers, aligning perfectly with the module's confirmed scope of qualifying issues.

For the VIES validation, your points about preventing API rate limiting and processing in batches are crucial considerations for a performant and reliable implementation.

Thank you for the guidance. I have updated the implementation to use Odoo's built-in VAT validation features.

Here's a summary of the changes:

The module now depends on base_vat.

The validation logic no longer uses a simple length check. It now calls the standard partner.check_vat() method, which provides a much more robust validation.

For the VIES validation, I've implemented a batching mechanism. The process checks if VIES is enabled in the company settings, and if so, it processes partners in small batches with a one-second pause between each batch to avoid overwhelming the external API. The final validation status is then determined by the check_vat() method, which incorporates the VIES results.

I believe this new implementation correctly addresses your feedback and is much more robust. Please let me know if you have any other suggestions.

bosd · 2025-09-05T11:13:35Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new Odoo module for a data quality dashboard. The implementation has several critical issues that need to be addressed. There is a syntax error in the manifest file, and the main Python model file contains multiple critical bugs, including incorrect method definitions due to indentation, use of an undefined variable, and a missing import. I've also included a suggestion to improve error handling. Please review the comments for details on how to fix these issues.

modules/odf_data_quality_dashboard/__manifest__.py

modules/odf_data_quality_dashboard/models/data_quality_issue.py

bosd · 2025-09-05T19:48:07Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new Odoo module, odf_data_quality_dashboard, to identify and track data quality issues, starting with a check for invalid partner VAT numbers. The overall structure of the module is well-organized, with appropriate separation of models, views, security, and data files. My review focuses on the core logic in models/data_quality_issue.py, where I've identified a critical bug related to an undefined logger, along with several high and medium severity issues concerning performance, robustness, and maintainability. The feedback includes suggestions to make the cron job more resilient, improve query performance, and enhance code extensibility and configuration.

gemini-code-assist · 2025-09-05T19:50:07Z

modules/odf_data_quality_dashboard/models/data_quality_issue.py

+                except Exception as e:
+                    # If the VIES service fails, we can't validate this batch.
+                    # We could log this, but for now we'll just continue.
+                    _logger.warning(


The _logger variable is used here but is not defined. This will cause a NameError when this code is executed. You need to import the logging module and initialize _logger at the top of the file.

Add this after your imports:

import logging _logger = logging.getLogger(__name__)

gemini-code-assist · 2025-09-05T19:50:07Z

modules/odf_data_quality_dashboard/models/data_quality_issue.py

+        """Check for partners with invalid VAT numbers using a performant,
+        batch-oriented approach.
+        """
+        yesterday = fields.Datetime.now() - timedelta(days=1)


Using a fixed time window (timedelta(days=1)) can lead to missed records if the cron job fails to run on a given day. A more robust approach is to store the timestamp of the last successful run and check for records modified since then. You can use ir.config_parameter for this.

Example:

# At the beginning of the method: ICP = self.env['ir.config_parameter'].sudo() last_run_dt_str = ICP.get_param('data_quality.last_partner_vat_check', fields.Datetime.to_string(fields.Datetime.now() - timedelta(days=1))) now_dt_str = fields.Datetime.to_string(fields.Datetime.now()) # Your search domain would then use last_run_dt_str: search_domain = [('write_date', '>', last_run_dt_str), ...] # At the end of the method, after a successful run: ICP.set_param('data_quality.last_partner_vat_check', now_dt_str)

gemini-code-assist · 2025-09-05T19:50:07Z

modules/odf_data_quality_dashboard/models/data_quality_issue.py

+        # 2. Fetch all existing, unresolved "Invalid VAT" issues for the partners.
+        existing_issues = self.env["odf.data.quality.issue"].search(
+            [
+                (
+                    "related_record",
+                    "in",
+                    [f"res.partner,{pid}" for pid in partners_to_check.ids],
+                ),
+                ("issue_type", "=", "Invalid VAT"),
+                ("status", "!=", "resolved"),
+            ]
+        )
+        partners_with_existing_issue = set(
+            issue.related_record.id
+            for issue in existing_issues
+            if issue.related_record
+        )
+        partners_to_validate = partners_to_check.filtered(
+            lambda p: p.id not in partners_with_existing_issue
+        )


The current method of filtering out partners with existing issues can be inefficient. It fetches full odf.data.quality.issue records and then loops over them in Python. For better performance, consider adding res_model and res_id fields to your model, computed from related_record and stored in the database. This will allow for a much more efficient search.

Add these fields to the DataQualityIssue model:

res_model = fields.Char( string="Related Model", compute="_compute_related_record_parts", store=True, index=True ) res_id = fields.Integer( string="Related Record ID", compute="_compute_related_record_parts", store=True, index=True ) @api.depends("related_record") def _compute_related_record_parts(self): for issue in self: if issue.related_record: issue.res_model = issue.related_record._name issue.res_id = issue.related_record.id else: issue.res_model = False issue.res_id = False

Then, you can replace this block with more efficient logic:

# 2. Find partner IDs that already have an unresolved "Invalid VAT" issue. existing_issue_partner_ids = set(self.env["odf.data.quality.issue"].search([ ("res_model", "=", "res.partner"), ("res_id", "in", partners_to_check.ids), ("issue_type", "=", "Invalid VAT"), ("status", "!=", "resolved"), ]).mapped("res_id")) partners_to_validate = partners_to_check.filtered( lambda p: p.id not in existing_issue_partner_ids )

gemini-code-assist · 2025-09-05T19:50:07Z

modules/odf_data_quality_dashboard/models/data_quality_issue.py

+    def _selection_related_record(self):
+        """Return the list of models that can be checked."""
+        return [
+            ("res.partner", "Partner"),
+            ("product.product", "Product"),
+        ]


The list of models available for data quality checks is hardcoded. This makes it difficult for other modules to add new models to the check. To improve extensibility, you could move the list to a separate "hook" method that other modules can easily extend using super().

Example:

@api.model def _get_data_quality_models(self): """Hook for extensibility. Returns a list of (model_name, model_description) tuples.""" return [ ("res.partner", "Partner"), ("product.product", "Product"), ] @api.model def _selection_related_record(self): """Return the list of models that can be checked.""" return self._get_data_quality_models()

gemini-code-assist · 2025-09-05T19:50:07Z

modules/odf_data_quality_dashboard/models/data_quality_issue.py

+                    "in",
+                    [f"res.partner,{pid}" for pid in partners_to_check.ids],
+                ),
+                ("issue_type", "=", "Invalid VAT"),


The string 'Invalid VAT' is hardcoded here and on line 131. It's better to define this as a constant at the class level to avoid typos and improve maintainability.

For example, add this to the DataQualityIssue class:

ISSUE_TYPE_INVALID_VAT = "Invalid VAT"

Then use self.ISSUE_TYPE_INVALID_VAT in your code.

gemini-code-assist · 2025-09-05T19:50:07Z

modules/odf_data_quality_dashboard/models/data_quality_issue.py

+        # 3a. First, run the VIES check in batches if enabled. This is the most
+        # reliable check, but it is remote and requires careful handling.
+        if self.env.company.vat_check_vies:
+            batch_size = 10  # Process 10 partners per batch


The batch size is hardcoded. This value, along with the time.sleep duration on line 115, might need tuning in a production environment. It's good practice to make these values configurable via system parameters (ir.config_parameter). This allows administrators to adjust them without code changes.

Example:

ICP = self.env['ir.config_parameter'].sudo() batch_size = int(ICP.get_param('data_quality.vies_batch_size', '10')) sleep_interval = float(ICP.get_param('data_quality.vies_sleep_interval', '1.0'))

gemini-code-assist · 2025-09-05T19:50:07Z

modules/odf_data_quality_dashboard/models/data_quality_issue.py

+                except Exception as e:
+                    # If the VIES service fails, we can't validate this batch.
+                    # We could log this, but for now we'll just continue.
+                    _logger.warning(
+                        "VIES check failed for batch with error: %s", e
+                    )


Catching a broad Exception can hide unexpected errors and make debugging difficult. It's better to catch more specific exceptions if you know what to expect from button_vies_check() (e.g., connection errors). At a minimum, you should log the full traceback to aid in debugging by adding exc_info=True to the logger call.

Suggested change

except Exception as e:

# If the VIES service fails, we can't validate this batch.

# We could log this, but for now we'll just continue.

_logger.warning(

"VIES check failed for batch with error: %s", e

)

except Exception:

# If the VIES service fails, we can't validate this batch.

# We could log this, but for now we'll just continue.

_logger.warning(

"VIES check failed for batch.", exc_info=True

)

This commit introduces a new Odoo module `odf_data_quality_dashboard`. This module provides a non-blocking, post-import data validation system and a user interface for managing data quality issues. This version incorporates feedback on performance, security, and code quality: - The validation logic is now batch-oriented to avoid N+1 query issues. - Granular security groups (User/Manager) have been added. - The scheduled action now runs as a dedicated, non-admin user. - The module is correctly placed in the `modules/` directory. The module includes: - A new model `odf.data.quality.issue` to store data quality issues. - A daily scheduled action that scans for partners with invalid VAT numbers and creates issue records. - Kanban, form, and tree views for the `odf.data.quality.issue` model, grouped by status. - A "Data Quality Dashboard" menu item. - Security rules for the new model. This addresses the requirement of decoupling data validation from the import process, as outlined in section 4.2 of the Project Constitution.

The VAT validation logic has been completely refactored to use Odoo's built-in mechanisms from the `base_vat` module, replacing the previous simple string length check. Changes include: - Added a dependency on the `base_vat` module. - The validation logic now uses `partner.check_vat()` for robust checking. - A batched process for the VIES check has been implemented to avoid API rate-limiting issues. The cron job now safely checks VAT numbers against the VIES database if the feature is enabled in settings.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

bosd · 2025-09-05T22:48:45Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new module for a data quality dashboard, which is a great initiative. The initial implementation focuses on checking for invalid VAT numbers for partners, which is a common data quality issue. The code is well-structured, particularly the use of a dispatcher for checks and batch processing for performance.

I've found a critical issue that would prevent the module from working correctly, related to an undefined logger. I've also included a few suggestions for improvement regarding code simplification, configurability, data modeling, and documentation clarity. Once the critical issue is addressed, this module will be a valuable addition.

gemini-code-assist · 2025-09-05T22:49:59Z

modules/odf_data_quality_dashboard/models/data_quality_issue.py

+import time
+from datetime import timedelta
+
+from odoo import api, fields, models
+


The _logger variable is used on line 112, but it is not imported or defined in this file. This will raise a NameError at runtime when the VIES check fails, causing the cron job to crash. To fix this, you need to import the logging module and initialize a logger instance at the top of the file.

import logging import time from datetime import timedelta from odoo import api, fields, models _logger = logging.getLogger(__name__)

gemini-code-assist · 2025-09-05T22:49:59Z

modules/odf_data_quality_dashboard/models/data_quality_issue.py

+    issue_type = fields.Char(
+        string="Issue Type",
+    )


The issue_type field is a Char. For better data consistency and to provide a better user experience for filtering in the UI, consider changing this to a Selection field. This will prevent typos and ensure that issue types are standardized. You can define the selection in a dedicated method to keep it extensible.

Example:

issue_type = fields.Selection( selection=[ ('invalid_vat', 'Invalid VAT'), ], string="Issue Type", )

If you make this change, remember to update the hardcoded string 'Invalid VAT' on lines 80 and 131 to use the selection key (e.g., 'invalid_vat').

gemini-code-assist · 2025-09-05T22:49:59Z

modules/odf_data_quality_dashboard/models/data_quality_issue.py

+            [
+                ("write_date", ">=", fields.Datetime.to_string(yesterday)),
+                ("vat", "!=", False),
+                ("vat", "!=", ""),
+            ]
+        )


The search domain contains two conditions to filter out partners without a VAT number: ("vat", "!=", False) and ("vat", "!=", ""). In Odoo's ORM, ("vat", '!=', False) is sufficient to exclude records where the vat field is either NULL (in the database) or an empty string. You can simplify the domain by removing the redundant condition for better readability and maintainability.

[ ("write_date", ">=", fields.Datetime.to_string(yesterday)), ("vat", "!=", False), ] )

gemini-code-assist · 2025-09-05T22:49:59Z

modules/odf_data_quality_dashboard/models/data_quality_issue.py

+        # 3a. First, run the VIES check in batches if enabled. This is the most
+        # reliable check, but it is remote and requires careful handling.
+        if self.env.company.vat_check_vies:
+            batch_size = 10  # Process 10 partners per batch


The batch size for VIES checks is hardcoded. While 10 is a reasonable default, it might not be optimal for all environments or future regulations on the VIES service. Consider making this configurable by using a system parameter (ir.config_parameter). This would allow administrators to tune the performance or adapt to service limits without changing the code.

Example of how to retrieve the parameter:

batch_size = int(self.env['ir.config_parameter'].sudo().get_param('odf_data_quality_dashboard.vies_batch_size', '10'))

gemini-code-assist · 2025-09-05T22:50:00Z

modules/odf_data_quality_dashboard/readme/INSTALL.md

+[ This file must only be present if there are very specific
+  installation instructions, such as installing non-python
+  dependencies. The audience is systems administrators. ]


This file contains placeholder text indicating it should only be present for specific installation instructions. Since there are no such instructions for this module, this file could be confusing for system administrators. It would be better to either remove this file or clear its content if you plan to add instructions later.

google-labs-jules bot mentioned this pull request Aug 20, 2025

Task 1.3.1: Create odf_data_quality_dashboard Module #8

Open

8 tasks

bosd marked this pull request as ready for review September 5, 2025 10:20

gemini-code-assist bot reviewed Sep 5, 2025

View reviewed changes

bosd force-pushed the feat/data-quality-module-final branch from 68aa7cc to 63f292a Compare September 5, 2025 11:12

gemini-code-assist bot reviewed Sep 5, 2025

View reviewed changes

bosd force-pushed the feat/data-quality-module-final branch from be56801 to 2b81def Compare September 5, 2025 22:39

bosd force-pushed the feat/data-quality-module-final branch 2 times, most recently from 21f3091 to 430c51d Compare September 5, 2025 22:45

google-labs-jules bot and others added 11 commits September 6, 2025 00:46

improve odf_data_quality_dashboard

c7bb411

Update modules/odf_data_quality_dashboard/data/res_users.xml

b83c79c

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

feat: make selection record extensionable

b3c55c7

fixup import

3cd1e20

Update modules/odf_data_quality_dashboard/__manifest__.py

812f957

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update modules/odf_data_quality_dashboard/models/data_quality_issue.py

8ee9de0

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update modules/odf_data_quality_dashboard/models/data_quality_issue.py

7ec1b86

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update modules/odf_data_quality_dashboard/models/data_quality_issue.py

d85d6b8

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

fix indentation

05e3825

Fixup: MVP

258f574

bosd force-pushed the feat/data-quality-module-final branch from 430c51d to 258f574 Compare September 5, 2025 22:46

gemini-code-assist bot reviewed Sep 5, 2025

View reviewed changes

Conversation

google-labs-jules bot commented Aug 20, 2025

Uh oh!

bosd commented Sep 5, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

bosd Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

google-labs-jules bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

bosd Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

google-labs-jules bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

bosd commented Sep 5, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bosd commented Sep 5, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

bosd commented Sep 5, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!