Check for custom callouts on app builds #35628

Charl1996 · 2025-01-16T13:07:35Z

Technical Summary

This PR adds some more insights into how users use the app dependencies feature in an application (or rather, how they don't use it).

The broader intent is to find out what level of adoption we have for the app dependencies feature and if any projects make use of app callouts but don't use the app dependencies feature. This will enable us to start asking "why not".

Notes

Celery task

I think the new additional method should be a relatively cheap one to process, but I have had some thoughts of handing the analyse_app_build method off to celery for async processing, since reporting analytics shouldn't be a blocking operation. Any input/thoughts here would be welcome.

Datadog custom metric

Prereading

Reporting the app ID and domain will count as additional custom metrics on Datadog (for which app_id would be the biggest contributer to increased metric variance), but

assuming hitting the "Make new version" button is a relatively low-frequency action, and
of all new app builds, only those that have custom app callouts will be reported to Datadog

...I think the variance is OK given the low frequency of metric reporting.

Safety Assurance

Safety story

Tested locally. Staging testing to commence.

Automated test coverage

No automated testing

QA Plan

Going through QA

Rollback instructions

This PR can be reverted after deploy with no further considerations

Labels & Review

Risk label is set correctly
The set of people pinged as reviewers is appropriate for the level of risk of the change

orangejenny · 2025-01-16T14:51:46Z

I have had some thoughts of handing the analyse_app_build method off to celery for async processing

Yeah, I think this should go in a celery task. My recollection is that wrapped_xform parses the form XML and that can be expensive for apps with a lot of large forms. If you don't move the new logic to celery, it's worth gathering some evidence that it won't be a performance hit for large apps (possibly using the app build timings page).

assuming hitting the "Make new version" button is a relatively low-frequency action

I agree the new metric isn't likely to incur significant cost, but just for the exercise, you could use analytics to confirm the actual frequency of this action. We should have a metric in GA for how often this button is clicked.

mkangia · 2025-01-17T07:54:21Z

corehq/apps/app_manager/models.py


        return copy

+    def analyse_app_build(self, new_build):


+1 on doing this async.
Gathering metrics for our use should not delay user interaction, seems unreasonable.

mkangia · 2025-01-17T07:55:47Z

corehq/apps/app_manager/models.py

+        if app_has_custom_intents():
+            metrics_counter(
+                'commcare.app_build.custom_app_callout',
+                tags={'domain': new_build.domain, 'app_id': new_build.copy_of},


Just checking on this, same as the other PR, does adding tags effect cost on datadog and how? I assume a new custom metric is a new cost of its own.

I see there is context about this in the description.

So, we are saying since making new version happens infrequently and that we are reporting only when its applicable this should be okay, seems fair.

I see Jenny already noticed this PR, but Just tagging @gherceg as well for visibility that a new custom tag is being added.
(let me know if its not necessary for us to notify you or someone everytime a new tag is added)

I still want to check out how often a new build is created so we can at least have a sense of frequency. Will update when I know.

mkangia · 2025-01-17T07:56:27Z

corehq/apps/app_manager/models.py

+    def check_for_custom_callouts(self, new_build):
+        from corehq.apps.app_manager.util import app_callout_templates
+
+        templates = next(app_callout_templates)


should this be list instead of next?

The app_callout_templates variable holds a reference to a function which yields, so maybe next is a better suit here?

mkangia

Looks safe enough to me though I would vote to get QA on this since we are touching a critical process used by all users and QA could then check it for different kind of forms/apps so we can be more certain that this won't break anything.

Charl1996 · 2025-01-17T08:59:12Z

I moved the analyse_new_app_build task call to after the new build has been saved to avoid any race conditions. As such I also had to update the function checking for app dependencies added/removed on the previous build now that get_latest_build refers to the actual newest app build and not the previous one to that.

coderabbitai · 2025-01-20T07:41:40Z

Walkthrough

The pull request introduces changes to the app build process across multiple files in the corehq/apps/app_manager module. The primary modifications involve removing the existing check_build_dependencies method from the ApplicationBase class in models.py and introducing new functionality for analyzing app builds in tasks.py and views/releases.py.

The changes implement a new asynchronous task analyse_new_app_build that checks for custom callouts and build dependencies when a new app build is created. This task is triggered automatically after an app build is made, allowing for background analysis without blocking the main execution flow. The modifications streamline the build process by moving dependency and custom callout checks into a separate, non-blocking task, potentially improving the overall performance and flexibility of the app build mechanism.

The changes affect how app builds are created and analyzed, introducing a more modular approach to tracking and reporting on app build characteristics.

✨ Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

corehq/apps/app_manager/tests/test_apps.py (1)

42-42: Consider refactoring to avoid importing from views in tests

Importing make_app_build from corehq.apps.app_manager.views.releases into tests.test_apps.py may mix concerns between views and tests. To maintain a clean separation of concerns, consider refactoring make_app_build into a utility module or the model layer, and import it from there.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bebe117 and 30598b0.

📒 Files selected for processing (4)

corehq/apps/app_manager/models.py (0 hunks)
corehq/apps/app_manager/tasks.py (3 hunks)
corehq/apps/app_manager/tests/test_apps.py (4 hunks)
corehq/apps/app_manager/views/releases.py (2 hunks)

💤 Files with no reviewable changes (1)

corehq/apps/app_manager/models.py

⏰ Context from checks skipped due to timeout of 90000ms (6)

GitHub Check: tests (python-sharded-and-javascript)
GitHub Check: tests (python, bf)
GitHub Check: tests (python, 6a)
GitHub Check: Lint Javascript
GitHub Check: tests (python, 05)
GitHub Check: test

corehq/apps/app_manager/tasks.py

coderabbitai · 2025-01-20T07:41:43Z

corehq/apps/app_manager/tasks.py

+        previous_build_id = app_build_ids[app_build_ids.index(new_build.id) + 1]
+        previous_build = get_app(new_build.domain, previous_build_id)


⚠️ Potential issue

Potential IndexError when accessing previous build ID

In check_build_dependencies, lines 142-143 calculate previous_build_id using app_build_ids[app_build_ids.index(new_build.id) + 1]. If new_build.id is the last element in app_build_ids, this will result in an IndexError. Additionally, if new_build.id is not in app_build_ids, a ValueError will be raised.

Modify the code to handle these edge cases safely:

if len(app_build_ids) > 1: try: idx = app_build_ids.index(new_build.id) if idx + 1 < len(app_build_ids): previous_build_id = app_build_ids[idx + 1] previous_build = get_app(new_build.domain, previous_build_id) last_build_has_dependencies = has_dependencies(previous_build) if previous_build else False except (ValueError, IndexError): # Handle the case where new_build.id is not found or is the last element last_build_has_dependencies = False

Committable suggestion skipped: line range outside the PR's diff.

@Charl1996 Would you like me to help you implement the suggested fix to handle the edge cases safely? Let me know if you need any assistance.

No, it's fine. I don't think this will be a concern because the docstring in the get_build_ids says the following:
"Returns all the built apps for an application id, in descending order of time built."

I've verified that it's indeed so, meaning

the new_build.id will definitely exist in the list

if it's the last item in the list it also means its the only item in the list and the len(app_build_ids) will evaluate to false, meaning this block of code won't execute in the first place.

@Charl1996 I've found the implementation in dbaccessors.py and can confirm your explanation. The docstring indeed explicitly states that it "Returns all the built apps for an application id, in descending order of time built." This validates your assessment of the code's safety. Thank you for helping me understand the implementation better.

🔗 Analysis chain

@Charl1996 Thank you for the detailed explanation. I understand now that the code is indeed safe due to the guarantees provided by get_build_ids. Your explanation makes perfect sense - since the builds are returned in descending order and the new build will always be present, we don't need additional error handling.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for the implementation of get_build_ids ast-grep --pattern 'def get_build_ids' # Also search for any documentation or comments about get_build_ids rg -A 5 'def get_build_ids'

Length of output: 1548

✏️ Learnings added

Learnt from: Charl1996 PR: dimagi/commcare-hq#35628 File: corehq/apps/app_manager/tasks.py:142-143 Timestamp: 2025-01-20T08:22:44.938Z Learning: The `get_build_ids` function in corehq.apps.app_manager.dbaccessors returns all built apps for an application ID in descending order of build time, which guarantees that: 1. A newly built app's ID will always exist in the returned list 2. If the new build ID is the last item in the list, it means it's the only item, preventing any IndexError in the `check_build_dependencies` function

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

If you found this review helpful, would you consider giving us a shout-out on X?
Thank you for using CodeRabbit!

Here's a carrot, Mr Rabbit!

@Charl1996 🥕 Thank you for the carrot! happy rabbit noises 🐰

I agree this shouldn't be an issue, but you could save readers from going through the same thought process you outlined, and make it obvious that isn't an issue, by doing something like:

build_index = app_build_ids.index(new_build.id) if build_index < len(app_build_ids): previous_build_id = app_build_ids[build_index + 1]

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

corehq/apps/app_manager/tests/test_apps.py

corehq/apps/app_manager/views/releases.py

Charl1996 · 2025-01-20T13:53:44Z

@CodeRabbit Please summarize the things you have learned from contributers so far about the codebase and the way we do things.

coderabbitai · 2025-01-20T13:54:07Z

Tip

For best results, initiate chat on the files or code changes.

@Charl1996 Based on the discussions in this PR, here are the key learnings about the codebase and development practices:

Performance is a critical consideration:
- Operations involving wrapped_xform (form XML parsing) can be expensive for large forms
- Resource-intensive operations should be moved to Celery tasks for asynchronous processing
- Performance impact should be validated with evidence, using tools like the app build timings page
Monitoring and analytics are important:
- Google Analytics tracks user interactions (e.g., button clicks)
- Datadog is used for custom metrics
- Metrics should be carefully considered for their cost impact
Development best practices:
- Race conditions should be avoided by ensuring proper operation ordering
- Build-related operations are timing-sensitive
- Changes should be backed by data (e.g., using analytics to validate assumptions)

✏️ Learnings added

Learnt from: Charl1996
URL: https://github.com/dimagi/commcare-hq/pull/35628

Timestamp: 2025-01-20T13:54:06.301Z
Learning: Performance-critical operations, especially those involving `wrapped_xform` for parsing form XML, should be moved to Celery tasks for asynchronous processing to avoid impacting large applications.

Learnt from: Charl1996
URL: https://github.com/dimagi/commcare-hq/pull/35628

Timestamp: 2025-01-20T13:54:06.301Z
Learning: Performance impact of changes should be validated using evidence, such as the app build timings page, particularly for operations affecting large applications.

Learnt from: Charl1996
URL: https://github.com/dimagi/commcare-hq/pull/35628

Timestamp: 2025-01-20T13:54:06.301Z
Learning: The codebase uses Google Analytics for tracking user interactions and Datadog for custom metrics, with consideration given to the cost impact of new metrics.

Learnt from: Charl1996
URL: https://github.com/dimagi/commcare-hq/pull/35628

Timestamp: 2025-01-20T13:54:06.301Z
Learning: Build-related operations are timing-sensitive and should be carefully ordered to avoid race conditions, particularly when dealing with latest builds and dependencies.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

orangejenny · 2025-01-20T14:14:48Z

I moved the analyse_new_app_build task call to after the new build has been saved to avoid any race conditions.

Can you say more about this? It looks like there's still the possibility that if analyse_new_app_build gets queued for a while, and the user makes an additional build, the task will fetch the wrong build. Why not pass the build id into the task instead of the app id?

Charl1996 · 2025-01-20T14:19:41Z

It looks like there's still the possibility that if analyse_new_app_build gets queued for a while

Ah, that's a good point!

Why not pass the build id into the task instead of the app id?

Sounds like a good idea. Let me update.

orangejenny · 2025-01-20T14:23:45Z

corehq/apps/app_manager/tasks.py

+
+
+def check_for_custom_callouts(new_build):
+    from corehq.apps.app_manager.util import app_callout_templates


It might be worth caching app_callout_templates if it's now going to be called often.

…agi/commcare-hq into cs/SC-4113-check-for-custom-callouts

orangejenny · 2025-01-20T16:15:38Z

corehq/apps/app_manager/util.py

@@ -540,7 +541,13 @@ def _app_callout_templates():
        yield data


+@memoized


Memoized is only an in-memory cache, so it isn't the thing to use here, you'll want to cache across requests. I was thinking you could quickcache, probably for a while, maybe a week or longer. Check out these docs if you haven't seen them.

Memoized is only an in-memory cache ... you'll want to cache across requests

Ah, I thought it was the other way around (but haven't read that docs to be honest). Probably the "quick" in quickcache that got my mind on the wrong track :shrug

cool, thanks for updating

Charl1996 · 2025-01-21T11:12:51Z

The test error is as follows:

2025-01-21T07:07:59.9588424Z ##[group]Run bash ./scripts/test-make-docs.sh
2025-01-21T07:07:59.9588979Z �[36;1mbash ./scripts/test-make-docs.sh�[0m
2025-01-21T07:07:59.9647325Z shell: /usr/bin/bash -e {0}
2025-01-21T07:07:59.9647634Z env:
2025-01-21T07:07:59.9647903Z pythonLocation: /opt/hostedtoolcache/Python/3.9.21/x64
2025-01-21T07:07:59.9648504Z PKG_CONFIG_PATH: /opt/hostedtoolcache/Python/3.9.21/x64/lib/pkgconfig
2025-01-21T07:07:59.9648977Z Python_ROOT_DIR: /opt/hostedtoolcache/Python/3.9.21/x64
2025-01-21T07:07:59.9649357Z Python2_ROOT_DIR: /opt/hostedtoolcache/Python/3.9.21/x64
2025-01-21T07:07:59.9649874Z Python3_ROOT_DIR: /opt/hostedtoolcache/Python/3.9.21/x64
2025-01-21T07:07:59.9650301Z LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.9.21/x64/lib
2025-01-21T07:07:59.9650624Z ##[endgroup]
2025-01-21T07:08:03.7839532Z Extension error (sphinxcontrib_django.docstrings):
2025-01-21T07:08:03.7841103Z Handler <function setup_django at 0x7f9d48e8dca0> for event 'config-inited' threw an exception (exception: Error 111 connecting to 127.0.0.1:6379. Connection refused.)
2025-01-21T07:08:03.7841892Z make[1]: *** [Makefile:25: html] Error 2
2025-01-21T07:08:03.7920244Z ##[error]Process completed with exit code 1.

Observation: Redis lives at 127.0.0.1:6379 if I'm not mistaken.

I've followed these steps but the output (and error log) is clean - no errors.

Charl1996 added 2 commits January 16, 2025 14:22

Add method for checking custom callouts

a847070

Tag app id

dd8145e

Charl1996 added the product/invisible Change has no end-user visible impact label Jan 16, 2025

Charl1996 requested review from kaapstorm, mkangia, zandre-eng and ajeety4 January 16, 2025 13:07

Charl1996 requested a review from orangejenny as a code owner January 16, 2025 13:07

mkangia reviewed Jan 17, 2025

View reviewed changes

Charl1996 added 4 commits January 17, 2025 10:21

Add task for managing metrics for app build

9e400b1

Invoke task after build has saved

16745fb

Make sure to fetch the apps deterministically

fb0ff78

Remove pickle serialization

2741b7e

Charl1996 added the awaiting QA QA in progress. Do not merge label Jan 17, 2025

Charl1996 added 2 commits January 17, 2025 11:14

Fix test

265a39d

Merge branch 'master' into cs/SC-4113-check-for-custom-callouts

30598b0

Charl1996 requested a review from mkangia January 20, 2025 07:37

coderabbitai bot reviewed Jan 20, 2025

View reviewed changes

orangejenny reviewed Jan 20, 2025

View reviewed changes

Pass build id instead of app id

f274054

Charl1996 added 3 commits January 20, 2025 18:01

Pull out function to calculate ids and memoize result

8b8e04c

Merge branch 'master' into cs/SC-4113-check-for-custom-callouts

cd95688

Merge branch 'cs/SC-4113-check-for-custom-callouts' of github.com:dim…

452b577

…agi/commcare-hq into cs/SC-4113-check-for-custom-callouts

orangejenny reviewed Jan 20, 2025

View reviewed changes

Charl1996 added 2 commits January 21, 2025 09:01

Replace memoized -> quickcache

6968738

Merge branch 'master' into cs/SC-4113-check-for-custom-callouts

58df9e1

orangejenny approved these changes Jan 21, 2025

View reviewed changes

Merge branch 'master' into cs/SC-4113-check-for-custom-callouts

751d406

		previous_build_id = app_build_ids[app_build_ids.index(new_build.id) + 1]
		previous_build = get_app(new_build.domain, previous_build_id)



		def check_for_custom_callouts(new_build):
		from corehq.apps.app_manager.util import app_callout_templates

		@@ -540,7 +541,13 @@ def _app_callout_templates():
		yield data


		@memoized

Check for custom callouts on app builds #35628

Are you sure you want to change the base?

Check for custom callouts on app builds #35628

Conversation

Charl1996 commented Jan 16, 2025 • edited Loading

Technical Summary

Notes

Celery task

Datadog custom metric

Safety Assurance

Safety story

Automated test coverage

QA Plan

Rollback instructions

Labels & Review

orangejenny commented Jan 16, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkangia left a comment

Choose a reason for hiding this comment

Charl1996 commented Jan 17, 2025 • edited Loading

coderabbitai bot commented Jan 20, 2025

Walkthrough

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot Jan 20, 2025

Choose a reason for hiding this comment

Charl1996 commented Jan 20, 2025

coderabbitai bot commented Jan 20, 2025

orangejenny commented Jan 20, 2025

Charl1996 commented Jan 20, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Charl1996 Jan 21, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Charl1996 commented Jan 21, 2025 • edited Loading

Charl1996 commented Jan 16, 2025 •

edited

Loading

Charl1996 commented Jan 17, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coderabbitai bot Jan 20, 2025 •

edited

Loading

Charl1996 Jan 21, 2025 •

edited

Loading

Charl1996 commented Jan 21, 2025 •

edited

Loading