Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add caching #179

Merged
merged 25 commits into from
Dec 30, 2024
Merged

Add caching #179

merged 25 commits into from
Dec 30, 2024

Conversation

gladyshcodes
Copy link
Contributor

@gladyshcodes gladyshcodes commented Dec 26, 2024

Issue #124

This PR introduces basic caching mechanism to reduce costs and increase effectiveness of running test suites.

Performance Boost: Achieves an average speedup of 400%-600%, automations like "Find Lionel Messi Wikipedia page" now completing about 5x faster

Flow diagram


Benchmarks

No caching
Screenshot 2024-12-29 at 22 36 05

With caching
Screenshot 2024-12-29 at 22 36 17

Perf boost: 6x

Copy link

vercel bot commented Dec 26, 2024

@gladyshcodes is attempting to deploy a commit to the Antiwork Team on Vercel.

A member of the Team first needs to authorize it.

@CLAassistant
Copy link

CLAassistant commented Dec 26, 2024

CLA assistant check
All committers have signed the CLA.

@gladyshcodes
Copy link
Contributor Author

gladyshcodes commented Dec 27, 2024

One challenge is:

async getComponentStringByCoords(x: number, y: number) {
    return await this.getPage().evaluate(
      ([x, y]) => {
        const elem = document.elementFromPoint(x, y);
        return elem?.outerHTML.trim().replace(/\s+/g, ' ');
      },
      [x, y]
    );
  }

The function above retrieves normalized component string given X and Y coordinates. The reason it's implemented this way is that currently, Playwright does not support selecting DOM elements using coordinates (see this closed issue). The only way I have found was to use native document func. If you have any other ideas in mind, please suggest 👍

Copy link

vercel bot commented Dec 27, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
shortest ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 30, 2024 7:50pm

@m2rads
Copy link
Contributor

m2rads commented Dec 27, 2024

Would be nice to resolve the conflicts and then I can test your branch locally.

@gladyshcodes
Copy link
Contributor Author

Would be nice to resolve the conflicts and then I can test your branch locally.

@m2rads I have resolved conflicts now. You can check it out.

@gladyshcodes
Copy link
Contributor Author

Refinements / Improvements:

  1. Write new text execution to cache file only after it has been successfully completed. That way we can be sure cached tests are only successful ones
  2. Delete cache entry if it fails to run (e.g UI change) and rewrite it

@m2rads
Copy link
Contributor

m2rads commented Dec 27, 2024

@gladyshcodes sorry we made some more changes. Please resolve the conflicts again and I will test soon. Thank you :)

@gladyshcodes
Copy link
Contributor Author

@gladyshcodes sorry we made some more changes. Please resolve the conflicts again and I will test soon. Thank you :)

Hi there @m2rads. Those conflicts actually do not interfere with functionality added. I resolved them 👍 .

Comment on lines 16 to 35
export type BrowserAction =
| "mouse_move"
| "left_click"
| "left_click_drag"
| "right_click"
| "middle_click"
| "double_click"
| "screenshot"
| "cursor_position"
| "github_login"
| "clear_session"
| "type"
| "key"
| "run_callback"
| "navigate"
| "sleep"
| "check_email";
export enum BrowserActionEnum {
MouseMove = "mouse_move",
LeftClick = "left_click",
LeftClickDrag = "left_click_drag",
RightClick = "right_click",
MiddleClick = "middle_click",
DoubleClick = "double_click",
Screenshot = "screenshot",
CursorPosition = "cursor_position",
GithubLogin = "github_login",
ClearSession = "clear_session",
Type = "type",
Key = "key",
RunCallback = "run_callback",
Navigate = "navigate",
Sleep = "sleep",
CheckMail = "check_mail",
}

export type BrowserAction = `${BrowserActionEnum}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

@gladyshcodes gladyshcodes Dec 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to use screenshot browser action and founded it no good to make hard code comparison. That's why I created enum from interface, not touching the interface itself. Now we can do: BrowserActionEnum.Screenshot

@m2rads
Copy link
Contributor

m2rads commented Dec 29, 2024

Seems like Vercel deployment is failing. I am also getting some build errors locally.

Steps to reproduce this error:

Remove these files:
pnpm-lock.json
node_modules
packages/shortest/node_modules
packages/shortest/dist

Then run pnpm install

image

Let me know if you need Vercel logs for troubleshooting.

@gladyshcodes
Copy link
Contributor Author

@m2rads Build issue resolved, but Vercel pipeline does not re-run. I guess your approval is needed

@gladyshcodes gladyshcodes mentioned this pull request Dec 29, 2024
Comment on lines 24 to 25
this.cacheFile = path.join(process.cwd(), ".cache", "cache.json");
this.lockFile = path.join(process.cwd(), ".cache", "cache.lock");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to put this inside .shortest folder. Also does this only cache passing test cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to put this inside .shortest folder.

Good point. I will rename it

Also does this only cache passing test cases?

Not entirely sure what you mean. If you are asking whether test file only contains tests that succeeded, then yes

@m2rads
Copy link
Contributor

m2rads commented Dec 29, 2024

Thanks the build issue is resolved now but I think there might be another issue. When running a test, it caches the steps successfully, but on a consecutive run, it deletes the cache and does the same process. It doesn't seem like the consecutive test is being executed from the cache as it takes the same amount of time to run.

cache.mp4

@gladyshcodes
Copy link
Contributor Author

gladyshcodes commented Dec 29, 2024

@m2rads Hmm that's weird. I have just tried running several tests and it worked fine for me. I didn't try it with new dashboard test though. I will see now and let you know

shortest("Open Google"); runs:

Screenshot 2024-12-29 at 23 19 05

@m2rads
Copy link
Contributor

m2rads commented Dec 29, 2024

I see. The new test needs setup with Mailosaur. Maybe try a more complicated test that has multiple steps and see if this issue happens.

@gladyshcodes
Copy link
Contributor Author

@m2rads I have tried with this test, it works:

shortest("Open first article on Hackernews");

Screenshot 2024-12-29 at 23 33 15 Screenshot 2024-12-29 at 23 33 22

@m2rads
Copy link
Contributor

m2rads commented Dec 30, 2024

@m2rads I have tried with this test, it works:

shortest("Open first article on Hackernews");

Screenshot 2024-12-29 at 23 33 15 Screenshot 2024-12-29 at 23 33 22

Hhhm seems like this happens when you have multiple test cases. Can you try with more than one test case?

@gladyshcodes
Copy link
Contributor Author

@m2rads Thank you for caching the bug! I've realized the problem was caused by several factors:

  1. Missing sleep time after each browserTool.execute. Similar to how it's done in runner, we need to address this. However, I believe we should eliminate such code altogether since it affects the performance of both cached and uncached tests. Sometimes, waiting less than a second is sufficient for the next action, while other times, potentially, a longer wait might be needed
  2. For dashboard.test.ts:
  • I managed to get it running by referencing to the source code of Feature: Email Validation with Mailosaur #183, as the relevant documentation appears to be missing. It would be helpful under the services configuration section section. Also, Clerk configuration needs updating: you’ll need to disable the “Require the same device and browser” checkmark in the Clerk dashboard for the tests with mailosaur to work (at least in my case)
  • Issue itself was the difference in components from email letter as their href attrs differ every time. We now recursively clean up href attributes from the tags. The idea that if the URL changes and we need to click the link, the next step will fail

I ran several tests and they pass now:

Screenshot 2024-12-30 at 17 13 21

Perf boost: 5.6x

@slavingia If you have time, take a look as well, maybe any suggestions?

@slavingia
Copy link
Contributor

Merged in auto fix which led to some new failures

Copy link
Contributor

@slavingia slavingia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, looks great otherwise!

@slavingia slavingia merged commit c43a054 into anti-work:main Dec 30, 2024
4 of 5 checks passed
@slavingia
Copy link
Contributor

Bug report:

however I've removed the button I was testing and it still pass

maybe I did something wrong, my test is just this:
shortest("Open the home page and signin via email").expect("A message to check your email");

what I did now is that I removed the "Signing via Email" link and it still pass

it seems the second time I run the tests they are run against the screenshots but don't hit the server

You can see the video here: https://x.com/madarco/status/1874141497404363128

I guess there needs to be some logic to destroy the cache, so that the spec starts to fail. We'll need to think more on the best way to do that dynamically; we can start by making it very easy (one command) to nuke the cache?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants