Remove problematic content slicing in test output parsing #6

ryanhoangt · 2025-06-16T11:01:03Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

When running patch eval on Modal, I see that for some instances, the content of test output files being captured are out of order, which causes the test summary to fall outside the >>>>> Start Test Output and >>>>> End Test Output markers. I attached a sample log file below.

test_output_astropy__astropy-12907.txt

This PR removes the content slicing line and uses the whole file content for parsing.

Any other comments?

🧡 Thanks for contributing!

Some of the repo_setup.sh scripts leave the working tree in a dirty state which can make it difficult to generate a patch that applies cleanly. This change commits any outstanding changges such that any patch generated with `git diff` will cleanly apply to a newly launched container during the evaluation step.

* Simplify installation guidelines for inference submodule * Fixes SWE-bench#368 * Update version

* add docs * Add leaderboard * Remove unused import * Update docs * Update version * Update: docs

thx KL

Closes SWE-bench#389

* Support multilingual evaluation * CI: Fix documentation building vs deploying * Minor fixes * Remove some redundancy * Update dataset ref --------- Co-authored-by: Kilian Lieret <[email protected]> Co-authored-by: John Yang <[email protected]>

…re are subsequent successful futures. (SWE-bench#370)

…h#358) * fix: preserve all issue references with same keyword in PRs * Modified extract_resolved_issues to use a set instead of list to store references

Co-authored-by: John Yang <[email protected]>

…t.py SWE-bench#368 (SWE-bench#369) * fix prompt_col from text_inputs to text * update log --------- Co-authored-by: changqingai <[email protected]>

Match the documentation for installing additional dependencies with the contents of `pyproject.toml`

SWE-bench#399

…SWE-bench#422)

This action fails if more than 1 is running at the same time (which happens if you merge multiple PRs in quick succession). Fix is by disabling concurrency, so they just queue up.

SWE-bench#417) * fix(build): fix python base images requirement types-setuptools incorrect version when replacing * Update clean_requirements and clean_environment_yml patterns to remove version specs safely --------- Co-authored-by: baixuran <[email protected]> Co-authored-by: carlose <[email protected]>

sedrick-keh-tri and others added 30 commits March 24, 2025 23:28

catch-all exception for docker pull (SWE-bench#366)

71af697

Simplify installation guidelines for inference submodule

e19aabe

Fixes SWE-bench#368

ae22bf4

Fix/missing text column (SWE-bench#376)

c8d7763

* Simplify installation guidelines for inference submodule * Fixes SWE-bench#368 * Update version

Merge remote-tracking branch 'refs/remotes/origin/main'

0c0de95

Docs (SWE-bench#381)

9b9b9d3

* add docs * Add leaderboard * Remove unused import * Update docs * Update version * Update: docs

deploy docs

65237b8

Update pytest workflow python version

de31aa5

Update docs link

3f01bd6

Update README.md

6c36e5d

Doc: Add links to other github repos (SWE-bench#387)

9f836d4

thx KL

Doc: Fix closing div

11710a2

CI: Remove griffe-pydantic from mkdocs extensions (SWE-bench#391)

f98dd10

Fix loading of jsonl data (SWE-bench#390)

fea293e

Closes SWE-bench#389

Fix: the pbar doesn't update immediately when futures fail unless the…

35a4152

…re are subsequent successful futures. (SWE-bench#370)

fix: preserve all issue references with same keyword in PRs (SWE-benc…

3fd9e87

…h#358) * fix: preserve all issue references with same keyword in PRs * Modified extract_resolved_issues to use a set instead of list to store references

Add test for PR SWE-bench#358

124897d

Update README.md (SWE-bench#388)

b627f5c

Co-authored-by: John Yang <[email protected]>

Urgent, there is a bug when generate prompt_col in create_text_datase…

d47ae07

…t.py SWE-bench#368 (SWE-bench#369) * fix prompt_col from text_inputs to text * update log --------- Co-authored-by: changqingai <[email protected]>

Minor SWE-bench#369 fix

a525e96

Release 4.0.1

af0938c

Release 4.0.2

35fee16

Release 4.0.3

547035d

Doc: Update inference.md (SWE-bench#397)

c6e7858

Match the documentation for installing additional dependencies with the contents of `pyproject.toml`

remove problematic content slicing

6a83d74

skip content slicing on Modal only

b8ffb7b

Fix docs

e3a6d5b

Update PR template

0c8d9f5

carlosejimenez and others added 18 commits June 1, 2025 21:07

Add more informative log for prepare_images script

54426ef

fixes SWE-bench#58 - overestimating recall for small k (SWE-bench#409)

fe2d3d1

Update namespace arg type for argparser to do null-conversion implicitly

1be8dbb

Add clean step for requirements / environments for python constants per

6a932bc

SWE-bench#399

Update README.md

aef44cb

fixes SWE-bench#421 - fix mention of private dockerhub link in readme (…

13c622a

…SWE-bench#422)

Fix: Allow empty namespace correctly. Ref SWE-bench#423 (SWE-bench#424)

9d79b3e

update modal docs + fix deprecations (SWE-bench#426)

dcf99be

Fix mm dev p5js log parsing

cd60a86

Fix chartjs log parsing

42703a6

support return exceptions (SWE-bench#431)

fc9161e

CI: Queue up doc pushes (SWE-bench#428)

2bf15e1

This action fails if more than 1 is running at the same time (which happens if you merge multiple PRs in quick succession). Fix is by disabling concurrency, so they just queue up.

Fix docs for prediction format

e32ac10

Add tests for harness utils

63dce46

Release v4.0.4

c7c22a9

Merge branch 'main' into fix-modal-patch-eval

03846bf

add extra validation for make_run_report

aa0f1ed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove problematic content slicing in test output parsing #6

Remove problematic content slicing in test output parsing #6

Uh oh!

ryanhoangt commented Jun 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants

Remove problematic content slicing in test output parsing #6

Are you sure you want to change the base?

Remove problematic content slicing in test output parsing #6

Uh oh!

Conversation

ryanhoangt commented Jun 16, 2025

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants