-
Notifications
You must be signed in to change notification settings - Fork 15
Added support for purposeful failing test cases #528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
DyanB
wants to merge
3
commits into
main
Choose a base branch
from
488-purposeful-fail-tests
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -51,6 +51,7 @@ | |
| TESTFILES_DST = LIND_ROOT / "testfiles" | ||
| DETERMINISTIC_PARENT_NAME = "deterministic" | ||
| NON_DETERMINISTIC_PARENT_NAME = "non-deterministic" | ||
| FAIL_PARENT_NAME = "fail" | ||
| EXPECTED_DIRECTORY = Path("./expected") | ||
| SKIP_TESTS_FILE = "skip_test_cases.txt" | ||
|
|
||
|
|
@@ -65,7 +66,12 @@ | |
| "Lind_wasm_Segmentation_Fault": "Lind Wasm Segmentation Failure", | ||
| "Lind_wasm_Timeout": "Timeout During Lind Wasm run", | ||
| "Unknown_Failure": "Unknown Failure", | ||
| "Output_mismatch": "C Compiler and Wasm Output mismatch" | ||
| "Output_mismatch": "C Compiler and Wasm Output mismatch", | ||
| "Fail_native_succeeded": "Fail Test: Native Succeeded (Should Fail)", | ||
| "Fail_wasm_succeeded": "Fail Test: Wasm Succeeded (Should Fail)", | ||
| "Fail_both_succeeded": "Fail Test: Both Native and Wasm Succeeded (Should Fail)", | ||
| "Fail_native_compiling": "Fail Test: Native Compilation Failure (Should Succeed)", | ||
| "Fail_wasm_compiling": "Fail Test: Wasm Compilation Failure (Should Succeed)" | ||
| } | ||
|
|
||
| # ---------------------------------------------------------------------- | ||
|
|
@@ -440,10 +446,82 @@ def run_compiled_wasm(wasm_file, timeout_sec=DEFAULT_TIMEOUT): | |
| # TODO: Currently for non deterministic cases, we are only compiling and running the test case, success means the compiled test case ran, need to add more specific tests | ||
| # | ||
| def test_single_file_unified(source_file, result, timeout_sec=DEFAULT_TIMEOUT, test_mode="deterministic"): | ||
| """Unified test function for both deterministic and non-deterministic tests""" | ||
| """Unified test function for both deterministic, non-deterministic and failing tests""" | ||
| source_file = Path(source_file) | ||
| handler = TestResultHandler(result, source_file) | ||
|
|
||
| # For fail tests, we need to run both native and wasm | ||
| if test_mode == "fail": | ||
| # Run native version | ||
| native_success, native_output, native_retcode, native_error = compile_and_run_native(source_file, timeout_sec) | ||
|
|
||
| # NOTE: We explicitly early-abort here and report the native compilation failure | ||
| # rather than treating it as a successful "fail-test". | ||
| if native_error == "Failure_native_compiling": | ||
| # Record this specifically as a fail-test native-compilation error so it is | ||
| # counted alongside other `Fail_*` test categories instead of the generic | ||
| # compilation error bucket used elsewhere. | ||
| failure_info = ( | ||
| "=== FAILURE: Native compilation failed during fail-test (expected runtime failure) ===\n" | ||
| f"Native output:\n{native_output}" | ||
| ) | ||
| add_test_result(result, str(source_file), "Failure", "Fail_native_compiling", failure_info) | ||
| return | ||
|
|
||
| # Compile and run WASM | ||
| wasm_file, wasm_compile_error = compile_c_to_wasm(source_file) | ||
| if wasm_file is None: | ||
| # Record this specifically as a fail-test WASM-compilation error so it is | ||
| # counted alongside other `Fail_*` test categories instead of the generic | ||
| # Lind_wasm_compiling bucket used elsewhere. | ||
| failure_info = ( | ||
| "=== FAILURE: Wasm compilation failed during fail-test (expected runtime failure) ===\n" | ||
| f"Wasm compile output:\n{wasm_compile_error}" | ||
| ) | ||
| add_test_result(result, str(source_file), "Failure", "Fail_wasm_compiling", failure_info) | ||
| return | ||
|
|
||
| try: | ||
| wasm_retcode, wasm_output = run_compiled_wasm(wasm_file, timeout_sec) | ||
|
|
||
| # Normalize return codes for comparison | ||
| native_failed = native_retcode != 0 | ||
|
|
||
| # Check if wasm_retcode is an integer or string | ||
| if isinstance(wasm_retcode, str): | ||
| wasm_failed = wasm_retcode in ["timeout", "unknown_error"] # Explicitly check for failure strings | ||
| else: | ||
| wasm_failed = wasm_retcode != 0 | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This works, but I’d tighten it up in case later on someone forgets why string return codes mean “failed.” |
||
|
|
||
| # Both should fail for this test to pass | ||
| if native_failed and wasm_failed: | ||
DyanB marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| # Success: both failed as expected | ||
| output_info = ( | ||
| f"Native exit code: {native_retcode}\n" | ||
| f"Wasm exit code: {wasm_retcode}\n" | ||
| "Both failed as expected." | ||
| ) | ||
| handler.add_success(output_info) | ||
| elif not native_failed and not wasm_failed: | ||
| # Both succeeded when they should have failed | ||
| failure_info = build_fail_message("both", native_output, wasm_output, native_retcode, wasm_retcode) | ||
| add_test_result(result, str(source_file), "Failure", "Fail_both_succeeded", failure_info) | ||
| elif not native_failed: | ||
| # Only native succeeded | ||
| failure_info = build_fail_message("native_only", native_output, wasm_output, native_retcode, wasm_retcode) | ||
| add_test_result(result, str(source_file), "Failure", "Fail_native_succeeded", failure_info) | ||
| else: | ||
| # Only wasm succeeded | ||
| failure_info = build_fail_message("wasm_only", native_output, wasm_output, native_retcode, wasm_retcode) | ||
| add_test_result(result, str(source_file), "Failure", "Fail_wasm_succeeded", failure_info) | ||
|
|
||
| finally: | ||
| # Always clean up WASM file | ||
| if wasm_file and wasm_file.exists(): | ||
| wasm_file.unlink() | ||
|
|
||
| return # Exit early for fail tests | ||
|
|
||
| # For deterministic tests, get expected output | ||
| expected_output = None | ||
| if test_mode == "deterministic": | ||
|
|
@@ -453,9 +531,9 @@ def test_single_file_unified(source_file, result, timeout_sec=DEFAULT_TIMEOUT, t | |
| return | ||
|
|
||
| # Compile and run WASM | ||
| wasm_file, compile_err = compile_c_to_wasm(source_file) | ||
| wasm_file, wasm_compile_error = compile_c_to_wasm(source_file) | ||
| if wasm_file is None: | ||
| handler.add_compile_failure(compile_err) | ||
| handler.add_compile_failure(wasm_compile_error) | ||
| return | ||
|
|
||
| try: | ||
|
|
@@ -491,6 +569,9 @@ def test_single_file_deterministic(source_file, result, timeout_sec=DEFAULT_TIME | |
| def test_single_file_non_deterministic(source_file, result, timeout_sec=DEFAULT_TIMEOUT): | ||
| test_single_file_unified(source_file, result, timeout_sec, "non_deterministic") | ||
|
|
||
| def test_single_file_fail(source_file, result, timeout_sec=DEFAULT_TIMEOUT): | ||
| test_single_file_unified(source_file, result, timeout_sec, "fail") | ||
|
|
||
| # ---------------------------------------------------------------------- | ||
| # Function: analyze_testfile_dependencies | ||
| # | ||
|
|
@@ -1141,9 +1222,50 @@ def run_tests(config, artifacts_root, results, timeout_sec): | |
| test_single_file_deterministic(dest_source, results["deterministic"], timeout_sec) | ||
| elif parent_name == NON_DETERMINISTIC_PARENT_NAME: | ||
| test_single_file_non_deterministic(dest_source, results["non_deterministic"], timeout_sec) | ||
| elif parent_name == FAIL_PARENT_NAME: | ||
| test_single_file_fail(dest_source, results["fail"], timeout_sec) | ||
| else: | ||
| # Log warning for tests not in deterministic/non-deterministic folders | ||
| logger.warning(f"Test file {original_source} is not in a deterministic or non-deterministic folder - skipping") | ||
| # Log warning for tests not in deterministic/non-deterministic/fail folders | ||
| logger.warning(f"Test file {original_source} is not in a deterministic, non-deterministic, or fail folder - skipping") | ||
|
|
||
| def build_fail_message(case: str, native_output: str, wasm_output: str, native_retcode=None, wasm_retcode=None) -> str: | ||
| """ | ||
| Build a consistent failure message for fail-tests. | ||
|
|
||
| Args: | ||
| case: One of "both", "native_only", "wasm_only" describing which succeeded. | ||
| native_output: Captured native stdout/stderr text. | ||
| wasm_output: Captured wasm stdout/stderr text. | ||
| native_retcode: Native return code (optional, included where helpful). | ||
| wasm_retcode: Wasm return code (optional, included where helpful). | ||
|
|
||
| Returns: | ||
| A formatted failure string. | ||
| """ | ||
| if case == "both": | ||
| return ( | ||
| "=== FAILURE: Both Native and Wasm succeeded when they should fail ===\n" | ||
| f"Native output:\n{native_output}\n\n" | ||
| f"Wasm output:\n{wasm_output}" | ||
| ) | ||
| elif case == "native_only": | ||
| return ( | ||
| "=== FAILURE: Native succeeded when it should fail ===\n" | ||
| f"Native output:\n{native_output}\n\n" | ||
| f"Wasm failed with exit code {wasm_retcode}:\n{wasm_output}" | ||
| ) | ||
| elif case == "wasm_only": | ||
| return ( | ||
| "=== FAILURE: Wasm succeeded when it should fail ===\n" | ||
| f"Wasm output:\n{wasm_output}\n\n" | ||
| f"Native failed with exit code {native_retcode}:\n{native_output}" | ||
| ) | ||
| else: | ||
| return ( | ||
| "=== FAILURE: Unexpected fail-test result ===\n" | ||
| f"Native (rc={native_retcode}) output:\n{native_output}\n\n" | ||
| f"Wasm (rc={wasm_retcode}) output:\n{wasm_output}" | ||
| ) | ||
|
|
||
| def main(): | ||
| os.chdir(LIND_WASM_BASE) | ||
|
|
@@ -1184,7 +1306,8 @@ def main(): | |
|
|
||
| results = { | ||
| "deterministic": get_empty_result(), | ||
| "non_deterministic": get_empty_result() | ||
| "non_deterministic": get_empty_result(), | ||
| "fail": get_empty_result() | ||
| } | ||
|
|
||
| # Prepare artifacts root | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
File renamed without changes.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice early-abort. One thought: this means a fail-test doesn't expect compile errors. Fine, but maybe worth a note since it’s distinct from “expected failure = runtime failure,” not “compile must fail.”