Create directory to group native samples #2623

hcnpeiris · 2025-03-11T13:35:36Z

Create directory to group native samples

Fixes capa#1787
Depends on: mandiant/capa-testfiles#279

Since after changing the native samples path, pytest file paths have been updated accordingly

However, one expected failure test (xfailed) unexpectedly passed.
I ran pytest on the latest master branch, and it produced the same result.

Pytest Results After Updating File Paths

Pytest Results for latest master branch

Request for Feedback

Can I get feedback on resolving this issue?

Checklist

No CHANGELOG update needed

No new tests needed

No documentation update needed

…est files

gemini-code-assist

Hello @hcnpeiris, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request addresses issue capa#1787 by creating a directory structure to group native samples within the test data. The primary change involves modifying file paths in tests/fixtures.py to reflect the new directory structure, which includes segregating samples into benign, malware, and pma_labs subdirectories. Additionally, file paths in tests/test_binexport_accessors.py, tests/test_elffile_features.py, and tests/test_scripts.py have been updated to align with the new directory structure. The author notes that one expected failure test unexpectedly passed after the file path updates and is seeking feedback on this issue.

Highlights

Directory Restructuring: Creates a new directory structure under tests/data to group native samples into benign, malware, and pma_labs subdirectories.
File Path Updates: Updates file paths in tests/fixtures.py, tests/test_binexport_accessors.py, tests/test_elffile_features.py, and tests/test_scripts.py to reflect the new directory structure.
Unexpected Test Result: An expected failure test unexpectedly passed after the file path updates, requiring further investigation.

Changelog

Click here to see the changelog

tests/fixtures.py
- Modified search paths for binexport2 extractor to include data/malware.
- Updated file paths for various test samples to reflect the new directory structure (e.g., moving mimikatz to data/benign, kernel32 to data/malware, and PMA labs to data/pma_labs).
- Updated file paths for various malware samples.
tests/test_binexport_accessors.py
- Updated the file path for the mimikatz binexport extractor to point to the new location in data/benign.
tests/test_elffile_features.py
- Updated sample paths for ELF files to reflect the new location in data/malware.
tests/test_scripts.py
- Updated the binary file path and binexport2 file path to reflect the new locations in data/malware and data/benign respectively.
- Updated the CAPA_SAMPLES_DIR environment variable to point to the data/benign directory for binexport scripts.
- Updated the source file path for bulk processing tests to reflect the new location in data/malware.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

The tests did run, the paths aligned,
A structure new, carefully designed.
But a test that failed, now sees the light,
A puzzle to solve, both day and night.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

The pull request aims to improve the organization of test files by grouping native samples into a dedicated directory. The changes primarily involve updating file paths in the test suite. The author has identified an unexpected test pass and is seeking feedback on resolving it. Overall, the changes seem reasonable, but the identified test issue needs further investigation.

Summary of Findings

Inconsistent file categorization: The categorization of files as 'benign' or 'malware' seems inconsistent. For example, mimikatz is categorized as benign, which might not be accurate in all contexts. Review the categorization to ensure it aligns with the intended use case.
Unexpected test pass: The pull request description mentions that one expected failure test unexpectedly passed. While the author notes that this also occurs on the latest master branch, it's important to investigate and address this issue to ensure tests are behaving as expected.

Merge Readiness

The pull request should not be merged until the unexpected test pass is investigated and resolved. While the file path updates seem correct, the test results indicate a potential underlying issue that needs to be addressed. I am unable to approve this pull request, and recommend that others review and approve this code before merging. At a minimum, the high severity issue should be addressed before merging.

gemini-code-assist · 2025-03-11T13:35:58Z

tests/fixtures.py

    if name == "mimikatz":
-        return CD / "data" / "mimikatz.exe_"
+        return CD / "data" / "benign" / "mimikatz.exe_"


Is mimikatz truly benign? While it can be used for legitimate purposes, it's often associated with malicious activity. Consider whether it should be categorized as malware or have a separate 'potentially unwanted program' category.

gemini-code-assist · 2025-03-11T13:35:59Z

tests/test_binexport_accessors.py

@@ -593,7 +593,7 @@ def test_pattern_matching_not_stack():
    assert match_address_with_be2(BE2_EXTRACTOR_687, queries, 0x107918) is None


-BE2_EXTRACTOR_MIMI = fixtures.get_binexport_extractor(CD / "data" / "binexport2" / "mimikatz.exe_.ghidra.BinExport")
+BE2_EXTRACTOR_MIMI = fixtures.get_binexport_extractor(CD / "data" / "benign" / "binexport2" / "mimikatz.exe_.ghidra.BinExport")


Similar to the previous comment, is mimikatz truly benign in this context? Ensure the categorization aligns with the test's purpose.

gemini-code-assist · 2025-03-11T13:35:59Z

tests/test_scripts.py

@@ -96,7 +96,7 @@ def test_scripts(script, args):
 )
 def test_binexport_scripts(script, args):
    # define sample bytes location
-    os.environ["CAPA_SAMPLES_DIR"] = str(Path(CD / "data"))
+    os.environ["CAPA_SAMPLES_DIR"] = str(Path(CD / "data" / "benign" ))


Why is CAPA_SAMPLES_DIR set to the benign directory here? Does this mean that the binexport scripts are only intended to be run on benign samples?

hcnpeiris · 2025-03-17T04:17:04Z

@mr-tz
I changed the path of the test file in the test scripts to match the capa-testfiles PR: capa-testfiles PR #279

Here is the pytest result. However, tests/test_scripts.py::test_proto_conversion, which was expected to fail, was passed.
I also checked pytest against the latest master branch, and it shows the same result.

Can I get feedback on this?

mr-tz

Looks good, thank you.

mr-tz · 2025-03-20T18:55:01Z

I'll take another closer look and think about the naming. Maybe we add OS or arch but I'd like to keep it simple.

hcnpeiris · 2025-03-27T04:14:14Z

Hi @mr-tz, just checking in — any updates on the PR or anything you’d like me to adjust?

mr-tz · 2025-03-27T08:09:25Z

Hey, sorry it may take a bit on my end. Thanks for the ping.

Reset rules and tests/data to upstream/master and restored modified t…

c2f5df8

…est files

gemini-code-assist bot reviewed Mar 11, 2025

View reviewed changes

gemini-code-assist bot suggested changes Mar 11, 2025

View reviewed changes

hcnpeiris and others added 6 commits March 12, 2025 11:41

Merge branch 'mandiant:master' into fix/pytest-file-rearrange

97af695

Apply black formatting

71dcb21

rearnage pytest path

02c4c37

Rename file path

92672bf

Apply Black formatting to test scripts

b5127dc

Merge branch 'mandiant:master' into fix/pytest-file-rearrange

293cc89

Merge branch 'mandiant:master' into fix/pytest-file-rearrange

0fa9da3

hcnpeiris mentioned this pull request Mar 20, 2025

create directory to group native samples mandiant/capa-testfiles#279

Open

hcnpeiris marked this pull request as ready for review March 20, 2025 11:26

mr-tz reviewed Mar 20, 2025

View reviewed changes

Merge branch 'mandiant:master' into fix/pytest-file-rearrange

b2e1b4a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create directory to group native samples #2623

Create directory to group native samples #2623

Uh oh!

hcnpeiris commented Mar 11, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 11, 2025

Uh oh!

gemini-code-assist bot Mar 11, 2025

Uh oh!

gemini-code-assist bot Mar 11, 2025

Uh oh!

hcnpeiris commented Mar 17, 2025 •

edited

Loading

Uh oh!

mr-tz left a comment

Uh oh!

mr-tz commented Mar 20, 2025

Uh oh!

hcnpeiris commented Mar 27, 2025

Uh oh!

mr-tz commented Mar 27, 2025

Uh oh!

Uh oh!

Create directory to group native samples #2623

Are you sure you want to change the base?

Create directory to group native samples #2623

Uh oh!

Conversation

hcnpeiris commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!