getappmap · kgilpin · Nov 12, 2024 · Nov 12, 2024 · Nov 13, 2024 · Nov 13, 2024
diff --git a/.gitignore b/.gitignore
@@ -172,6 +172,7 @@ notebooks/
 *logs/
 work/
 appmap.log
+tmp/appmap
 
 # Solve
 solve

diff --git a/VERIFY.md b/VERIFY.md
@@ -0,0 +1,61 @@
+The purpose of this issue is to provide instructions on how to verify the open source status and benchmark results for AppMap Navie v2 on the Lite and Verified benchmarks.
+
+## Navie is open source
+
+You can find the benchmark code for Navie V2 here:
+
+[https://github.com/getappmap/navie-benchmark](https://github.com/getappmap/navie-benchmark)
+
+Within that project, there are two git submodules, which are also open source:
+
+* [https://github.com/getappmap/appmap-js/](https://github.com/getappmap/appmap-js/)
+* [https://github.com/getappmap/navie-editor](https://github.com/getappmap/navie-editor)
+
+These three projects completely contain the code of Navie v2.
+
+## Running the benchmark
+
+### General instructions
+
+You'll be using the GitHub Workflow `official.yml` to run the solver. It will generate test patches ("synthetic tests"), code patches ("solutions"), and then evaluate the results.
+
+For best results, use `claude-3-5-sonnet-20241022` with GitHub Action environment variable `ANTHROPIC_API_KEY`.
+
+Use the default branch of the repository, which is `swe-bench-2`.
+
+### Instance set option
+
+The primary input that you need to select is the instance set. The `instance_set` option names a ".txt" file that's located in `data/instance_sets`. For example, the instance set `verified_33_pct_1` includes 1/3 of the instances from the Verified set (every 3rd instance). Using instance sets enables you to run solver more quickly and cheaply than running the entire dataset. 
+
+To run a quick "smoke" test, use instance set `smoke`.
+
+To run the entire Verified dataset, use instance set `verified`.
+
+### Other options
+
+- **llm** `claude-3-5-sonnet-20241022`
+- **context_token_limit** `64000` For economy, you can run with a smaller token limit (e.g. `16000`), however you'll lose a couple of percent in the solve rate.
+- **context_token_limit_increase** `20` (default)
+- **temperature_increase** `0.1` (default)
+- **test_patch_solve_threshold** `1` (default)
+- **max_test_solve_iterations** `3` (default)
+- **num_runners** Size these according to the instance set that you use. We recommend using one runner for every 20-30 instances. With this many runners, you can expect the workflow to complete in 1-2 hours.
+- **name** As desired
+
+## Notes
+
+_Evaluation_
+
+If you prefer to use your own evaluation, rather than the code in this fork of swe-bench, you can remove that section from the Workflow.
+
+_Environments other than GitHub Actions_
+
+Of course, you don’t have to use GitHub Actions to run Navie. It’s just easy because it’s all configured. 
+
+You can see from the official.yml that, aside from building a conda environment and installing some dependencies, it’s necessary to build submodules/appmap-js using yarn. 
+
+---
+
+Please let me know if you have any questions, or if you would like these instructions in a different format or for a different target system. 
+
+
diff --git a/architecture/code-checkout/.navie/class-diagram/question.md b/architecture/code-checkout/.navie/class-diagram/question.md
@@ -0,0 +1 @@
+@diagram /noprojectinfo /include=\bsolver\b Create a class diagram for the feature "code checkout", using the provided documentation as a guide.
diff --git a/architecture/code-checkout/.navie/dependencyFiles.json b/architecture/code-checkout/.navie/dependencyFiles.json
@@ -0,0 +1,5 @@
+[
+  "solver/checkout_code.py",
+  "solver/harness/build_extended_image.py",
+  "solver/harness/image_store.py"
+]
diff --git a/architecture/code-checkout/.navie/readme/prompt.md b/architecture/code-checkout/.navie/readme/prompt.md
@@ -0,0 +1,14 @@
+## Task
+
+Your task is to document a feature in the style of a software architecture document. Use the
+available project information to document the feature.
+
+Document the feature from a usage point of view, not from an implementation point of view. Focus
+on the design and behavior of the feature as it is used by the end user or other parts of the
+system. Avoid including implementation details in the documentation. Do not provide a code
+breakdown, as the user is not interested in that information.
+
+Do not emit anything before or after the documentation content. Just emit the documentation content.
+
+Avoid using tentative language such as "may", "might", "could", "appears", "likely" etc. Describe
+only what you see from the data.
diff --git a/architecture/code-checkout/.navie/readme/question.md b/architecture/code-checkout/.navie/readme/question.md
@@ -0,0 +1 @@
+@explain /noprojectinfo /include=\bsolver\b Document the feature "code checkout".
diff --git a/architecture/code-checkout/README.md b/architecture/code-checkout/README.md
@@ -0,0 +1,24 @@
+**Feature: Code Checkout**
+
+The "Code Checkout" feature facilitates the creation of a local working copy of the code repository by leveraging a Docker container. This process involves exporting the current state of the version-controlled files from the container to the local file system, capturing the initial code baseline in a local git repository, and ensuring the code is set up for subsequent modifications and executions.
+
+### Feature Overview
+
+- **Container-Based Code Export**: The feature initiates by executing a command in a Docker container to create a compressed `.tar.gz` archive of the current state of the code. This archive is generated using the `git archive` command, emphasizing that the export is derived from a version-controlled git repository within the container.
+
+- **Local Directory Setup**: Prior to extraction, the feature verifies that the designated local directory (`source_dir`) for extracting the code does not already exist. If it does, a `ValueError` is raised to prevent unintentional overwrites. The directory is created if it doesn't preexist.
+
+- **Extraction Process**: The generated archive is copied from the container to the local file system. The content of the archive is then extracted into `source_dir`. This step ensures that the working directory is populated with the latest version-controlled code from the container environment.
+
+- **Local Git Initialization**: After extraction, the feature performs a series of git commands within the local directory to initialize a new git repository. It adds all the extracted files to the staging area and performs an initial commit, labeling it as "Baseline commit". This establishes a baseline from which subsequent modifications can be tracked locally.
+
+### Error Handling
+
+The feature incorporates robust error handling to address potential issues during the checkout process:
+
+- If a failure occurs during the git initialization and commit stages, the error is logged with details of the exception. This ensures transparency and ease of troubleshooting.
+- Regardless of the operation outcome, the process guarantees that the system directory context is reset to its original state by using a `finally` block.
+
+### Usage Context
+
+This feature is a fundamental part of setting up the development workflow by ensuring that the source code is properly initialized with version control in the local environment after being exported from a controlled Docker container. This setup is particularly useful when working with remote environments, allowing developers to have a synchronized and consistent starting point for code development and testing on their local systems.
diff --git a/architecture/code-checkout/class-diagram.md b/architecture/code-checkout/class-diagram.md
@@ -0,0 +1,34 @@
+
+```mermaid
+classDiagram
+  direction LR
+
+  class CodeCheckout {
+      -log: Callable
+      -container: docker.models.containers.Container
+      -source_dir: Path
+      -tmp_dir: Path
+      +checkout_code(): void
+  }
+
+  class GitOperations {
+      +initialize_git(source_dir: Path): void
+      +add_all_files(source_dir: Path): void
+      +commit_files(message: str): void
+  }
+
+  class DockerOperations {
+      +create_git_archive(container: docker.models.containers.Container, archive_path: Path): void
+      +copy_archive_to_local(container: docker.models.containers.Container, local_path: Path): void
+  }
+
+  class DirectoryManager {
+      +verify_directory_not_exists(path: Path): void
+      +create_directory(path: Path): void
+      +reset_directory_context(original_path: Path): void
+  }
+
+  CodeCheckout --> DockerOperations : uses
+  CodeCheckout --> GitOperations : uses
+  CodeCheckout --> DirectoryManager : uses
+```
diff --git a/architecture/code-solving/.navie/class-diagram/question.md b/architecture/code-solving/.navie/class-diagram/question.md
@@ -0,0 +1 @@
+@diagram /noprojectinfo /include=\bsolver\b Create a class diagram for the feature "code solving", using the provided documentation as a guide.
diff --git a/architecture/code-solving/.navie/dependencyFiles.json b/architecture/code-solving/.navie/dependencyFiles.json
@@ -0,0 +1,4 @@
+[
+  "solver/solve.py",
+  "solver/workflow/generate_code.py"
+]
diff --git a/architecture/code-solving/.navie/readme/prompt.md b/architecture/code-solving/.navie/readme/prompt.md
@@ -0,0 +1,14 @@
+## Task
+
+Your task is to document a feature in the style of a software architecture document. Use the
+available project information to document the feature.
+
+Document the feature from a usage point of view, not from an implementation point of view. Focus
+on the design and behavior of the feature as it is used by the end user or other parts of the
+system. Avoid including implementation details in the documentation. Do not provide a code
+breakdown, as the user is not interested in that information.
+
+Do not emit anything before or after the documentation content. Just emit the documentation content.
+
+Avoid using tentative language such as "may", "might", "could", "appears", "likely" etc. Describe
+only what you see from the data.
diff --git a/architecture/code-solving/.navie/readme/question.md b/architecture/code-solving/.navie/readme/question.md
@@ -0,0 +1 @@
+@explain /noprojectinfo /include=\bsolver\b Document the feature "code solving".
diff --git a/architecture/code-solving/README.md b/architecture/code-solving/README.md
@@ -0,0 +1,38 @@
+## Feature: Code Solving
+
+### Overview
+
+The "Code Solving" feature is designed to automate the generation and optimization of code patches within a specified project. This feature systematically identifies test errors, formulates a plan to address them, and applies code modifications confined to specific files, without altering the testing infrastructure. The ultimate objective is to create an environment where code changes are seamless and optimized for the existing architecture, avoiding disruptions or errors.
+
+### Functionality
+
+1. **Error Analysis**: 
+   - The code solver identifies and presents test errors that need addressing. A structured plan is generated outlining these errors and preventing test failures.
+
+2. **Plan and Modify**:
+   - A detailed plan is created for the necessary code modifications, restricting changes to explicitly mentioned files. This ensures that only the targeted areas of the codebase are altered, preserving the integrity of other components.
+
+3. **Patch Generation**:
+   - A new code patch is generated based on the specified plan. The code solver selects optimal code patches using mock functionalities to simulate and validate the generated code's effectiveness in resolving identified issues.
+
+4. **Test Compatibility**:
+   - Ensures compatibility with the test framework utilized, allowing seamless integration and execution of generated code patches using pre-defined command lines for testing. This maintains consistency across project environments.
+
+5. **Execution**:
+   - Utilizes Python's subprocess capabilities to execute commands related to instance sets and solve limits. This automation facilitates the smooth application of generated code patches across the codebase.
+
+6. **Archiving Logs**:
+   - After the code-solving process, logs for the applied patches and predictions are archived. This is instrumental in maintaining records of all code changes and predictions made during the process.
+
+### Design Considerations
+
+- **Environment-Specific Code**: 
+  - The feature is designed to respect the constraints of the specific Python version used by the project, ensuring no incompatibilities or unsupported features are introduced in the codebase.
+
+- **No Direct Testing Suggestions**:
+  - The feature does not provide direct testing recommendations or alterations, as these considerations are handled in a separate step to maintain focus on code optimization.
+
+- **User Interaction**: 
+  - Minimal user intervention is required, other than initiating the code-solving process and inputting necessary parameters such as instance sets and context tokens.
+
+Overall, the Code Solving feature offers a streamlined approach to code optimization within a project, facilitating automatic code patch creation and application while aligning with the existing architectural constraints and testing frameworks.
diff --git a/architecture/code-solving/class-diagram.md b/architecture/code-solving/class-diagram.md
@@ -0,0 +1,42 @@
+
+```mermaid
+classDiagram
+
+class CodeSolver {
+  +solve_errors(test_errors: List~str~): Plan
+  +apply_patch(patch: Patch): bool
+  +execute_cmds(cmds: List~str~)
+  +archive_logs(log_dir: Path)
+}
+
+class Plan {
+  +errors: List~str~
+  +generate_plan(): void
+  +modify_code(files: List~str~): bool
+}
+
+class Patch {
+  +content: str
+  +apply_to(file: str): bool
+}
+
+class Logger {
+  +log_process(step: str, message: str): void
+}
+
+class Environment {
+  +python_version: str
+  +validate_compatibility(): bool
+}
+
+class UserInteraction {
+  +init_code_solver()
+  +input_parameters(instance_set: str, context_tokens: int)
+}
+
+CodeSolver --> Plan
+CodeSolver --> Patch
+CodeSolver --> Logger
+CodeSolver --> Environment
+CodeSolver -- UserInteraction
+```
diff --git a/architecture/collect-appmap-context/.navie/class-diagram/question.md b/architecture/collect-appmap-context/.navie/class-diagram/question.md
@@ -0,0 +1 @@
+@diagram /noprojectinfo /include=\bsolver\b Create a class diagram for the feature "collect appmap context", using the provided documentation as a guide.
diff --git a/architecture/collect-appmap-context/.navie/dependencyFiles.json b/architecture/collect-appmap-context/.navie/dependencyFiles.json
@@ -0,0 +1,9 @@
+[
+  "solver/appmap/appmap.py",
+  "solver/observe_test.py",
+  "solver/solve.py",
+  "solver/workflow/collect_appmap_context.py",
+  "solver/workflow/observe_test.py",
+  "solver/workflow/solution_listener.py",
+  "solver/workflow/solve_code.py"
+]
diff --git a/architecture/collect-appmap-context/.navie/readme/prompt.md b/architecture/collect-appmap-context/.navie/readme/prompt.md
@@ -0,0 +1,14 @@
+## Task
+
+Your task is to document a feature in the style of a software architecture document. Use the
+available project information to document the feature.
+
+Document the feature from a usage point of view, not from an implementation point of view. Focus
+on the design and behavior of the feature as it is used by the end user or other parts of the
+system. Avoid including implementation details in the documentation. Do not provide a code
+breakdown, as the user is not interested in that information.
+
+Do not emit anything before or after the documentation content. Just emit the documentation content.
+
+Avoid using tentative language such as "may", "might", "could", "appears", "likely" etc. Describe
+only what you see from the data.
diff --git a/architecture/collect-appmap-context/.navie/readme/question.md b/architecture/collect-appmap-context/.navie/readme/question.md
@@ -0,0 +1 @@
+@explain /noprojectinfo /include=\bsolver\b Document the feature "collect appmap context".
diff --git a/architecture/collect-appmap-context/README.md b/architecture/collect-appmap-context/README.md
@@ -0,0 +1,35 @@
+## Feature: Collect AppMap Context
+
+### Overview
+
+The "Collect AppMap Context" feature is responsible for extracting context information from AppMap data files. This feature focuses on gathering code context details from `.appmap.json` files to support subsequent processing steps or analyses.
+
+### Usage
+
+The feature is primarily utilized in scenarios involving collecting and utilizing AppMap data for code analysis and validation. It serves as part of a broader workflow designed to enhance code understanding and processing. The collection process is typically triggered during the execution of synthetic tests, where AppMap data is generated.
+
+#### Workflow Integration
+
+1. **Test Execution and Observation**: The process begins with the observation and execution of synthetic tests within a controlled environment. This is achieved using a Docker container setup. During the test execution phase, AppMap data files are generated and stored in a specified directory. The `ObserveTest` class and its associated methods manage the test execution and data storage.
+
+2. **AppMap Context Collection**: Once the tests are run and AppMap data is available, the `collect_appmap_context_from_directory` function is invoked. It iterates over the generated `.appmap.json` files and extracts relevant context using the `AppMap` class functionalities. The context primarily includes code locations (filename and line number) and associated function codes.
+
+3. **Handling Data**: The collected AppMap context is maintained within a dictionary structure, where keys represent code locations and values contain the associated function code. This context data is then made available for downstream processes, such as improved code patch generation and validation.
+
+4. **Logging and Error Handling**: The feature includes logging mechanisms to track the status and progress of the context collection process. Errors encountered during data extraction are logged for troubleshooting and resolution.
+
+### Key Components
+
+- **AppMap Class**: The central component responsible for parsing and extracting location data from `.appmap.json` files. It provides the `list_locations` method to enumerate code locations present within the class map of the AppMap data.
+
+- **Collection Functions**: 
+  - `collect_appmap_context_from_directory`: This function initiates the collection of AppMap context from a specified directory containing the AppMap files.
+  - `collect_appmap_context`: Called by the directory function to handle individual AppMap data and populate the result dictionary with location-to-code mappings.
+
+### Benefits
+
+- **Enhanced Code Understanding**: By collecting detailed location and function code information, developers gain better insights into the structure and behavior of the codebase.
+- **Support for Code Analyses**: The available context facilitates various analyses and transformations, enabling more informed code generation and validation processes.
+- **Improved Workflow Efficiency**: Automation of context extraction reduces manual overhead and streamlines the workflow, enhancing overall productivity. 
+
+In summary, the "Collect AppMap Context" feature provides a robust mechanism to extract and maintain code context information, supporting advanced code analyses and improvements. It plays a crucial role in understanding and validating test-generated code efficiently and effectively.
diff --git a/architecture/collect-appmap-context/class-diagram.md b/architecture/collect-appmap-context/class-diagram.md
@@ -0,0 +1,32 @@
+
+```mermaid
+classDiagram
+    class AppMap {
+        +data: dict
+        +__init__(data: Union[str, dict])
+        +list_locations(): List[str]
+    }
+    class ObserveTest {
+        +log
+        +work_dir: Path
+        +test_spec: TestSpec
+        +run(docker_client: docker.DockerClient, test_patch: Patch): Optional[ObserveTestResult]
+    }
+    class Path {
+    }
+    class TestSpec {
+    }
+    class Patch {
+    }
+    class ObserveTestResult {
+        +test_status: TestStatus
+        +appmap_dir: Path
+    }
+    class AppMapContextCollector {
+        +collect_appmap_context_from_directory(log, appmap_dir: Path): dict[str, str]
+        +collect_appmap_context(log, appmap: AppMap, result: dict[str, str]): dict[str, str]
+    }
+    AppMap --> AppMapContextCollector
+    ObserveTest --> AppMapContextCollector
+    ObserveTestResult --> Path
+```
diff --git a/architecture/filter-solutions-from-instance-set/.navie/class-diagram/question.md b/architecture/filter-solutions-from-instance-set/.navie/class-diagram/question.md
@@ -0,0 +1 @@
+@diagram /noprojectinfo /include=\bsolver\b Create a class diagram for the feature "filter solutions from instance set", using the provided documentation as a guide.
diff --git a/architecture/filter-solutions-from-instance-set/.navie/dependencyFiles.json b/architecture/filter-solutions-from-instance-set/.navie/dependencyFiles.json
@@ -0,0 +1,12 @@
+[
+  "solver/filter_solutions_from_instance_set.py",
+  "solver/prepare_predictions.py",
+  "solver/report.py",
+  "solver/solve.py",
+  "solver/solve_loop.py",
+  "solver/workflow/generate_code.py",
+  "solver/workflow/generate_test.py",
+  "solver/workflow/observe_test.py",
+  "solver/workflow/patch.py",
+  "solver/workflow/solve_listener.py"
+]
-Original file line number
+Diff line change
@@ Expand Up / @@ -172,6 +172,7 @@ notebooks/ @@
     *logs/
     work/
     appmap.log
+    tmp/appmap
     # Solve
     solve
@@ Expand Down @@
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		@diagram /noprojectinfo /include=\bsolver\b Create a class diagram for the feature "code checkout", using the provided documentation as a guide.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		@explain /noprojectinfo /include=\bsolver\b Document the feature "code checkout".