Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularity and support for a second language (discussion) #142

Open
maxirmx opened this issue Apr 1, 2024 · 9 comments
Open

Modularity and support for a second language (discussion) #142

maxirmx opened this issue Apr 1, 2024 · 9 comments
Labels
question Further information is requested

Comments

@maxirmx
Copy link
Member

maxirmx commented Apr 1, 2024

Current state of modularity

Existing tebako packager includes two compore modularity

  • libdwarfs
  • tebako

libdwarfs

libdwarfs is a library that provides our implementation of IO functions used by Ruby. This implementation reroutes calls to memfs or host filesystem.

libdwarfs uses upstream dwars project. dwarfs provides filesystem driver API and libdwarfs implements application API atop. 'application API' is a subset of Posix API plus some OS-specific functions used by Ruby as mentioned above.

  • conventional, meaning that there is nothing unsual in functions that implement Posix API ontop of custom user-space filesystem driver
  • reusable, meaning that the same library can be used to support packaging of multiple languages in addition to Ruby
  • extendable, meaning that if languages other then Ruby require functions not currently supported by libdwarfs then such functions can be added to libdwarfs

Reusability and extendability statement is correct for Liinux (gnu, musl) and MacOS implementation of tebako.
Windows version of Ruby implements its own Posix compatibility layer so Windows version of libdwarfs includes a module that implements our version of Ruby Posix compatibility layer.

tebako

Existing tebako component implements Ruby code patching and drives tebako pac are skage builds and rebuilds
Ruby patching is required to meet two objectives

  • static ("single file") linking. Theoretically, static linking is supported by Ruby. Practically Ruby team does not test it after version 2.5. It results in issues and these issues differ from version to version anbd from platform to platform
  • routing IO calls to libdwarfs.

Tebako itself is

  • uncoventional, meaning that Ruby (or any other language) core team won't suppose and support such patching
  • not resuable, meaning that all research, tests and implementation created for Ruby have little value for other languages
  • not extendable, meaning that it does not look feasible to extend exisitng code to support other languages. I believe for other language we have do develop independent patching and independent build driver
@maxirmx maxirmx added the question Further information is requested label Apr 1, 2024
@maxirmx
Copy link
Member Author

maxirmx commented Apr 1, 2024

Suggestions re support of other language (say, Julia)

libdwarfs

  • move Ruby-Windwos specific code to a separate library. Now it is isolated into separate source file(s) linked to the same library as portable code. [I do not think it is critical though if Julia version of the library on Windows includes some unused Ruby-specific code]

  • extend exiting reusable Posix-compliant or OS-specific code with additional functions if required. Exisiting library grew this way when new platforms and Ruby versions were added.

  • create language-specific library if any other language uses approach similar to Ruby

tebako

  • Create patching code and build driver independently since they are language specific.
  • There may be some reusable utilities, for example CLI code or code that make patching but this part is implemented in Ruby. Not sure if we want to keep CLI in Ruby for Julia and not implement Julia front-end for Julia.

@maxirmx
Copy link
Member Author

maxirmx commented Apr 1, 2024

Support of additional Ruby version

Supporting of additional Ruby version means two tasks:

  • Adding our implementation of additional IO functions to libdwarfs as required (unlikely, extending support from 2.6 to 2.7 to 3.0 to 3.1 to 3.2 required one new function)
  • Modification of patching (very likely, not predictable research and likely small change ). Now patching for all Ruby version is implemented in a single module because version to version change is small (<5%)

It would be nice to do patching more modular but I do not see how to achieve it.

@maxirmx
Copy link
Member Author

maxirmx commented Apr 1, 2024

Support of additional platform (like FreeBSD)

Supporting of additional Ruby version means two tasks:

  • Adding our implementation of additional IO functions to libdwarfs as required (unlikely, extending support from GNU linux to MUSL linux to MacOS required two new functions). Windows is an exempt, all other platforms are supported in a similar way
  • Modification of patching (very likely, not predictable research and likely small change ). Now patching for Linux GNU/Linux musl/MacOS is implemented in a single module because version to version change is small (<5%)
    Windows is different and requires ~20% of extra patching.

It would be nice to do patching more modular but I do not see how to achieve it.

@ronaldtse
Copy link
Contributor

Separately package individual runtimes

An "individual runtime" here is an interpreter for a particular language at a particular version. e.g. Ruby 3.1, Ruby 3.2, Julia 1.10, Julia 1.09.

Benefits

  • Reusable individual runtimes across multiple Tebako binaries.
  • Modular architecture: enables easier and contained tests; ensures streamlining of API (which are supposed to only be File APIs?) across runtimes.
  • Ability to use a smaller Tebako core which will download the necessary runtime when run the first time.
  • Ability to "upgrade" to have the executable use a newer runtime for performance.
  • Works like asdf or rbenv or jenv etc.

@maxirmx maxirmx pinned this issue Apr 7, 2024
@ronaldtse
Copy link
Contributor

To extend Tebako to support other language runtimes like Julia, we need to make several modifications to the existing system. Here's a high-level approach to achieve this:

  1. Generalize the runtime management system
  2. Create language-specific builders and patchers
  3. Update the packaging and execution processes
  4. Modify the configuration files

Let's go through these steps:

  1. Generalize the runtime management system:
module Tebako
  class RuntimeRepository
    def initialize(path)
      @path = path
      @metadata = {}
      load_metadata
    end

    def add_runtime(language, version, os, arch, file_path)
      runtime_key = "#{language}-#{version}-#{os}-#{arch}"
      # ... (similar to previous implementation)
    end

    def get_runtime(language, version, os, arch)
      runtime_key = "#{language}-#{version}-#{os}-#{arch}"
      # ... (similar to previous implementation)
    end

    # ... (other methods)
  end

  class RuntimeBuilder
    def initialize(repository)
      @repository = repository
    end

    def build_runtime(language, version, os, arch)
      builder = get_language_specific_builder(language)
      builder.build(version, os, arch)
    end

    private

    def get_language_specific_builder(language)
      case language
      when 'ruby'
        RubyRuntimeBuilder.new(@repository)
      when 'julia'
        JuliaRuntimeBuilder.new(@repository)
      else
        raise "Unsupported language: #{language}"
      end
    end
  end

  class RuntimeManager
    def initialize(local_repo_path, remote_repo_url = nil)
      @local_repo = RuntimeRepository.new(local_repo_path)
      @remote_repo_url = remote_repo_url
    end

    def ensure_runtime(language, version, os, arch)
      runtime_path = @local_repo.get_runtime(language, version, os, arch)
      return runtime_path if runtime_path

      if @remote_repo_url
        download_runtime(language, version, os, arch)
      else
        build_runtime(language, version, os, arch)
      end
    end

    # ... (other methods)
  end
end
  1. Create language-specific builders and patchers:
module Tebako
  class RubyRuntimeBuilder
    def initialize(repository)
      @repository = repository
    end

    def build(version, os, arch)
      # ... (existing Ruby build process)
    end

    private

    def apply_tebako_patches(version)
      # ... (Ruby-specific patches)
    end
  end

  class JuliaRuntimeBuilder
    def initialize(repository)
      @repository = repository
    end

    def build(version, os, arch)
      puts "Building Julia #{version} for #{os} (#{arch})..."
      
      # Clone Julia source
      system("git clone https://github.com/JuliaLang/julia.git -b v#{version} julia-#{version}")
      
      # Apply Tebako patches
      apply_tebako_patches(version)
      
      # Build Julia
      Dir.chdir("julia-#{version}") do
        system("make -j#{Etc.nprocessors}")
        system("make install prefix=#{Dir.pwd}/install")
      end
      
      # Package the built Julia
      output_file = "julia-#{version}-#{os}-#{arch}.tar.gz"
      system("tar -czf #{output_file} -C julia-#{version}/install .")
      
      # Add to repository
      @repository.add_runtime('julia', version, os, arch, output_file)
      
      puts "Julia #{version} for #{os} (#{arch}) built and added to repository."
    end

    private

    def apply_tebako_patches(version)
      # Apply necessary Tebako patches for Julia
      puts "Applying Tebako patches for Julia #{version}..."
      # Implement Julia-specific patches here
    end
  end
end
  1. Update the packaging and execution processes:

Modify the Tebako::Packager class to handle different languages:

module Tebako
  class Packager
    def package(config)
      language = config['language']
      version = config['version']
      # ... (other configuration options)

      runtime_manager = RuntimeManager.new(LOCAL_REPO_PATH, REMOTE_REPO_URL)
      runtime_path = runtime_manager.ensure_runtime(language, version, os, arch)

      # Package the application with the appropriate runtime
      # ... (packaging logic)
    end
  end

  class Executor
    def execute(package_path)
      metadata = load_metadata(package_path)
      language = metadata['language']
      version = metadata['version']
      # ... (other metadata)

      runtime_manager = RuntimeManager.new(LOCAL_REPO_PATH, REMOTE_REPO_URL)
      runtime_path = runtime_manager.ensure_runtime(language, version, os, arch)

      # Execute the package with the appropriate runtime
      case language
      when 'ruby'
        system("#{runtime_path}/bin/ruby", package_path)
      when 'julia'
        system("#{runtime_path}/bin/julia", package_path)
      else
        raise "Unsupported language: #{language}"
      end
    end
  end
end
  1. Modify the configuration files:

Update the tebako.yaml format to include the language specification:

language: julia
version: 1.6.3
entry_point: main.jl
# ... (other configuration options)
  1. Update the GitHub Actions workflow:

Modify the build_runtimes.yml to build both Ruby and Julia runtimes:

name: Build and Release Tebako Runtimes

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]
  workflow_dispatch:

jobs:
  build:
    strategy:
      matrix:
        os: [ubuntu-latest, macos-latest, windows-latest]
        include:
          - os: ubuntu-latest
            arch: x86_64
          - os: macos-latest
            arch: x86_64
          - os: windows-latest
            arch: x86_64
        language: [ruby, julia]
        include:
          - language: ruby
            versions: ['3.1.3', '3.2.4', '3.3.3']
          - language: julia
            versions: ['1.6.3', '1.7.2']

    runs-on: ${{ matrix.os }}

    steps:
    - uses: actions/checkout@v2

    - name: Set up Ruby
      uses: ruby/setup-ruby@v1
      with:
        ruby-version: '3.0'  # Use a stable version for building

    - name: Build Tebako Runtime
      run: |
        ruby -r ./lib/tebako/runtime_manager -e "
          repo = Tebako::RuntimeRepository.new('runtimes')
          builder = Tebako::RuntimeBuilder.new(repo)
          '${{ matrix.versions }}'.split(',').each do |version|
            builder.build_runtime('${{ matrix.language }}', version, '${{ runner.os }}', '${{ matrix.arch }}')
          end
        "

    # ... (rest of the workflow remains similar)

These changes allow Tebako to support multiple language runtimes, including Julia. The system is now more flexible and can be extended to support additional languages in the future by adding new language-specific builders and updating the configuration and execution processes accordingly.

@ronaldtse
Copy link
Contributor

Tebako could be extended to support various interpretive languages that require a runtime environment. Some languages that could be relatively easily added to Tebako in the future include:

  1. Python: A widely-used language with a large ecosystem of libraries.

  2. Node.js: For JavaScript and TypeScript applications.

  3. Lua: A lightweight scripting language often used in game development and embedded systems.

  4. R: Popular for statistical computing and data analysis.

  5. Perl: Still used in many legacy systems and for text processing.

  6. PHP: Commonly used for web development.

  7. Tcl: Used in networking, embedded systems, and testing.

  8. Erlang/Elixir: For building scalable and fault-tolerant systems.

  9. Groovy: A dynamic language for the Java platform.

  10. Scala: Another JVM language, combining object-oriented and functional programming.

  11. Haskell: A purely functional programming language.

  12. OCaml: A multi-paradigm programming language.

  13. Go: While typically compiled, its runtime could be packaged for certain use cases.

  14. Racket: A general-purpose, multi-paradigm programming language in the Lisp/Scheme family.

  15. Dart: Used for web, mobile, and desktop application development.

To add these languages, you would need to:

  1. Create a language-specific runtime builder (e.g., PythonRuntimeBuilder, NodeRuntimeBuilder, etc.).
  2. Implement the necessary Tebako patches for each language runtime.
  3. Update the RuntimeBuilder class to support the new language.
  4. Modify the Packager and Executor classes to handle the new language.
  5. Update the GitHub Actions workflow to build and release the new language runtimes.

Here's a sketch of how you might add Python support:

module Tebako
  class PythonRuntimeBuilder
    def initialize(repository)
      @repository = repository
    end

    def build(version, os, arch)
      puts "Building Python #{version} for #{os} (#{arch})..."
      
      # Clone Python source
      system("git clone https://github.com/python/cpython.git -b v#{version} python-#{version}")
      
      # Apply Tebako patches
      apply_tebako_patches(version)
      
      # Build Python
      Dir.chdir("python-#{version}") do
        system("./configure --prefix=#{Dir.pwd}/install")
        system("make -j#{Etc.nprocessors}")
        system("make install")
      end
      
      # Package the built Python
      output_file = "python-#{version}-#{os}-#{arch}.tar.gz"
      system("tar -czf #{output_file} -C python-#{version}/install .")
      
      # Add to repository
      @repository.add_runtime('python', version, os, arch, output_file)
      
      puts "Python #{version} for #{os} (#{arch}) built and added to repository."
    end

    private

    def apply_tebako_patches(version)
      puts "Applying Tebako patches for Python #{version}..."
      # Implement Python-specific patches here
    end
  end
end

Then, update the RuntimeBuilder:

class RuntimeBuilder
  # ...

  private

  def get_language_specific_builder(language)
    case language
    when 'ruby'
      RubyRuntimeBuilder.new(@repository)
    when 'julia'
      JuliaRuntimeBuilder.new(@repository)
    when 'python'
      PythonRuntimeBuilder.new(@repository)
    # Add more languages here
    else
      raise "Unsupported language: #{language}"
    end
  end
end

And update the Executor:

class Executor
  def execute(package_path)
    # ...
    case language
    when 'ruby'
      system("#{runtime_path}/bin/ruby", package_path)
    when 'julia'
      system("#{runtime_path}/bin/julia", package_path)
    when 'python'
      system("#{runtime_path}/bin/python", package_path)
    # Add more languages here
    else
      raise "Unsupported language: #{language}"
    end
  end
end

By following this pattern, Tebako can be extended to support a wide range of interpretive languages, making it a versatile tool for packaging and distributing applications written in various programming languages.

@ronaldtse
Copy link
Contributor

To integrate Julia with Tebako using libdwarfs for a memory-based file system, we'll need to make several modifications to the Julia interpreter. Here's an overview of the changes needed and how to manage these patches:

  1. File System Redirections:
    Julia's file system operations need to be redirected to use libdwarfs when accessing files within the packaged application. This primarily involves modifying Julia's I/O subsystem.

    Key areas to patch:

    • src/jl_uv.c: Contains Julia's libuv integration for file operations.
    • src/sys.c: Implements various system-level operations, including file system functions.
    • src/init.c: Initializes Julia's runtime, where we can set up the libdwarfs integration.
  2. Memory Mapping:
    Julia uses memory mapping for efficient file access. We need to modify this to work with the in-memory file system provided by libdwarfs.

    Key area to patch:

    • src/mmap.c: Implements memory mapping functionality.
  3. Module Loading:
    Julia's module system needs to be aware of the in-memory file system for loading packages and modules.

    Key area to patch:

    • src/toplevel.c: Handles module loading and evaluation.
  4. Standard Library Adjustments:
    Some parts of Julia's standard library that interact directly with the file system may need modifications.

    Key areas to patch:

    • base/io.jl
    • base/filesystem.jl
  5. Initialization:
    We need to initialize the libdwarfs system when Julia starts up, before any file operations occur.

    Key area to patch:

    • src/init.c

Here's how we can manage these patches:

  1. Create a Patch Directory:
    In the Tebako project, create a directory structure like:

    patches/
      julia/
        v1.6.3/
        v1.7.2/
        ...
    
  2. Create Patch Files:
    For each Julia version, create separate patch files for each area of modification. For example:

    patches/julia/v1.6.3/
      01-jl_uv.patch
      02-sys.patch
      03-init.patch
      04-mmap.patch
      05-toplevel.patch
      06-stdlib.patch
    
  3. Patch Application Script:
    Create a script that applies these patches during the Julia build process in the JuliaRuntimeBuilder:

    def apply_tebako_patches(version)
      patch_dir = File.join(PATCH_DIR, "julia", "v#{version}")
      Dir.glob(File.join(patch_dir, "*.patch")).sort.each do |patch_file|
        system("patch -p1 < #{patch_file}")
      end
    end
  4. Version Control:
    Keep these patches under version control in the Tebako repository. This allows for easy management of different patches for different Julia versions.

  5. Patch Maintenance:
    As new versions of Julia are released, review and update the patches as necessary. You may need to create new patch sets for major Julia versions.

  6. Documentation:
    Maintain documentation explaining each patch, why it's necessary, and any potential implications for Julia's behavior.

Example of a patch (simplified) for src/jl_uv.c:

--- a/src/jl_uv.c
+++ b/src/jl_uv.c
@@ -100,6 +100,7 @@ int jl_fs_open(const char *path, int flags, int mode)
 {
+    if (is_tebako_path(path)) {
+        return tebako_open(path, flags, mode);
+    }
     uv_fs_t req;
     int r = uv_fs_open(NULL, &req, path, flags, mode, NULL);
     uv_fs_req_cleanup(&req);
     return r;
 }

This patch checks if the path is within the Tebako filesystem and redirects to a Tebako-specific open function if so.

By managing patches this way, you can:

  1. Easily apply and update patches for different Julia versions.
  2. Keep track of changes required for Tebako integration.
  3. Quickly rebuild Julia with Tebako support when new versions are released.
  4. Collaborate with other developers on maintaining and improving the Julia integration.

@ronaldtse
Copy link
Contributor

To patch Julia to use the memory-based file system provided by libdwarfs, we need to focus on several key areas of the Julia codebase. Here are the specific patches needed, organized by the main components that require modification:

  1. File System Operations (src/jl_uv.c):

This file handles low-level file system operations using libuv. We need to intercept these calls and redirect them to libdwarfs when appropriate.

--- a/src/jl_uv.c
+++ b/src/jl_uv.c
@@ -1,5 +1,7 @@
 #include <uv.h>
 #include "julia.h"
+#include "tebako/tebako-io.h"
+#include "tebako/tebako-fs.h"

 int jl_fs_open(const char *path, int flags, int mode)
 {
+    if (within_tebako_memfs(path)) {
+        return tebako_open(path, flags, mode);
+    }
     uv_fs_t req;
     int r = uv_fs_open(NULL, &req, path, flags, mode, NULL);
     uv_fs_req_cleanup(&req);
     return r;
 }

 ssize_t jl_fs_read(int fd, char *data, size_t len)
 {
+    if (is_tebako_file_descriptor(fd)) {
+        return tebako_read(fd, data, len);
+    }
     uv_fs_t req;
     ssize_t r = uv_fs_read(NULL, &req, fd, &uv_buf_init(data, len), 1, -1, NULL);
     uv_fs_req_cleanup(&req);
     return r;
 }

 // Similar modifications for jl_fs_write, jl_fs_close, jl_fs_stat, etc.
  1. Memory Mapping (src/mmap.c):

Julia uses memory mapping for efficient file access. We need to modify this to work with the in-memory file system.

--- a/src/mmap.c
+++ b/src/mmap.c
@@ -1,4 +1,6 @@
 #include "julia.h"
+#include "tebako/tebako-io.h"
+#include "tebako/tebako-fs.h"

 void *jl_mmap(void *addr, size_t len, int prot, int flags, int fd, off_t offset)
 {
+    if (is_tebako_file_descriptor(fd)) {
+        return tebako_mmap(addr, len, prot, flags, fd, offset);
+    }
     return mmap(addr, len, prot, flags, fd, offset);
 }

 int jl_munmap(void *addr, size_t len)
 {
+    if (within_tebako_memfs(addr)) {
+        return tebako_munmap(addr, len);
+    }
     return munmap(addr, len);
 }
  1. Module Loading (src/toplevel.c):

Modify the module loading system to be aware of the in-memory file system.

--- a/src/toplevel.c
+++ b/src/toplevel.c
@@ -1,4 +1,6 @@
 #include "julia.h"
+#include "tebako/tebako-io.h"
+#include "tebako/tebako-fs.h"

 jl_value_t *jl_load_file_string(const char *text, size_t len, char *filename)
 {
+    if (within_tebako_memfs(filename)) {
+        // Use tebako functions to read the file content
+        char *tebako_content = tebako_read_file(filename, &len);
+        if (tebako_content) {
+            jl_value_t *result = jl_parse_input_line(tebako_content, len, filename, 0);
+            free(tebako_content);
+            return result;
+        }
+    }
     // Existing implementation for non-tebako files
 }
  1. Initialization (src/init.c):

Initialize the libdwarfs system when Julia starts up.

--- a/src/init.c
+++ b/src/init.c
@@ -1,4 +1,6 @@
 #include "julia.h"
+#include "tebako/tebako-io.h"
+#include "tebako/tebako-fs.h"

 void jl_init(void)
 {
+    // Initialize tebako/libdwarfs
+    tebako_init();
+
     // Existing initialization code
 }

 void jl_cleanup(void)
 {
+    // Cleanup tebako/libdwarfs
+    tebako_cleanup();
+
     // Existing cleanup code
 }
  1. Standard Library Adjustments (base/filesystem.jl):

Modify Julia's standard library functions that interact with the file system.

--- a/base/filesystem.jl
+++ b/base/filesystem.jl
@@ -1,8 +1,14 @@
 # filesystem.jl
+import Tebako
+
 function open(filename::AbstractString, mode::AbstractString="r")
+    if Tebako.within_memfs(filename)
+        return Tebako.open(filename, mode)
+    end
     # Existing open implementation
 end

 function read(filename::AbstractString)
+    if Tebako.within_memfs(filename)
+        return Tebako.read(filename)
+    end
     # Existing read implementation
 end

 # Similar modifications for other file operations
  1. Tebako Integration Header (include/julia/tebako.h):

Create a new header file to define the interface between Julia and Tebako/libdwarfs.

// include/julia/tebako.h
#ifndef JULIA_TEBAKO_H
#define JULIA_TEBAKO_H

#include <stdbool.h>
#include <sys/types.h>

bool within_tebako_memfs(const char *path);
bool is_tebako_file_descriptor(int fd);
int tebako_open(const char *path, int flags, int mode);
ssize_t tebako_read(int fd, void *buf, size_t count);
ssize_t tebako_write(int fd, const void *buf, size_t count);
int tebako_close(int fd);
void *tebako_mmap(void *addr, size_t len, int prot, int flags, int fd, off_t offset);
int tebako_munmap(void *addr, size_t len);
void tebako_init(void);
void tebako_cleanup(void);

#endif // JULIA_TEBAKO_H

These patches provide the core modifications needed to integrate libdwarfs with Julia for accessing the memory-based file system. The key points are:

  1. Intercepting file system calls and redirecting them to Tebako/libdwarfs functions when appropriate.
  2. Modifying memory mapping functions to work with the in-memory file system.
  3. Adjusting the module loading system to be aware of the in-memory file system.
  4. Initializing and cleaning up the Tebako/libdwarfs system.
  5. Modifying standard library functions to work with the in-memory file system.

Remember that these patches are conceptual and may need to be adjusted based on the specific versions of Julia and libdwarfs you're working with. You'll also need to implement the Tebako-specific functions (like tebako_open, tebako_read, etc.) to interface with libdwarfs.

@ronaldtse
Copy link
Contributor

Testing these patches effectively is crucial to ensure that the integration of Tebako and libdwarfs with Julia works correctly and doesn't introduce any regressions. Here's a comprehensive approach to testing these patches:

  1. Unit Tests

Create unit tests for each modified function. These tests should cover both Tebako and non-Tebako paths.

Example for jl_fs_open:

void test_jl_fs_open() {
    // Test regular file open
    int fd = jl_fs_open("/tmp/test.txt", O_RDONLY, 0);
    assert(fd >= 0);
    jl_fs_close(fd);

    // Test Tebako file open
    fd = jl_fs_open("/__tebako_memfs__/test.txt", O_RDONLY, 0);
    assert(fd >= 0);
    assert(is_tebako_file_descriptor(fd));
    jl_fs_close(fd);
}
  1. Integration Tests

Create tests that exercise the entire stack, from Julia code down to the libdwarfs layer.

function test_file_operations()
    # Write to a file in the Tebako filesystem
    open("/__tebako_memfs__/test.txt", "w") do f
        write(f, "Hello, Tebako!")
    end

    # Read from the file
    content = read("/__tebako_memfs__/test.txt", String)
    @assert content == "Hello, Tebako!"

    # Test file existence
    @assert isfile("/__tebako_memfs__/test.txt")

    # Test directory operations
    mkdir("/__tebako_memfs__/testdir")
    @assert isdir("/__tebako_memfs__/testdir")

    # Test file copy
    cp("/__tebako_memfs__/test.txt", "/__tebako_memfs__/testdir/test_copy.txt")
    @assert isfile("/__tebako_memfs__/testdir/test_copy.txt")
end
  1. Performance Tests

Compare the performance of file operations between the regular filesystem and the Tebako filesystem.

function benchmark_file_operations()
    regular_time = @elapsed for i in 1:1000
        open("/tmp/bench.txt", "w") do f
            write(f, "Benchmark test")
        end
        content = read("/tmp/bench.txt", String)
    end

    tebako_time = @elapsed for i in 1:1000
        open("/__tebako_memfs__/bench.txt", "w") do f
            write(f, "Benchmark test")
        end
        content = read("/__tebako_memfs__/bench.txt", String)
    end

    println("Regular filesystem time: ", regular_time)
    println("Tebako filesystem time: ", tebako_time)
end
  1. Edge Case Tests

Test various edge cases and error conditions:

  • Accessing non-existent files
  • Accessing files with insufficient permissions
  • Handling of very large files
  • Concurrent access to the same file
  • Handling of special characters in filenames
  1. Memory Leak Tests

Use tools like Valgrind to check for memory leaks, especially in the Tebako/libdwarfs integration code.

  1. Stress Tests

Create tests that put heavy load on the filesystem:

function stress_test()
    for i in 1:10000
        filename = "/__tebako_memfs__/stress_test_$i.txt"
        open(filename, "w") do f
            write(f, "Stress test content for file $i")
        end
        content = read(filename, String)
        rm(filename)
    end
end
  1. Compatibility Tests

Ensure that existing Julia packages that heavily rely on filesystem operations still work correctly with the Tebako integration.

  1. Cross-Platform Tests

Test the patches on all supported platforms (Linux, macOS, Windows) to ensure consistent behavior.

  1. Regression Tests

Create a suite of tests that cover all previously known bugs and edge cases to prevent regressions.

  1. Continuous Integration

Set up CI pipelines to automatically run these tests on every commit and pull request.

name: Tebako Integration Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ubuntu-latest, macos-latest, windows-latest]
        julia-version: ['1.6', '1.7', '1.8']
    steps:
    - uses: actions/checkout@v2
    - name: Set up Julia
      uses: julia-actions/setup-julia@v1
      with:
        version: ${{ matrix.julia-version }}
    - name: Build Tebako-patched Julia
      run: |
        # Commands to apply patches and build Julia
    - name: Run tests
      run: |
        julia --project=@. -e 'using Pkg; Pkg.test()'
  1. Fuzzing Tests

Use fuzzing tools to generate random inputs and file operations to uncover potential bugs or crashes.

  1. Benchmark Suite

Create a comprehensive benchmark suite to compare performance across different scenarios and Julia versions.

To implement these tests effectively:

  1. Organize tests into categories (unit, integration, performance, etc.).
  2. Use Julia's built-in testing framework (Test module) for Julia-level tests.
  3. Use a C testing framework (like Google Test) for low-level C tests.
  4. Automate test running as part of the build process.
  5. Generate test coverage reports to ensure all code paths are tested.
  6. Regularly review and update tests as the codebase evolves.

By implementing this comprehensive testing strategy, you can ensure that the Tebako patches are robust, performant, and don't introduce regressions in Julia's core functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants