Skip to content

Conversation

@pasabanov
Copy link
Member

@pasabanov pasabanov commented Jan 25, 2025

Types of changes

  • Feature
  • Testing
  • Configuration (CI/CD)

Description

  1. Implemented dist/generate.py script to generate data for testing distributions.
    All formulas for generating distributions implemented in the library were implemented in a simplified form.

    a and b are global constants. The results of all functions are calculated without taking into account the values of a and b. This is made for simplicity. Instead of changing the values of a and b, mapping_intervals (see below) are added.

    The calculations rely on the mpmath library and are performed with a precision of 100 significant digits. However, the numbers are rounded to 17 significant digits before being output. This ensures the highest possible precision (actually higher than double precision).

    It is not expected that the library data will match the test data exactly, as the library provides ideal results. The library should match with a precision up to a certain significant digit (the exact precision will be determined later).

    The script allows the output file name to be specified using the -o (--output) flag. If no output file is specified, the script’s output will be directed to the standard output stream.

  2. Created dist/dist.toml as output of the script.
    The dist/dist.toml file has been carefully constructed step-by-step based on the output from the dist/generate.py script. At each stage, it was verified that the old values remained unchanged. The calculations for uniform and chebyshev were initially done manually using wolframalpha.com.

    At the beginning of the file, intervals labeled mapping_intervals were added, within which each test case will undergo additional testing. The expected values specified in the file are linearly mapped to each interval, and the library generates new values for each interval.

    Since the library first generates points on intervals like (-1, 1) or (0, 1) when creating distributions, the loss of precision during the linear mapping should be approximately the same for both generated and test data.

  3. Added verify_data.py script to verify the generated data against the saved data.
    To verify the generated data, a temporary file is used, which is deleted after the verification process.

  4. Added .github/workflows/ci.yml to verify the data during the CI build process.
    To maintain the integrity of the repository, a CI script has been created to verify the test data with every change to the main branch.

@pasabanov pasabanov added config Configuring the project feature New feature or request tests Adding or changing tests labels Jan 25, 2025
@pasabanov pasabanov self-assigned this Jan 25, 2025
1. Implemented `dist/generate.py` script to generate data for testing distributions.
2. Created `dist/dist.toml` as output of the script.
3. Added `verify_data.py` script to verify the generated data against the saved data.
4. Added `.github/workflows/ci.yml` to verify the data during the CI process.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

config Configuring the project feature New feature or request tests Adding or changing tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants