Skip to content

Conversation

Unisay
Copy link
Contributor

@Unisay Unisay commented Sep 18, 2025

Summary

This PR implements costing functions for the lookupCoin and valueContains built-in functions in Plutus Core, making them usable in practice by replacing prohibitively expensive unimplementedCostingFun with proper cost models.

Implementation

Complete Implementation Pipeline

  1. Cost Model Infrastructure - Added parameter definitions and infrastructure for both functions
  2. JSON Configuration - Updated all cost model files with initial parameter values
  3. Builtin Integration - Enabled proper costing in builtin function definitions
  4. Comprehensive Benchmarking - Created extensive benchmark suite with realistic test data
  5. Analysis & Verification - Added R modeling and test infrastructure for validation
  6. Code Quality Improvements - Refactored benchmarking code for maintainability

Function Specifications

  • lookupCoin: ByteString -> ByteString -> Value -> Integer (3 arguments)

    • Uses ModelThreeArguments with constant memory model
    • O(log n) CPU complexity, O(1) memory usage
  • valueContains: Value -> Value -> Bool (2 arguments)

    • Uses ModelTwoArguments with constant memory model
    • O(n₂ × log max(m₁, k₁)) CPU complexity, O(1) memory usage

Benchmarking Highlights

  • Realistic Cardano Constraints: 28-byte policy IDs, variable token names, 20KB script limit
  • Comprehensive Size Coverage: From tiny (1 entry) to huge (30KB total)
  • Enhanced Parameter Spread: Added random test generation for better 3D visualization coverage
  • Pseudo-random Generation: Uses StdGen for reproducible, deterministic test data
  • Security-Conscious: Tests pathological cases (empty keys, huge keys, edge cases)
  • Memory Budget Aware: 30KB total constraint prevents DoS during benchmarking

Recent Improvements

  • Better Parameter Spreading: Added 100 random test combinations per function for diverse benchmark coverage
  • Code Simplification: Replaced Size datatype with direct Int parameters using numeric underscores
  • Improved Readability: Expanded abbreviated variable names and applied consistent formatting
  • Type System Cleanup: Eliminated unnecessary abstraction layer while maintaining functionality

Cost Model Approach

Both functions use constant memory models reflecting their O(1) memory characteristics:

  • No memory allocation proportional to input size
  • Consistent with similar lookup/contains operations in the cost model
  • Initial placeholder values will be refined through benchmark data analysis

Current State

All implementation steps are complete:

  • Infrastructure: Cost model parameters and type definitions
  • Configuration: JSON files with initial parameter values
  • Integration: Builtin functions use proper costing parameters
  • Benchmarking: Comprehensive benchmark suite for performance measurement
  • Analysis: R modeling and verification test infrastructure
  • Code Quality: Refactored and simplified benchmarking implementation

The functions are now immediately usable with reasonable costing instead of being prohibitively expensive.

@Unisay Unisay self-assigned this Sep 18, 2025
Copy link
Contributor

github-actions bot commented Sep 18, 2025

PR Preview Action v1.6.2

🚀 View preview at
https://IntersectMBO.github.io/plutus/pr-preview/pr-7344/

Built to branch gh-pages at 2025-09-19 08:01 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

@Unisay Unisay force-pushed the yura/costing-builtin-value branch from bfdb60d to 1fbd745 Compare September 22, 2025 08:37
- Add paramLookupCoin :: f ModelThreeArguments to BuiltinCostModelBase
- Add paramValueContains :: f ModelTwoArguments to BuiltinCostModelBase

Part of implementing costing for lookupCoin and valueContains builtins.
Currently these functions use unimplementedCostingFun which makes them
prohibitively expensive. This adds the type-level infrastructure needed
for proper costing.
Updates all three cost model JSON files with initial parameter
values for the new built-in functions:

- lookupCoin: constant CPU cost (1000ns), constant memory (1 unit)
- valueContains: constant CPU cost (1000ns), constant memory (1 unit)

These are placeholder values that will be refined through benchmarking.
The constant cost model reflects the O(1) complexity of both operations
on Cardano's Map-based Value representation.
…d valueContains

Updates the builtin function definitions to use the new cost model
parameters instead of unimplementedCostingFun:

- lookupCoin now uses paramLookupCoin with runCostingFunThreeArguments
- valueContains now uses paramValueContains with runCostingFunTwoArguments

This makes both functions usable in practice by providing reasonable
costing instead of prohibitively expensive fallback costs.
@Unisay Unisay force-pushed the yura/costing-builtin-value branch 2 times, most recently from 19c3d50 to 2eec71f Compare September 22, 2025 14:07
Implements extensive benchmarking infrastructure for the new built-in functions:

- Tests various Value sizes from tiny (1 entry) to huge (30KB)
- Uses pseudo-random test data generation with StdGen for reproducibility
- Covers realistic Cardano constraints (28-byte policy IDs, variable token names)
- Tests pathological cases (empty keys, huge keys, edge case sizes)
- Memory budget-conscious generation prevents DoS during benchmarking
- Generates data for both early-hit and late-hit scenarios

The benchmark suite enables accurate performance measurement to determine
optimal cost model coefficients for real-world usage patterns.
Extends the cost modeling infrastructure to support the new functions:

- Adds R modeling code to analyze benchmark data and fit cost functions
- Includes test cases to verify agreement between R models and Haskell implementation
- Supports both ModelTwoArguments and ModelThreeArguments cost function types
- Enables automated validation of cost model accuracy

This completes the cost modeling pipeline allowing benchmark data to be
processed into production-ready cost model coefficients.
@Unisay Unisay force-pushed the yura/costing-builtin-value branch from 2eec71f to 58b2a86 Compare September 23, 2025 07:35
The original benchmark generation created repeated parameter combinations
due to Size enum mapping, causing clustered data points in 3D plots.

This hybrid approach maintains all original Size-based systematic tests
for guaranteed coverage while adding 100 random tests per function with
varied parameters:

- LookupCoin: 0-20KB keys, 1-2000 policies, 1-1000 tokens/policy
- ValueContains: 1-5000 entries with varied key sizes

Random tests bypass the Size system entirely, generating diverse
benchmark names like "LookupCoin/1/125/83083" instead of repeated
"LookupCoin/4/4/3100" entries, enabling better parameter spread
for cost model visualization and analysis.
This refactoring simplifies the benchmarking code by:
- Eliminating the intermediate Size datatype that was converting to Int anyway
- Using direct Int parameters with numeric underscores for clarity
- Expanding abbreviated variable names to improve code readability
- Ensuring proper code formatting with line length constraints

The changes maintain all existing benchmark logic while making the code
more maintainable and easier to understand.
Replace placeholder zero costs with empirically-derived parameters:
- lookupCoin: linear model with intercept=269678, slope=1
- valueContains: added_sizes model with intercept=30547830, slope=30

Remove obsolete complexity modeling comments in R file as the
implementation now uses proper linear models based on benchmark data.
Add extensive benchmark measurements covering various parameter
combinations to support the empirical cost model derivation:
- LookupCoin benchmarks with different map sizes and key counts
- ValueContains benchmarks with varying value sizes
- Additional BLS12-381 multiScalarMul benchmark data for completeness

This data was used to derive the cost model parameters committed separately.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant