-
Notifications
You must be signed in to change notification settings - Fork 2.2k
feat(fuzz): ast-seeded dictionary #12015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
.prop_flat_map(move |(use_ast_index, select_index)| { | ||
let dict = state_clone.dictionary_read(); | ||
|
||
// AST string literals available: use 30/70 allocation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arbitrary value, we could change it if you are opinionated.
|
||
// Seed dict with AST literals if analysis is available. | ||
if let Some(literals) = analysis { | ||
dictionary.ast_values = Some(literals); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO better to keep this simpler and just insert / reuse the existing fuzz dict samples - insert_sample_values
which stores values by type and use them during fuzz runs instead having new strategy / weights and ast_values
. We probably need to make the samples limit configurable and bump the default value
foundry/crates/evm/fuzz/src/strategies/state.rs
Lines 374 to 377 in 020d515
/// Insert sample values that are reused across multiple runs. | |
/// The number of samples is limited to invariant run depth. | |
/// If collected samples limit is reached then values are inserted as regular values. | |
pub fn insert_sample_values( |
Not blocking but I would recommend implementing constant folding to some degree i.e. evaluate |
thanks! One thing here - this means we should collect from tests too which we don't do in PR, is this correct? |
AFAIK neither Echidna or slither's printer filters tests out. I think not including forge-std makes sense. Also, I am not sure how the push/pop/log dictionary is managed currently in Foundry, but I think Echidna will always keep the constant pool around and eject the dynamically collected values after running a full sequence. For example, a user's balance that is emitted in one run may help within the same sequence but probably unlikely to help in a totally unrelated sequence. |
👍 @0xrusowsky let's include too
Please let us know if you see any redundant data / ways to improve the dict. Thank you! |
^ note that AST literals are injected into |
6f35d1b
to
fc4f3d4
Compare
a9373d6
to
3440429
Compare
thanks for the advise! will be tackled next on a follow-up PR: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, looks good! left some comments / nits, pls check
if let Some(config) = cheatcodes { | ||
let mut cheatcodes = Cheatcodes::new(config); | ||
// Set analysis capabilities if they are provided | ||
if let Some(analysis) = analysis { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the analysis
here is technically the compiler, not the analysis per se, should we process / analyze already, smth like
if let Some(compiler) = compiler {
let ast_analysis = AstAnalysis::new(compiler);
cheatcodes.set_struct_defs(ast_analysis.get_struct_defs().clone());
stack.set_ast_analysis(ast_analysis);
}
and we consolidate AST analysis in single place instead have parts of it in cheatcodes, parts in stack? Then in EvmFuzzState::new
we just pass AstAnalysis
and populate dict with AstAnalysis
words, strings and bytes - side note, in this way we could pass to fuzzer also the enums to be used for #6623 but that's different scope and complex (it affects mutations as well) as @DaniPopes pointed out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was also thinking about where to place the LiteralsCollector
, and i guess it could also make sense to upstream it to the inspector stack so that other inspectors could benefit from it.
however, i wouldn't eagerly perform the analysis as you suggest here, as most of the times you won't need to use the analysis capabilities (i.e. only a small subset of tests will use the cheatcodes that require struct defs).
also, i expect each consumer (inspector) to have different needs, hence why i thought having a more granular approach and implementing the actual analysis capabilities on each inspector (i.e. crates/evm/fuzz/src/strategies/state.rs
, crates/cheatcodes/src/inspector/analysis.rs
) would make more sense 🤔
let's see what @DaniPopes prefers and we can do what majority thinks its best? haha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we then analyze at build time and cache the values / write them to disk, then lazy loading what's needed in different components and only when / where needed? this will also mean we don't need to analyze each time we forge test
.prop_flat_map(move |(use_ast_index, select_index)| { | ||
let dict = state_clone.dictionary_read(); | ||
|
||
// AST string literals available: use 30/70 allocation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this should follow the sample rules, we already have logic / bias to select them
maybe we could reuse same and return DynSolValue
s from ast analyzed String / bytes here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdym exactly?
bias
is a randomly generated bool
(50-50) but allocating 50% to ast seeded literals feels like a lot (before it was 0-100).
my idea was that by using Index
we can allocate a smaller pct to AST string literals, but we are already using them (30% of the time)
let max_int_plus1 = U256::from(1).wrapping_shl(n - 1); | ||
let num = I256::from_raw(uint.wrapping_sub(max_int_plus1)); | ||
// Extract lower N bits | ||
let uint_n = U256::from_be_bytes(value.0) % U256::from(1).wrapping_shl(n); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, need to make some more tests to see how this affects overall perf
} | ||
|
||
#[derive(Clone, Default, Debug)] | ||
pub struct LiteralMaps { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as in comment above, would be nice to have all AST analysis consolidated and performed only once, these could be good candidates to move there. Let's add comments to the enum / structs and their members too
Motivation
closes #10233
Solution
solar::sema::Compiler
to collect all relevant AST literals found in the sources (excluding libs and scripts) and seed theFuzzerDictionary
with them at initialization.TODO
Future improvements
PR Checklist