Skip to content

Conversation

moratorium08
Copy link
Contributor

Note: You don’t need to look into the details of this implementation. This draft PR is not intended to be merged; it’s only for discussion of the future implementation.

I have implemented a prototype for "auto-annot" (only for focus).

The goal of this prototype is to automatically synthesize missing focus annotations for iterated resources by observing concrete executions of binaries instrumented by Fulminate, then infer a linear arithmetic term for the focused index from in-scope concrete values at each memory access. The overview of the pipeline is (i) well-formedness check (ii) Fulminate instrumentation with additional instrumentations (see below) (iii) Bennet tests, and (iv) Infer the focus from the execution.

What I have added are

  • New CLI subcommand, named auto-annot
  • Additional Fulminate instrumentations
  • Runtime support (libcn) for storing focus and iterated resource information
  • Inference algorithm (lib/autoannot/autoAnnot.ml)

I tried to avoid degrading the current Fulminate/Bennet as much as possible. To achieve this, I added a switch in Fulminate to check whether auto-annot is enabled. If auto-annot is disabled, the instrumented binary remains identical to the previous version. If any difference appears in that case, it should be due to a mistake on my part.

Fulminate Instrumentations

Basically, I added the machinery for dealing with focus context:

  • initialization + finalization at the prologue and epilogue of main + C-scope tracking for the context (push_focus_context/pop_focus_context at the beginning/end of each function)
  • Memory access: swapping CN_LOAD/CN_STORE/CN_STORE_OP for CN_*_ANNOT variants when enabled; Logs the values for the scoped variables if a focus is needed at that access (the case where the memory access is justified by
    some iterated resource, but no focus is stated for the index).
  • Iterated resources: injecting CN_INSERT_ITER_RES(ptr, start, end, step, type) at pre/loop spec points to
    register concrete resource ranges and element types at runtime.
  • Focus instantiation: on Extract of Owned/Block with a concrete type, injects CN_INSERT_FOCUS(index,
    type) so runtime knows which indices have already been focused.
    They may (eventually) be replaced by Rini's future machinery for better error reporting.

Runtime Support

  • Added per-frame stack storing the information of iterated resources and focused indices
  • Added interface functions for that.
  • Added needs_focus, a function to determine if the focus is missing or not.
  • Added small functional tests using Zain's GTest framework

Running Example

If you run cn auto-annot prog.c where prog.c is

unsigned f(int *p, unsigned long long n, unsigned j) 
/*@
requires
  take X = each(u64 i; i < n) {
      RW<int>(array_shift<int>(p, i))
  };
  n > 10u64;
  (u64)j < n;
ensures
  take Y = each(u64 i; i < n) {
      RW<int>(array_shift<int>(p, i))
  };
@*/
{
    /*@ focus RW<int>, 1u64; @*/
    return p[j];
}

You will get the log like this in a tmp file:

... 
(omitted)
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=2, n=19, j=2, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=12, n=23, j=12, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=4, n=11, j=4, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=12, n=14, j=12, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=10, n=17, j=10, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=16, n=22, j=16, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=6, n=15, j=6, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=11, n=14, j=11, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=7, n=15, j=7, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=11, n=24, j=11, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=2, n=23, j=2, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=15, n=24, j=15, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=7, n=16, j=7, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=8, n=24, j=8, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=8, n=12, j=8, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=11, n=13, j=11, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=5, n=15, j=5, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=10, n=12, j=10, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=9, n=12, j=9, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=13, n=19, j=13, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=8, n=16, j=8, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=14, n=19, j=14, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=16, n=21, j=16, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=3, n=18, j=3, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=7, n=22, j=7, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=0, n=17, j=0, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=6, n=15, j=6, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=12, n=14, j=12, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=3, n=15, j=3, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=9, n=19, j=9, 

Synthesis

Now assume we have samples like the above. Auto-annot takes such samples and tries to find integer coefficients (w0, {w_v}) such that w0 + Σ(w_v * v) = !index across all samples for a given site. In principle, there are many solutions in terms of a SAT problem, but for a better suggestion, I formalized the problem as an optimization to minimize the coefficients w_i as much as possible by using L1 norm. Currently, I'm using Z3 for the optimization, but there might be better choices.

Limitations

  • Only collects integer-typed variables (skips structs, floats, pointers).
  • Assumes Iterated resources are contiguous and type equality is string-based (this can be mitigated by the Rini's new machinery)
  • Only supports explicit memory manipulations (sometimes focus is needed for ghost state manipulations)
  • Only supports Owned-iterated resource (i.e. each(i; c) { RW(...) } not each(i; c) { Pred(...) } )

@moratorium08 moratorium08 changed the title Focus context [Draft] Prototype of auto-annot Aug 21, 2025
@dc-mak
Copy link
Contributor

dc-mak commented Aug 22, 2025

Thanks for the implementation and the overview of the state of the implementation. This is the sort of documentation we would need for merging this in and having it not bit-rot.

  1. Would you be willing to extend the "Running Example" so that it covers an end-user perspective on the feature? Specifically, once the synthesis is done, what is reported to the user and what actions could the user take. It's ok if it's not implemented, or it's vague/hypothetical, it will help clarify the story and context for this feature.
  2. Can you add a little bit of info about why the limitations are there (is it just time/version 0, or is it some other changes required in CN etc.)?

@moratorium08
Copy link
Contributor Author

moratorium08 commented Aug 22, 2025

@dc-mak

  1. Yes, it is still not so fixed because I'm not sure about a good UI/UX for humans and interface for language servers, but currently users can get the result from stdout like this (see the last line)
> cn auto-annot prog.c
Using temporary directory: /var/folders/5_/bc8gln213hx04f4_d3p9p7dr0000gn/T/autoannot.3EB7
Testing prog::f: 100 runs, 459 discards
PASSED

Testing Summary:
cases: 1, passed: 1, failed: 0, errored: 0, skipped: 0
[Result of auto-annotation]
prog.c:28: focus RW<signed int>, j
  1. Most of them are due to the immature implementation; they will be fixed by further engineering. However, the third item (Only supports explicit memory manipulations...) can be beyond engineering as the current implementation only tracks explicit memory accesses; the current approach cannot infer focuss below in principle:
/*@
predicate [nounfold] void Cn_char_array(pointer p, u64 size)
{
        take U = each(u64 i; i < size){
                W<unsigned char>(array_shift<unsigned char>(p, i))
        };
        return;
}
*@/

void Lemma(
        char *p, unsigned int size1, unsigned int size2
)
/*@
requires
        let size = (u64)size1 + (u64)size2;
        take X = Cn_char_array(p, size);
        let owned_by_ha = array_shift<unsigned char>(p, (u64)size1);
ensures
        take X1 = Cn_char_array(p, (u64)size1);
        take X2 = Cn_char_array(owned_by_ha, (u64)size2);
@*/
{
        /*@
        unpack Cn_char_array(p, (u64)size1 + (u64)size2);
        @*/
        unsigned long i;
        for (i = 0; i < (unsigned long)size2; i++)
        /*@
        inv
                take L0 = Cn_char_array(p, (u64)size1);
                take L1 = Cn_char_array_with_offset(p, (u64)size2 - i, (u64)size1 + (u64)i);
                take L2 = Cn_char_array(owned_by_ha, i);
                {p} unchanged;
                {size1} unchanged;
                {size2} unchanged;
                i <= (u64)size2;
        @*/
        {
                /*@
                  unpack Cn_char_array(owned_by_ha, i);
                  focus W<unsigned char>, i;
                  focus W<unsigned char>, ((u64)size1 + i);
                @*/
        }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants