[Draft] Prototype of `auto-annot` #327

moratorium08 · 2025-08-21T12:01:26Z

Note: You don’t need to look into the details of this implementation. This draft PR is not intended to be merged; it’s only for discussion of the future implementation.

I have implemented a prototype for "auto-annot" (only for focus).

The goal of this prototype is to automatically synthesize missing focus annotations for iterated resources by observing concrete executions of binaries instrumented by Fulminate, then infer a linear arithmetic term for the focused index from in-scope concrete values at each memory access. The overview of the pipeline is (i) well-formedness check (ii) Fulminate instrumentation with additional instrumentations (see below) (iii) Bennet tests, and (iv) Infer the focus from the execution.

What I have added are

New CLI subcommand, named auto-annot
Additional Fulminate instrumentations
Runtime support (libcn) for storing focus and iterated resource information
Inference algorithm (lib/autoannot/autoAnnot.ml)

I tried to avoid degrading the current Fulminate/Bennet as much as possible. To achieve this, I added a switch in Fulminate to check whether auto-annot is enabled. If auto-annot is disabled, the instrumented binary remains identical to the previous version. If any difference appears in that case, it should be due to a mistake on my part.

Fulminate Instrumentations

Basically, I added the machinery for dealing with focus context:

initialization + finalization at the prologue and epilogue of main + C-scope tracking for the context (push_focus_context/pop_focus_context at the beginning/end of each function)
Memory access: swapping CN_LOAD/CN_STORE/CN_STORE_OP for CN_*_ANNOT variants when enabled; Logs the values for the scoped variables if a focus is needed at that access (the case where the memory access is justified by
some iterated resource, but no focus is stated for the index).
Iterated resources: injecting CN_INSERT_ITER_RES(ptr, start, end, step, type) at pre/loop spec points to
register concrete resource ranges and element types at runtime.
Focus instantiation: on Extract of Owned/Block with a concrete type, injects CN_INSERT_FOCUS(index,
type) so runtime knows which indices have already been focused.
They may (eventually) be replaced by Rini's future machinery for better error reporting.

Runtime Support

Added per-frame stack storing the information of iterated resources and focused indices
Added interface functions for that.
Added needs_focus, a function to determine if the focus is missing or not.
Added small functional tests using Zain's GTest framework

Running Example

If you run cn auto-annot prog.c where prog.c is

unsigned f(int *p, unsigned long long n, unsigned j) 
/*@
requires
  take X = each(u64 i; i < n) {
      RW<int>(array_shift<int>(p, i))
  };
  n > 10u64;
  (u64)j < n;
ensures
  take Y = each(u64 i; i < n) {
      RW<int>(array_shift<int>(p, i))
  };
@*/
{
    /*@ focus RW<int>, 1u64; @*/
    return p[j];
}

You will get the log like this in a tmp file:

... 
(omitted)
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=2, n=19, j=2, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=12, n=23, j=12, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=4, n=11, j=4, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=12, n=14, j=12, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=10, n=17, j=10, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=16, n=22, j=16, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=6, n=15, j=6, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=11, n=14, j=11, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=7, n=15, j=7, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=11, n=24, j=11, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=2, n=23, j=2, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=15, n=24, j=15, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=7, n=16, j=7, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=8, n=24, j=8, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=8, n=12, j=8, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=11, n=13, j=11, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=5, n=15, j=5, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=10, n=12, j=10, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=9, n=12, j=9, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=13, n=19, j=13, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=8, n=16, j=8, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=14, n=19, j=14, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=16, n=21, j=16, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=3, n=18, j=3, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=7, n=22, j=7, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=0, n=17, j=0, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=6, n=15, j=6, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=12, n=14, j=12, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=3, n=15, j=3, 
[auto annot (focus)]prog.c:28:12-16:RW:signed int, !index=9, n=19, j=9,

Synthesis

Now assume we have samples like the above. Auto-annot takes such samples and tries to find integer coefficients (w0, {w_v}) such that w0 + Σ(w_v * v) = !index across all samples for a given site. In principle, there are many solutions in terms of a SAT problem, but for a better suggestion, I formalized the problem as an optimization to minimize the coefficients w_i as much as possible by using L1 norm. Currently, I'm using Z3 for the optimization, but there might be better choices.

Limitations

Only collects integer-typed variables (skips structs, floats, pointers).
Assumes Iterated resources are contiguous and type equality is string-based (this can be mitigated by the Rini's new machinery)
Only supports explicit memory manipulations (sometimes focus is needed for ghost state manipulations)
Only supports Owned-iterated resource (i.e. each(i; c) { RW(...) } not each(i; c) { Pred(...) } )

dc-mak · 2025-08-22T11:59:28Z

Thanks for the implementation and the overview of the state of the implementation. This is the sort of documentation we would need for merging this in and having it not bit-rot.

Would you be willing to extend the "Running Example" so that it covers an end-user perspective on the feature? Specifically, once the synthesis is done, what is reported to the user and what actions could the user take. It's ok if it's not implemented, or it's vague/hypothetical, it will help clarify the story and context for this feature.
Can you add a little bit of info about why the limitations are there (is it just time/version 0, or is it some other changes required in CN etc.)?

moratorium08 · 2025-08-22T13:04:11Z

@dc-mak

Yes, it is still not so fixed because I'm not sure about a good UI/UX for humans and interface for language servers, but currently users can get the result from stdout like this (see the last line)

> cn auto-annot prog.c
Using temporary directory: /var/folders/5_/bc8gln213hx04f4_d3p9p7dr0000gn/T/autoannot.3EB7
Testing prog::f: 100 runs, 459 discards
PASSED

Testing Summary:
cases: 1, passed: 1, failed: 0, errored: 0, skipped: 0
[Result of auto-annotation]
prog.c:28: focus RW<signed int>, j

Most of them are due to the immature implementation; they will be fixed by further engineering. However, the third item (Only supports explicit memory manipulations...) can be beyond engineering as the current implementation only tracks explicit memory accesses; the current approach cannot infer focuss below in principle:

/*@
predicate [nounfold] void Cn_char_array(pointer p, u64 size)
{
        take U = each(u64 i; i < size){
                W<unsigned char>(array_shift<unsigned char>(p, i))
        };
        return;
}
*@/

void Lemma(
        char *p, unsigned int size1, unsigned int size2
)
/*@
requires
        let size = (u64)size1 + (u64)size2;
        take X = Cn_char_array(p, size);
        let owned_by_ha = array_shift<unsigned char>(p, (u64)size1);
ensures
        take X1 = Cn_char_array(p, (u64)size1);
        take X2 = Cn_char_array(owned_by_ha, (u64)size2);
@*/
{
        /*@
        unpack Cn_char_array(p, (u64)size1 + (u64)size2);
        @*/
        unsigned long i;
        for (i = 0; i < (unsigned long)size2; i++)
        /*@
        inv
                take L0 = Cn_char_array(p, (u64)size1);
                take L1 = Cn_char_array_with_offset(p, (u64)size2 - i, (u64)size1 + (u64)i);
                take L2 = Cn_char_array(owned_by_ha, i);
                {p} unchanged;
                {size1} unchanged;
                {size2} unchanged;
                i <= (u64)size2;
        @*/
        {
                /*@
                  unpack Cn_char_array(owned_by_ha, i);
                  focus W<unsigned char>, i;
                  focus W<unsigned char>, ((u64)size1 + i);
                @*/
        }
}

moratorium08 changed the title ~~Focus context~~ [Draft] Prototype of auto-annot Aug 21, 2025

moratorium08 added 27 commits September 25, 2025 12:06

Add focus_context for runtime

33ded8a

Store iterated resources in the context

49616fa

Update test

221301f

Fix the signature for needs_focus

7bfce6e

insert_focus... maybe?

b9fade9

Update the API for focusing

0aca2a8

Handle iterated resources

19e59d4

Add needs_focus for each LOAD/STORE

c474610

Print variables when focus is missing

e554103

Do free when pop

50dd0c5

Store & push pop context

6e7e6df

Add line number

153e27d

Fix a printer logging problem for storeop

4e7f506

Small refactoring

5bfa7fb

Handle loop (wip)

60261fb

Add clear_focus function

feb3c83

Add switch for introducing autoannot things

e69543d

Add autoannot subcommand

8d5103b

Fix bugs

15bf472

Log to a file instead of stdout

7267c06

Integrate bennet with autoannot

63c8ea5

Clear log file everytime

4f9884f

Add the CSV parser

a37e636

parse and group

01c8521

refactor

ce9d16f

Preliminary optimization

b43892b

Add target index to the constraint

aea2aa6

moratorium08 added 7 commits September 25, 2025 12:28

Fix a minor issue

2845cca

pretty print the inferred index

4209988

Show type annotation

4c054c6

clang format

a326845

Only calls gen_env_fmt_printer when auto_annot is enabled

c99368c

clang-format, again ...?

097b1c4

print results to stdout

9526604

moratorium08 force-pushed the focus_context branch from 5b6ea44 to 9526604 Compare September 25, 2025 12:28

moratorium08 added 3 commits September 25, 2025 14:05

Fix compile error

d9b703c

Merged the machinery (WIP)

a5d918b

Fix weired pointer cast problem

0b3ae87

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Draft] Prototype of `auto-annot` #327

[Draft] Prototype of `auto-annot` #327

Uh oh!

moratorium08 commented Aug 21, 2025

Uh oh!

dc-mak commented Aug 22, 2025

Uh oh!

moratorium08 commented Aug 22, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Draft] Prototype of auto-annot #327

Are you sure you want to change the base?

[Draft] Prototype of auto-annot #327

Uh oh!

Conversation

moratorium08 commented Aug 21, 2025

Fulminate Instrumentations

Runtime Support

Running Example

Synthesis

Limitations

Uh oh!

dc-mak commented Aug 22, 2025

Uh oh!

moratorium08 commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Draft] Prototype of `auto-annot` #327

[Draft] Prototype of `auto-annot` #327

moratorium08 commented Aug 22, 2025 •

edited

Loading