A Strategy for Inserting returns in Verification Conditions

F*'s verification condition generation is, for the most part, syntax directed. Every let yields a bind in a VC; applications are monadic, yielding binds of their own; lifts are inserted when sequentially composing computations in different monads, etc.Although the binds and lifts are determined syntactically, it's less clear when exactly a return should be inserted. After all, in our formalizations of F*'s type system, the rule of return looks something like this:

G |- e : Tot t
----------------------------
G |- e : Pure t (return e)

which is clearly not syntax directed.

F*'s default implementation of this rule is in fact (roughly) to insert a return on every Tot value, see:

here, followed by many other ad hoc insertions of returns, carefully crafted after much experience, but without any overarching principle. e.g., here, here, here, here, here etc.

But, this is very inefficient, from multiple perspectives:

Each return introduces an indirection in a VC, as defined by Prims.pure_return, which introduces a quantifier as an equality hypothesis.
Each return turns a Tot term into a Pure term (GTot into Ghost), disabling various optimizations that apply when sequentially composing Tot/GTot terms, leading to further bloat.
The additional indirections are hard to predict, making it difficult to write tactics to process VCs after the fact, as noticed especially in the context of Vale.
The indirections block normalization, causing more proofs to have to be done equationally by the SMT solver.

A strategy for inserting returns

In what follows, I'll mainly refer to Pure terms, treating Tot terms as a sub-class of Pure terms. What I say applies equally to Ghost and GTot terms.

A: The main observation

Pure terms are self-specifying---F* and SMT can reason directly about their definitions. Naming sub-terms of a pure term in a VC is usually not necessary. As such, inserting a return on every Tot term is unnecessary. Instead, let's aim to delay the insertion of a return until

a. the point where a pure sub-term p of a larger impure term is lifted to the impure effect of its context, since the larger impure term cannot be returned itself. Inserting a return on p at that point allows reasoning about the impure term in a VC while making use of the definition of its pure sub-term p.

b. the point where a pure term's computed type and effect are compared against its expected type and effect. A return at that point allows proving that the term has its expected type and effect relying on both on "intrinsic reasoning", using its computed type and effect, as well as "extrinsic reasoning", using its definition.

With this in mind here's how returns are now inserted in F*.

B: The form of return

First, rather than the simple return rule above, FStar.TypeChecker.Util.maybe_assume_result_eq_pure_term implements something that's closer to this:

G |- e : Pure t wp    [when should_return e t]
----------------------------------------------------------------------------
G |- e : Pure t (bind wp (fun x -> forall y. y==x /\ y == e ==> post y))

C: Inserting returns on pure sub-terms of impure computations

Consider term e = let x = (e1 <: M1 t1 wp1) in (e2 : Pure t2 wp2). If M1 is not Pure, then the resulting term is not Pure either and e cannot be returned in its entirety. So, we insert a return on its pure sub-term e2. See here (check_inner_let) and here (sequencing with ;)
Consider the application (e_f <: Pure (x:t1 -> Pure t2 wp2)) (e_x <: M t1 wp1), where M is not Pure. This is interpreted as the monadic application let f = e_f in let x = e_x in f x. As in the previous case, although the pure sub-term f x is part of a larger impure term (because of e_x in effect M). So, we insert a return on f x. See here (monadic_application)
Consider match x with P1 -> (e1 : Pure t1 wp1) | P2 -> (e2 : M t1 wp2). Again, the entire match cannot be returned, since it has an impure sub-term. So, we insert a return on the pure branch e1. See here (tc_eqn) and its call-site (bind_cases).
Consider the term e = let rec f x = e_f in (e' : Pure t wp). The let rec always binds a lambda term, so it is always Tot (even though, of course, e_f may itself be effectful). Now, although e itself is a Pure, due to a limitation of the SMT encoding, local let rec terms cannot appear in a VC. (See issue #1362). This means that until the SMT encoding is capable of encoding inner let recs, e cannot be returned in its entirety. So, we insert a return around its pure sub-term e'. See here (check_inner_let_rec)

D: Inserting returns when comparing the type of a computation with its expected type

When proving that a value (v : t) has its expected type t', we DO NOT insert a return on the value. Instead, we compute a predicate guard : t -> Type which is sufficient to establish t <: t' and simply apply guard v`. See here (value_check_expected_typ and here (check_and_ascribe).
When checking that a computation (e : M t wp) has an expected effect M' t' wp', we try to insert a return on e first (i.e., in case M is Pure and should_return is true etc.) See here (check_expected_effect)
When subsuming the result type of a computation (e : M t wp) to some other type t', where guard: t -> Type is sufficient to establish t <: t', we build the VC M.bind wp (fun x -> lift_Pure_M (Pure.assert (guard x) (Pure.return x))). See here (comp_check_expected_typ) and here (weaken_result_typ).

E: `should_return` and avoiding duplicate returns

All returns (except the return mentioned in D.3 above) are guarded by a should_return check. See its implementation here.

should_return (e : M t wp) is true when all of the following conditions are true:

M is pure (or ghost)
The return type t is not unit or squash
The head symbol of e is not marked irreducible
e does not contain a let rec

Additionally, in maybe_assume_result_eq_pure_term, we include a check to ensure that in addition to should_return (e : M t wp) inserts a return only if wp doesn't already contain a return.

Implicit returns in `bind`

Aside from the strategy described above, the implementation of bind contains many special cases, some of which cause also cause a return to be inserted.

The following is an adaptation of this code comment.

Consider sequentially composing let x = (e1 : Pure t1 wp1) in (e2 : M2 t2 wp2).

In case should_return (e1 : Pure t1 wp1) is false, then we'll just build M.bind (lift_Pure_M wp1) (fun x -> lift_M2_M wp2).

Otherwise, we aim to build the following VC: M.bind (lift_Pure_M wp1) (x == e1 ==> lift_M2_M (wp2[e1/x])). The reason for substituting e1 for x in wp2 is to allow proofs there to proceed by just computing with e1.

The additional equality hypothesis x = e1 may seem redundant, but it's not. The post-condition in wp1 or its result type t1 type may carry some meaningful information about e1. Then, it's important to weaken wp2 with the equality, so that whatever property is proven about the result of wp1 (i.e., x) is still available in the proof of wp2.

However, we apply a few optimizations:

a. if wp1 is already a return or a partial return, then it already provides this equality, so there's no need to add it again and we instead generate M.bind (lift_Pure_M wp1) (lift_M2_M (wp2[e1/x]).

b. if wp1 has a trivial post-condition (e.g., it is a strengthening of a Tot computation), then the post-condition does not carry any useful information. We have two sub-cases:

i. In case the user option vcgen.optimize_bind_as_seq = without_type we instead generate M.assert_wp (wp1 (\x. True)) (lift_M2_M (wp2[e1/x]))

ii. Although the post-condition of wp1 does not carry useful information, its result type might. When applying the optimization above, the SMT solver is faced with reconstructing the type of e1. Usually, it can do this, but in some cases (e.g., if the result type has a complex refinement), then this optimization can actually cause a VC to fail. So, we add an option to recover from this, at the cost of some VC bloat. In case the user option vcgen.optimize_bind_as_seq = with_type, M.assert_wp (wp1 (\x. True)) (lift_M2_M (wp2[Prims.with_type e1 t1/x])) where with_type e1 t1, decorates e1 with its type before substituting. This allows the SMT solver to recover the type of e1 (using a primitive axiom about with_type), without polluting the VC with an additional equality. Note, specific occurrences of with_type e t can be normalized away to e if requested explicitly by a user tactic.

Here is the effect of these optimizations.

module Vcgen

assume val f1 : x:nat -> y:int{y >= x}
assume val f2 : x:nat -> y:int{y >= x}
assume val f3 : x:nat -> y:int{y >= x}

#set-options "--max_fuel 0 --max_ifuel 0 --debug Vcgen --debug_level SMTQuery --print_full_names"
let test123 (x1:int{x1 > 1}) =
  let x2 = f1 x1 in
  let x3 = f2 x2 in
  let x4 = f3 x3 in
  assert (x4 >= x1)

(* With --vcgen.optimize_bind_as_seq without_type

Encoding query formula: forall (x1293:(x1294:Prims.int{ x1294 > 1 })).
  (*could not prove post-condition*)
  (*Subtyping check failed; expected type Prims.nat; got type (x1240:Prims.int{ x1240 > 1 })*)
  x1293 > 1 ==> x1293 >= 0 /\
  (*Subtyping check failed; expected type Prims.nat; got type (y244:Prims.int{ y244 >= x1158 })*)
  Vcgen.f1 x1293 >= x1293 ==> Vcgen.f1 x1293 >= 0 /\
  ((*Subtyping check failed; expected type Prims.nat; got type (y248:Prims.int{ y248 >= x2182 })*)
  Vcgen.f2 (Vcgen.f1 x1293) >= Vcgen.f1 x1293 ==> Vcgen.f2 (Vcgen.f1 x1293) >= 0 /\
  (Vcgen.f3 (Vcgen.f2 (Vcgen.f1 x1293)) >= x1293))

--------------------------------------------------------------------------------
Default:

Encoding query formula: forall (x1413:(x1414:Prims.int{ x1414 > 1 })).
  (*could not prove post-condition*)
  (*Subtyping check failed; expected type Prims.nat; got type (x1240:Prims.int{ x1240 > 1 })*)
  x1413 > 1 ==> x1413 >= 0 /\
  (forall (any_result415:(x1416:Prims.int{ x1416 > 1 })).
      x1413 == any_result415 ==>
      (forall (any_result417:(y418:Prims.int{ y418 >= x1413 })).
          Vcgen.f1 x1413 == any_result417 ==>
          (*Subtyping check failed; expected type Prims.nat; got type (y244:Prims.int{ y244 >= x1158 })*)
          Vcgen.f1 x1413 >= x1413 ==> Vcgen.f1 x1413 >= 0 /\
          (forall (any_result419:(y420:Prims.int{ y420 >= x1413 })).
              Vcgen.f1 x1413 == any_result419 ==>
              (forall (any_result421:(y422:Prims.int{ y422 >= Vcgen.f1 x1413 })).
                  Vcgen.f2 (Vcgen.f1 x1413) == any_result421 ==>
                  (*Subtyping check failed; expected type Prims.nat; got type (y248:Prims.int{ y248 >= x2182 })*)
                  Vcgen.f2 (Vcgen.f1 x1413) >= Vcgen.f1 x1413 ==> Vcgen.f2 (Vcgen.f1 x1413) >= 0 /\
                  (forall (any_result423:(y424:Prims.int{ y424 >= Vcgen.f1 x1413 })).
                      Vcgen.f2 (Vcgen.f1 x1413) == any_result423 ==>
                      (forall (any_result425:(y426:Prims.int{ y426 >= Vcgen.f2 (Vcgen.f1 x1413) })).
                          Vcgen.f3 (Vcgen.f2 (Vcgen.f1 x1413)) == any_result425 ==>
                          Vcgen.f3 (Vcgen.f2 (Vcgen.f1 x1413)) >= x1413))))))

*)

A Strategy for Inserting returns in Verification Conditions

A strategy for inserting returns

A: The main observation

B: The form of return

C: Inserting returns on pure sub-terms of impure computations

D: Inserting returns when comparing the type of a computation with its expected type

E: should_return and avoiding duplicate returns

Implicit returns in bind

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

E: `should_return` and avoiding duplicate returns

Implicit returns in `bind`