Nov 22, 2016

Parametricity, Ch 48

Announcements:

Talk at 3PM today
Project demos begin after Thanksgiving
Office hours tomorrow between 1-3PM

Questions

Kleene/Observational equivalence

What is the purpose of the base type 2? The System F that we saw in class did not have this type! Also, we don't want to reduce all computations in our language to just one type do we?

[SCW: No, we don't. But we need a way of "observing" the results of computation and functions, by themselves, aren't very good for that task. Part of the reason that we define observational equivalence is to have a good definition of equivalence for functions. But to do that we need to have a base type around to "bootstrap" the process.]

This chapter's Kleene "equivalence" is also not really an equivalence until termination is proven, right?

[SCW: Yes. This chapter has the same "bug" as Ch 46.]

Definition of (parametric) logical equivalence

Could you give more clarification on the reason why the first attempt of defining logical equivalence is not OK?

[SCW: The relation isn't well founded. It is defined by structural recursion on types, but the type doesn't get smaller.]

In Section 48.3 the book says “when polymorphism is impredicative, the type [p/t]τ might well be larger than ∀(t.τ)”. I’m not clear what this sentence means: What does “impredicative” means here? Why would [p/t]τ be larger than ∀(t.τ)?

[SCW: Impredicative means that you can instatiate ∀(t.τ) with ∀(t.τ). Consider ∀(t.t). If we instantiate t with ∀(t.t -> t) we get a larger type.]
Could give more explanation on the definition of parametric logical equivalence?

[SCW: Will do! Please ask questions in class.]

Could we go over the logical equivalence section for System F in class tomorrow in detail? In particular, I don't quite understand the connection between free type variables and any admissible relation between types ρ and ρ' , and how that helps in defining logical equivalence.

[SCW: The problem with free type variables is twofold. First the well-foundness issue above. Second, by generalizing to admissible relations, we prove a stronger property.]

Why are there tensor product symbols in the definition of Parametric Logical Equivalence? What do those symbols mean?

[SCW: These symbols are used to "extend" the substitutions with new mappings from type variables to types/relations.]

Could you kindly give simple-minded concrete examples for each of the definitions used in the chapter? For instance, what is an example of an admissible relation that is not logical equivalence? What is an example of an admissible relation assignment and so on...

[SCW: Here is an admissible relation that is not logical equivalence:

R : 2 <-> 2 iff R(true, false) R(false, true) R (e,e') when e ->* v & e' ->* v' and R(v,v')

Here is another one:

R : 2 <-> (all a. a -> a -> a)

R (true, /\a. \x.\y.x)
 R (false, /\a. \x.\y.y)
 anything else required to respect observational equivalence]

Connection to normalization

The "other" normalization proof for F, I saw in undergrad was in Girard's book using reducibility candidates. The connection I'm seeing is that whereas Bob essentially assigns relations between type variables, Girard assigns a "candidate set" of relations on type variables. I see the technical similarities between both approaches - we're sidesteping a technical problem with induction using a clever logical relation trick, but I fail to see Bob's comment "the concept of parametricity is latent in the strong normalization proof by Girard".

[SCW: In both cases we are sidestepping a technical problem with structural recursion by using a larger (unary/binary) relation than necessary. This is exactly what Bob means by his remark. ]

Theorem 48.12 (Parametricity)

I cannot follow the proof of parametricity theorem, it would be great if you could further clarify that.
Could we go through the proof of 48.12 more carefully? I'm still not quite sure I fully understand it, especially the type application case. The variable case also seems like it falls out magically from the definitions.

[SCW: will do.]

Representation independence && applications of parametricity

I still somehow lack intuition about the parametricity proof (mostly on its key argument, taking related inputs to related outputs). Indeed, while it makes sense, I still don't have a strong feeling about why this is the right thing to do™. I'd love to talk more about this, hopefully getting a chance to get to that point in my understanding.

[SCW: not sure how to help here. Intuition about the definitions? The structure of the parametricity theorem? The applications?]
The use of parametricity (for the identity and for products - basically section 48.4) to derive important properties of the inhabitants of these types, while being fairly short, remains a little convoluted and seemingly ad hoc. Would there be a nice general "procedure" (depending only on the structure of the involved type) - or even better, a more powerful/explicit result - to prove these parametricity properties?

[SCW: again, not sure how to help your 'seems ad hoc' complaint. The properties are defined generically via type structure. They will differ for different type structures---so there won't be a general result for all types. But that is the point; the types tell us something about the program.]
I'm kind of confused by Section 48.5. I thought the proof of representation independence in Chapter 17 depended on the particular choice of R (since R discusses how individual parts of the type should behave), but here it seems the exact definition of R is mostly ignored.

[SCW: The requirement for representation independence is that there exists an (admissible) relation between the types that respects the operations of the abstract structure. ]
Because we have kept talking about parametricity this semester, and this chapter is one of the final chapters of this book, it gives me the impression that it is very important and very useful. However, I don’t really feel that way after reading this chapter (maybe because I don’t fully understand this chapter). For example, we said that parametricity can be used for compiler optimizations, but how much optimizations based on parametricity property can we do in practice? Because my impression is that most compiler optimizations are about low-level stuff like memory allocations (for locality, etc.), can we justify those optimizations using parametricity property?

[SCW: an example of a compiler optimization that requires parametricity is "list fusion" We want to, for example, replace "map f (map g xs)" with "map (f . g) xs" so that we don't have to allocate and deallocate the inner list.]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

112216.md

112216.md

Nov 22, 2016

Parametricity, Ch 48

Questions

Kleene/Observational equivalence

Definition of (parametric) logical equivalence

Connection to normalization

Theorem 48.12 (Parametricity)

Representation independence && applications of parametricity

Files

112216.md

Latest commit

History

112216.md

File metadata and controls

Nov 22, 2016

Parametricity, Ch 48

Questions

Kleene/Observational equivalence

Definition of (parametric) logical equivalence

Connection to normalization

Theorem 48.12 (Parametricity)

Representation independence && applications of parametricity