-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathmotivation.tex
507 lines (464 loc) · 18.3 KB
/
motivation.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
\section{Motivation}
\label{sec:motivation}
% Examples, examples, examples from simple to more advanced showing off
% the (difficult) features of inference.
This section presents a sequence of more and more challenging
examples for global type inference (GTI). To spice up our examples
somewhat, we assume some predefined utility classes with the following
interfaces.
\begin{lstlisting}[style=fgj]
class Bool {
Bool not();
}
class Int {
Int negate ();
Int add (Int that);
Int mult (Int that);
}
class Double {
Double negate ();
Double add (Double that);
Double mult (Double that);
}
\end{lstlisting}
We generally use upper case single-letter identifiers like $\TVX,
\TVY, \dots$ for type variables.
Given a FGJ-GT class \CL 0,
we call any FGJ class \CL i that can be transformed to \CL 0 by
erasing type annotations a \emph{completion of \CL 0}.
\subsection{Multiplication}
\label{sec:multiplication}
Here is the \TFGJ code for multiplying the components of a
pair.\footnote{We indicate \TFGJ code fragments by using a
{gray background}.}
\begin{lstlisting}[style=tfgj]
class MultPair {
mult (p) { return p.fst.mult(p.snd); }
}
\end{lstlisting}
Assuming the parameter typing $\mv p:\mv P$, result type $\mv R$, and that
\texttt{mult} in the body refers to \texttt{Int.mult}, we
obtain the following constraints.
\begin{itemize}
\item From \texttt{p.fst}: $\mv P \subconstr \mathtt{Pair}\Angle{\TVX,\TVY}$ and
$\texttt{p.fst} : \TVX$.
\item From \texttt{p.snd}: $\mv P \subconstr \mathtt{Pair}\Angle{\TVZ,\TVW}$ and
$\texttt{p.snd} : \TVW$.
\item The two constraints on $\mv P$ imply that $\TVX \eqconstr \TVZ$ and
$\TVY \eqconstr \TVW$.
\item From \texttt{.mult (p.snd)}: $\TVX \subconstr \mathtt{Int}$, $\TVY \subconstr
\mathtt{Int}$, and $\mathtt{Int} \subconstr \mv R$.
\end{itemize}
The return type $\mv R$ only occurs positively in the constraints, so we can
set $\mv R = \mathtt{Int}$.
The argument type $\mv P$ only occurs negatively in the constraints,
so $\mv P = \mathtt{Pair} \Angle{\TVX,\TVY}$.
This reasoning gives rise to the following completion.
% \begin{lstlisting}[style=fgj]
% class MultPair {
% Int mult (Pair<Int,Int> p) { return p.fst.mult(p.snd); }
% }
% \end{lstlisting}
\begin{lstlisting}[style=fgj]
class MultPair {
<X extends Int, Y extends Int>
Int mult (Pair<X,Y> p) { return p.fst.mult(p.snd); }
}
\end{lstlisting}
We obtain a second completion if we assume that \texttt{mult} refers to
\texttt{Double.mult}.
\begin{lstlisting}[style=fgj]
class MultPair {
<X extends Double, Y extends Double>
Double mult (Pair<X,Y> p) { return p.fst.mult(p.snd); }
}
\end{lstlisting}
Finally, the definition of \texttt{mult} might be recursive, which
generates different constraints for the method invocation of \texttt{mult}.
\begin{itemize}
\item From \texttt{.mult (p.snd)}: $\TVX \subconstr \mathtt{MultPair}$, $\TVY \subconstr
\mathtt{P}$, and $\mathtt{R} \subconstr \mv R$.
\end{itemize}
Transitivity of subtyping applied to $\TVY \subconstr \mathtt{P}$ and $\mv P \subconstr \mathtt{Pair}\Angle{\TVX,\TVY}$
yields the constraint $\TVY \subconstr \mathtt{Pair}\Angle{\TVX,\TVY}$, which triggers the
occurs-check in unification and is hence rejected.
% The corresponding example in full Java would be the dot product. In
% full Java, we also have to deal with overloading of the multiplication
% operator, which amounts to considering \texttt{Int.mult} and
% \texttt{Double.mult}. Overloading is not allowed in FGJ, but method
% names in different classes may overlap.
The two solutions can be combined to
\begin{lstlisting}[style=fgj]
class MultPair {
<X extends T1, Y extends T2>
T0 mult (Pair<X,Y> p) { return p.fst.mult(p.snd); }
}
\end{lstlisting}
where
\begin{align*}
(T_0, T_1, T_2) & \in \{ (\mv{Int}, \mv{Int}, \mv{Int}), (\mv{Double}, \mv{Double}, \mv{Double}) \}
\end{align*}
\subsection{Inheritance}
\label{sec:inheritance}
\begin{figure}[tp]
\begin{subfigure}[t]{0.49\linewidth}
\begin{lstlisting}[style=tfgj]
class A1 {
m(x) { return x.add(x); }
}
class B1 extends A1 {
m(x) { return x; }
}
\end{lstlisting}
\end{subfigure}
\begin{subfigure}[t]{0.49\linewidth}
\begin{lstlisting}[style=tfgj]
class A2 {
m(x) { return x; }
}
class B2 extends A2 {
m(x) { return x.add(x); }
}
\end{lstlisting}
\end{subfigure}
\caption{Method overriding}
\label{fig:method-overriding}
\end{figure}
Let's start with the artificial example in the left
listing of Figure~\ref{fig:method-overriding} and ignore the \texttt{Double} class. Type
inference proceeds according
to the inheritance hierarchy starting from the superclasses. In class
\texttt{A1}, the inferred method type is \texttt{Int A1.m (Int)}. Class \texttt{B1} is a
subclass of \texttt{A1} which must override \texttt{m} as there is no
overloading in FGJ. However, the inferred method
type is \texttt{<T> T B1.m(T)}, which is not a correct
method override for \texttt{A1.m()}.
Hence, GTI must instantiate the method type in the subclass \texttt{B1} to
\texttt{Int B1.m(Int)}.
Conversely, for the right listing of
Figure~\ref{fig:method-overriding}, GTI infers the types
\texttt{<T> T A2.m (T)} and \texttt{Int B2.m (Int)}. Again, these
types do not give rise to a correct method override and
GTI is now forced to instantiate the type in the superclass to
\texttt{Int A2.m (Int)}.
\todo[inline]{This example shows that GTI must first collect the constraints
for all methods \texttt{m{}} in a class hierarchy. Generalization can only happen after
all constraints on \texttt{m}'s type have been considered.}
% Otherwise, we would get strange results. Consider extending the
% program with classes \texttt{A} and \texttt{B} by the following class.
% \begin{lstlisting}[style=tfgj]
% class C {
% mbool(x) { return x.m(new Bool()); }
% }
% \end{lstlisting}
% Inferring the type of \texttt{C.mbool()} using \texttt{A.m()} would
% fail because \texttt{Int} and \texttt{Bool} are not
% unifiable. However, inferring the type of \texttt{C.mbool()} using
% \texttt{B.m()} (with the generic type) would succeed and yield the
% typing
% \begin{lstlisting}[style=fgj]
% class C {
% Bool mbool(B x) { return x.m(new Bool()); }
% }
% \end{lstlisting}
In full Java, type inference would have to offer two alternative
results: either two different
overloaded methods (one inherited and one local) in \texttt{B1}/\texttt{B2} or
impose the typing \texttt{Int B1.m(Int)} or \texttt{Int A2.m(Int)} to enforce correct overriding.
\subsection{Inheritance and Generics}
\label{sec:inheritance-generics}
\begin{lstlisting}[float,caption={Function class}, label={lst:function-class},style=fgj]
class Function<S,T> {
T apply(S arg) { return this.apply (arg); }
}
\end{lstlisting}
Suppose we are given a generic class for modeling functions in FGJ (Listing~\ref{lst:function-class}).
This code is constructed to serve as an ``abstract'' super class to derive more
interesting subclasses.
The class \texttt{Function<S,T>} must be presented in this explicit
way. Its type annotations \textbf{cannot} be inferred by GTI because
the use of the generic class parameters in the method type cannot be inferred from the
implementation.
If we applied GTI to the type-erased version of
Listing~\ref{lst:function-class}, the \texttt{apply} method would be considered a generic method:
\begin{center}
\begin{minipage}{0.3\linewidth}
\begin{lstlisting}[style=tfgj]
apply (arg) { ... }
\end{lstlisting}
\end{minipage}
\hfill\texttt{ --GTI--> }\hfill
\begin{minipage}{0.45\linewidth}
\begin{lstlisting}[style=fgj]
<A,B> B apply (A arg) { ... }
\end{lstlisting}
\end{minipage}
\end{center}
The typing of \texttt{apply} in Listing~\ref{lst:function-class} is an
instance of this result, so that completeness of GTI is preserved!
% While it is possible to force GTI to infer the intended typing, it requires some awkward coding.
% \begin{lstlisting}[style=tfgj]
% class Function<S,T> {
% S in;
% T out;
% dummy (arg) { return this.dummy(in); }
% apply (arg) { return new Pair<>(this.out, this.dummy (arg)).fst; }
% }
% \end{lstlisting}
Now that we have the abstract class \texttt{Function<S,T>} at our
disposal, let us apply GTI to a class of boxed values with a
\texttt{map} function:
\begin{lstlisting}[style=tfgj]
class Box<S> {
S val;
map(f) {
return new Box<>(f.apply(this.val));
} }
\end{lstlisting}
GTI finds the following constraints
\begin{itemize}
\item the return value must be of type \texttt{Box<T>}, for some type
\texttt{T},
\item \texttt{T} is a supertype of the type returned by
\texttt{f.apply},
\item \texttt{apply} is defined in class \texttt{Function<S1,T1>} with
type \texttt{T1 apply(S1 arg)},
\item hence \texttt{T1\,<:\,T} and \texttt{S\,<:\,S1} (because
\texttt{this.val\,:\,S}),
\end{itemize}
and resolves them to the desired outcome where \texttt{T1=T} and
\texttt{S=S1} using the methods of
Simonet~\cite{DBLP:conf/aplas/Simonet03}.
\begin{lstlisting}[style=fgj]
class Box<S> {
S val;
<T> Box<T> map(Function<S,T> f) {
return new Box<T>(f.apply<S,T>(this.val));
} }
\end{lstlisting}
But what happens if we add subclasses of \texttt{Function}?
For example:
\begin{lstlisting}[style=tfgj]
class Not extends Function<Bool,Bool> {
apply(b) { return b.not(); }
}
class Negate extends Function<Int,Int> {
apply(x) { return x.negate(); }
}
\end{lstlisting}
\todo[inline]{Does this approach really make sense? If we have a
subclass that overrides a method of the superclass, then the public
interface should be the one of the superclass. Subsequently, the
implementation of the method in the subclass should be checked
against the type inferred for the superclass.}
If we rerun GTI with these classes, we now have additional
possibilities to invoke the \texttt{apply} method. With \texttt{Not}, we need to use the
generic type of \texttt{Function.apply()}, but instantiate it according to
\texttt{Function<Bool,Bool>}. Thus,
we obtain the constraints \texttt{Bool $\subconstr$ T} and \texttt{S $\subconstr$ Bool} for \texttt{T = Bool}
and \texttt{S = Bool}, which are both satisfiable. With
\texttt{Negate} we run into the same situation with the constraints
\texttt{Int $\subconstr$ Int} and \texttt{Int $\subconstr$ Int}.
% That means there is no way to
% \emph{directly} invoke \texttt{Not.apply} or \texttt{Negate.apply}
% from \texttt{Box.map}, but it may happen indirectly via inheritance.
Here is another subclass of \texttt{Function<S,T>} that we want
to consider.
\begin{lstlisting}[style=fgj]
class Identity<S> extends Function<S,S> {
S apply(S arg) { return arg; }
}
\end{lstlisting}
Here, we obtain the following type constraints
\begin{itemize}
\item \texttt{apply} is defined in class \texttt{Identity<S1>} with
type \texttt{S1 apply (S1 arg)},
\item hence \texttt{S1 $\subconstr$ T} and \texttt{S $\subconstr$ S1}.
\end{itemize}
Resolving the constraints yields \texttt{S = T} thus the typing
\begin{lstlisting}[style=fgj]
Box<S> map(Identity<S> f);
\end{lstlisting}
which is an instance of the previous typing.
\subsection{Multiple typings}
\label{sec:multiple-results}
% Our global type inference algorithm is able to infer every type annotation.
% For the sake of simplicity we will only consider method types in this
% paper and assume that field types are specified.
\begin{figure}[tp]
\begin{minipage}{0.49\linewidth}
\begin{lstlisting}[style=fgj]
class List<A> {
List<A> add(A item) {...}
A get() { ... }
}
\end{lstlisting}
\end{minipage}
~$\left|
\begin{minipage}{0.49\linewidth}
\begin{lstlisting}[style=tfgj]
class Global{
m(a){
return a.add(this).get();
} }
\end{lstlisting}
\end{minipage}\right.$
\caption{Example for multiple inferred types}
\label{fig:example-types-not-unique}
\end{figure}
Global type inference processes classes in order of
dependency.
To see why, consider the classes \texttt{List<A>} and \texttt{Global}
in Figure~\ref{fig:example-types-not-unique}.
Class \texttt{Global} may depend on
class \texttt{List} because \texttt{Global} uses methods \texttt{add} and
\texttt{get} and \texttt{List} defines methods with the same names.
The dependency is only approximate because, in general, there may be additional classes
providing methods \texttt{add} and \texttt{get}.
In the example, it is safe to assume that the types for the methods of class \texttt{List}
are already available, either because they are given (as in the code
fragment) or because they were inferred before considering class \texttt{Global}.
The method \texttt{m} in class \texttt{Global} first invokes
\texttt{add} on \texttt{a}, so the type of \texttt{a} as well as the
return type of \texttt{a.add(this)} must be
\texttt{List<T>}, for some \texttt{T}. As \texttt{this} has
type \texttt{Global}, it must be that \texttt{Global} is a
subtype of \texttt{T}, which gives rise to the constraint
\texttt{Global $\subconstr$ T}. By the typing of \texttt{get()} we
find that the return type of method \texttt{m} is also \texttt{T}.
But now we are in a dilemma because FGJ only supports \emph{upper bounds} for
type variables,\footnote{Java has the same restriction. Lower bounds
are only allowed for wildcards.} so that
\texttt{Global $\subconstr$ T} is not a valid constraint in FGJ.
To stay compatible with this restriction, global type inference
expands the constraint by instantiating \texttt{T} with the (two) superclasses fulfilling
the constraint, \texttt{Global} and \texttt{Object}. They give rise to two incomparable
types for
\texttt{m}, \texttt{List<Global> -> Global} and
\texttt{List<Object> -> Object}. So there are two different FGJ
programs that are completions of the \texttt{Global} class.
GTI models these instances by inferring an \emph{intersection type}
\texttt{List<Global> -> Global \& List<Object> -> Object}
for method \texttt{m} and the different FGJ-completions of class \texttt{Global} are
instances of the intersection type:\footnote{The cognoscenti will be
reminded of overloading. However, FGJ does not support overloading,
so we rely on resolution by subsequent uses of the method. Moreover,
this intersection type cannot be realized by overloading in a Java
source program because it is resolved according to the raw classes
of the arguments, in this case \lstinline{List}. It can be realized
in bytecode which supports overloading on the return type, too.%
}
\begin{center}
\begin{minipage}{0.49\linewidth}
\begin{lstlisting}[style=fgj]
class Global {
Global m(List<Global> a) {
return a.add(this).get();
}
\end{lstlisting}
\end{minipage}
\begin{minipage}{0.49\linewidth}
\begin{lstlisting}[style=fgj]
class Global {
Object m(List<Object> a) {
return a.add(this).get();
}
\end{lstlisting}
\end{minipage}
\end{center}
In this sense, the inferred intersection type represents a principal
typing for the class.
% \begin{lstlisting}[style=fgj]
% class Global {
% <Global <: T> T m( List<T> a ) {
% return a.add(this).get();
% }
% \end{lstlisting}
Additional classes in the program may further restrict the number of
viable types. Suppose we define a class \texttt{UseGlobal} as
follows:
\begin{lstlisting}[style=tfgj]
class UseGlobal {
main() {
return new Global().m((List<Object>) new List());
} }
\end{lstlisting}
Due to the dependency on \texttt{Global.m()}, type inference considers this class after class
\texttt{Global}. As it uses \texttt{m} at type
\texttt{List<Object> -> Object}, global type inference narrows the
type of \texttt{m} to just this alternative.
% \todo[inline]{If there are conflicting uses of a method with an
% intersection type \texttt{T1\&T2}, one might consider splitting the
% class into one providing the method with type \texttt{T1} and
% another providing it with \texttt{T2}.}
\subsection{Polymorphic recursion}
\label{sec:polym-recurs}
\begin{figure}[tp]
\begin{minipage}{0.49\linewidth}
\begin{lstlisting}[style=fgj]
class UsePair {
<X,Y> Object prc(Pair<X,Y> p) {
return this.prc<Y,X> (p.swap<X,Y>());
} }
\end{lstlisting}
\end{minipage}
~$\left|
\begin{minipage}{0.49\linewidth}
\begin{lstlisting}[style=tfgj]
class UsePair {
prc(p) {
return this.prc (p.swap());
} }
\end{lstlisting}
\end{minipage}\right.$
\caption{Example for polymorphic recursion}
\label{fig:examples-poly-rec}
\end{figure}
A program uses \emph{polymorphic recursion} if there is a generic method that is invoked
recursively at a more specific type than its definition.
As a toy example for polymorphic recursion consider the FGJ class \texttt{UsePair} with a
generic method \texttt{prc} that invokes itself
recursively on a swapped version of its argument pair
(Figure~\ref{fig:examples-poly-rec}, left).
This method makes use of polymorphic recursion because the type of the
recursive call is different from the declared type of the method. More
precisely, the declared argument type is \texttt{Pair<X,Y>} whereas
the argument of the recursive call has type
\texttt{Pair<Y,X>}---an instance of the declared type.
For this particular example, global type inference succeeds on the
corresponding stripped program shown in
Figure~\ref{fig:examples-poly-rec}, right, but it yields a more restrictive
typing of \texttt{<X> Object prc (Pair<X,X> p)} for the method.
A minor variation of the FGJ program with a non-variable instantiation makes type inference fail entirely:
\begin{lstlisting}[style=fgj]
class UsePair2 {
<X,Y> Object prc(Pair<X,Y> p) {
return this.prc<Y,Pair<X,Y>> (new Pair (p.snd, p));
}
}
\end{lstlisting}
Polymorphic recursion is known to make type inference intractable
\cite{DBLP:journals/toplas/Henglein93,DBLP:journals/toplas/KfouryTU93}
because it can be reduced to an undecidable semi-unification problem
\cite{DBLP:journals/iandc/KfouryTU93}. However, \emph{type checking} with
polymorphic recursion is tractable and routinely used in languages like Haskell
and Java.
GTI does not infer method types with polymorphic recursion. Inference either fails or
returns a more restrictive type. Classes making use of polymorphic recursion need to supply
explicit typings for methods in question.
% Features:
% \begin{itemize}
% \item overloaded methods
% \item class header is complete in the form
% $\mathtt{class}\ C\langle\overline X <: \overline N\rangle <: N \{
% \overline T\ \overline f;\ K\ \overline M \}$
% \item constructor fully typed (as it is determined by the types of the
% fields and the supertype)
% \item method signatures may be omitted, i.e.,
% $M\ ::=\ m(\overline x) \{ \ \mathtt{return}\ e;\ \}$
% \item polymorphic recursive methods must be annotated.
% \end{itemize}
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "TIforGFJ"
%%% End: