-
Notifications
You must be signed in to change notification settings - Fork 97
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
74 changed files
with
4,236 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
![[Week 2 - Cache-Consistency I]] | ||
![[Week 3 - Cache-Consistency II]] | ||
![[Week 4 - Atomic Execution]] | ||
![[Week 5 - Deadlocks]] | ||
![[Week 6 - Transactions]] | ||
![[Week 7 - Hardware Transactional Memory]] | ||
![[Week 8 - Function Dispatching]] | ||
![[Week 9 - Multi-Inheritance]] | ||
![[Week 10 - C++ Multi-Inheritance and Dynamic Dispatching]] | ||
![[Week 11 - Mixins and Traits]] | ||
![[Week 12 - Prototybe-Based Programming in Lua]] | ||
![[Week 13 - Aspect-Oriented Programming in AspectJ]] | ||
![[Week 14 - Metaprogramming]] | ||
![[Week 15 - Continuations]] |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# About these notes | ||
|
||
These notes are my lecture notes on the Programming Languages course held in the winter term 2021/2022 by Dr. Michael Petter. The notes are based on the course slides [^1]. Images are taken from the course slides. | ||
|
||
- [[0 - everything.pdf]] contains the notes of all chapters exported as a single PDF | ||
|
||
The notes are written in [Obsidian markdown](https://obsidian.md/) and are best viewed in Obsidian. | ||
|
||
[^1]: Michael Petter -- Programming Languages, lecture slides |
242 changes: 242 additions & 0 deletions
242
Programming Languages/Week 10 - C++ Multi-Inheritance and Dynamic Dispatching.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,242 @@ | ||
|
||
# C++: Multi-Inheritance and Dynamic Dispatching | ||
|
||
What happens if we combine Multi-Dispatching and Virtual Tables? | ||
|
||
If $C(A, B)$, in the memory layout, we would need a vptr before $A$ as well as before $B$! | ||
|
||
##### C++ example | ||
Recall [[Week 9 - Multi-Inheritance#Ambiguities from Multi-Inheritance|this example]], but now with a virtual overwritten method `f`. | ||
|
||
```c++ | ||
class A { public: | ||
int a; | ||
virtual int f(int); | ||
}; | ||
class B { public: | ||
int b; | ||
virtual int f(int); | ||
virtual int g(int); | ||
}; | ||
class C : public A, public B { | ||
int c; | ||
virtual int f(int); | ||
}; | ||
|
||
// ...somewhere in the code: | ||
C c; | ||
B* pb = &c; | ||
pb->f(42); | ||
``` | ||
At the point where `pb` is of type `B*`, all the compiler is supposed to use is the fact that this points to an object of type `B`. So the vptr at the beginning of the memory layout of this `B` object must point to a vtable which contains the `C::f` version, not the `B::f` version. | ||
![[vtable-multiinh-ABC-1.png]] | ||
## Basic Virtual Tables | ||
Basic virtual tables are made up of | ||
- the *offset to top*, i.e. how much higher you need to go to reach the enclosing object's top, starting at the vptr (e.g. $\Delta B$) | ||
- *typeinfo pointer* to an RTTI object (= Run-Time Type Information) | ||
- multiple *virtual function pointers*, used to resolve virtual methods | ||
When multiple inheritance is used, *virtual tables are composed*: The vptrs are pointing to the corresponding *virtual subtables*. | ||
(QUESTION: Is this distinction into vtables/v-subtables actually relevant?) | ||
With this setup, casting preserves the link between an object and its corresponding virtual subtable. | ||
##### Finding the vtables of a program | ||
The vtables of a compilation unit are output by the command | ||
```bash | ||
clang -cc1 -fdump-vtable-layouts -emit-llvm source.cpp | ||
``` | ||
|
||
### Thunks | ||
##### Casting issues when calling virtual methods | ||
The following problem can occur when casting: Imagine the overwritten `C::f` method is called from an object of type `B`. Then the `C::f` methods expects as this reference an object of type `C`, so the `B` object must be cast to `C`. This information is not available, however, in the vtable as presented until now! | ||
|
||
![[vtable-multiinh-this-references.png]] | ||
|
||
##### Thunks to the rescue | ||
*Thunks* are "trampoline methods" that adapt the `this` reference before delegating to the original virtual method implementation. | ||
|
||
In our example, in the B-in-C-vtable, `f(int)` is represented by the thunk `_f(int)`. It adds the compiletime constant $\Delta B$ to `this` and then calls `f(int)`. | ||
|
||
Compiled code: | ||
```llvm | ||
define i32 @__f(%class.B* %this, i32 %i) { ; thunk definition | ||
%1 = bitcast %class.B* %this to i8* | ||
%2 = getelementptr i8* %1, i64 -16 ; subtract ΔB = size(A) = 16 | ||
%3 = bitcast i8* %2 to %class.C* ; interpret as C pointer... | ||
%4 = call i32 @_f(%class.C* %3, i32 %i) ; ...and call the f method | ||
ret i32 %4 | ||
} | ||
``` | ||
|
||
## Common Ancestors | ||
What if there are common ancestors? Where to place them in the memory layout? Classical example: Diamond problem. | ||
|
||
(Maybe this means you should rethink your class structure... But if the language allows it, the compiler must nevertheless handle that case.) | ||
|
||
### Standard C++ approach: Duplicated Bases | ||
Standard C++ Multi-Inheritance: conceptually, duplicates of common ancestors. | ||
![[duplicated-base-class.png]] | ||
|
||
Memory layout: | ||
![[duplicated-base-class-memory-layout.png]] | ||
```llvm | ||
%class.C = type { %class.A, %class.B, | ||
i32, [4 x i8] } | ||
%class.A = type { [12 x i8], i32 } | ||
%class.B = type { [12 x i8], i32 } | ||
%class.L = type { i32 (...)**, i32 } | ||
``` | ||
One can see that only `L` needs a vptr. | ||
|
||
##### Examples of Ambiguities of Duplicated Bases | ||
The following code fails to compile: | ||
```c++ | ||
C c; | ||
L* pl = &c; // 'L' is an ambiguous base of 'C' | ||
``` | ||
There are two L objects stored inside `c`, the compiler wouldn't know to which to point! | ||
|
||
This works: | ||
```c++ | ||
C c; | ||
L* pl = (B*)&c; | ||
``` | ||
|
||
This also fails: | ||
```c++ | ||
C c; | ||
L* pl = (B*)&c; | ||
C* pc = (C*)pl; // 'L' is an ambiguous base of 'C' | ||
``` | ||
|
||
This works: | ||
```c++ | ||
C c; | ||
L* pl = (B*)&c; | ||
|
||
// Even the call is allowed (other than c.f(42)): On an L object, | ||
// calling f(...) is unambiguous (the ambiguity would already | ||
// kick in during the cast to L if we didn't resolve it) | ||
pl->f(42); | ||
|
||
// For the cast back to C, give the compiler a hint (static casts | ||
// need compile-time constant offsets!) | ||
C* pc = (C*)(B*)pl; | ||
``` | ||
### Virtual base clases: Allow Common Bases | ||
C++ allows diamond-pattern-style shared base classes with the `virtual` keyword: | ||
```c++ | ||
class W { public: | ||
int w; | ||
virtual int f(int); virtual int g(int); virtual int h(int); | ||
}; | ||
class A : public virtual W { public: | ||
int a; | ||
int f(int); | ||
}; | ||
class B : public virtual W { public: | ||
int b; | ||
int g(int); | ||
}; | ||
class C : public A, public B { public: | ||
int c; | ||
int h(int); | ||
}; | ||
``` | ||
|
||
![[virtual-base-classes-example.png]] | ||
|
||
Ambiguities can occur (e.g. if `f` is overwritten in both `A` and `B`), resolved by explicit qualification (`pc->B::f`). | ||
|
||
#### Memory Layout with *Offset to Virtual Base* entries | ||
In the memory layout of `C`, the shared base class `W` cannot be placed both | ||
- directly above `A` | ||
- and directly above of `B` | ||
|
||
This violates the assumption that *the parent can be found within an offset of its child which is constant throughout each occurence of the particular parent in some inheritance relation*. We therefore have to drop this assumption to be able to layout shared base classes in memory. | ||
|
||
Each child of the virtual base class stores an *offset to virtual base* (`vbase_offset`) entry in the v-subtable. | ||
|
||
Disadvantage: Each time you access a field of a virtual parent, you will have an indirect memory access! | ||
|
||
##### Example Memory Layout with Offsets to Virtual Bases | ||
Memory layout (in this particular case): | ||
- place `W` at the end of the `C` representation | ||
- In the vtables for "`A` in `C`" and "`B` in `C`" include offsets to the virtual base class `W` within `C` (in addition to the "offset to top" entries) | ||
- e.g. from `A`, the offset is $\Delta W = |A| + |B| + |C|$ | ||
- from `B`, the offset is $\Delta W - \Delta B = \Delta W - |A| = |B| + |C|$ | ||
|
||
![[virtual-baseclass-vtables.png]] | ||
|
||
Some more details: see [[Week 10.2 - VTable experiments]] | ||
|
||
##### Dynamic Casting | ||
Since there is no guaranteed offset between virtual bases and their childs, *static casting* becomes impossible. Example: If `C(A, B)` and `D(C, B)`, then in the `C` layout we have $A|B|C|W$ and in the D layout, $A|B|C|B|D|W$. So a `W` pointer cannot be statically cast into a `C` pointer! | ||
|
||
```c++ | ||
C c; | ||
W* pw = &c; | ||
C* pc = (C*)pw; // This gives a compiler error | ||
C* pc = dynamic_cast<C*>(pw); // This works (uses offset-to-top fields) | ||
``` | ||
|
||
|
||
#### Virtual Thunks | ||
Recall that [[#Thunks to the rescue|thunks]] added a constant (statically known) offset to the this reference before calling the original method. | ||
|
||
This doesn't work anymore in the virtual base class setting, due to [[#Dynamic Casting|casts being dynamic]]. Instead we need *virtual thunks*, which obtain the offset by which to translate the `this` pointer from the vtable. | ||
|
||
The virtual table is extended by one additional entry for each method this is relevant for: The entry corresponding to a method pointer, e.g. `A::Wf`, contains by how much to shift the `this` pointer in the virtual thunk (e.g. `DeltaA-DeltaW` to get from W to the top, then down to A). These offset entries are called *virtual call offsets* (`vcall_offset`). | ||
|
||
(Recall notation: `A::Wf` means calling the `A::f` method from a `W` pointer). | ||
|
||
#### Complete convention for virtual subtable memory layout | ||
- entries $0$ to $n$: virtual function pointers | ||
- entry $-1$: RTTI pointer | ||
- entry $-2$: offset to top `offset_to_top` | ||
- entry $-3$: offset to virtual base `vbase_offset` (in case there is a virtual base) | ||
- entry $-4-i$ or $-3-i$, $i$ from $0$ to $n$: offsets for virtual thunks `vcall_offset` | ||
|
||
![[virtual-subtable-conventions.png]] | ||
|
||
### C++ Memory Layout: Compiler/Runtime assumptions | ||
Compiler generates: | ||
- *one codeblock* per method | ||
- i.e. not different codeblocks depending on the type (like e.g. Rust?) | ||
- *one virtual table* per class composition | ||
- referencing the *most recent* implementations of methods ("of a unique common signature"). I.e. single-dispatching. | ||
- containing sub-tables for the composed sub-classes, | ||
- top-of-object offsets per sub-table, | ||
- virtual base offsets and virtual call offsets per method/subclass if needed | ||
|
||
Runtime behaviour: | ||
- globally create vtables at startup (copied in from binary) | ||
- creating new object: allocate memory, call constructor; constructors stores vtable pointers in the objects | ||
- method calls: call methods *statically* or *dynamically from vtables*; *unaware* of real class identity | ||
- dynamic casts (usually downcasts): use offset-to-top fields. | ||
|
||
### Advantages and Disadvantages of Multi-Inheritance | ||
Pros of *Full Multiple Inheritance* (FMI): | ||
- Removes an (unneeded? inconvenient?) inheritance constraint | ||
- Can be convenient in common cases | ||
- Diamond patterns may occur, but are not as frequent as it seems from the discussion around it | ||
|
||
Pros of *Multiple Interface Inheritance* (MII): | ||
- simpler to implement | ||
- already expressive (enough?) | ||
- using FMI too frequently often considered a flaw in the class hierarchy design | ||
|
||
### Sidenote about Applicability (MS VC++) | ||
The discussion about implementation of FMI applies to GNU C++ and LLVM. | ||
|
||
In Microsoft's Visual C++, FMI is implemented a bit differently however: | ||
- split virtual table into several smaller tables | ||
- keep a *virtual base pointer* (`vbptr`) in the *object representation* which points to the virtual base of a child class |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
Layout: (according to slides) | ||
![[virtual-baseclass-vtables.png]] | ||
|
||
Emitted vtable: | ||
``` | ||
Vtable for 'C' (17 entries). | ||
0 | vbase_offset (32) | ||
1 | offset_to_top (0) | ||
2 | C RTTI | ||
-- (A, 0) vtable address -- | ||
-- (C, 0) vtable address -- | ||
3 | int A::f(int) | ||
4 | int C::h(int) | ||
5 | vbase_offset (16) | ||
6 | offset_to_top (-16) | ||
7 | C RTTI | ||
-- (B, 16) vtable address -- | ||
8 | int B::g(int) | ||
9 | vcall_offset (-32) | ||
10 | vcall_offset (-16) | ||
11 | vcall_offset (-32) | ||
12 | offset_to_top (-32) | ||
13 | C RTTI | ||
-- (W, 32) vtable address -- | ||
14 | int A::f(int) | ||
[this adjustment: 0 non-virtual, -24 vcall offset offset] | ||
15 | int B::g(int) | ||
[this adjustment: 0 non-virtual, -32 vcall offset offset] | ||
16 | int C::h(int) | ||
[this adjustment: 0 non-virtual, -40 vcall offset offset] | ||
Virtual base offset offsets for 'C' (1 entry). | ||
W | -24 | ||
Thunks for 'int C::h(int)' (1 entry). | ||
0 | this adjustment: 0 non-virtual, -40 vcall offset offset | ||
VTable indices for 'C' (1 entries). | ||
1 | int C::h(int) | ||
``` | ||
|
||
Corresponding output by the program: | ||
``` | ||
DeltaW 32 | ||
DeltaA 0 | ||
RTTI 55ae13efcd68 | ||
A::f 55ae13efa2a0 | ||
C::h 55ae13efa380 | ||
DeltaW-DeltaB 16 | ||
DeltaB -16 | ||
RTTI 55ae13efcd68 | ||
B::g 55ae13efa310 | ||
? -32 | ||
? -16 | ||
? -32 | ||
DeltaW -32 | ||
RTTI 55ae13efcd68 | ||
A::Wf 55ae13efa2e0 | ||
B::Wg 55ae13efa350 | ||
C::Wh 55ae13efa3c0 | ||
``` |
Oops, something went wrong.