Improve MHP precision using ancestor locksets #1865

dabund24 · 2025-11-05T18:14:11Z

First part of #1805.
Second case will be handled in a separate PR.

To be handled

Non-transitive version

When creating $t_1$, $t_0$ must hold a lock $l$. If $l$ is not released before $t_1$ is definitely joined into $t_0$, $t_1$ is protected by $l$.

Examples

graph TB;
subgraph t1;
    E["..."]-->F["return;"];
end;
subgraph t0;
    A["lock(l);"]-->B;
    B["create(t1);"]-->C;
    C["join(t1);"]-->D["unlock(l);"]
end;
B-.->E
F-.->C

graph TB;
subgraph t1;
    E["..."]-->F["return;"];
end;
subgraph t0;
    A["lock(l);"]-->B;
    B["create(t1);"]-->C[return;];
end;
B-.->E

General version

Let $t_1$ be a must-ancestor of $t_0$. When creating $t_1$, $t_0$ must hold a lock $l$. If $l$ is not released before $t_d$ is definitely joined into $t_0$, $t_d$ is protected by $l$.

Example

graph TB;
subgraph td;
    G["..."]-->H["return;"];
end;
subgraph t1;
    E["create(td);"]-->F["return;"];
end;
subgraph t0;
    A["lock(l);"]-->B;
    B["create(t1);"]-->C;
    C["join(td);"]-->D["unlock(l);"]
end;
B-.->E
E-.->G
H-.->C

Dependency Analyses

$\mathcal T$: Ego Thread Id at program point
$\mathcal L$: Must-Lockset at program point
$\mathcal C$: May-Creates of ego thread before program point
$\mathcal J$: Transitive Must-Joins of ego thread before program point
$\mathcal{DES}\ t$: Descendant threads of $t$ (implemented in this PR)
$t_a\in\mathcal{A}\ t$: $t_a$ is a must-ancestor thread of $t$

Conditions to satisfy

$t_0\in\mathcal A\ t_1\land (t_1=t_d\lor t_1\in\mathcal A\ t_d)$
maybe $\exists$ create(t1) in $t_0$ with $l\in\mathcal L$
$\forall$ create(t1) in $t_0:l\in\mathcal L$
$\forall$ unlock(l) in $t_0:t_d\notin\left(\mathcal C\cup\bigcup_{c\in\mathcal C}\mathcal{DES}\ c\right)\setminus\mathcal J$

Possible solutions

1. Explicitly listing all descendants

$\mathcal{CL}\subseteq T\to T\to 2^L$
$T\to 2^L$ is MapBot
$2^L$ is Must-Set
Flow-Insensitive
$(t_1\mapsto\{t_0\mapsto L\})\in\mathcal{CL}$ means " $t_1$ is protected by all mutexes in $L$ locked in $t_0$ and by nothing else".

Contributions

create(t1):
$\forall t\in t_1\cup\mathcal{DES}\ t_1:$
$$\mathcal{CL}\ t\sqsupseteq\{\mathcal T\mapsto\mathcal L\}$$
unlock(l):
$\forall t\in \left(\mathcal C\cup\bigcup_{c\in\mathcal C}\mathcal{DES}\ c \right)\setminus\mathcal J:$
$\mathcal{CL}\ t\sqsupseteq \{\mathcal T\mapsto(\mathcal{CL}\ t\ \mathcal T)\setminus \{l\}\}$
unlock of unknown mutex:
$\forall t\in \left(\mathcal C\cup\bigcup_{c\in\mathcal C}\mathcal{DES}\ c \right)\setminus\mathcal J:$
$$\mathcal{CL}\ t\sqsupseteq \{\mathcal T\mapsto\emptyset\}$$

Rules for MHP exclusion

Program points $s_1$ with $\mathcal T_1$, $\mathcal L_1$ and $\mathcal{CL}_1$ and $s_2$ with $\mathcal T_2$, $\mathcal L_2$ and $\mathcal{CL}_2$ cannot happen in parallel if at least one condition holds:

$\exists (t_a\mapsto L_a)\in\mathcal{CL}_1:L_a\cap\mathcal L_2\neq\emptyset,t_a\neq \mathcal T_2$
$\exists (t_a\mapsto L_a)\in\mathcal{CL}_2:L_a\cap\mathcal L_1\neq\emptyset,t_a\neq \mathcal T_1$
$\exists(t_{a1}\mapsto L_{a1})\in\mathcal{CL}_ 1,(t_{a2}\mapsto L_{a2})\in\mathcal {CL}_ 2: L_{a1}\cap L_{a2}\neq\emptyset\land t_{a1}\neq t_{a2}$

…ter-proc lock regression tests

… tests

…t analysis

michael-schwarz · 2025-12-10T03:56:14Z

Before discussing this, I'll start by first explaining the potential fix of the bug just in case this is part of the necessary considerations. The reason the bug happens is the fact that the lockset analysis does some path splitting, thus there exists a create(t2)-statement (the one after node 14) with mutex in the must-lockset.

Stupid question: Is that not a problem of how CL is constructed that is in principle independent of path-sensitivity and can, e.g., also arise as a result of context-sensitivity?

void evil(int x) {
      if(x) lock(a);
      create(t1)
      if(x) unlock(a);
}

void main() {
    // Branching to ensure created thread has a unique tid, even if there is potentially two places it is created : -)
    if (top) {
        evil(0);
    } else {
        evil(1);
    }
}

Assuming evil is analyzed context-sensitively with x in context (which it usually is), we have the same problem here? Or not?

Would $T \to T \to 2^L$ with $2^L$ a must set not be the better choice?
(Alternatively, you may want to write it as $T \to 2^{(T \times 2^L)}$ with the invariant that there is only one tuple (t,L) for each t)

Then, $l \in CL \quad t_d \quad t_0$ means that $t_0$ is a must parent of $t_d$ and it always holds $l$ when creating $t_0$.

Afterthought: how do you deal with ambiguous creators? I guess giving up when the thread id is no longer unique?

The push for a more modular solution was that I implemented something that looks somewhat similar on the surface (#1065) which turned out to cause a slow-down by a factor of 4 (#1120), which we have still not fixed.

But maybe we can go with the descendant global invariant for now and then check later if it causes any slowdown we're unwilling to pay on real programs? We can still go for the more involved local solution later if this is the case?
(Probably something for @dabund24 and @DrMichaelPetter to decide).

dabund24 · 2025-12-10T11:20:18Z

Before discussing this, I'll start by first explaining the potential fix of the bug just in case this is part of the necessary considerations. The reason the bug happens is the fact that the lockset analysis does some path splitting, thus there exists a create(t2)-statement (the one after node 14) with mutex in the must-lockset.

Stupid question: Is that not a problem of how CL is constructed that is in principle independent of path-sensitivity and can, e.g., also arise as a result of context-sensitivity?
void evil(int x) {
      if(x) lock(a);
      create(t1)
      if(x) unlock(a);
}

void main() {
    // Branching to ensure created thread has a unique tid, even if there is potentially two places it is created : -)
    if (top) {
        evil(0);
    } else {
        evil(1);
    }
}
Assuming evil is analyzed context-sensitively with x in context (which it usually is), we have the same problem here? Or not?

I haven't thought about that, but you are right, we do have the same problem here. In 1db14cb I added another test, which covers this. However, I think that the fix for the path-sensitivity problem should fix this, too, since in that case, we would again have another creation statement without the mutex locked and add it to the tainted set (or use your approach).

Afterthought: how do you deal with ambiguous creators? I guess giving up when the thread id is no longer unique?

I was assuming initially, that the three cases in the section "Notes on non-unique thread ids" from the PR-summary are the only relevant cases, where ambiguous creators could be a problem, but thinking about it, that is a really bold claim, which I just should not make without knowing a proof for it. Checking if the descendant threads have a unique thread id wouldn't even result in a loss of precision in most cases[1], since the threadJoin analysis also gives up on non-unique TIDs.

Would T → T → 2 L with 2 L a must set not be the better choice?
(Alternatively, you may want to write it as T → 2 ( T × 2 L ) with the invariant that there is only one tuple (t,L) for each t)

I'm amazed, that is so much nicer O_O

But maybe we can go with the descendant global invariant for now and then check later if it causes any slowdown we're unwilling to pay on real programs? We can still go for the more involved local solution later if this is the case?
(Probably something for @dabund24 and @DrMichaelPetter to decide).

I am going to implement the other case, too, since I am also somewhat intrigued now, how the two approaches will compare. Thanks a lot for your remarks!

[1] it would if we never unlock and never join, but I think that this wouldn't be too tragic

dabund24 · 2025-12-11T12:44:42Z

Is there a way of accessing all (must-)ancestors of a thread? Getting all threads in general or all keys of a global analysis would also work, but I don't see any of that to be possible at first glance. If that's the cas, I can add a MustAncestors analysis, such that $\bigcup_{a\in\mathcal A}\ldots$ becomes possible to implement

…eaner

dabund24 · 2025-12-11T20:41:33Z

I have not yet completely reviewed it (please re-request my review once you have fixed the bug you found).

The simpler version is now done (at least that's what I believe) and in sync with the PR-summary. If you want to review it already, you can do so. Otherwise, feel free to ignore the review request until the alternative solution is also implemented.

michael-schwarz · 2025-12-12T02:13:58Z

If you want to review it already, you can do so.

I probably won't get around to it until some time next week, but I added it to my TODO (list / stack / multiset).

michael-schwarz · 2025-12-12T02:16:52Z

Is there a way of accessing all (must-)ancestors of a thread?

I think in general no. However, if you only need the must ancestors of definite thread ids (which I guess is true in your case?), you can reconstruct them as the new create edge is simply appended to the sequence of the parent.

Such a function must_ancestors: TID -> TID list option could then be added to the generic thread id interface, and just return None, i.e., "information not available" for the thread ids which don't allow this trick.

dabund24 added 5 commits October 17, 2025 10:27

add inter-procedural lock c files

7831cbf

add lock-fork hb relationship c file

3ef7082

use pthread_create() and pthread_join() instead of race macros for in…

9d88584

…ter-proc lock regression tests

activate creationLockset analysis for inter-threaded lock regressions…

b7d0d35

… tests

initial version of creationLockset analysis

93a513a

sim642 added feature student-job precision labels Nov 5, 2025

dabund24 added 6 commits November 7, 2025 12:26

AncestorLocksetSpec as common base module

946536b

initial version of TaintedCreationLockset analysis

a78e44c

use thread domain instead of lifted thread domain

8ba9094

add threadJoins as dependency for TaintedCreationLockset analysis

ead4843

initial version of transitive descendants analysis

73e6d9c

some comments in transitiveDescendats analysis

f212c45

sim642 changed the title ~~Improve mhp precision using ancestor locksets~~ Improve MHP precision using ancestor locksets Nov 10, 2025

dabund24 added 15 commits November 11, 2025 12:29

query for descendant analysis

4633ec7

get rid of unnecessary match expression

3f3e2f3

MayCreationLockset query

691cdfd

InterThreadedLockset query

61d5c3f

fix incorrect query answer type in transitive descendants analysis

e1719b2

cartesian product helper functions

8720b23

remove unused function from TaintedCreationLocksetSpec

7c6e5d9

correct comment in tainted lockset analysis

61a48dc

replace threadset and lockset module references with shorthand

31ceff8

function for getting currently running tids

450d349

inter-threaded lockset A module

7d1fa1a

use topped set for global domain in AncestorLocksetSpec

3c52c76

replace comparison operators with equals function of domains

8b1727f

add creationLockset analysis to dependencies of taintedCreationLockse…

b0751dc

…t analysis

fix regression test files

09f28ba

remove debug statements accidentally committed

6c2c849

add ambiguous context regression test

1db14cb

dabund24 marked this pull request as draft December 10, 2025 11:22

dabund24 added 7 commits December 10, 2025 23:47

change global domain for creation lockset analysis

099f742

remove creation lockset query

850e1cb

remove config constraints for creation lockset analysis

76c14dd

remove query function from creation lockset analysis

1f3cdef

enforce must-ancestor property in unlock transfer function

72afe4a

remove some semi-colons

3b494d9

fix comments for test files

583a41f

dabund24 added 9 commits December 11, 2025 14:11

reorder some statements in event transition function

744f584

comment on applying setminus to bottom

648de9a

move domain type dafinition back from queries to analysis

772cf03

move comment explaining the analysis to top of file

a8479fe

fix an outdated comment

14d195f

rename descendants analysis

a0c2446

make threadspawn transfer function in descendants analysis a little l…

048ca71

…eaner

top comment for thread descendants analysis

a677314

re-add an empty line, which was removed in a previous commit

97a6967

dabund24 requested a review from michael-schwarz December 11, 2025 20:41

remove redundant and inaccurate comment

2c9c39d

dabund24 added 3 commits December 12, 2025 22:17

must_ancestors function for thread ids

c36d6d1

must_ancestors test

3c1a51f

Merge branch 'master' into master

32c76cd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve MHP precision using ancestor locksets #1865

Improve MHP precision using ancestor locksets #1865

Uh oh!

dabund24 commented Nov 5, 2025 •

edited

Loading

Uh oh!

michael-schwarz commented Dec 10, 2025 •

edited

Loading

Uh oh!

dabund24 commented Dec 10, 2025 •

edited

Loading

Uh oh!

dabund24 commented Dec 11, 2025

Uh oh!

dabund24 commented Dec 11, 2025 •

edited

Loading

Uh oh!

michael-schwarz commented Dec 12, 2025

Uh oh!

michael-schwarz commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Improve MHP precision using ancestor locksets #1865

Are you sure you want to change the base?

Improve MHP precision using ancestor locksets #1865

Uh oh!

Conversation

dabund24 commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

To be handled

Non-transitive version

Examples

General version

Example

Dependency Analyses

Conditions to satisfy

Possible solutions

1. Explicitly listing all descendants

Contributions

Rules for MHP exclusion

Uh oh!

michael-schwarz commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dabund24 commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dabund24 commented Dec 11, 2025

Uh oh!

dabund24 commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michael-schwarz commented Dec 12, 2025

Uh oh!

michael-schwarz commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dabund24 commented Nov 5, 2025 •

edited

Loading

michael-schwarz commented Dec 10, 2025 •

edited

Loading

dabund24 commented Dec 10, 2025 •

edited

Loading

dabund24 commented Dec 11, 2025 •

edited

Loading