Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client lease library does not check for expiration #19091

Open
4 tasks done
upamanyus opened this issue Dec 19, 2024 · 0 comments · May be fixed by #19092
Open
4 tasks done

Client lease library does not check for expiration #19091

upamanyus opened this issue Dec 19, 2024 · 0 comments · May be fixed by #19092
Labels

Comments

@upamanyus
Copy link

Bug report criteria

What happened?

The leasing library fails a new test case that does

client1.Put("k", "abc");
client1.Get("k"); // to add it to client1's cache
client2.Put("k", "def");
assert(client1.Get("k") == "def") // sometimes fails, seeing the stale "abc"

with some delays and failures added in between.
This is because the client Lease and Session libraries do not provide a method to check if a lease is currently unexpired, and so the leasing package does not correctly check that its lease is valid.

What did you expect to happen?

Based on leasing documentation, reads should be linearizable, which means the above sequence of operations should cause the final assert to succeed.

// Package leasing serves linearizable reads from a local cache by acquiring

How can we reproduce it (as minimally and precisely as possible)?

The new failing test is in forked tests/integration/clientv3/lease/leasing_test.go:

func TestLeasingGetChecksForExpiration(t *testing.T) {
	integration2.BeforeTest(t)

	ttl := 6

	// There's no way to partition a client from the server, so use multiple
	// servers and shut down a single server to partition the client from the
	// system.
	clus := integration2.NewCluster(t, &integration2.ClusterConfig{Size: 3, UseBridge: true})
	defer clus.Terminate(t)

	// Try to make sure server 0 is not the leader
	if clus.Members[0].Server.Leader() == clus.Members[0].Server.MemberID() {
		err := clus.Members[0].Server.MoveLeader(context.TODO(),
			clus.Members[0].Server.Lead(),
			uint64(clus.Members[1].Server.MemberID()))
		if err != nil {
			panic(err)
		}
	}
	time.Sleep(2 * time.Second)

	leaderID := clus.Members[0].Server.Leader()
	if leaderID == clus.Members[0].Server.MemberID() {
		panic("test wants 0 to not be leader")
	}

	// This client will get partitioned away from the system (by killing the
	// node it's connected to).
	lkv0, closeLKV0, err := leasing.NewKV(clus.Client(0), "pfx/", concurrency.WithTTL(ttl))
	require.NoError(t, err)
	defer closeLKV0()

	lkv1, closeLKV1, err := leasing.NewKV(clus.Client(1), "pfx/")
	require.NoError(t, err)
	defer closeLKV1()

	if _, err = lkv0.Put(context.TODO(), "k", "abc"); err != nil {
		t.Fatal(err)
	}
	if _, err = lkv0.Get(context.TODO(), "k"); err != nil {
		t.Fatal(err)
	}

	time.Sleep(6500 * time.Millisecond)
	clus.Members[0].Stop(t)

	if _, err = lkv1.Put(context.TODO(), "k", "def"); err != nil {
		t.Fatal(err)
	}

	resp, err := lkv1.Get(context.TODO(), "k")
	if err != nil {
		t.Fatal(err)
	}
	if len(resp.Kvs) != 1 || string(resp.Kvs[0].Value) != "def" {
		t.Fatalf(`expected "k"->"def" from lkv1, got response %+v`, resp)
	}

	go func() {
		// Eventually bring back the server so the disconnected client can
		// finish its last `Get()`.
		time.Sleep(1 * time.Second)
		clus.Members[0].Restart(t)
	}()
	cachedResp, err := lkv0.Get(context.TODO(), "k")
	if err != nil {
		t.Fatal(err)
	}
	if len(cachedResp.Kvs) != 1 || string(cachedResp.Kvs[0].Value) != "def" {
		t.Fatalf(`expected "k"->"def", got response %+v`, cachedResp)
	}
}

Anything else we need to know?

No response

Etcd version (please run commands below)

$ etcd --version
etcd Version: 3.6.0-alpha.0
Git SHA: d065a4a9e
Go Version: go1.23.4
Go OS/Arch: linux/amd64

$ etcdctl version
etcdctl version: 3.6.0-alpha.0
API version: 3.6

Etcd configuration (command line flags or environment variables)

No response

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

No response

Relevant log output

No response

@upamanyus upamanyus linked a pull request Dec 19, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

1 participant