|
2 | 2 |
|
3 | 3 | ## Executive Summary |
4 | 4 |
|
5 | | -**Issue**: `CREATE VECTOR INDEX` times out at exactly 30 seconds when called from a Python subprocess on multi-node CockroachDB clusters, but completes successfully on single-node deployments. |
| 5 | +**Issue**: `CREATE VECTOR INDEX` times out at exactly 30 seconds when called from a Python subprocess on multi-node CockroachDB clusters (v25.2-v25.3), but completes successfully on single-node deployments. |
6 | 6 |
|
7 | | -**Impact**: VectorDBBench integration cannot complete on multi-node clusters (the recommended production configuration) |
| 7 | +**Root Cause**: CockroachDB v25.2-v25.3 has a limitation with online table backfills during vector index creation on populated tables. |
8 | 8 |
|
9 | | -**Evidence**: Comprehensive testing shows this is specifically a multi-node + subprocess interaction issue, not a VectorDBBench framework problem. |
| 9 | +**Solution**: ✅ **Fixed in CockroachDB v25.4!** - The release introduces "online table backfills" that eliminate this timeout issue. |
| 10 | + |
| 11 | +**Evidence**: Comprehensive testing on v25.3.x shows this is a version-specific limitation that has been resolved in v25.4. |
| 12 | + |
| 13 | +**Reference**: https://www.cockroachlabs.com/docs/releases/v25.4 |
| 14 | +> "Online table backfills: adding vector indexes to tables with existing data no longer requires taking the table offline during the backfill process." |
10 | 15 |
|
11 | 16 | --- |
12 | 17 |
|
@@ -350,12 +355,26 @@ These are excellent results! We just need to solve the multi-node subprocess iss |
350 | 355 |
|
351 | 356 | --- |
352 | 357 |
|
353 | | -## Next Steps |
| 358 | +## Resolution: Upgrade to v25.4 |
| 359 | + |
| 360 | +**✅ FIXED IN v25.4**: The release notes confirm this exact issue has been resolved: |
| 361 | + |
| 362 | +From https://www.cockroachlabs.com/docs/releases/v25.4: |
| 363 | + |
| 364 | +> **"Online table backfills: adding vector indexes to tables with existing data no longer requires taking the table offline during the backfill process. This eliminates downtime when adopting vector search capabilities."** |
| 365 | +
|
| 366 | +### Recommended Actions |
| 367 | + |
| 368 | +1. **Upgrade to CockroachDB v25.4+** for production vector search workloads |
| 369 | +2. **Test VectorDBBench on v25.4** to confirm the timeout is resolved |
| 370 | +3. **Update documentation** to require v25.4+ for VectorDBBench compatibility |
| 371 | + |
| 372 | +### For v25.2-v25.3 Users (Workaround) |
354 | 373 |
|
355 | | -1. **Confirm if this is a known issue** with distributed schema changes from subprocess connections |
356 | | -2. **Identify the 30-second timeout** source (server-side setting?) |
357 | | -3. **Determine if there's a configuration** to extend this timeout |
358 | | -4. **Consider if schema change coordinator** needs to handle subprocess connections differently |
| 374 | +If you must use v25.2 or v25.3: |
| 375 | +- Use single-node clusters for benchmarking |
| 376 | +- Create indexes BEFORE loading data (`create_index_before_load=True`) |
| 377 | +- Note that multi-node clusters will experience 30s timeouts during index creation on populated tables |
359 | 378 |
|
360 | 379 | --- |
361 | 380 |
|
|
0 commit comments