fix(clustered): Closes influxdata/DAR#472. Catalog terminology is confusing, needs scaling recommendations:

jstirnaman · jstirnaman · commit 8491969abde1 · 2025-03-11T17:18:05.000-05:00
- Distinguish between Catalog store (Postgres db) and Catalog service (API and cache for the store).
- Add scaling recommendations for each.
- Resolve conflicting scaling info, remove duplicated scaling info from storage-engine.md.
diff --git a/content/influxdb3/clustered/admin/backup-restore.md b/content/influxdb3/clustered/admin/backup-restore.md
@@ -12,29 +12,29 @@ weight: 105
 influxdb3/clustered/tags: [backup, restore]
 ---
 
-InfluxDB Clustered automatically stores snapshots of the InfluxDB Catalog that
+InfluxDB Clustered automatically stores snapshots of the InfluxDB Catalog store that
 you can use to restore your cluster to a previous state. The snapshotting
 functionality is optional and is disabled by default.
 Enable snapshots to ensure you can recover
 in case of emergency.
 
 With InfluxDB Clustered snapshots enabled, each hour, InfluxDB uses the `pg_dump`
-utility included with the InfluxDB Garbage Collector to export an SQL blob or
-“snapshot” from the InfluxDB Catalog and store it in the object store.
-The Catalog is a PostgreSQL-compatible relational database that stores metadata
+utility included with the InfluxDB Garbage collector to export an SQL blob or
+“snapshot” from the InfluxDB Catalog store to the Object store.
+The Catalog store is a PostgreSQL-compatible relational database that stores metadata
 for your time series data, such as schema data types, Parquet file locations, and more.
 
-The Catalog snapshots act as recovery points for your InfluxDB cluster that
-reference all Parquet files that existed in the object store at the time of the
-snapshot. When a snapshot is restored to the Catalog, the Compactor
+The Catalog store snapshots act as recovery points for your InfluxDB cluster that
+reference all Parquet files that existed in the Object store at the time of the
+snapshot. When a snapshot is restored to the Catalog store, the Compactor
 “[soft deletes](#soft-delete)” any Parquet files not listed in the snapshot.
 
 > [!Note]
 > InfluxDB won't [hard delete](#hard-delete) Parquet files listed in _any_ hourly or daily snapshot.
 > 
 > For example, if you have Parquet files A, B, C, and D, and you restore to a
 > snapshot that includes B and C, but not A and D, then A and D are soft-deleted, but remain in object
-> storage until they are no longer referenced in any Catalog snapshot.
+> storage until they are no longer referenced in any Catalog store snapshot.
 - [Soft delete](#soft-delete)
 - [Hard delete](#hard-delete)
 - [Recovery Point Objective (RPO)](#recovery-point-objective-rpo)
@@ -75,8 +75,8 @@ The InfluxDB Clustered snapshot strategy RPO allows for the following maximum da
  ## Recovery Time Objective (RTO)
 
 RTO is the maximum amount of downtime allowed for an InfluxDB cluster after a failure.
-RTO varies depending on the size of your Catalog database, network speeds
-between the client machine and the Catalog database, cluster load, the status
+RTO varies depending on the size of your Catalog store, network speeds
+between the client machine and the Catalog store, cluster load, the status
 of your underlying hosting provider, and other factors.
 
 ## Data written just before a snapshot may not be present after restoring
@@ -94,14 +94,14 @@ present after restoring to that snapshot.
 ### Automate object synchronization to an external S3-compatible bucket
 
 Syncing objects to an external S3-compatible bucket ensures an up-to-date backup
-in case your object store becomes unavailable. Recovery point snapshots only
-back up the InfluxDB Catalog. If data referenced in a Catalog snapshot does not
-exist in the object store, the recovery process does not restore the missing data.
+in case your Object store becomes unavailable. Recovery point snapshots only
+back up the InfluxDB Catalog store. If data referenced in a Catalog store snapshot does not
+exist in the Object store, the recovery process does not restore the missing data.
 
 ### Enable short-term object versioning
 
 If your object storage provider supports it, consider enabling short-term
-object versioning on your object store--for example, 1-2 days to protect against errant writes or deleted objects.
+object versioning on your Object store--for example, 1-2 days to protect against errant writes or deleted objects.
 With object versioning enabled, as objects are updated, the object store
 retains distinct versions of each update that can be used to “rollback” newly
 written or updated Parquet files to previous versions.
@@ -140,7 +140,7 @@ spec:
 
 #### INFLUXDB_IOX_CREATE_CATALOG_BACKUP_DATA_SNAPSHOT_FILES
 
-Enable hourly Catalog snapshotting. The default is `'false'`. Set to `'true'`:
+Enable hourly Catalog store snapshotting. The default is `'false'`. Set to `'true'`:
 
 ```yaml
 INFLUXDB_IOX_CREATE_CATALOG_BACKUP_DATA_SNAPSHOT_FILES: 'true'
@@ -217,22 +217,20 @@ written on or around the beginning of the next hour.
 ## Restore to a recovery point
 
 Use the following process to restore your InfluxDB cluster to a recovery point
-using Catalog snapshots:
+using Catalog store snapshots:
 
 1.  **Install prerequisites:**  
 
     - `kubectl` CLI for managing your Kubernetes deployment.  
-    - `psql` CLI to interact with the PostgreSQL-compatible Catalog database with
-      the appropriate Data Source Name (DSN) and connection credentials.  
-    - A client to interact with your InfluxDB cluster’s object store.
-      Supported clients depend on your object storage provider.
+    - `psql` CLI configured with your Data Source Name and credentials for interacting with the PostgreSQL-compatible Catalog store database.
+    - A client from your object storage provider for interacting with your InfluxDB cluster's Object store.
 
 2.  **Retrieve the recovery point snapshot from your object store.**
 
     InfluxDB Clustered stores hourly and daily snapshots in the
     `/catalog_backup_file_lists` path in object storage. Download the snapshot
-    that you would like to use as the recovery point. If your primary object
-    store is unavailable, download the snapshot from your replicated object store.
+    that you would like to use as the recovery point. If your primary Object
+    store is unavailable, download the snapshot from your replicated Object store.
 
     > [!Important]
     > When creating and storing a snapshot, the last artifact created is the
diff --git a/content/influxdb3/clustered/admin/databases/delete.md b/content/influxdb3/clustered/admin/databases/delete.md
@@ -41,8 +41,8 @@ Once a database is deleted, data stored in that database cannot be recovered.
 
 After a database is deleted, you cannot reuse the same name for a new database.
 
-#### Never directly modify the Catalog
+#### Never directly modify the Catalog store
 
-Never directly modify the [PostgreSQL-compatible Catalog](/influxdb3/clustered/reference/internals/storage-engine/#catalog).
- Doing so will result in an undefined state for various components and may lead to data loss and crashes.
+Never directly modify the [PostgreSQL-compatible Catalog store](/influxdb3/clustered/reference/internals/storage-engine/#catalog-store).
+Doing so will result in an undefined state for various components and may lead to data loss and crashes.
 {{% /warn %}}
diff --git a/content/influxdb3/clustered/admin/scale-cluster.md b/content/influxdb3/clustered/admin/scale-cluster.md
@@ -22,19 +22,12 @@ resources available to each component.
 - [Scaling strategies](#scaling-strategies)
   - [Vertical scaling](#vertical-scaling)
   - [Horizontal scaling](#horizontal-scaling)
+- [Scale your cluster as a whole](#scale-your-cluster-as-a-whole)
 - [Scale components in your cluster](#scale-components-in-your-cluster)
   - [Horizontally scale a component](#horizontally-scale-a-component)
   - [Vertically scale a component](#vertically-scale-a-component)
   - [Apply your changes](#apply-your-changes)
-- [Scale your cluster as a whole](#scale-your-cluster-as-a-whole)
 - [Recommended scaling strategies per component](#recommended-scaling-strategies-per-component)
-  - [Ingester](#ingester)
-  - [Querier](#querier)
-  - [Router](#router)
-  - [Compactor](#compactor)
-  - [Garbage collector](#garbage-collector)
-  - [Catalog](#catalog)
-  - [Object store](#object-store)
 
 ## Scaling strategies
 
@@ -59,6 +52,14 @@ throughput a system can manage, but also provides additional redundancy and fail
 
 {{< html-diagram/scaling-strategy "horizontal" >}}
 
+## Scale your cluster as a whole
+
+Scaling your entire InfluxDB Cluster is done by scaling your Kubernetes cluster
+and is managed outside of InfluxDB. The process of scaling your entire Kubernetes
+cluster depends on your underlying Kubernetes provider. You can also use 
+[Kubernetes autoscaling](https://kubernetes.io/docs/concepts/cluster-administration/cluster-autoscaling/)
+to automatically scale your cluster as needed.
+
 ## Scale components in your cluster
 
 The following components of your InfluxDB cluster are scaled by modifying
@@ -69,11 +70,12 @@ properties in your `AppInstance` resource:
 - Compactor
 - Router
 - Garbage collector
+- Catalog service
 
 {{% note %}}
-#### Scale your Catalog and Object store
+#### Scale your Catalog store and Object store
 
-Your InfluxDB [Catalog](/influxdb3/clustered/reference/internals/storage-engine/#catalog)
+Your InfluxDB [Catalog store](/influxdb3/clustered/reference/internals/storage-engine/#catalog-store)
 and [Object store](/influxdb3/clustered/reference/internals/storage-engine/#object-store)
 are managed outside of your `AppInstance` resource. Scaling mechanisms for these
 components depend on the technology and underlying provider used for each.
@@ -451,22 +453,15 @@ helm upgrade \
 {{% /code-tab-content %}}
 {{< /code-tabs-wrapper >}}
 
-## Scale your cluster as a whole
-
-Scaling your entire InfluxDB Cluster is done by scaling your Kubernetes cluster
-and is managed outside of InfluxDB. The process of scaling your entire Kubernetes
-cluster depends on your underlying Kubernetes provider. You can also use 
-[Kubernetes autoscaling](https://kubernetes.io/docs/concepts/cluster-administration/cluster-autoscaling/)
-to automatically scale your cluster as needed.
-
 ## Recommended scaling strategies per component
 
 - [Router](#router)
 - [Ingester](#ingester)
 - [Querier](#querier)
 - [Compactor](#compactor)
 - [Garbage collector](#garbage-collector)
-- [Catalog](#catalog)
+- [Catalog store](#catalog-store)
+- [Catalog service](#catalog-service)
 - [Object store](#object-store)
 
 ### Router
@@ -563,16 +558,29 @@ efficiently as vertical scaling.
 
 ### Garbage collector
 
-The Garbage collector can be scaled [vertically](#vertical-scaling). It is a
-light-weight process that typically doesn't require many system resources, but
-if you begin to see high resource consumption on the garbage collector, you can
-scale it vertically to address the added workload.
+The Garbage collector is not designed for distributed load and should _not_ be
+scaled horizontally. It is a lightweight process that typically doesn't require
+significant system resources. [Vertical scaling](#vertical-scaling) should only
+be considered if you observe consistently high CPU usage or if the container
+regularly runs out of memory.
+
+### Catalog store
+
+The Catalog store is a PostgreSQL-compatible database that persistently stores metadata. 
+Scaling strategies depend on your chosen PostgreSQL implementation.
+All support [vertical scaling](#vertical-scaling), and most support
+[horizontal scaling](#horizontal-scaling) for redundancy and failover.
+
+### Catalog service
 
-### Catalog
+The Catalog service should maintain exactly 
+3 replicas for optimal redundancy.
+Additional replicas are discouraged; favor vertical scaling instead if performance improvements are needed.
 
-Scaling strategies available for the Catalog depend on the PostgreSQL-compatible
-database used to run the catalog. All support [vertical scaling](#vertical-scaling).
-Most support [horizontal scaling](#horizontal-scaling) for redundancy and failover.
+> [!Note]
+> The [Catalog service](/influxdb3/clustered/reference/internals/storage-engine/#catalog-service) is managed through the
+> `AppInstance` resource, while the [Catalog store](/influxdb3/clustered/reference/internals/storage-engine/#catalog-store) 
+> is managed separately according to your PostgreSQL implementation.
 
 ### Object store
 
diff --git a/content/influxdb3/clustered/install/_index.md b/content/influxdb3/clustered/install/_index.md
@@ -61,13 +61,13 @@ Updating your InfluxDB cluster is as simple as re-applying your app-instance wit
 
 The word safely here means being able to redeploy your cluster while still being able to use the tokens you’ve created, and being able to write/query to the database you’ve previously created.
 
-All of the important state in InfluxDB 3 lives in the Catalog (the Postgres equivalent database) and the Object Store (the S3 compatible store). These should be treated with the utmost care. 
+All of the important state in InfluxDB 3 lives in the Catalog store (the Postgres equivalent database) and the Object Store (the S3 compatible store). These should be treated with the utmost care. 
 
-If a full redeploy of your cluster needs to happen, the namespace containing the Influxdb instance can be deleted **_as long as your Catalog and Object Store are not in this namespace_**. Then, the influxdb AppInstance can be redeployed. It is possible the operator may need to be removed and reinstalled. In that case, deleting the namespace that the operator is deployed into and redeploying is acceptable.
+If a full redeploy of your cluster needs to happen, the namespace containing the Influxdb instance can be deleted **_as long as your Catalog store and Object Store are not in this namespace_**. Then, the influxdb AppInstance can be redeployed. It is possible the operator may need to be removed and reinstalled. In that case, deleting the namespace that the operator is deployed into and redeploying is acceptable.
 
 ### Backing up your data
 
-The Catalog and Object store contain all of the important state for InfluxDB 3. They should be the primary focus of backups. Following the industry standard best practices for your chosen Catalog implementation and Object Store implementation should provide sufficient backups.  In our Cloud products, we do daily backups of our Catalog, in addition to automatic snapshots, and we preserve our Object Store files for 100 days after they have been soft-deleted.
+The Catalog store and Object store contain all of the important state for InfluxDB 3. They should be the primary focus of backups. Following the industry standard best practices for your chosen Catalog store implementation and Object Store implementation should provide sufficient backups.  In our Cloud products, we do daily backups of our Catalog, in addition to automatic snapshots, and we preserve our Object Store files for 100 days after they have been soft-deleted.
 
 ### Recovering your data
 
diff --git a/content/influxdb3/clustered/install/secure-cluster/tls.md b/content/influxdb3/clustered/install/secure-cluster/tls.md
@@ -17,7 +17,7 @@ following:
 
 - Ingress to your cluster
 - Connection to your Object store
-- Connection to your Catalog (PostgreSQL-compatible) database
+- Connection to your Catalog store (PostgreSQL-compatible) database
 
 {{% note %}}
 If using self-signed certs,
@@ -177,8 +177,8 @@ objectStore:
 Refer to your PostreSQL-compatible database provider's documentation for
 installing TLS certificates and ensuring secure connections.
 
-If currently using an unsecure connection to your Catalog database, update your
-Catalog data source name (DSN) to **remove the `sslmode=disable` query parameter**:
+If currently using an unsecure connection to your Catalog store database, update your
+Catalog store data source name (DSN) to **remove the `sslmode=disable` query parameter**:
 
 {{% code-callout "\?sslmode=disable" "magenta delete" %}}
 ```txt
diff --git a/content/influxdb3/clustered/install/set-up-cluster/prerequisites.md b/content/influxdb3/clustered/install/set-up-cluster/prerequisites.md
@@ -100,7 +100,7 @@ following sizing for {{% product-name %}} components:
 {{% tab-content %}}
 <!--------------------------------- BEGIN AWS --------------------------------->
 
-- **Catalog (PostgreSQL-compatible database) (x1):**
+- **Catalog store (PostgreSQL-compatible database) (x1):**
   - _[See below](#postgresql-compatible-database-requirements)_
 - **Ingesters and Routers (x3):**
   - EC2 m6i.2xlarge (8 CPU, 32 GB RAM)
@@ -117,7 +117,7 @@ following sizing for {{% product-name %}} components:
 {{% tab-content %}}
 <!--------------------------------- BEGIN GCP --------------------------------->
 
-- **Catalog (PostgreSQL-compatible database) (x1):**
+- **Catalog store (PostgreSQL-compatible database) (x1):**
   - _[See below](#postgresql-compatible-database-requirements)_
 - **Ingesters and Routers (x3):**
   - GCE c2-standard-8 (8 CPU, 32 GB RAM)
@@ -134,7 +134,7 @@ following sizing for {{% product-name %}} components:
 {{% tab-content %}}
 <!-------------------------------- BEGIN Azure -------------------------------->
 
-- **Catalog (PostgreSQL-compatible database) (x1):**
+- **Catalog store (PostgreSQL-compatible database) (x1):**
   - _[See below](#postgresql-compatible-database-requirements)_
 - **Ingesters and Routers (x3):**
   - Standard_D8s_v3 (8 CPU, 32 GB RAM)
@@ -151,7 +151,7 @@ following sizing for {{% product-name %}} components:
 {{% tab-content %}}
 <!------------------------------- BEGIN ON-PREM ------------------------------->
 
-- **Catalog (PostgreSQL-compatible database) (x1):**
+- **Catalog store (PostgreSQL-compatible database) (x1):**
   - CPU: 4-8 cores
   - RAM: 16-32 GB
 - **Ingesters and Routers (x3):**
diff --git a/content/influxdb3/clustered/query-data/troubleshoot-and-optimize/report-query-performance-issues.md b/content/influxdb3/clustered/query-data/troubleshoot-and-optimize/report-query-performance-issues.md
@@ -78,8 +78,8 @@ including the following:
 - CPU and memory resources set on each type of InfluxDB pod
 - The number of pods in each InfluxDB StatefulSet and Deployment
 - The type of object store used and how it is hosted
-- How the Catalog (PostgreSQL-compatible database) is hosted
-- Indicate if either the Object store or the Catalog is shared by more than one InfluxDB
+- How the Catalog store (PostgreSQL-compatible database) is hosted
+- Indicate if either the Object store or the Catalog store is shared by more than one InfluxDB
   Clustered product
   - If so, describe the network-level topology of your setup
 
diff --git a/content/influxdb3/clustered/reference/internals/storage-engine.md b/content/influxdb3/clustered/reference/internals/storage-engine.md
diff --git a/content/influxdb3/clustered/reference/release-notes/clustered.md b/content/influxdb3/clustered/reference/release-notes/clustered.md
diff --git a/content/influxdb3/clustered/write-data/best-practices/data-lifecycle.md b/content/influxdb3/clustered/write-data/best-practices/data-lifecycle.md