Skip to content

Conversation

@safianalicb
Copy link
Contributor

Logging + stats in cbbackupmgr has (mostly) been moved to the repository-level. This change makes promtimer compatible with these changes.

Copy link
Contributor

@dave-finlay dave-finlay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Saf: the change is pretty straightforward and it works -- at least it did when I got the correct version of cbmstatparser built. :-)

One thing that wasn't clear to me was the distinction between "archive-level" stats and "repo-level" stats. Archive-level makes sense when these were the only places that stats could reside in an archive (under logs/stats) but now because stats may be found under any file path that matches*/logs/stats basically there are lots of "archive-level" stats.

Is "repo" a backup repository -- and is it the case that backup archives can now include backup information from multiple different backup repostories?

Copy link
Contributor

@dave-finlay dave-finlay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch is fine and works. Interested in the response to my question - perhaps the comments could be a bit clearer. Let me know your thoughts.

@safianalicb
Copy link
Contributor Author

safianalicb commented Dec 8, 2025

Hi Dave, apologies, I should have added more context in my initial comment.

We are moving from having all logging and stats being at the archive-level, to being at the repository-level. This is to make cbbackupmgr compatible with encryption-at-rest: since encryption information is at the repository-level, it doesn't make sense to encrypt the archive logs (which repositories' information would we use?), so we instead move the logs (and stats) down to the repository-level, and encrypt them there.

So prior to this change, this is what a backup archive looked like:

archive/
├── logs/
│   └── stats/
│       ├── cpu/
│       ├── disk/
│       ├── ...
├── repo1/
│   └── 2024-01-01T12_00_00/
│       └── ...
└── repo2/
    └── 2024-01-02T12_00_00/
        └── ...

You can see here that the only logs/stats is at the archive-level.

With the move to repository-level logging, a backup archive will look like this:

archive/
├── logs/
│   └── stats/
│       └── ...
├── repo1/
│   ├── logs/
│   │   └── stats/
│   │       └── ...
│   └── 2024-01-01T12_00_00/
│       └── ...
└── repo2/
    ├── logs/
    │   └── stats/
    │       └── ...
    └── 2024-01-02T12_00_00/
        └── ...

You can see here that logs/stats is present at both the archive-level, and inside all of the repositories. For archives created after the move, the stats will only be at the repository-level, but it's possible for stats to be in both levels if a newer version of cbbackupmgr is used with an older archive. For backwards compatibility, we check in both places.

One thing that wasn't clear to me was the distinction between "archive-level" stats and "repo-level" stats. Archive-level makes sense when these were the only places that stats could reside in an archive (under logs/stats) but now because stats may be found under any file path that matches*/logs/stats basically there are lots of "archive-level" stats.

The amount of stats files will be the same, they have just been moved to the repo-level. Previously, the archive-level "logs/stats" had all of the stats for all of the repositories; now they will be placed in the appropriate repository directory. Also, we typically see very few repositories in an archive in case you are concerned about the search for the stats paths taking too long - I don't remember seeing any customer cases with >5 repositories.

Is "repo" a backup repository -- and is it the case that backup archives can now include backup information from multiple different backup repostories?

Yes, repo and repository in this context both refer to backup repositories. Backup archives already contain backup repositories, and logging/stats for those repositories - we are just moving the logging from the archive-level to the repository-level.

Sorry, that might have been an overkill explanation. Let me know if anything isn't clear. Thanks

@dave-finlay
Copy link
Contributor

dave-finlay commented Dec 9, 2025

Hey @safianalicb -- thank you! Great explanation.

Would you mind adding to the the method-level comment for get_stats_paths that backup is moving to repo-level logging and that instead of all logs residing under logs/stats/... there will now be logs per backup repos and that the structure will look as follows:

archive/
├── logs/
│   └── stats/
│       └── ...
├── repo1/
│   ├── logs/
│   │   └── stats/
│   │       └── ...
│   └── 2024-01-01T12_00_00/
│       └── ...
└── repo2/
    ├── logs/
    │   └── stats/
    │       └── ...
    └── 2024-01-02T12_00_00/
        └── ...

I think that will help folks that come after you understand what is going on in this method. Otherwise the change is great and I will push submit.

@safianalicb
Copy link
Contributor Author

Thanks @dave-finlay, I updated the docstring for get_stats_path to make this clearer.

Copy link
Contributor

@dave-finlay dave-finlay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Saf!

@dave-finlay dave-finlay merged commit be9501e into couchbaselabs:master Dec 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants