* feat(metrics): add Prometheus GC metrics
Track garbage collection activity with three new metrics:
- zot_gc_runs_total (counter, label: error) — GC run count
- zot_gc_duration_seconds (summary) — GC run duration
- zot_gc_deleted_total (counter, label: type) — items deleted
by type: blob, manifest, upload
MetricServer is added to GarbageCollect and wired through
all callers (controller, verify-feature retention, tests).
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* fix(test): add missing metrics var in GCS GC tests
TestGCSGarbageCollectImageIndex and
TestGCSGarbageCollectChainedImageIndexes were missing the
metrics variable required by NewGarbageCollect after the
MetricServer parameter was added.
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* fix(test): add defer metrics.Stop() in GC tests
Prevent goroutine/port leaks by stopping MetricsServer in
storage_test.go (3 functions) and gcs_test.go (also add
missing metrics declaration in TestGCSGarbageCollectImageManifest).
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* fix(test): cover `CleanRepo` error path
Add test that exercises the error branch in
`CleanRepo` where `cleanRepo` fails, covering
the metrics calls and log lines flagged by Codecov.
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* test: Cover GC error paths for codecov
Add three tests in gc_internal_test.go to cover previously
untested error branches in `removeBlobUploads` and
`removeUnreferencedBlobs`: `ListBlobUploads` failure,
`addIndexBlobsToReferences` failure, and `PathNotFoundError`
from `GetAllBlobs`.
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* test(gc): cover remaining error paths
Cover `StatBlobUpload`, `digest.Validate()`,
`isBlobOlderThan`, and `CleanupRepo` error branches
in `removeBlobUploads` and `removeUnreferencedBlobs`.
`removeUnreferencedBlobs` now at 100% coverage,
`removeBlobUploads` from 78.3% to 91.3%.
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* test: cover `sanityChecks` label name mismatch
Try to avoid -0.09% coverage regression on `minimal.go`
by exercising the uncovered branch in `sanityChecks`
where label names have correct count but wrong values.
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* test(gc): exercise real GC path in metrics test
TestGCMetrics was calling metric helpers directly instead of
running actual garbage collection, so it couldn't catch wiring
regressions where `CleanRepo` stops recording metrics.
Now uploads an orphaned blob and runs `gc.CleanRepo` end-to-end,
verifying metrics appear on the Prometheus endpoint.
Suggestion from Copilot: https://github.com/project-zot/zot/pull/3863#discussion_r3129324719
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* fix(gc): skip deletion metrics when DryRun is enabled
https://github.com/project-zot/zot/pull/3863#discussion_r3129324684
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* fix(test): stop leaked MetricsServer goroutines in GCS tests
https://github.com/project-zot/zot/pull/3863#discussion_r3129324657
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* refactor(test): drop unnecessary zlog import alias
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* fix(monitoring): expose metric types outside build tag
`MetricsCopy` and related types were only visible under `\!metrics`,
causing a typecheck failure when golangci-lint runs with `-tags metrics`.
Moving the type definitions to `common.go` makes them unconditionally available.
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* fix(monitoring): remove extra blank line for gci
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* test(gc): cover both dry-run and real deletion metrics
And fix issue with build tag with metrics
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
* Satisfy testpackage linter for gc metrics test
The `testpackage` linter allows `package gc` only in files named
`*_internal_test.go`; rename to follow that convention.
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
---------
Signed-off-by: Benoit Tigeot <benoit.tigeot@lifen.fr>
GC and scrub should not stop if a manifest or index is missing from storage.
Other similar changes are also included.
WRT metadb, the missing manifests cannot be added, and the results returned from metadb
do not include the descriptors for these manifests.
Signed-off-by: Andrei Aaron <andreifdaaron@gmail.com>
feat: add verify-feature retention subcommand with comprehensive testing and validation
Add a `verify-feature retention` subcommand that allows users to preview and
validate retention policy changes without running the actual Zot server.
The command runs GC and retention tasks in dry-run mode for immediate feedback.
- Run verify-feature retention standalone without starting the server
- Preview retention policy decisions in dry-run mode
- Configurable GC interval override via command-line flag
- Optional timeout for task completion
- Configurable log output (stdout or file)
Basic usage:
```bash
zot verify-feature retention <config-file>
```
With log file output:
```bash
zot verify-feature retention -l /var/log/zot-retention-check.log <config-file>
```
With GC interval override (runs GC tasks every 30 seconds):
```bash
zot verify-feature retention -i 30s <config-file>
```
With timeout (wait up to 5 minutes for tasks to complete):
```bash
zot verify-feature retention -t 5m <config-file>
```
Combined flags:
```bash
zot verify-feature retention -l /var/log/zot-retention-check.log -i 1m -t 10m <config-file>
```
The command supports overriding GC settings from the config:
- `-i, --gc-interval`: Override the GC interval setting (applies to all storage paths including subpaths)
- Refactored `RunGCTasks` from `controller.go` to be reusable
- Added `checkServerRunning` validation to prevent conflicts
- Implemented signal handling for graceful shutdown
- Added configuration sanitization and logging
- Set GCMaxSchedulerDelay programmatically (not user-configurable)
Added tests for coverage on main function:
- Negative test cases (no args, bad config, GC disabled, server running)
- Both BoltDB and Redis
- Retention enabled scenarios with complex image setups
- Retention disabled scenarios
- Delete referrers functionality
- Subpaths configuration
- GC interval override validation
Run the verify-feature retention tests:
```bash
go test -v ./pkg/cli/server -run TestRetentionCheck
```
Signed-off-by: Andrei Aaron <andreifdaaron@gmail.com>
* fix: migrate to Go module v2 for proper semantic versioning
This change updates the module path from 'zotregistry.dev/zot' to
'zotregistry.dev/zot/v2' to comply with Go's semantic versioning rules.
According to Go's module versioning requirements, major version v2+
must include the major version in the module path. The current
module path 'zotregistry.dev/zot' only supports v0.x.x and v1.x.x
versions, making existing v2.x.x tags (like v2.1.8) unusable.
Changes:
- Updated go.mod module path to zotregistry.dev/zot/v2
- Updated all internal import paths across 280+ Go source files
- Updated configuration files (golangcilint.yaml, gqlgen.yml)
- Updated README.md Go reference badge
This fix enables proper use of existing v2.x.x Git tags and allows
external packages to import zot v2+ versions without compatibility
errors.
Resolves: Go module import compatibility for v2+ versions
Fixes: #3071
Signed-off-by: Luca Muscariello <muscariello@ieee.org>
* fix: regenerate GraphQL files with updated v2 import paths
The gqlgen tool needs to regenerate the GraphQL schema files after
the module path change to use the new v2 imports.
Signed-off-by: Luca Muscariello <muscariello@ieee.org>
---------
Signed-off-by: Luca Muscariello <muscariello@ieee.org>
- fixes#3347: removeUntaggedManifests() did not consider compatible manifest types
- add AsDockerImage() to Image and MultiarchImage for testing
- extend TestGarbageCollectAndRetentionMetaDB to test docker image and multiarch image
Signed-off-by: Stephan Merker <stephan.merker@sap.com>
Using just the last repository is not enough as in the case when it is deleted
(either by GC or some other way), GetNextRepository returns empty string
causing the generator to be marked completed without any errors.
An alternative would have been to start over from the first repository,
but this can take hours if multiple repositories need to be deleted,
not to mention the processing power and I/O and S3 load this could take.
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
It is to fix#3185.
This fixes the case where MetaDB is not instantiated (none of the conditions match),
and we want to retain tags only by pattern (which should not need to use MetaBD).
Without this fix you could only use retention to delete untagged manifests.
If you specified only the key "patterns" under "keepTags", zot would crash.
It was possible to not specify "keepTags" all, which would retain all tags,
but it was not possible to retains specific tags.
Basically the case quoted below, from the documentation, was broken::
https://zotregistry.dev/v2.1.4/articles/retention/#configuration-example
```
When you specify a regex pattern with no rules other than the default, all tags matching the pattern are retained.
```
This would only work if MetaDb was instantiated by an unrelated configured feature.
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
* feat: add support for docker images
Issue #724
A new config section under "HTTP" called "Compat" is added which
currently takes a list of possible compatible legacy media-types.
https://github.com/opencontainers/image-spec/blob/main/media-types.md#compatibility-matrix
Only "docker2s2" (Docker Manifest V2 Schema V2) is currently supported.
Garbage collection also needs to be made aware of non-OCI compatible
layer types.
feat: add cve support for non-OCI compatible layer types
Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>
*
Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>
* test: add more docker compat tests
Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>
* feat: add additional validation checks for non-OCI images
Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>
* ci: make "full" images docker-compatible
Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>
---------
Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>
* fix(oras)!: remove ORAS artifact references support
ORAS artifacts/references predated OCI dist-spec 1.1.0 which now has the
same functionality and likely to see wider adoption.
Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>
* test: update to released official images
So that they are unlikely to be deleted.
*-rc images may be cleaned up over time.
Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>
---------
Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>
This causes the "fair" scheduler to run it too often in the detriment of other generators.
The intention was to run it every 2 hours but the measurement unit for 7200 was not specified.
Add more logs, including showing a generator name, in order to troubleshoot this kind of issues easier in the future.
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
wait for workers to finish before exiting
should fix tests reporting they couldn't remove rootDir because it's being
written by tasks
Signed-off-by: Petu Eusebiu <peusebiu@cisco.com>
- Cosign supports 2 types of signature formats:
1. Using tag -> each new signature of the same manifest is
added as a new layer of the signature manifest having that
specific tag("{alghoritm}-{digest_of_signed_manifest}.sig")
2. Using referrers -> each new signature of the same manifest is
added as a new manifest
- For adding these cosign signature to metadb, we reserved index 0 of the
list of cosign signatures for tag-based signatures. When a new tag-based
signature is added for the same manifest, the element on first position
in its list of cosign signatures(in metadb) will be updated/overwritten.
When a new cosign signature(using referrers) will be added for the same
manifest this new signature will be appended to the list of cosign
signatures.
Signed-off-by: Andreea-Lupu <andreealupu1470@yahoo.com>
no need to run dedupe/restore blobs for images being pushed or synced while
running dedupe task, they are already deduped/restored inline.
Signed-off-by: Petu Eusebiu <peusebiu@cisco.com>
fix(gc): fix cleaning deduped blobs because they have the modTime of
the original blobs, fixed by updating the modTime when hard linking
the blobs.
fix(gc): failing to parse rootDir at zot startup when using s3 storage
because there are no files under rootDir and we can not create empty dirs
on s3, fixed by creating an empty file under rootDir.
Signed-off-by: Petu Eusebiu <peusebiu@cisco.com>