Commit Graph

33 Commits

Author SHA1 Message Date
Petu Eusebiu 7d077eaf5a Added storage interface 2021-10-11 17:38:46 -07:00
Shivam Mishra c6670b1329 api: implement delete by tag 2021-08-23 17:30:41 -07:00
Ramkumar Chinchani 63b88d0e57 pkg/storage: fix partially initialized repo storage
Thanks shimish2 for the unit test.

Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>
2021-08-09 23:19:20 -07:00
Shivam Mishra 53b5fa6493 dedupe: stat blob path before creating link 2021-08-09 09:40:35 -07:00
Shivam Mishra af30c06aff api: use blob cache path while making hard link
previously mount blob will look for blob that is provided in http request and try to hard link that path
but ideally we should look for path from our cache and do the hard link of that particular path.
this commit does the same.
2021-06-30 01:42:21 -07:00
Shivam Mishra 28974e81dc config: support multiple storage locations
added support to point multiple storage locations in zot by running multiple instance of zot in background.

see examples/config-multiple.json for more info about config.

Closes #181
2021-05-21 10:18:28 -07:00
Shivam Mishra a7c17b7c16 spec: added support for mount request using hard link 2021-05-04 09:42:29 -07:00
Ramkumar Chinchani f7829d6470 dedupe: optimize check-blob with hard links
In use cases, when there are large images with shared layers across
repositories, clients may benefit from not re-uploading the same blobs
over and over again.

We ensure this by hard linking when check-blob api is called.

Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>
2021-05-04 09:42:29 -07:00
Shivam Mishra 2b7b57313a conformance: fix http status code for cross-repository mounting 2021-01-29 09:35:15 -08:00
Ramkumar Chinchani 386c72d332 routes: refactor locks to handle large file uploads
The storage layer is protected with read-write locks.
However, we may be holding the locks over unnecessarily large critical
sections.

The typical workflow is that a blob is first uploaded via a per-client
private session-id meaning the blob is not publicly visible yet. When
the blob being uploaded is very large, the transfer takes a long time
while holding the lock.

Private session-id based uploads don't really need locks, and hold locks
only when blobs are published after the upload is complete.
2020-10-16 13:33:11 -07:00
Shivam Mishra 3a30290e08 Using "destRecord" as a path in DeleteBlob function instead of "dst".
dstRecord :- blob path stored in cache.
dst :- blob path that is trying to be uploaded.

Currently, if the actual blob on disk may have been removed by GC/delete, during syncing the cache dst is being passed to DeleteBlob function and retry section is being continuously called because DeleteBlob function never deletes dst path (doesn't exist in db), dstRecord should be passed into DeleteBlob function because dstRecord is actual blob path stored in db.

If dst and dstRecord path value is same then this issue will not be produced and DeleteBlob method will delete the blob info from cache but if both are different then DeleteBlob method will try to delete dst path which is not in cache.

Note:- boltdb delete method return nil even when value doesn't exist (https://godoc.org/github.com/boltdb/bolt#Bucket.Delete)
2020-08-12 10:06:20 -07:00
Ramkumar Chinchani 324a517ea3 gc: add a policy to skip garbage collecting new blobs
We perform inline garbage collection of orphan blobs. However, the
dist-spec poses a problem because blobs begin their life as orphan blobs
and then a manifest is add which refers to these blobs.

We use umoci's GC() to perform garbage collection and policy support
has been added recently which can control whether a blob can be skipped
for GC.

In this patch, we use a time-based policy to skip blobs.
2020-07-06 15:52:35 -07:00
Shivam Mishra af77876306 Upgraded build pipeline
Go version changed to 1.14.4
Golangci-lint changed to 1.26.0
Bazel version changed to 3.0.0
Bazel rules_go version changed to 0.23.3
Bazel gazelle version changed to v0.21.0
Bazel build tools version changed to 0.25.1
Bazel skylib version changed to 1.0.2
2020-06-25 23:43:31 -07:00
Shivam Mishra 85d3e1db4b Changed umoci import path 2020-06-25 17:04:32 -07:00
Ramkumar Chinchani 3dc9885ee9 update the size field when existing manifest entry is updated
An existing manifest descriptor in index.json can be updated with
different manifest contents for the same/existing tag. We were updating
the digest but not the size field causing GC to report an error.

Add a unit test case to cover this.

Add logs.
2020-06-18 16:20:43 -07:00
Ramkumar Chinchani cb9e773a3e dedupe: record relative path for cache entries
In a production use case we found that the actual rootdir can be moved.
Currently, cache entries for dedupe record the full blob path which
doesn't work in the move use case.

Only for dedupe cache entries, record relative blob paths.
2020-05-27 22:11:26 -07:00
Ramkumar Chinchani dd1fc1e866 config: add gc and dedupe as configurable params (default to enabled)
Since we want to conform to dist-spec, sometimes the gc and dedupe
optimizations conflict with the conformance tests that are being run.
So allow them to be turned off via configuration params.
2020-04-16 16:01:53 -07:00
Ramkumar Chinchani b1f882e1b8 conformance: align with upstream conformance tests
Upstream conformance tests are being updated, so we need to align along
with our internal GC and dedupe features.

Add a new example config file which plays nice with conformance tests.

DeleteImageManifest() updated to deal with the case where the same
manifest can be created with multiple tags and deleted with the same
digest - so all entries must be deleted.

DeleteBlob() delete the digest key (bucket) when last reference is
dropped
2020-04-16 16:01:53 -07:00
Ramkumar Chinchani 25f5a45296 dedupe: use hard links to dedupe blobs
As the number of repos and layers increases, the greater the probability
that layers are duplicated. We dedupe using hard links when content is
the same. This is intended to be purely a storage layer optimization.
Access control when available is orthogonal this optimization.

Add a durable cache to help speed up layer lookups.

Update README.

Add more unit tests.
2020-04-03 09:29:12 -07:00
Ramkumar Chinchani 8ff60f9138 conformance: fix error msg for DELETE MANIFEST
---
Ran 27 of 27 Specs in 0.120 seconds
SUCCESS! -- 27 Passed | 0 Failed | 0 Pending | 0 Skipped
PASS
---
2020-03-25 12:53:15 -07:00
Ramkumar Chinchani 2fd87b6a86 pkg/api: use a rwlock when accessing storage
The original patch used a mutex, however, the workload patterns are
likely to be read-heavy, so use a rwlock instead.
2020-03-20 10:58:21 -07:00
Ramkumar Chinchani fe471a3c35 gc: fix test cases since umoci GC is more strict
umoci GC enforces a valid index.json and current tests were a little
lax.
2020-03-20 10:58:21 -07:00
Tycho Andersen 95d4a7ce04 zot: run GC after manifest removal
Clients today expect the repo to clean up if there are unused blobs, not to
manually delete things they think are unused. Let's do that, and use
umoci's code to do it since it's tested and works.

v2: also run GC on update as well as delete

v3: fix up error return paths needing two args

Signed-off-by: Serge Hallyn <shallyn@cisco.com>
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
2020-03-20 10:58:21 -07:00
Ramkumar Chinchani 58040f4562 check: add unit tests to cover the new code, fix linter errors 2020-01-31 13:21:43 -08:00
Ramkumar Chinchani 909a97b922 storage: compliance allows for a full blob upload without a session
implement a new method which just takes the repo name, body and digest
and creates a blob out of this
2020-01-31 11:49:15 -08:00
Ramkumar Chinchani faad2b1d1f manifest can be deleted only by digest and not tag
Fixes issue #67.

As per dist spec, DELETE of a image manifest can only be done with
digest as <reference> param. Previously, tags were being allowed as
well. This is not conformant to the spec.
2020-01-28 14:51:51 -08:00
Ramkumar Chinchani 964af6ba51 compliance: be compliant with dist-spec compliance tests
dist-spec compliance tests are now becoming a part of dist-spec repo
itself - we want to be compliant

pkg/api/regex.go:
	* revert uppercasing in repository names

pkg/api/routes.go:
	* ListTags() should support the URL params 'n' and 'last'
	  for pagination

	* s/uuid/session_id/g to use the dist-spec's naming

	* Fix off-by-one error in GetBlobUpload()'s http response "Range" header

	* DeleteManifest() success status code is 202

	* Fix PatchBlobUpload() to account for "streamed" use case
	  where neither "Content-Length" nor "Content-Range" headers are set

pkg/storage/storage.go:
	* Add a "streamed" version of PutBlobChunk() called PutBlobChunkStreamed()

pkg/compliance/v1_0_0/check.go:
	* fix unit tests to account for changed response status codes
2020-01-16 11:28:23 -08:00
Ravi Chamarthy 535b9d07b1 Fix comments in storage.go 2019-12-13 17:31:05 -08:00
zendril 4e22352e9c Fixing all the issues with upgrading to golangci-lint 1.21.0 2019-12-13 00:53:18 -05:00
Ramkumar Chinchani 9ae9e40b67 log: improve logging
- add a panic recovery handler
        - add logs on unexpected error paths
        - use logger's panic method
2019-11-26 14:18:20 -08:00
Ramkumar Chinchani bb5ebe6984 issue #14: fix repo path walk 2019-08-29 13:47:59 -07:00
Ramkumar Chinchani 36ca298507 tls: require mutual auth only when htpasswd not available 2019-07-21 15:10:09 -07:00
Ramkumar Chinchani 9d4e8b4594 zot: initial commit 2019-06-21 15:29:19 -07:00