Google Container Registry lifecycle policy for images retention

Written by Marek Bartík | October 14, 2019

Is your Google Container Registry filling up, taking up storage and becoming expensive? How to handle images retention as a service?

Amazon’s Elastic Container Registry has a feature called Lifecycle Policies to handle images retention. Google doesn’t have this feature. There is a feature request in their tracker since Aug 2018 and there is not ETA for it so far…

There is a popular bash script from Ahmet and Go in CloudRun from Seth but none of them solve the requirements I needed. What exactly do I need?

I wanna scan my whole GCR and delete the digests that are:

older than X days
not being used in my kubernetes cluster
not the most recent Y digests (I wanna keep, say, 20 most recent tagged digests)

When I check these requirements, I want to apply these lifecycle policies to all the images.

Say I have few images in GCR with certain prefixes:

eu.gcr.io/my-project/foo/bar/my-service:123

eu.gcr.io is the docker registry endpoint
my-project is ID of my GCP project
foo/bar is the prefix (“repo”)
my-service is an image name
123 is a tag

my-service:123 is an image with a tag, but wait, what is the digest?

Image vs Layers, taken from https://windsock.io/explaining-docker-image-ids/

A docker image digest is an ID (hashing algorithm used and the hash computed). The digest can look like this:

@sha256:296e2378f7a14695b2f53101a3bd443f656f823c46d13bf6406b91e9e9950ef0

You can tag a digest with several tags, even zero tags = untagged image.

Let’s say build an image my-service and push it to docker registry. When pushing, I tag it with :123. The new produced digest has two tags, :123 and :latest.The digest that was tagged :latest before I pushed this image, got the :latest tag removed.

If I remove a tag from an image in GCR, I simply remove a tag from the digest, I don’t delete the digest though.

What I can delete, in order to save some space, is the digest, like this:

gcloud container images delete -q — force-delete-tags eu.gcr.io/my-project/foo/bar/my-service@sha256:296e2378f7a14695b2f53101a3bd443f656f823c46d13bf6406b91e9e9950ef0

Then, what do I need to do?

Recursively scan gcr.io for all image prefixes (eu.gcr.io/my-project/foo/bar/my-service)
For each prefix — list all its digests, delete the ones that don’t match my rules

How to check if they match the rules:

sort them, preserve the most recent Y digests
fetch pods and replicaSets’ image:tags (all of them, even the ones scaled to zero, we’d need these images in case of a rollback) from the k8s cluster, then go through the digests (that belong to that image name) and check if ANY of their tags contain a tag that is used in the cluster, preserve those
check the rest of the digests, if they are older than X days, delete them

you could use standard kubectl to fetch the data:

kubectl get rs,po --all-namespaces -o jsonpath={..image} | tr ' ' '\n'

the gcr.io is exposing a docker v2 API, you can use a standard docker client or just curl using the gcloud token

ACCESS_TOKEN=$(gcloud auth print-access-token)
curl --silent --show-error -u_token:"$ACCESS_TOKEN" -X GET "https://eu.gcr.io/v2/_catalog"

I implemented all of this using bash/jq (yep, that wasn’t a smart idea) and published it to github:

https://github.com/marekaf/gcr-lifecycle-policy?source=post_page-----e0bc6bb88246----------------------

Right now I’m running this in Gitlab-CI pipeline on a cron schedule (once a day) to evaluate it’s dry-run logs for production GCP projects.

I’m planning on rewriting this to python (py-kubeng and docker-py) if Google will not to come up with ETA for this feature :(

FAQs

Q1: What is a container?

A container is a logical “package” that contains everything an application needs in order to function, including the application itself, its dependencies, libraries, and configuration files.

Q2: How did containers evolve, and how are they different from virtual machines (VMs)?

Containers are an evolution from physical servers and then virtual machines. Unlike VMs, which virtualize entire machines including the operating system, containers only virtualize the application. Because they do not contain an OS image, containers are much smaller, more portable, and more flexible than VMs.

Q3: What is the primary advantage of containers being independent of their environment?

The primary advantage is that a container with your application can be run seamlessly in any environment — whether it’s AWS, GCP, Azure, a private data center, or a developer’s laptop. This allows developers to focus on application development without having to worry about where or how their applications will run.

Q4: How do containers lead to more efficient resource utilization?

Containers are small compared to conventional virtual machines and require fewer resources like memory or CPU. As a result, you can use your physical server resources more efficiently by stacking multiple containers on one server in a smart way, making the most out of the available hardware.

Q5: In what ways do containers improve the agility and stability of development and operations?

Because they are lightweight, containers can be started, stopped, replicated (horizontally scaled), or patched very quickly. This allows development and operations teams to be more independent, spend less time on debugging, and achieve faster development cycles, a quicker time-to-market, and a more stable infrastructure that can respond immediately to frequent changes like new releases or traffic spikes.

Q6: Why are containers considered an ideal solution for a hybrid infrastructure?

Since containers can be run anywhere, they are perfectly suited for hybrid infrastructure setups where applications run partially in a private data center and partially on a public cloud. Currently, there is no better solution for designing a hybrid infrastructure than to “containerize” your applications.

View full post