scala/scala-dev

scala-ci.typesafe.com/artifactory is full? PR validation jobs are giving 413 errors

SethTisue opened this issue · 15 comments

so e.g. at https://scala-ci.typesafe.com/job/scala-2.12.x-validate-main/4276/console:

[error] (partest / publish) java.io.IOException: PUT operation to URL https://scala-ci.typesafe.com/artifactory/scala-pr-validation-snapshots;build.timestamp=1603922141869/org/scala-lang/scala-partest/2.12.13-bin-bfc824a-SNAPSHOT/scala-partest-2.12.13-bin-bfc824a-SNAPSHOT.pom failed with status code 413: Request Entity Too Large; Response Body: {
[error]   "errors" : [ {
[error]     "status" : 413,
[error]     "message" : "Datastore disk usage is too high. Contact your Artifactory administrator to add additional storage space or change the disk quota limits."
[error]   } ]
[error] }

iirc, last time it happened we dealt with this by zeroing out https://scala-ci.typesafe.com/artifactory/scala-pr-validation-snapshots/ — or did we not zero it out entirely, but just deleted the oldest builds? I can't remember

the PR validation snapshots do have some value even after a PR is merged, since they enable per-commit bisecting of regressions. (in contrast to the mergelies at https://scala-ci.typesafe.com/artifactory/scala-integration/ , which enable coarser-grained bisecting: per PR. the mergelies, we intend to retain indefinitely)

aha, I found the ticket from the last time this happened: #636

I went through all the remote repositories and set them to expire cached artifacts after 720 hours (30 days), then I ran "Cleanup Unused Cached Artifacts", but this is unlikely to buy us more than a small amount of time, as https://scala-ci.typesafe.com/artifactory/webapp/#/admin/advanced/storage_summary shows that the lion's share of storage is going to the PR validation snapshots and mergelies

lrytz commented

I'll run my script (#636 (comment)) to delete from pr-validation-snapshots what's older than 2019.

lrytz commented

Seems we already deleted what's older than 2019, and half of 2019 too. So I'm going for everything artifacts for non-merged commits older than 2020.

lrytz commented

🍿

ezgif com-video-to-gif

lrytz commented

This didn't help enough, we're still at 85%.

I noticed

lrytz commented

@SethTisue let me know what you think. IMO we can also bump the EBS volume size.

I've never used scala-release-temp or seen it used, so I have no objection to zeroing that one out.

IMO we can also bump the EBS volume size

Can that be done without a lot of rebuilding effort?

lrytz commented

Resizing the EBS is simple, then i'll give the resize2fs command a try

lrytz commented

Currently taking a snapshot of the volume (that's quite slow).

lrytz commented

Resizing worked fine

admin@ip-172-31-10-237:~$ df -hT
...
/dev/xvdk      ext4      493G  419G   52G  90% /var/opt/jfrog/artifactory/data
...

admin@ip-172-31-10-237:~$ lsblk
...
xvdk    202:160  0  600G  0 disk /var/opt/jfrog/artifactory/data
...

admin@ip-172-31-10-237:~$ sudo /sbin/resize2fs /dev/xvdk
resize2fs 1.43.4 (31-Jan-2017)
Filesystem at /dev/xvdk is mounted on /var/opt/jfrog/artifactory/data; on-line resizing required
old_desc_blocks = 32, new_desc_blocks = 38
The filesystem on /dev/xvdk is now 157286400 (4k) blocks long.

admin@ip-172-31-10-237:~$ df -hT
...
/dev/xvdk      ext4      591G  419G  146G  75% /var/opt/jfrog/artifactory/data
....

image

Fixed. 🤞

@lrytz I don't remember if it was on a ticket somewhere or in private communication, but you asked if I wanted the behemoths resized to help the community build, and I said yes please — when you have time.

what I should have added: lately the most common disk space problem on the behemoths is actually inodes, not raw space. the community build is amazingly inode hungry. is there a way to get more inodes as well as more gigabytes?