scala-ci.typesafe.com/artifactory is full? PR validation jobs are giving 413 errors
SethTisue opened this issue · 15 comments
so e.g. at https://scala-ci.typesafe.com/job/scala-2.12.x-validate-main/4276/console:
[error] (partest / publish) java.io.IOException: PUT operation to URL https://scala-ci.typesafe.com/artifactory/scala-pr-validation-snapshots;build.timestamp=1603922141869/org/scala-lang/scala-partest/2.12.13-bin-bfc824a-SNAPSHOT/scala-partest-2.12.13-bin-bfc824a-SNAPSHOT.pom failed with status code 413: Request Entity Too Large; Response Body: {
[error] "errors" : [ {
[error] "status" : 413,
[error] "message" : "Datastore disk usage is too high. Contact your Artifactory administrator to add additional storage space or change the disk quota limits."
[error] } ]
[error] }
iirc, last time it happened we dealt with this by zeroing out https://scala-ci.typesafe.com/artifactory/scala-pr-validation-snapshots/ — or did we not zero it out entirely, but just deleted the oldest builds? I can't remember
the PR validation snapshots do have some value even after a PR is merged, since they enable per-commit bisecting of regressions. (in contrast to the mergelies at https://scala-ci.typesafe.com/artifactory/scala-integration/ , which enable coarser-grained bisecting: per PR. the mergelies, we intend to retain indefinitely)
recent article (October 1, 2020) with advice: https://jfrog.com/knowledge-base/artifactory-cleanup-best-practices/
I went through all the remote repositories and set them to expire cached artifacts after 720 hours (30 days), then I ran "Cleanup Unused Cached Artifacts", but this is unlikely to buy us more than a small amount of time, as https://scala-ci.typesafe.com/artifactory/webapp/#/admin/advanced/storage_summary shows that the lion's share of storage is going to the PR validation snapshots and mergelies
I'll run my script (#636 (comment)) to delete from pr-validation-snapshots what's older than 2019.
Seems we already deleted what's older than 2019, and half of 2019 too. So I'm going for everything artifacts for non-merged commits older than 2020.
This didn't help enough, we're still at 85%.
I noticed
- 28 gigs in scala-release-temp. I think we used this at some point for temporary builds, eg for benchmarking, to avoid putting them in scala-integration. scala/scala@6ff389166e. We can probably clean that up.
- Artifactory's internal database takes 67 gigs (
du -h /var/opt/jfrog/artifactory/data/derby
). There's a function to compress that (https://www.jfrog.com/confluence/display/JFROG/Regular+Maintenance+Operations#RegularMaintenanceOperations-Storage), after trying that a few times with an error message it eventually worked, but it didn't help. Still 67 gigs. - I can delete more PR validation builds, that repo is at 110 gigs
- scala-integration is at 207 gigs and we probably don't ever want to remove from there
@SethTisue let me know what you think. IMO we can also bump the EBS volume size.
I've never used scala-release-temp or seen it used, so I have no objection to zeroing that one out.
IMO we can also bump the EBS volume size
Can that be done without a lot of rebuilding effort?
Resizing the EBS is simple, then i'll give the resize2fs command a try
Currently taking a snapshot of the volume (that's quite slow).
Resizing worked fine
- took a snapshot of the EBS volume (will keep it around for a while)
- using "Modify Volume" changed the size to 600 gigs
- changed the filesystem according to https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/recognize-expanded-volume-linux.html
admin@ip-172-31-10-237:~$ df -hT
...
/dev/xvdk ext4 493G 419G 52G 90% /var/opt/jfrog/artifactory/data
...
admin@ip-172-31-10-237:~$ lsblk
...
xvdk 202:160 0 600G 0 disk /var/opt/jfrog/artifactory/data
...
admin@ip-172-31-10-237:~$ sudo /sbin/resize2fs /dev/xvdk
resize2fs 1.43.4 (31-Jan-2017)
Filesystem at /dev/xvdk is mounted on /var/opt/jfrog/artifactory/data; on-line resizing required
old_desc_blocks = 32, new_desc_blocks = 38
The filesystem on /dev/xvdk is now 157286400 (4k) blocks long.
admin@ip-172-31-10-237:~$ df -hT
...
/dev/xvdk ext4 591G 419G 146G 75% /var/opt/jfrog/artifactory/data
....
Fixed. 🤞
@lrytz I don't remember if it was on a ticket somewhere or in private communication, but you asked if I wanted the behemoths resized to help the community build, and I said yes please — when you have time.
what I should have added: lately the most common disk space problem on the behemoths is actually inodes, not raw space. the community build is amazingly inode hungry. is there a way to get more inodes as well as more gigabytes?