Mediawiki - image sizes and duplication
Opened this issue · 4 comments
Open question about image sizes, how to optimize not to waste substantial disk, and ideally how to integrate with images stored by dweb-mirror.
- figure out strategy - keep 400px & 2000px on MW and full size on DM (for now)
- remove unneeded images on MW - asked DK 8jan
- Run a crawl and look at size
- later: figure out how to a) cross link or b) pass URL to React that gets from DM
David says should be full size and thumbnail (~400px) only, but thats not what I'm seeing
-rw-r--r-- 1 www-data pi 1746 Dec 28 23:40 images/c/cc/120px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 3048 Dec 28 23:39 images/0/04/180px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 4539 Dec 28 23:40 images/b/b0/240px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 7120 Dec 28 23:40 images/7/7c/320px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 11038 Dec 28 23:40 images/f/f1/400px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 23891 Dec 28 23:40 images/5/50/600px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 45855 Dec 28 23:40 images/f/ff/800px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 109011 Dec 28 23:39 images/6/6e/1200px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 225615 Dec 28 23:40 images/9/9e/1599px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 644082 Dec 29 12:36 images/7/72/20190913062124!usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 644082 Dec 29 12:51 images/d/d6/usadha-cetik_56.jpeg
Shows I believe 11 of the same image
From Slack: MA> so there is 400px for thumbnail, 2000px for ‘full-size’ on MW and then original at much larger resolution which you are getting via IIIF for the enhancement process. So I’d expect we have the 400p, and 2000px in MW and cached in DW, that would be nice (but not if its too hard) to eliminate duplication, and then a full-size that will be cached in DW * if * someone starts editing the image, and passed to React via IIF.
DK:enhancement occurs in either viewing or editing, but only when zoomed in
so, the only sensible way to make it work offline is to already have the original IA image available
enhancement is not really optional -- you can't comfortably read everything without it -- hence my idea of removing the 2000px version and going directly to the original
For now, we'll keep 400p and 2000px in MW, and cache full image on DM
cd $MW/images ; find . -name "[0-9]*" -delete ; # Freed up space from 5.5GB of images to 3.02GB (51000 images)
php maintenance/checkImages.php | grep missing | sed -E 's/^(.+):.+/File:\1/' |sudo php
# Should then fix the DB
php maintenance/deleteBatch.php # VERY slow
# And to ccleanup
php maintenance/deleteArchivedRevisions.php --delete
php maintenance/deleteArchivedFiles.php --delete