
Mediawiki - image sizes and duplication

Opened this issue · 4 comments

Open question about image sizes, how to optimize not to waste substantial disk, and ideally how to integrate with images stored by dweb-mirror.

  • figure out strategy - keep 400px & 2000px on MW and full size on DM (for now)
  • remove unneeded images on MW - asked DK 8jan
  • Run a crawl and look at size
  • later: figure out how to a) cross link or b) pass URL to React that gets from DM

David says should be full size and thumbnail (~400px) only, but thats not what I'm seeing

-rw-r--r-- 1 www-data pi   1746 Dec 28 23:40 images/c/cc/120px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi   3048 Dec 28 23:39 images/0/04/180px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi   4539 Dec 28 23:40 images/b/b0/240px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi   7120 Dec 28 23:40 images/7/7c/320px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi  11038 Dec 28 23:40 images/f/f1/400px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi  23891 Dec 28 23:40 images/5/50/600px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi  45855 Dec 28 23:40 images/f/ff/800px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 109011 Dec 28 23:39 images/6/6e/1200px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 225615 Dec 28 23:40 images/9/9e/1599px-usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 644082 Dec 29 12:36 images/7/72/20190913062124!usadha-cetik_56.jpeg
-rw-r--r-- 1 www-data pi 644082 Dec 29 12:51 images/d/d6/usadha-cetik_56.jpeg 

Shows I believe 11 of the same image

From Slack: MA> so there is 400px for thumbnail, 2000px for ‘full-size’ on MW and then original at much larger resolution which you are getting via IIIF for the enhancement process. So I’d expect we have the 400p, and 2000px in MW and cached in DW, that would be nice (but not if its too hard) to eliminate duplication, and then a full-size that will be cached in DW * if * someone starts editing the image, and passed to React via IIF.
DK:enhancement occurs in either viewing or editing, but only when zoomed in
so, the only sensible way to make it work offline is to already have the original IA image available
enhancement is not really optional -- you can't comfortably read everything without it -- hence my idea of removing the 2000px version and going directly to the original

For now, we'll keep 400p and 2000px in MW, and cache full image on DM

cd $MW/images ; find . -name "[0-9]*" -delete ; # Freed up space from 5.5GB of images to 3.02GB (51000 images) 
php maintenance/checkImages.php | grep missing | sed -E 's/^(.+):.+/File:\1/' |sudo php 
# Should then fix the DB
php maintenance/deleteBatch.php # VERY slow 
# And to ccleanup
php maintenance/deleteArchivedRevisions.php --delete
php maintenance/deleteArchivedFiles.php --delete