ArctosDB/arctos

Help with locating media files on TACC

Closed this issue · 11 comments

I need to access files on TACC that I can no longer find. I have 21 daily folders that maybe are in this tar file (arctos-pgdata.tar.bz2)?

image

However, when I click on the zipped file, it just downloads (says it will take more than 80 minutes), so I cancelled the procedure.


We essentially went back and brightened all of our early jpeg images for the oMeso project. We've already overwritten all the files present in /corral-tacc/projects/arctos/web/ucm/oMeso_herps, but again, we have 21 folders outstanding that I can't find. I'm assuming they are in the tar file, but they could be somewhere else(?). I searched around in the arctos project but couldn't find anything.

These are the missing folders, all the images contained within have the following base URL: https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/ (e.g., https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/2020-11-24/UCM_HERP_51385_Urosaurus_nigricaudus_ventral.jpg).

2020-11-24
2020-12-10
2021-01-18
2021-01-25
2021-02-01
2021-02-08
2021-03-02
2021-03-22
2021-03-30
2021-04-12
2021-04-27
2021-05-04
2021-05-25
2021-06-07
2021-06-08
2021-06-15
2021-06-28
2021-07-05
2021-07-12
2021-07-27
2021-08-02

Alternatively, I have all of the old media bulkloaders so I * could * reload all of these folders to the new allocation, /corral-tacc/projects/arctos/web/ucm/oMeso_herps, however I would need help editing the exiting Arctos media to have a "ucm" vs "UCM" in the URL (maybe that could be magiced?). Or help with deleting all those images and reloading, but that seems like a pain...

@ebraker that would have been a direct allocation, not through Arctos, Chris Jordan should be able to point you in the right direction.

@dustymc -Chris says these images are on an old iRODS system and the original access method to them is shut down.

Is any chance that you could globally update all media objects with URIs that contain the address "https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/" to "https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/" (essentially replace UCM with ucm)?

If this is a possibility, I will go ahead and copy the 19 folders worth of photos that I have stored on our imaging computer onto the ucm/oMeso_Herps web directory and the updated URI should correctly point to them. Then I can just ask Chris to delete the UCM/oMeso_Herps folder on the old iRODs system.

If not, I will see what I can figure out with Chis.

@ebraker yes I can make replace-updates.

Ooh, great! I will put this on my list for tomorrow and message you when I'm ready to run that update. Thanks!

@dustymc OK, I have moved all my folders to the /corral-tacc/projects/arctos/web/ucm/oMeso_herps allocation. Will you replace all "https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/" with "https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/" please?

replace all "https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/" with "https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/" please?

I think you had a copypaste fail or something. I took a guess, let me know if this looks like the right thing and everything and such, or where I got lost.

temp_me_up.csv.zip

arctosprod@arctos>> select count(*) from temp_me_up;
 count 
-------
   359


arctosprod@arctos>> select split_part(replace(media_uri,'https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/',''),'/',1) f, count(*) c from temp_me_up group by f order by f;
     f      | c  
------------+----
 2020-11-24 | 56
 2020-12-10 | 38
 2021-01-18 | 13
 2021-01-25 |  9
 2021-02-01 | 13
 2021-02-08 | 23
 2021-03-02 | 24
 2021-03-22 | 21
 2021-03-30 | 20
 2021-04-12 | 26
 2021-04-27 | 24
 2021-05-04 | 22
 2021-05-25 | 10
 2021-06-07 | 20
 2021-06-08 |  8
 2021-06-15 | 14
 2021-06-28 |  6
 2021-07-05 | 12
(18 rows)


Doh! You guessed right! Thank you!

Sorry I wasn't clear, I DID NOT update anything, if that looks right I can though.

oh, haha, YES! that looks right!

done