Tired of fighting the GUI. Going to implement this logic in MacOS's Automations scripting environment. See /MacOSAutomation/JoeregerMediaArchiveMirroringScript.sh
This project aims to create an application with a graphical user interface (GUI) to help me sort through my large collection of photos (approximately 2,000,000 images/videos) efficiently. The goal is to categorize these images as either "Public," "Private," or "Unsure" for eventual publication of a subset.
My photo collection consists of:
- Roughly 2,000,000 images and videos
- A mix of dSLR and cameraphone photos
- Organized in a single master directory
- Subdirectories based on years, life events, months, etc.
- Create a GUI application for efficient photo sorting
- Allow for incremental progress over weeks or months
- Achieve completeness in sorting all images
- Prepare for eventual public sharing of selected photos
- Directory and Image Views: The app will present both directory and individual image views for sorting.
- Quick Categorization: Users can quickly categorize images as "Public," "Private," or "Unsure."
- Progress Tracking: The app will maintain a list of sorted and unsorted files.
- Completeness: The app will continue presenting unsorted images until all are categorized.
- Non-destructive: Initial sorting will not move or alter original files.
- App presents a directory or image view
- User selects a categorization option (Public, Private, Unsure)
- App records the decision and moves to the next unsorted item
- Process repeats until all images are sorted
- Implement file moving functionality based on categorization
- Add machine learning for automatic categorization suggestions
- Integrate with cloud storage or photo management services
- Choose a suitable GUI framework (e.g., PyQt, Tkinter, Electron)
- Implement efficient file system scanning and metadata reading
- Design a database or file-based system to store sorting progress
- Ensure the app can handle large numbers of files without performance issues
PUBLIC_ROOT = "test_public" PRIVATE_ROOT = "test_private" SAFE_DELETE_ROOT = "test_safe_delete" METADATA_FOLDER = "test_metadata"
This system manages and sorts a large collection of photos and videos between public and private directories, tracking review status and maintaining file integrity.
- Two root directories:
public_root
: for publicly accessible photos/videosprivate_root
: for private photos/videos
- Root directory paths stored in
constants.py
- Year-based organization:
- First-level subdirectories named by year (e.g., 2003, 2004, 2005)
- Each year directory may contain multiple files, subdirectories, and nested subdirectories
sort(path: str, is_public: bool) -> None
function insift_io_utils.py
- Handles individual files and directories
- Sorting process:
- Moves files/directories between public and private roots as needed
- Preserves original directory structure
- Resolves filename conflicts by incrementing
- Verifies file integrity before deleting originals
- Deletes empty directories (except year directories)
- Files are moved to
SAFE_DELETE_ROOT
instead of permanent deletion SAFE_DELETE_ROOT
haspublic/
andprivate/
subdirectories- Files in
SAFE_DELETE_ROOT
are kept indefinitely - Checksum verification after copying, before deleting original
- JSON format, one file per year in each root directory
- Fields for each file:
status
: "public" or "private"last_reviewed
: timestamp of last review (ISO 8601 format)reviewed
: boolean indicating manual review status
- Directory-level progress bars
- Shows percentage of reviewed files to total files
- Permission issues: Error and abort
- Symbolic/hard links: Console error and abort
- Failed operations logged to console and recorded in metadata
- One JSON metadata file per year in each root directory
- Python Qt application (details to be provided separately)
- Files with no extension: Processed normally
- Long file names:
- Max length: 255 characters
- Truncation and unique identifier appending if exceeded
- Logging of modifications
sort()
updateslast_reviewed
timestamp- Checksum verification errors: Log to console, abort, preserve original file
sort()
handles multiple files/folders within given path, but not multiple sibling directories
- Designed for local storage only
- No summary statistics provided
- No batch operations across multiple sibling directories
The sort()
function is the primary operation, updating the last_reviewed
timestamp in the metadata.
Checksum verification errors are logged to the console, the operation is aborted, and the original file is preserved.
The sort()
function accepts a single path, which may contain multiple files and folders to be processed.
This file contains the DirectoryDetailsPane
class, which displays information about the selected directory and provides sorting options.
Key methods:
update_directory(path)
: Updates the displayed directory informationrefresh_stats()
: Refreshes the statistics for the current directorybatch_sort(is_public)
: Initiates batch sorting of files in the current directory
This file implements the DirectoryTreePane
class, which displays a tree view of the directory structure.
Key methods:
populate_tree()
: Builds the directory tree structurerefresh_directory_structure()
: Refreshes the entire directory treerefresh_stats(path)
: Updates the progress statistics for a specific directory
This file contains the FileGridItem
class, which represents individual files in the grid view.
Key methods:
update_border()
: Updates the border color based on the file's review statussort_public()
andsort_private()
: Sort the file as public or private
This file implements the FilesGridPane
class, which displays a grid of files in the selected directory.
Key methods:
update_directory(path)
: Updates the displayed files for the given directoryshow_zoomed(file_path)
: Displays a zoomed view of the selected filesort_public(file_path)
andsort_private(file_path)
: Sort a file as public or private
This file contains the MainWindow
class, which is the main application window.
Key methods:
on_directory_selected(path)
: Handles directory selection eventson_directory_sorted(path)
: Handles directory sorting events
This file implements video-related widgets, including VideoThumbnailWidget
and VideoPlayerWidget
.
Key methods:
play()
andstop()
: Control video playbackset_position(position)
: Sets the video playback position
This file contains the ZoomedView
class, which displays a zoomed view of selected files.
Key methods:
show_zoomed(file_path, sift_io, sift_metadata)
: Displays a zoomed view of the filesort_public()
andsort_private()
: Sort the current file as public or private
This file implements the ScrollPositionManager
class, which manages scroll positions for different views.
Key methods:
save_scroll_position(path, position)
: Saves the scroll position for a specific pathget_scroll_position(path)
: Retrieves the saved scroll position for a path
This file contains the SiftIOUtils
class, which handles file operations and sorting.
Key methods:
sort(path, is_public)
: Sorts a file or directory as public or privatemove_file(file_path, is_public)
: Moves a file between public and private directoriesget_directory_status(dir_path)
: Retrieves the status of files in a directory
This file implements the SiftMetadataUtils
class, which manages metadata for sorted files.
Key methods:
get_file_status(file_path)
: Retrieves the status of a fileupdate_manual_review_status(file_path, new_status)
: Updates the review status of a fileupdate_file_path(old_path, new_path)
: Updates metadata when a file is moved
example JSON metadata format: "1979/tests/test_01/test_image_2.jpg": { "year": "1979", "status": "public", "last_reviewed": "2024-08-09T08:43:57.732855", "reviewed": true }
These components work together to provide a comprehensive solution for managing and sorting a large collection of photos and videos, with a user-friendly interface and robust backend operations.
example call: sort(PUBLIC_ROOT/1975/foo/, is_public=False)
assert:
- PUBLIC_ROOT/1975/foo/, including all files and subfolders, is moved to PRIVATE_ROOT/1975/foo/
- files collissions in PRIVATE_ROOT/1975/foo/ are handled gracefully and no data is lost
- PUBLIC_ROOT/1975/foo/ is deleted
- PUBLIC_ROOT/1975/foo/ is saved at SAFE_DELETE_ROOT/public/1975/foo/
- if METADATA_FOLDER/public/1975/private_1975.json exists it contains no entries for the files that were moved
- METADATA_FOLDER/public/1975/private_1975.json exists
- METADATA_FOLDER/public/1975/private_1975.json contains entries for all files moved and they are recorded as "reviewed": true
example call: sort(PUBLIC_ROOT/1975/foo/, is_public=True)
assert:
- PUBLIC_ROOT/1975/foo/ exists and still includes all files and subfolders
- no files are moved to PRIVATE_ROOT/1975/foo/
- METADATA_FOLDER/public/1975/private_1975.json exists
- METADATA_FOLDER/public/1975/private_1975.json contains entries for all files and they are recorded as "reviewed": true
example call: sort(PUBLIC_ROOT/1975/foo/bar.png, is_public=False)
assert:
- PUBLIC_ROOT/1975/foo/bar.png is moved to PRIVATE_ROOT/1975/foo/bar.png
- files collissions in PRIVATE_ROOT/1975/foo/bar.png are handled gracefully and no data is lost
- PUBLIC_ROOT/1975/foo/bar.png is deleted
- PUBLIC_ROOT/1975/foo/bar.png is saved at SAFE_DELETE_ROOT/public/1975/foo/bar.png
- if METADATA_FOLDER/public/1975/private_1975.json exists it contains no entries for PUBLIC_ROOT/1975/foo/bar.png
- METADATA_FOLDER/public/1975/private_1975.json exists
- METADATA_FOLDER/public/1975/private_1975.json contains entries for PUBLIC_ROOT/1975/foo/bar.png and it is marked as "reviewed": true
example call: sort(PUBLIC_ROOT/1975/foo/bar.png, is_public=True)
assert:
- PUBLIC_ROOT/1975/foo/bar.png exists
- no files are moved to PRIVATE_ROOT/1975/foo/bar.png
- METADATA_FOLDER/public/1975/private_1975.json exists
- METADATA_FOLDER/public/1975/private_1975.json contains entries for PUBLIC_ROOT/1975/foo/bar.png and it is recorded as "reviewed": true