[Discussion] Is overwriting a big json file continuously okay?
pranitbauva1997 opened this issue · 2 comments
I am familiar with a repository named metakgp/naraad
which overwrites a JSON file around 4-5 times a day. One can check the commit history to get more useful information.
This causes json files to not just be big but it also causes to have too many big objects (non-binary) so it gets undetected.
Should such repositories be a problem? Personally it's a big headache for me to clone, and work on it, and I get around it by just cloning the tip.
This could also apply to package managers that write the current dependency state to a lock
file. Best practices is to commit this file to git to allow for identical dependency re-installs in different environments, but in most cases its completely different if dependencies are updated.
I am familiar with a repository named
metakgp/naraad
which overwrites a JSON file around 4-5 times a day. One can check the commit history to get more useful information.This causes json files to not just be big but it also causes to have too many big objects (non-binary) so it gets undetected.
For that repository, git-sizer
does in fact report (with 7 asterisks) the large total expanded size of blobs. This is the consequence of modifying large files many times.
I agree that this practice is not ideal, and it probably takes quite long to clone this repository or run git fsck
on it due to the large amount of time it takes to expand these files and compute their hashes.
Should such repositories be a problem? Personally it's a big headache for me to clone, and work on it, and I get around it by just cloning the tip.
I don't want to be the arbiter of Git best practice. As long as a repository is not causing serious problems for GitHub's services, it is not my business to tell users what to do.
I suggest that you express your concerns to the repository owner. Even better would be if you figure out a better way to structure the JSON files and the code that uses them, and submit a pull request.
This could also apply to package managers that write the current dependency state to a
lock
file. Best practices is to commit this file to git to allow for identical dependency re-installs in different environments, but in most cases its completely different if dependencies are updated.
It depends a lot on how big the lock
files get and how often they are updated. It's not a problem to update small- or medium-sized files relatively often. It's only if the file gets quite large, or the updates are extremely frequent, that it becomes problematic.