Fetch repositories incrementally

Question

Fetch repositories incrementally

wmalik opened this issue 3 years ago · 1 comments

Fetch the list from upstream at startup one page at a time but update the UI incrementally as pages are being downloaded)

Alternatively, or additionally, we could:

load the first page,
display it
load the rest in the background,

So that we get the content for searching, but the UI gets populated faster.

Originally posted by @padawin in #6 (comment)

Answer 1 · 2022-01-10T21:40:00.000Z

I gave this one a bit more thought, and I believe that the responsiveness of the
tool on startup (and some other issues) can be solved if the tool supports
offline usage. Details below.

Beware, wall of text :)

Problem 1

The tool is not usable without internet access.

I would find it useful to browse at lest my cloned repositories, and cd
into a project even when I don't have internet access. Additionally, I would
find it useful to browse repositories (name, description, and README)
without internet access. The repositories of the organization are not
frequently changing, especially the names, descriptions, README. The code
changes quite often, but I believe that it is worthwhile to have access to
stale code (as compared to no code). Repository metadata and code is highly
cache-able because it doesn't change often.

Problem 2

A high number of repositories slows down the user on each invocation of the
tool.

Currently, the tool fetches the entire list of repositories on startup.
This can a delay of a few seconds for a high number of configured repos (we
should allow users to add as many organizations as possible according to
the API rate limits). Since the repository metadata seldomly changes, it is
inefficient to fetch this whole list every time and make the user wait for
it. I see myself using this tool multiple times a day, and would find this
behavior a bit inconvenient. This issue is due to the fact that the entire
state of the application is kept in memory and needs to be rebuilt on each
invocation.

Proposal

I have the following proposal to address the above problems (and the parent issue):

Store the application state (i.e. the metadata of the organizations/repos) on
disk using SQLite (or similar)
Decouple the UI from over-the-network API calls, and render the UI using the
SQLite DB state only. The DB state consists of only the metadata (not the
repository contents)
Rebuild the DB state in the following scenarios:
- user presses 'r'
- when the configured organization list is different than the list stored in
  DB
- when the DB state is empty
- when the DB state is N minutes old

The above proposal addresses the responsiveness of the tool at startup, enables
offline usage, and opens possibilities for more features in the future:

sorting, filtering, grouping of the dataset is simpler to implement using SQL
a web based interface for the tool consuming the same state as the TUI
CLI tools that interact with all repos the DB state in a non TUI context e.g.:
- clone all
- code analysis, contributor analysis, LICENSE analysis
- arbitrary script execution on all cloned repos e.g.:
  - git fetch
  - git fetch && git log -5
  - git shortlog -sne
  - [ -f Makefile ] || echo "Missing Makefile"