ptsochantaris/trailer

[1.8.x] Issues with Github background sync

Jecoms opened this issue · 42 comments

On upgrade to 1.8/1.8.1, I reset/reloaded all data as the syncing seemed to be broken.

Initially, this seemed to update as expected and the menu bar icons have correct styling/count.

A few minutes later, the github server label goes red and all menu bar icons switch to red Xs.

Screenshot 2023-07-06 at 10 37 08 AM
Screenshot 2023-07-06 at 10 43 22 AM

I am also having this issue since the upgrade. When I downgrade back to version 1.7.6 it's no longer an issue.

Hi, and thanks for reporting! This sounds to me like GH isn't liking the query style of this version on accounts with many items - can you try this build to see if it improves things for you?

Also, if possible, could you turn on logging from Misc and see if you can spot any issues via Console.app? If this is not a throttling issue there may be some other problem that the log could provide us with more info about.

[edit: removed, see below]

Thanks! Will be keeping an eye on this thread and try to fix this issue asap once we know more.

Whoops, scratch that, that build was totally broken, apologies - please give this a try instead:

[Edit: removed - version now updated]

I think mine is working again on this latest build 👍

That's great, thanks for the feedback, if things still seem fine by this evening I'll do an update.

I have put up an update with this tweak now. I shortly expect to add an option to allow parallel v4 API requests (default off) so that, for users like me where this doesn't cause GH to throttle, we can turn it on. Sorry for the hiccup, I'll leave this open for a little while in case there are further issues.

Jecoms commented

After upgrade I am seeing the same behavior. I turned on the logging, but I'm not seeing any specific github/gql errors related to the data sync other than a log saying it failed. Logs included at the bottom.

I do have a lot of repos filtered by participation, so I went ahead and hid a bunch of them to be under 20 total (I started with 40-50). It seems related to the gql scope and maybe it's an n+1 query situation to get PR related data like assignees (There was a large block of logs saying "needs 1 more query").

This reduced set of repos works for my needs, so I'll stay on the latest version. I'm happy to retry with the larger set again with any new tweaks.

Appreciate the renewed time and attention to this project!

log examples:

default	10:00:14.462882-0500	Trailer	Will pause and retry call to https://api.github.com/graphql
default	10:00:34.797739-0500	Trailer	Failed call to https://api.github.com/graphql
default	10:00:34.804768-0500	Trailer	Status update: Processing update…
default	10:00:34.805377-0500	Trailer	Rolling back changes for failed sync on API server 'Github'
default	10:00:34.808521-0500	Trailer	Nuked total 0 items marked for deletion
default	10:00:34.809954-0500	Trailer	Committing synced data
default	10:00:34.810178-0500	Trailer	Synced data committed
default	10:00:34.810239-0500	Trailer	No DB changes
default	10:00:34.810874-0500	Trailer	Postprocess done - 0.0006099939346313477 sec
default	10:00:34.813730-0500	Trailer	No DB changes
default	10:00:34.813801-0500	Trailer	Refresh done
default	10:00:34.814079-0500	Trailer	Status update: Last update failed
default	10:00:34.876756-0500	Trailer	order window: ab2a op: 0 relative: 0 related: 0
default	10:00:34.876954-0500	Trailer	order window: ab2b op: 0 relative: 0 related: 0
default	10:00:34.877096-0500	Trailer	order window: ab2c op: 0 relative: 0 related: 0
default	10:00:34.899146-0500	Trailer	Updating general PullRequest menu, X total items

I only have 5 watched repositories. But I'm seeing the same issue & logs as @Jecoms is. Running 1.8.2 (1652).

Thanks for the update @Jecoms and @amayers - the Will pause and retry error does seem to imply a rate issue.

"1 more query" is fine, as basically it means that after processing some data, that data is signalling that more paging is needed.

If you turn on Dump API responses to the console in Misc you may be able to track which request is failing and what the error coming from the server is, it may provide us with more info.

@ptsochantaris I don't see any error messages from Github. In the servers tab it shows that my API limit is not even showing up on the bar. I don't have any other apps that are using the API, so I shouldn't be hitting the rate limit.

default	12:05:48.798306-0400	Trailer	Will sync items from: <redacted>, <redacted>, <redacted>, <redacted>, <redacted>
default	12:05:48.801183-0400	Trailer	order window: 23d5 op: 0 relative: 0 related: 0
default	12:05:48.805593-0400	Trailer	Updating general PullRequest menu, X total items
default	12:05:48.807607-0400	Trailer	order window: 23ea op: 1 relative: 23ea related: 0
default	12:05:48.813618-0400	Trailer	(TQL 'GitHub: Open PRs') Fetching: fragment milestoneFragment on Milestone { __typename title } fragment repositoryFragment on Repository { __typename id pullRequests(first: 50, states: [OPEN]) { edges { node { __typename ... pullrequestFragment } cursor } pageInfo { hasNextPage } } } fragment labelFragment on Label { __typename id name color createdAt updatedAt } fragment pullrequestFragment on PullRequest { __typename id bodyText state createdAt updatedAt number title url milestone { __typename ... milestoneFragment } author { __typename ... userFragment ... botFragment } assignees(first: 20) { edges { node { __typename ... userFragment } cursor } pageInfo { hasNextPage } } labels(first: 20) { edges { node { __typename ... labelFragment } cursor } pageInfo { hasNextPage } } headRefOid mergeable additions deletions headRefName baseRefName isDraft mergedBy { __typename ... userFragment } baseRepository { __typename nameWithOwner } headRepository { __typename nameWithOwner } } fragment userFragment on<…>
default	12:05:48.813788-0400	Trailer	(TQL 'GitHub: Open Issues') Fetching: fragment milestoneFragment on Milestone { __typename title } fragment botFragment on Bot { __typename id login avatarUrl } fragment issueFragment on Issue { __typename id bodyText state createdAt updatedAt number title url milestone { __typename ... milestoneFragment } author { __typename ... userFragment ... botFragment } assignees(first: 20) { edges { node { __typename ... userFragment } cursor } pageInfo { hasNextPage } } labels(first: 20) { edges { node { __typename ... labelFragment } cursor } pageInfo { hasNextPage } } } fragment labelFragment on Label { __typename id name color createdAt updatedAt } fragment repositoryFragment on Repository { __typename id issues(first: 50, states: [OPEN]) { edges { node { __typename ... issueFragment } cursor } pageInfo { hasNextPage } } } fragment userFragment on User { __typename id login avatarUrl } { nodes(ids: ["MDEwOlJlcG9zaXRvcnkzMTg4NDE1ODQ=","MDEwOlJlcG9zaXRvcnkzMjAxNjQ1NTk=","MDEwOlJlcG9zaXRvcnkzMDg3MDM1Mzk=","M<…>
default	12:05:48.813936-0400	Trailer	(TQL 'Authored Items') Fetching: fragment milestoneFragment on Milestone { __typename title } fragment botFragment on Bot { __typename id login avatarUrl } fragment pullrequestFragment on PullRequest { __typename id bodyText state createdAt updatedAt number title url milestone { __typename ... milestoneFragment } author { __typename ... userFragment ... botFragment } assignees(first: 20) { edges { node { __typename ... userFragment } cursor } pageInfo { hasNextPage } } labels(first: 20) { edges { node { __typename ... labelFragment } cursor } pageInfo { hasNextPage } } headRefOid mergeable additions deletions headRefName baseRefName isDraft mergedBy { __typename ... userFragment } baseRepository { __typename nameWithOwner } headRepository { __typename nameWithOwner } repository { __typename ... repositoryFragment } } fragment labelFragment on Label { __typename id name color createdAt updatedAt } fragment repositoryFragment on Repository { __typename id createdAt updatedAt isFork isArchived nameWithO<…>
default	12:05:48.814378-0400	Trailer	(TQL 'Authored Items') Fetching: fragment milestoneFragment on Milestone { __typename title } fragment issueFragment on Issue { __typename id bodyText state createdAt updatedAt number title url milestone { __typename ... milestoneFragment } author { __typename ... userFragment ... botFragment } assignees(first: 20) { edges { node { __typename ... userFragment } cursor } pageInfo { hasNextPage } } labels(first: 20) { edges { node { __typename ... labelFragment } cursor } pageInfo { hasNextPage } } repository { __typename ... repositoryFragment } } fragment botFragment on Bot { __typename id login avatarUrl } fragment repositoryFragment on Repository { __typename id createdAt updatedAt isFork isArchived nameWithOwner url isPrivate owner { __typename id } } fragment userFragment on User { __typename id login avatarUrl } fragment labelFragment on Label { __typename id name color createdAt updatedAt } { viewer { __typename issues(first: 100, states: [OPEN]) { edges { node { __typename ... issueFragment } <…>
default	12:05:48.814580-0400	Trailer	Status update: GitHub: Open PRs
default	12:05:50.074615-0400	Trailer	order window: 22b7 op: 0 relative: 0 related: 0
default	12:05:59.932531-0400	Trailer	Will pause and retry call to https://api.github.com/graphql
default	12:06:15.706038-0400	Trailer	Will pause and retry call to https://api.github.com/graphql
default	12:06:22.025238-0400	Trailer	Setting LAST_PREFS_TAB_SELECTED_OSX to 0
default	12:06:26.627696-0400	Trailer	Setting LAST_PREFS_TAB_SELECTED_OSX to 11
default	12:06:29.453216-0400	Trailer	Setting LAST_PREFS_TAB_SELECTED_OSX to 0
default	12:06:31.940960-0400	Trailer	Will pause and retry call to https://api.github.com/graphql
default	12:06:32.358858-0400	Trailer	order window front conditionally: 22c1 related: 0
default	12:06:35.945935-0400	Trailer	order window: 22c1 op: 0 relative: 0 related: 0
default	12:06:49.423100-0400	Trailer	Will pause and retry call to https://api.github.com/graphql
default	12:06:59.815551-0400	Trailer	Setting LAST_PREFS_TAB_SELECTED_OSX to 1
default	12:07:01.676331-0400	Trailer	Setting NEW_REPO_CHECK_PERIOD to 2.0
default	12:07:01.692583-0400	Trailer	order window front conditionally: 23f0 related: 0
default	12:07:04.500174-0400	Trailer	order window front conditionally: 22c1 related: 0
default	12:07:05.236879-0400	Trailer	Failed call to https://api.github.com/graphql
default	12:07:05.237083-0400	Trailer	Status update: GitHub: Open Issues
default	12:07:06.696990-0400	Trailer	order window: 22c1 op: 0 relative: 0 related: 0
default	12:07:06.825124-0400	Trailer	API data from https://api.github.com/graphql: ByteBuffer { readerIndex: 0, writerIndex: 46324, readableBytes: 46324, capacity: 65536, storageCapacity: 65536, slice: _ByteBufferSlice { 0..<65536 }, storage: 0x0000000158008000 (65536 bytes) }
default	12:07:06.830311-0400	Trailer	(TQL 'GitHub: Open Issues') Received page (Cost: 1, Remaining: 4659/5000 - Expected Count: 10255 - Returned Count: 10255)
default	12:07:06.830383-0400	Trailer	(TQL 'GitHub: Open Issues') Scanning result
default	12:07:06.830545-0400	Trailer	(TQL 'GitHub: Open Issues') Scanning result
default	12:07:06.830636-0400	Trailer	Status update: Authored Items
default	12:07:06.831142-0400	Trailer	(TQL 'GitHub: Open Issues') Parsed all pages
default	12:07:06.831259-0400	Trailer	(TQL 'GitHub: Open Issues') Parsed all pages
default	12:07:06.831656-0400	Trailer	Processing GQL nodes: Label: 20, Issue: 7, User: 9, Repository: 5
default	12:07:06.834255-0400	Trailer	Creating Issue ID: I_kwDOExVSz85UrDoI (v4)
default	12:07:06.834950-0400	Trailer	Creating Issue ID: I_kwDOEmZxM848sUfr (v4)
default	12:07:06.835493-0400	Trailer	Creating Issue ID: I_kwDOEmZxM849uBxS (v4)
default	12:07:06.835660-0400	Trailer	Creating Issue ID: I_kwDOEmZxM85hOq3X (v4)
default	12:07:06.835819-0400	Trailer	Creating Issue ID: I_kwDOEn8W285RZilp (v4)
default	12:07:06.836303-0400	Trailer	Creating Issue ID: I_kwDOEn8W285hO4fC (v4)
default	12:07:06.836472-0400	Trailer	Creating Issue ID: I_kwDOGVBfCM5RUV7s (v4)
default	12:07:06.837371-0400	Trailer	Creating PRLabel ID: LA_kwDOExVSz88AAAABGPqFBA (v4)
default	12:07:06.837584-0400	Trailer	Creating PRLabel ID: LA_kwDOExVSz88AAAABGPqFCA (v4)
default	12:07:06.837735-0400	Trailer	Creating PRLabel ID: LA_kwDOExVSz88AAAABGPqFDA (v4)
default	12:07:06.837879-0400	Trailer	Creating PRLabel ID: LA_kwDOExVSz88AAAABGQfioQ (v4)
default	12:07:06.838022-0400	Trailer	Creating PRLabel ID: LA_kwDOEmZxM88AAAABDeXTzA (v4)
default	12:07:06.838160-0400	Trailer	Creating PRLabel ID: LA_kwDOEmZxM88AAAABDeXT1g (v4)
default	12:07:06.838295-0400	Trailer	Creating PRLabel ID: LA_kwDOEmZxM88AAAABDeXT1w (v4)
default	12:07:06.838433-0400	Trailer	Creating PRLabel ID: LA_kwDOEmZxM88AAAABDejd5w (v4)
default	12:07:06.838565-0400	Trailer	Creating PRLabel ID: LA_kwDOEn8W288AAAABDNPnGA (v4)
default	12:07:06.838700-0400	Trailer	Creating PRLabel ID: LA_kwDOEn8W288AAAABDNPnGQ (v4)
default	12:07:06.838831-0400	Trailer	Creating PRLabel ID: LA_kwDOEn8W288AAAABDNPnGg (v4)
default	12:07:06.838973-0400	Trailer	Creating PRLabel ID: LA_kwDOEn8W288AAAABDNd_4A (v4)
default	12:07:06.839294-0400	Trailer	Creating PRLabel ID: LA_kwDOGVBfCM8AAAABDJlTSA (v4)
default	12:07:06.839445-0400	Trailer	Creating PRLabel ID: LA_kwDOGVBfCM8AAAABDJlTSw (v4)
default	12:07:06.839586-0400	Trailer	Creating PRLabel ID: LA_kwDOGVBfCM8AAAABDJlTVA (v4)
default	12:07:06.839711-0400	Trailer	Creating PRLabel ID: LA_kwDOGVBfCM8AAAABDJ27gQ (v4)
default	12:07:07.779143-0400	Trailer	API data from https://api.github.com/graphql: ByteBuffer { readerIndex: 0, writerIndex: 5870, readableBytes: 5870, capacity: 16384, storageCapacity: 16384, slice: _ByteBufferSlice { 0..<16384 }, storage: 0x000000012e8c6600 (16384 bytes) }
default	12:07:07.780575-0400	Trailer	(TQL 'Authored Items') Received page (Cost: 2, Remaining: 4657/5000 - Expected Count: 4100 - Returned Count: 4100)
default	12:07:07.780714-0400	Trailer	(TQL 'Authored Items') Scanning result
default	12:07:07.780988-0400	Trailer	(TQL 'Authored Items') Scanning result
default	12:07:07.781035-0400	Trailer	Status update: Authored Items
default	12:07:07.781402-0400	Trailer	(TQL 'Authored Items') Parsed all pages
default	12:07:07.781591-0400	Trailer	(TQL 'Authored Items') Parsed all pages
default	12:07:07.782053-0400	Trailer	Processing GQL nodes: User: 6, Label: 1, PullRequest: 3, Repository: 3, Organization: 3
default	12:07:07.786207-0400	Trailer	Creating PullRequest ID: PR_kwDOEwEi8M5VGI4G (v4)
default	12:07:07.786972-0400	Trailer	Creating PullRequest ID: PR_kwDOEwEi8M5VHYnd (v4)
default	12:07:07.787200-0400	Trailer	Creating PullRequest ID: PR_kwDOEwEi8M5VM_CH (v4)
default	12:07:07.787904-0400	Trailer	Creating PRLabel ID: LA_kwDOEwEi8M8AAAABRb5t-Q (v4)
default	12:07:08.246576-0400	Trailer	API data from https://api.github.com/graphql: ByteBuffer { readerIndex: 0, writerIndex: 199, readableBytes: 199, capacity: 16384, storageCapacity: 16384, slice: _ByteBufferSlice { 0..<16384 }, storage: 0x000000012f033000 (16384 bytes) }
default	12:07:08.246987-0400	Trailer	(TQL 'Authored Items') Received page (Cost: 2, Remaining: 4655/5000 - Expected Count: 4100 - Returned Count: 4100)
default	12:07:08.247056-0400	Trailer	(TQL 'Authored Items') Scanning result
default	12:07:08.247189-0400	Trailer	(TQL 'Authored Items') Scanning result
default	12:07:08.247359-0400	Trailer	(TQL 'Authored Items') Parsed all pages
default	12:07:08.247442-0400	Trailer	(TQL 'Authored Items') Parsed all pages
default	12:07:08.247692-0400	Trailer	Processing GQL nodes:
default	12:07:08.249692-0400	Trailer	Status update: Processing 33 items…
default	12:07:08.249995-0400	Trailer	Rolling back changes for failed sync on API server 'GitHub'
default	12:07:08.251969-0400	Trailer	Nuked total 0 items marked for deletion
default	12:07:08.252432-0400	Trailer	Committing synced data
default	12:07:08.252526-0400	Trailer	Synced data committed
default	12:07:08.252558-0400	Trailer	Saving DB
default	12:07:08.254170-0400	Trailer	Postprocess done - 0.0013219118118286133 sec
default	12:07:08.255312-0400	Trailer	No DB changes
default	12:07:08.255336-0400	Trailer	Refresh done
default	12:07:08.255429-0400	Trailer	Status update: Last update failed
default	12:07:08.269026-0400	Trailer	order window: 23ea op: 0 relative: 0 related: 0
default	12:07:08.271878-0400	Trailer	Updating general PullRequest menu, X total items

Indeed, definitely doesn't look like an API throttle issue @amayers - this looks way more like a GraphQL query problem. Two of the queries are failing there. It's odd that your log isn't showing the API error coming from the server. If you turn off Authored Items sync does the issue go away?

Also, are any of the repos you're following public? I'd love to try and reproduce the issue locally.

@ptsochantaris I tried turning off the authored items sync, but that didn't fix it. I also just tried downgrading to v1.7.6, and the console logs have more details for the API responses. No, none of these repos are public. Also of possible note, I'm using a fine grained personal access token (my organization now requires it). However Trailer did work with this token as of a week or two ago.

default	12:37:03.278708-0400	Trailer	Status update: GitHub: Open PRs
default	12:37:03.279405-0400	Trailer	Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> resuming, timeouts(60.0, 604800.0) QOS(0x9) Voucher (null)
default	12:37:03.281167-0400	Trailer	[Telemetry]: Activity <nw_activity 12:2[A7B09696-3341-4ABD-B614-549130F22252] (reporting strategy default)> on Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> was not selected for reporting
default	12:37:03.281853-0400	Trailer	Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> {strength 0, tls 4, sub 0, sig 1, ciphers 0, bundle 0, builtin 0}
default	12:37:03.282088-0400	Trailer	[C2] event: client:connection_reused @79.669s
default	12:37:03.283298-0400	Trailer	Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> now using Connection 2
default	12:37:03.283874-0400	Trailer	Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> sent request, body S 1276
default	12:37:06.721988-0400	Trailer	[C2] event: client:data_stall @83.109s
default	12:37:13.751696-0400	Trailer	Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> received response, status 502 content U
default	12:37:13.752280-0400	Trailer	Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> done using Connection 2
default	12:37:13.752438-0400	Trailer	[C2] event: client:connection_idle @90.139s
default	12:37:13.752876-0400	Trailer	Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> response ended
default	12:37:13.753189-0400	Trailer	Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> summary for task success {transaction_duration_ms=10471, response_status=502, connection=2, reused=1, request_start_ms=1, request_duration_ms=0, response_start_ms=10470, response_duration_ms=1, request_bytes=1398, response_bytes=707, cache_hit=true}
default	12:37:13.755007-0400	Trailer	Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> finished successfully
default	12:37:13.755760-0400	Trailer	API data from https://api.github.com/graphql: {
   "data": null,
   "errors":[
      {
         "message":"Something went wrong while executing your query. This may be the result of a timeout, or it could be a GitHub bug. Please include `CF2E:728F:1A97DE8:360CCCE:64AD8520` when reporting this issue."
      }
   ]
}
default	12:37:13.756566-0400	Trailer	(GQL 'GitHub: Open PRs') Received page (No stats)
default	12:37:13.757046-0400	Trailer	(GQL 'GitHub: Open PRs')  Error: Failed with error: 'Something went wrong while executing your query. This may be the result of a timeout, or it could be a GitHub bug. Please include `CF2E:728F:1A97DE8:360CCCE:64AD8520` when reporting this issue.'
default	12:37:13.757168-0400	Trailer	(GQL 'GitHub: Open PRs')  Pausing for retry, attempt 4

Oh I know why you can't see error messages in the log - it's a clear bug there, this build will output errors correctly, so maybe that can give us some insight. BTW Thanks for your patience everyone, we'll get it sorted.

Trailer.app.zip

Reminder to self: Implement a user-visible sync log already!!!

@amayers The error that you see there is a clear throttling error from GitHub. Even though Trailer does query things inside the boundaries of the API limits, it seems that some queries just "overflow" internally. That's why Trailer has a retry mechanism (you can see the message for that at the end of your log). One of the best ways to calm GH is to just not run Trailer for a few minutes. I'm going to try heavy-handedly reducing query sizes in a build and seeing if that improves things.

(BTW the fine grained token you mention shouldn't make a difference)

Using that build I do see more response details. But I'm not really seeing any additional details on the error its self.
I'll try dialing back some of Trailer's settings so hopefully it makes fewer requests, and less in each.

On a weirdly good side, this doesn't look like an issue with the rate per se, more like the specific query, so I can at least put back the multithreaded querying. But of course that leaves us with the mystery about why this query is failing. This build here cuts the PR batch by half for the Open PRs query (from 50 to 25 per page) - let's see if it helps.

Trailer.app.zip

(BTW If anyone wants to forward any queries or API results which may help but don't want to make them public, you can always reach me at my email which is my GH handle at me.com)

That build doesn't fix my issue (app still shows X). But I do see a lot more successful responses in the logs.
I filed a ticket with Github with: Something went wrong while executing your query. This may be the result of a timeout, or it could be a GitHub bug. Please include CF2E:728F:1A97DE8:360CCCE:64AD8520 when reporting this issue. Hopefully they can identify what part of the request is causing the issue.

Nice, thanks for trying that. I'm just cooking up a build in which you can configure the page sizes manually so perhaps you can experiment. Will post it here shortly.

@amayers This build has a ... button next to the v4 API checkbox in preferences. From there you can change the page size of queries. The defaults are quite conservative, and I've made some other bits of those queries lighter as well, but I'd be interested to see what kind of results you get. Thanks so much for helping with this!

[Edit: Have put this up as an update to play it safe, but please do give it a try when you find the time to see if it helps you]

@ptsochantaris Around 5 PRs/page it started to work with an occasional timeout/retry. So I lowered it to 4 for now and that seems good. I then turned back on most of the other details in the requests (reactions, merge conflict, line counts) and it seems to be working with all of that. The issues/page doesn't matter as our org doesn't use issues so they are disabled on all these repos.

Thank you so much for this work! I don't see any way to send you some money. Would you like to share a Cashtag, Venmo, or other way to send you something for your work?

Wow, 4 is extremely low - I mean don't get me wrong, I'm super glad you're unblocked but someone with 5 repos should not even come close to causing an API timeout on GH, even if paging was close to 100. It must be some strange corner case which I'll try to keep in mind when going through the code. If you do think you're able to share any of the queries Trailer sends to my email (I totally appreciate this may not be possible BTW) then it may go a long way in helping me try to diagnose what kind of weird corner case is at work here :D

BTW statuses have been known to cause horrors in the past, so you may want to try disabling those to see if that somehow "unlocks" your page limit, although I realise Trailer is supposed to help with your work, not become your work :D So no pressure.

On the issue of money I've always maintained that I get as much from working on Trailer as I put into it and that's been good enough, but times are getting a bit tougher and at some point I will put up a sponsorship link to see if my hobbies can earn some pocket money, but that's a possibility in the future. I definitely thank you for your very kind offer though. If Trailer keeps making you happy in the future, feel free to pass by the repo and see if there's a monetisation link :D

@ptsochantaris I'm happy to share the request/response details via email. What's your email?

Normally I do have the Show PR CI / statuses enabled. But I disabled it while doing all this debugging, and so far haven't turned it back on. So that doesn't seem to be the limiting factor in this case.

Ah, that's good to know. Which makes things weirder and more interesting. My email is my GitHub handle (minus the leading "at" of course) at me.com - any and all info you are comfortable with sending over will be super helpful!

Here is an updated version which adds additional safeguards to batched GQL calls - it seems that even though Trailer respects GitHub's 500,000 node rule, anything above 40-50k nodes causes a timeout. This build enforces that limit which should mean there's no need to artificially limit item paging sizes or multithreaded queries.

[edit: removed, updated version below]

Another iteration - also applies paging restrictions to things like review comments and reactions. Many thanks @amayers for helping with the testing!

[edit: removed, new build available below]

A specifically tweaked build that records response times in queries for testing - please note this build will not respect any settings from the v4 API paging settings in prefs.

[edit: removed, new build available below]

This build replaces the v4 sync settings with 3 presets: Safe / Default / High - Default being the same as the previous build, Safe is considerably lighter and worth trying if v4 API times out, while High is worth trying to batch up queries to a large extent to reduce API usage cost.

[edit: removed, new build available below]

Jecoms commented

I tried that latest build yesterday and I was having success when using the Safe mode with all of my repos marked as participating (50+).

Away from my work computer currently, but I'll confirm next week that it's performing as expected for a number of syncs.

Thanks for testing @Jecoms - sounds like "Safe" may have to be the actual default on the next release. Let me know how it goes 👍

Jecoms commented

@ptsochantaris It's looking good to me. API Limit remains a green sliver while using Safe mode. Thanks for the iteration to a solution for this!

I'll try leaving it on Default for a bit as well. Default seems to be working okay as well. Maybe github made some improvements to their gql endpoint and we're not hitting the timeouts anymore (if default is querying the same as before).

Also now seems okay with High. I'll leave it here for the rest of the day an update if I hit any issues.

@Jecoms Thanks so much for testing, I really appreciate it! I suspect that if you selected "reload all data" from Preferences -> Misc while "High" is selected then it will fail, as that sync transfers a much larger amount of data, but incremental syncs after that should very likely be fine on "High". If things continue to look fine I'll put up an "official" update a little later today or tomorrow 👍

👋 I have this sync problem as well. I've been following along and trying every new build. This latest one in safe mode is now working for me!

One change, I believe, helped was removing all custom-watched repositories and adding the minimum I needed to get by at work. I now have 20 repositories in total.

Thanks for the update @wassimk - it's very helpful; clearly GH's GraphQL is very sensitive when querying a large amount of repos in one query - I'll be sure to keep the safe setting as a default then, and maybe even consider a lighter one too.

Here's another update - this one features an even lighter mode if needed, as well as improved handling of weird burps that GH has been having where it returns zero-byte responses.

[edit: removed, new build available below]

... and another one that introduces better logging in case of invalid JSON coming from GH

[edit: removed, new build available below]

... and a fix that tries to recover from occasional GH timeouts

[edit: removed, new build available now]

... and one that dumps the new network library used in 1.8.x, as I've seen logs that have truncated responses

[edit: removed, new build available now]

This build adds a simpler log viewer in Prefs -> Misc which displays the current activity in Trailer, which can be helpful when reporting syncing issues or even just browsing the API queries and responses of syncs.

Trailer-184-test10.zip

Ok, that last build seems to be working for me now. It looks like the automatic pause & retry of any 403 responses has caused it to eventually succeed. I see around 5 of the 403s back to back, before it succeeds. Thanks for all the work on this!

@amayers That's great news - I wish GitHub had a way to proactively tell the clients if/when they are close to some rate limit instead of having to wait for a 403, it would really help tune all this without all this guesswork. I really appreciate your help with this, and I'll get onto preparing an update with some cleanups so that other people with a similar issue can use the v4API as well.

v.1.8.4 is now up, and contains all the updates from this thread, as well as a stability fix. You can update from any build by selecting "check for updates" from the prefs window, or just wait for Trailer to ping you :) Thank you to everyone who tested or who gave feedback. I'm going to close this issue now, please feel free to open a new one if any issues persist.