Collection of scripts to export GitHub data into CSV
- Repos in an Organization
- PRs and PR Comments for a given Repo
- Issues and Issues Comments for a given Repo
- Contributors for a given Repo
- Events for a given Repo
Note, the provided PR, PR Comments, Issues, and Issue Comments export only the basic events, you can alter the fields names to include additional data needed for your use-case.
Exporting your GitHub token as environment variables
export GITHUB_ACCESS_TOKEN="your-token-here"
Each one of the below scripts uses paging to export all of the data from the inception of that repo. Depending on the size of your repo, that may take a while. You can experiment with the starting page number in each script to start with a more recent records.
bin/repo org_name
https://docs.github.com/en/rest/repos/repos
Outputs CSV file (repo-org_name
.csv) containing:
repo
- name of the repoorg
- organization nameurl
- fully qualified URL of the repo in GitHubhome
- homepage from repo settingslang
- language from repo settingslicense
- the type of of license used in the repo (e.g. apache-2.0)update
- last update date (yyy-MM-dd)desc
- repo description from settings
bin/pr [org_name] [repo_name]
https://docs.github.com/en/rest/pulls/pulls
Outputs CSV file (pr-org_name
-repo-name
.csv) containing:
id
- numeric identifier for that PRnumber
- sequential number of that PRstate
- current state fo that PR (open, closed)repo
- name of the repoorg
- organization nameuser
- username who submitted the PRcreated
- ISO timestamp of when the PR was createdupdated
- ISO timestamp of when the PR was updatedmerged
- ISO timestamp of when the PR was mergedclosed
- ISO timestamp of when the PR was closedtitle
- PR tile at that timelabels
- Comma-separated list of labels
The pr
command also exports PR Reviews for each one of the PRs into CSV file (prr-org_name
-repo-name
.csv) containing:
https://docs.github.com/en/rest/pulls/reviews#list-reviews-for-a-pull-request
id
- numeric identifier for that PR reviewnumber
- number of the PRstate
- current state fo that PR reviewrepo
- name of the repoorg
- organization nameuser
- username who submitted the PR reviewsubmitted
- ISO timestamp of that PR reviewassociation
- author association (CONTRIBUTOR, MEMBER etc)
bin/prc [org_name] [repo_name]
https://docs.github.com/en/rest/pulls/comments#list-review-comments-in-a-repository
Outputs CSV file (prc-org_name
-repo-name
.csv) containing:
id
- numeric identifier for that PR commentnumber
- number of the PRreview_id
- request review IDin_reply_to_id
- numeric identifier for the PRposition
- order of the comment in the context of the PRrepo
- name of the repoorg
- organization nameauthor
- username who made the PR commentassociation
- author association (CONTRIBUTOR, MEMBER etc)created_at
- comment creation timestampupdated_at
- comment update timestamp
bin/issue [org_name] [repo_name]
https://docs.github.com/en/rest/issues/issues#list-repository-issues
Outputs CSV file (issue-org_name
-repo-name
.csv) containing:
id
- numeric identifier for that issuenumber
- number of the issuestate
- current state fo that PR reviewrepo
- name of the repoorg
- organization nameauthor
- username who created the issuecreated
- ISO timestamp of when the issue was createdclosed
- ISO timestamp of when the issue was closedtitle
- PR tile at that timelabels
- Comma-separated list of labels
bin/issuec [org_name] [repo_name]
https://docs.github.com/en/rest/issues/comments#list-issue-comments-for-a-repository
Outputs CSV file (issuec-org_name
-repo-name
.csv) containing:
id
- numeric identifier for that issue commentnumber
- number of the issuesrepo
- name of the repoorg
- organization nameauthor
- username who made the issue commentupdated
- ISO timestamp of the issue comment was last updatedassociation
- comment author association (role)
bin/contrib [org_name] [repo_name]
https://docs.github.com/en/rest/repos/repos#list-repository-contributors
Outputs CSV file (contrib-org_name
-repo-name
.csv) containing:
id
- numeric identifier for that issue commentrepo
- name of the repoorg
- organization nameuser
- username who made the issue commentadmin
- whether the user is a repo adminnum
- number of contributions to this repo made by that user
Note: Only events created within the past 90 days will be included in timelines.
bin/event
[org_name] [repo_name]
https://docs.github.com/en/rest/activity/events
Outputs CSV file (event-org_name
-repo-name
.csv) containing:
id
- numeric identifier for that eventrepo
- name of the repoorg
- organization nameuser
- the actor (user) that caused that eventtype
- the type of the event (e.g.PushEvent
)time
- ANSI timestamp of this event
This is my personal project and it does not represent my employer. I take no responsibility for issues caused by this code. I do my best to ensure that everything wor