Enhanced Github repository catalog
planetf1 opened this issue · 6 comments
There are a number of issues open to add additional repositories into the catalog at
Rather than manually add each one, I experimented with a script which can
- get a list of all repos under odpi that are not archived
- only get the 'egeria' ones (this was done by adding a metadata tag to each repository)
- output markdown in the same format as we currently use in the provided data table (albeit with no parsing of description yet)
The available fields include:
Available fields:
assignableUsers
codeOfConduct
contactLinks
createdAt
defaultBranchRef
deleteBranchOnMerge
description
diskUsage
forkCount
fundingLinks
hasIssuesEnabled
hasProjectsEnabled
hasWikiEnabled
homepageUrl
id
isArchived
isBlankIssuesEnabled
isEmpty
isFork
isInOrganization
isMirror
isPrivate
isSecurityPolicyEnabled
isTemplate
isUserConfigurationRepository
issueTemplates
issues
labels
languages
latestRelease
licenseInfo
mentionableUsers
mergeCommitAllowed
milestones
mirrorUrl
name
nameWithOwner
openGraphImageUrl
owner
parent
primaryLanguage
projects
pullRequestTemplates
pullRequests
pushedAt
rebaseMergeAllowed
repositoryTopics
securityPolicyUrl
squashMergeAllowed
sshUrl
stargazerCount
templateRepository
updatedAt
url
usesCustomOpenGraphImage
viewerCanAdminister
viewerDefaultCommitEmail
viewerDefaultMergeMethod
viewerHasStarred
viewerPermission
viewerPossibleCommitEmails
viewerSubscription
watchers
This seems a more sustainable model, and opens up some additional questions
- Where should we checkin this script?
- could we automate it's execution at page build time
- what other data do we want to include
- what sort order/grouping is needed (currently the script is unsorted, I sorted in intellij...)
For now I will
- Sensibly merge the descriptions in the actual repository metadata to the text we have in egeria docs for the small number of repos listed
- Run this script one-off and update the current doc manually
Current script:
#!/usr/bin/python3
import json
import subprocess
repoinfo = subprocess.check_output('gh repo list odpi --limit 200 --json name,description,repositoryTopics,isArchived',shell=True)
repojson = json.loads(repoinfo)
print('| Repository | Purpose |')
print('| --- | --- |')
for entry in repojson:
name=entry['name']
description=entry['description']
topics=entry['repositoryTopics']
archived=entry['isArchived']
if topics is not None:
if 'egeria' not in topics:
if not archived:
print ("| [`%s` :material-github:](https://github.com/odpi/%s){ target=gh } | %s |" % (name,name,description))
@planetf1 Could we add this script to the gradle build for the docs repo - i.e. run this Python in a step prior to the site build; so the documentation picks up this information on every build.
Yes, that sort of thing... I think we can do it for other similar content too. I'd probably put all generated content in one place, then refactor the docs a little to embed the dynamic content. I wanted to record the idea as I didn't have time to do it straight away, and in any case - worth a discussion on a team call/f2f,.
though probably wouldn't introduce gradle for this
run 1-off for v4