hdaSprachtechnologie/odenet

Publish a release

Closed this issue · 6 comments

For OMW we are now using GitHub releases as a distribution channel for the WN-LMF XML files (e.g., see the v1.3 release). The URL of those release files are then easily downloadable by Wn like this:

>>> import wn
>>> wn.download("https://github.com/bond-lab/omw-data/releases/download/v1.3/fra.tar.xz")

Wn can also index those URLs to make it even easier:

>>> wn.download("frawn")
>>> wn.download("pwn:3.0")
>>> wn.download("ewn:2020")

Would you be interested in publishing a release of OdeNet in a similar way? It doesn't need to be done through GitHub releases (EWN, for example, is not). If so, we have some additional suggestions about packaging.

Yes, of course, that'd be great.
What do I have to do to make this possible?

At a minimum, just create a release on GitHub, then attach the XML file as a release asset. You can do this through GitHub's web interface, or you can use something like GitHub CLI (for example, if v1.3 is the release tag, you could do gh release upload v1.3 odenet/odenet/deWordNet.xml).

You can improve on this by compressing the XML file to reduce the download time. Either .gz or .xz work.

Finally, if you want to provide more metadata than what's in the attributes of the <LexicalResource> element, you can create a package with a release README, a copy of the LICENSE file, and a citation.bib file for the canonical citation. Just create a folder containing these files and the resource file. E.g.:

odenet/
├── deWordNet.xml
├── LICENSE
└── README.md

Then tar and compress the folder so you have, e.g., an odenet.tar.xz file. Then upload this file as a release asset.

Does this help?

Since you've already merged #18, and since #17 was resolved, the release workflow should now work. You only need to create a release tagged v1.3. I have demonstrated that this works in my fork (see here for the release, and here for details of the workflow run that uploaded the asset). With the resource published as a release asset, I'm able to download it in Wn:

>>> wn.download('https://github.com/goodmami/odenet/releases/download/v1.3/odenet-1.3.tar.xz')
Download complete (2001072 bytes)
Added odenet:1.3 (Offenes Deutsches WordNet)

Two things to note:

  • If you want to release a new version, make sure you change the version attribute in the XML and tag the release to match
  • Currently the LMF package does not include the README or citation.bib files. The README.md file of this repository contains information irrelevant for the wordnet resource (related to installing the Python tooling), and there is no citation.bib file. Once there are appropriate files in the repository, you just need to add them environment variables in the workflow, and they will be copied into the LMF package. Here are the relevant lines in the workflow file:

env:
XMLPATH: odenet/wordnet/deWordNet.xml
DTD: WN-LMF-1.0.dtd
LICENSE: LICENSE
README:
CITATION:
BUILD_DIR: build

The values of those variables should be paths relative to the project root (see XMLPATH and LICENSE for examples).

The GWC paper is meant to be the official citation, so that's fine.

I've tried it out, and it works perfectly. Thank you!