Welcome to SARIT’s library of Indic e-texts. SARIT is short for “Search and Retrieval of Indic Texts” (cf. http://sarit.indology.info/).
This collection of files is special in that any changes you make to your files can, if you choose so, be shared with other people. So if you correct, annotate, or otherwise improve any file in this directory or one of the directories below it, you might consider sharing your work with others (and hope that they will share their work with you).
We, people administering SARIT and working with it, hope that this will lead to incremental but continuous improvement of these files, and of any files you wish to add. We plan to periodically update SARIT’s search interface with the texts changed or added in this repository. At the moment, this library is an experiment in online collaboration on this sort of material.
In order to make full use of the SARIT library, you should install git. You can find all you need to get started here: http://git-scm.com/downloads.
More details on the installation procedures are given in Pro Git, chap. 1.4.
If you do not like working on a command line, you can use a Graphical User Interface (GUI) for git. The standard GUI that is installed automatically by the steps above is “gitk” on Linux, and “git gui” on mac and windows.
These are stable but rather basic programs, and there are quite a few other options you might want to explore if you’re not happy with them. In the following, we will give examples for Smart Git, a Java program that is free of charge for non-commercial use, and that will run any platform that Java runs on (i.e., Linux, Windows, Mac).
Getting the first copy is called “cloning.” Depending on your platform, you can either
or consult the documentation of your git interface on how to point it at git://github.com/paddymcall/SARIT
For Smart Git, do this:
- Project –> Clone
- Under “Remote Git or SVN repository”, enter the name of the collection, e.g., “git://github.com/paddymcall/SARIT”, click Next.
- Choose a directory where to put your files. This should be somewhere where you usually store your files, i.e., not in “C:\Programs” or so. Click Next.
- Accept or choose the project name (SARIT should be the default).
The full guidelines can be found here: http://www.tei-c.org/Guidelines/P5/, and a simplified version, TEI Lite: http://www.tei-c.org/Guidelines/Customization/Lite/.
We have found the following introductions very helpful:
- Comparing git to a computer game: http://www-cs-students.stanford.edu/~blynn/gitmagic/ (note also the various translations)
- Basic git commands, short and to the point: http://www.kernel.org/pub/software/scm/git/docs/everyday.html
- More links can be found here: http://git-scm.com/documentation
That depends on the area you need help with:
You probably do need an XML aware editor. There are many, and with many different features, just as there are many tastes. Some popular choices include:
- Jedit, with appropriate plug-ins (runs on all computers where Java is installed)
- http://www.oxygenxml.com/ (on all platforms, not free)
- Emacs + nxml mode (all platforms, free)
For a list to choose from, cf. http://en.wikipedia.org/wiki/List_of_XML_editors or http://www.xml.com/pub/rg/XML_Editors.
This is done via the program git: see Learn about git above for links to documentation, and see How does sharing work? below for a general description.
Three steps are involved in sharing these files:
- Getting <<what other people changed>>.
- Letting other people <<get what you changed>>.
- <<Merging the changes>> together.
To do this in an organised fashion, we are using a program called git. It keeps track of changes to the files in this directory, and can `pull’ (point 1 above) and `push’ (point 2 above) from or to another instance of these files likewise controlled by git. What it pushes are the changes you have made to these files, and what it pulls are the changes another person (or a group of other persons) has made to these files.
When it does this, two things can happen:
When, say, Jane corrects paragraph 1, and Jack corrects paragraph 2 of the same file, git will be able to `merge’ (point 3 above) . So if Jack `pulls’ Jane’s changes, paragraph 1 of his file will automatically be changed to paragraph 1 of Jane’s file. Likewise, if Jane `pulls’ Jack’s changes, her file will automatically be changed in paragraph 2 according to Jack’s changes. So they each edited only one paragraph, but both have the same version of the file now, with both paragraphs corrected.
In case both Jane and Jack change the same part of a file, git will refuse to `merge’ the files (since it doesn’t know which change is the correct one). In this situation, either Jack or Jane will have to review the other person’s changes, and decide which version to keep (or make a third version that contains the changes of both). After making these changes, git will understand that either Jack or Jane have resolved the conflict, and they can continue to work in the normal fashion.
On github, there are two ways in which you can get your changes back to SARIT:
- by being a collaborator, or
- by creating your own copy of the SARIT library (forking) and telling us about your changes (pull request)
In both cases, you need to sign up on http://github.com.
Please send an email to pma@rdorte.org with your github account name and mentioning that you’d like to collaborate.
This is the preferred way to go if you want to be a little more independent from the main SARIT library, e.g., for working on your own set of files, or for general experimentation. Basically, you copy the whole project at a particular moment in its history, and then work independently on that copy. If you are happy with your changes, you can send us a pull request, and we will try to merge your changes back into the main repository.
These two pieces of information will get you started:
- forking http://help.github.com/fork-a-repo/
- Sending pull requests: http://help.github.com/pull-requests/
The files in this directory try to adhere to the Text Encoding Initiatives standards in version P5 (TEI P5). These standards define a vocabulary for describing things about a text: who is its author, which other texts is it referring to, which page of a printed edition is this paragraph on, who is “asya” referring to, etc.
This file is an exception in that it aggregates all the invidual text files into a corpus. Correspondingly, it has teiCorpus as its root element, instead of TEI like all the others.
You can use this to easily operate on the whole corpus, e.g.:
xmllint --encode UTF-8 --xinclude saritcorpus.xml > /temp/saritcorpus.xml jing -c schemas/tei_all.rnc /temp/saritcorpus.xml