web-verse
Verses for the web (deep linking text). Inspired by Emphasis by Michael Donohoe but leverages the Range interface and selection object.
We fingerprint a paragraph by
- Break the text into sentences [1]
- Take the first and last sentences [2]
- Take the first character from the first three words of each sentence [3]
These fingerprints have been shown to provide a uniqueness for reasonably sized documents. Since it's deterministic yet not dependent on all content, this method is tolerant to smaller changes in the content.
Regions of text can be referenced from within a paragraph by using character ranges (counting from 1). For instance, in the following paragraph:
**I** **a**m **a** paragraph with 2 sentences.
**I** **a**m **t**he second sentence.
We can refer to the word sentences
in the first sentence by using the range,
25-33
. Altogether with the paragraph's fingerprint, this gives us an address of
IaaIat:25-33
.
Try it!
npm run watch
Open index.html in a browser. Open the developer console. Select a section in a paragraph. See the identifier in the console, watch the selection being regenerated from the identifier.
Build
npm run build
-> create web-verse.min.js
Tests
npm run test
1: We attempt to be smart about handling full-stops. We'll ignore things like "Dr. Who" and a few similar cases. It's generally enough to avoid getting single word nonsense for our sentences.
2: It's ok if the first and last sentences are the same sentence.
3: Words are defined by tokens composed of a run of non-whitespace characters.