davidmerfield/Typeset

Cheerio Dependency

jakiestfu opened this issue · 12 comments

How integral is cheerio to this problem you're solving? What is it's biggest benefit?

Cheerio provides a neat, fast, browser-free interface to parse an HTML string and modify its text nodes. When I adapt the library to run on the client, I'll use jQuery, or a minimal equivalent.

Do you have a suggestion for an alternate method?

Your library is very useful but not everyone needs the text nodes parsed, they just want content in -> content out, If I were to have built this I would have just omitted it and them injecting their formatted content in their HTML is still up to them.

I guess I feel like it's an unnecessary addition, but if it's something you definitely wanted in the library, thats ok!

I'm struggling to work out how to implement some of the library's features without parsing the HTML. For instance, a naive find-and-replace on an HTML string would screw up pre-formatted content, tag attributes, script tags, inline CSS etc. The punctuation replacement and hanging punctuation needs to be limited to specific text nodes.

Can you think of a way of accomplishing this without some sort of HTML parser?

IMO, the real solution is to not attempt to parse HTML to begin with. Content !== HTML. With the example below, the user could just convert the text and inject it into the DOM however. Alternatively, they could just send the content down from the server if this is used server side.

var typeset = require('typeset');
var text = '"Hello," said the fox.';
var output = typeset(text);

// Client
document.getElementById('content').innerText = output;

// OR Server
res.send('my-template.jade', {
  content: output
});

This means it could work server side and client side, all it does is take raw text and format it as you'd expect. It should still be up to the user to inject that content into their setup however they see fit.

This will work with your current implementation of Typeset, but it will parse HTML in addition. I think it's feature creep in the sense that it's optionally imposing the use of HTML or it uses cheerio unnecessarily.

If you simplify the concept of this library to just be straight text replacement for those characters, it'd be useful in much more places.

Again, only if you think that concept is relevant. I could see my company using this but we use Ember.js for example, and we don't want to inject HTML, just raw text content and we handle data binding ourselves. Again, the current form of this lib allows us to do that, but that just means Cheerio is unnecessary at that point.

Agreed – this would be great. However, what if someone wants to mix in some code snippets into their content? Or embed a data visualization? Or video? It's going to involve mixing text & HTML. Perhaps the solution is to pass an option if you only have text, say typeset(text, {text: true}) that bypasses cheerio and gives you the resulting performance benefit?

The library was designed to be another stage in the asset pipeline for my blogging platform, so being able to parse an entire HTML document was a requirement.

I would say that person is "doing it wrong" then but you obviously have to account for as many people as possible if you want better exposure.

I guess that idea where you have an API that supports both would work. It'd at least make it clear that you're not using a virtual DOM parser if you don't want that.

The other thing to be aware of is that the features which enable hanging punctuation and optical-margin-alignment must return HTML, since the technique involves the insertion of new nodes.

One possible alternative for people in your situation might be to use this library on the client side. This is not a feature I have yet added but it is planned

If I were to use this in a personal application, I'd probably prefer to send the data formatted from the server, but with this solution, everybody wins.

You're right about the hanging punctuation, so maybe some formatting can only be done in HTML.

I just wanted to share the idea is all! Great library, very useful. Keep up the good work!

Cool – I do appreciate your questions and thanks for the kind words

I recommend Sprint-js for the client-side implementation.