cheeriojs/cheerio

How to avoid cheerio to escape character when i use `.html()` function

fireairforce opened this issue · 3 comments

My demo case just as follows:

const cheerio = require('cheerio')

const contents = 'https://baidu.com/test?b=1&a=1'

const $ = cheerio.load(contents, {
  decodeEntities: false
})

console.log($.html({
  decodeEntities: false,
}));

The cheerio version was 1.0.0-rc.12.

The output was:

<html><head></head><body>https://baidu.com/test?b=1&amp;a=1</body></html>

i don't want to & has been escaped to &amp;, i try to add a option like decodeEntities: false, but it seems make no sense.

How can i avoid this?

Its possible to set false in a third argument like this:

  const $ = cheerio.load(contents, { decodeEntities: false, }, false);

Reference: Section "Fragment mode" in Tutorial Advanced > Configuring Cheerio: https://cheerio.js.org/docs/advanced/configuring-cheerio

@fireairforce I'm using the options below and it works well, you can try it.

load(
  html,
  {
    xml: {
      decodeEntities: false,
    },
  },
  false
)

Actually, it seems to work with the second argument only, dropping the third one:

> cheerio.load('https://baidu.com/test?b=1&a=1').html()
'<html><head></head><body>https://baidu.com/test?b=1&amp;a=1</body></html>'
> cheerio.load('https://baidu.com/test?b=1&a=1', { xml: { decodeEntities: false } }).html()
'https://baidu.com/test?b=1&a=1'

The problem is in the second argument: it should be { xml: { decodeEntities: false } }.