antchfx/htmlquery

substring-after() is not being executed

Closed this issue · 4 comments

I tried a few expressions with substring-after() and it seems the functions is not being executed at all. Tried to debug func.go and substringIndFunc is being called, returns a callable which is never called though.

Example expression:
substring-after(//span[@Class="pageNumbersInfo"]//text(), "of ")
Node: Pages 1 of 25

You need provide more information to help me to debug.

c4tz commented

Hi, I just came across this problem, too.

Here is a minimal (not) working example:

package main

import (
	"log"
	"strings"

	"github.com/antchfx/htmlquery"
)

func main() {
	s := `<html>
	<head></head>
	<body>
		<div id="foo-12345"></div>
	</body>
</html>`
	xpath := "substring-after(//div/@id, 'foo-')"
	doc, err := htmlquery.Parse(strings.NewReader(s))
	if err != nil {
		log.Fatal(err)
	}
	results, err := htmlquery.QueryAll(doc, xpath)
	if err != nil {
		log.Fatal(err)
	}
	for _, r := range results {
		log.Print(r)
	}
}

When using the exact same XPath expression (and HTML) in the chromium dev tools, I get 12345 back as a result, but htmlquery does not seem to find anything.

@c4tz , substring-after() is a function, and return a string value not NODE type. your substring-after(//div/@id, 'foo-') is telling package to execute this function and returning a string value.

Compare the following two examples:

v := xpath.MustCompile("substring-after(//div/@id, 'foo-')").Evaluate(htmlquery.CreateXPathNavigator(doc))
if v != nil {
    fmt.Println(v.(string))  // output: 12345
}

The below code is return Node values.

	results, err := htmlquery.QueryAll(doc, "//div[substring-after(@id, 'foo-')]")
	if err != nil {
		log.Fatal(err)
	}
	for _, r := range results {
		log.Print(r)
	}
c4tz commented

Ahh 💡 Thank you for the clarification! Then it just was a misunderstanding on my side, sorry.