/LtGt

Lightweight HTML processor

Primary LanguageC#MIT LicenseMIT

LtGt

Build Tests NuGet NuGet Donate Donate

LtGt is a minimalistic library for working with HTML. It can be used to parse HTML5-compliant code, traverse the resulting syntax tree, locate specific elements, and extract information. The library can also be used the other way around, to render HTML code from its document object model.

Download

Features

  • Parse and render HTML5-compliant code
  • Traverse object tree using convenient methods
  • Convert HTML DOM to a Linq2Xml representation
  • Easily extensible with custom workflows
  • Targets .NET Framework 4.5+ and .NET Standard 1.0+

Usage

Parse a document

To parse an HTML document, you may create a new instance of HtmlParser or use a singleton HtmlParser.Default.

const string html = @"<!doctype html>
<html>
  <head>
    <title>Document</title>
  </head>
  <body>
    <div>Content</div>
  </body>
</html>";

var document = HtmlParser.Default.ParseDocument(html);

Parse a fragment

Besides parsing a full document, you can also parse any other type of node.

const string html = "<div id=\"some-element\"><a href=\"https://example.com\">Link</a></div>";

var node = HtmlParser.Default.ParseNode(html);

var element = (HtmlElement) node; // we assume we're dealing with an element

Find specific element

There are many extension methods that should help you locate elements you want to find.

var element1 = document.GetElementById("menu-bar");
var element2 = document.GetElementByTagName("div");
var element3 = document.GetElementByClassName("floating-button floating-button--enabled");

var element1Data = element1.GetAttribute("data")?.Value;
var element2Id = element2.GetId();
var element2Text = element3.GetInnerText();

Convert to Linq2Xml

It's possible to convert LtGt's objects to System.Xml.Linq objects (XNode, XElement, etc). This can be useful if you need to convert HTML to XML or if you want to use XPath to select nodes.

var htmlDocument = HtmlParser.Default.ParseDocument(html);

var xmlDocument = htmlDocument.ToXDocument();

var elements = xmlDocument.XPathSelectElements("//input[@type=\"submit\"]");

Render nodes

You can convert any node or hierarchy of nodes to HTML code.

var element = new HtmlElement("div",
    new HtmlAttribute("id", "main"),
    new HtmlText("Hello world"));

var html = HtmlRenderer.Default.RenderNode(element); // <div id="main">Hello world</div>

Libraries used

Donate

If you really like my projects and want to support me, consider donating to me on Patreon or BuyMeACoffee. All donations are optional and are greatly appreciated. 🙏