/DotNetExpose

A package that helps you to scrap web pages. It shows you a lot of information about the page.

Primary LanguageC#

DotNetExpose Build Status Nuget Nuget

DotNetExpose is a .Net library for helping you to scrap web pages. It shows you a lot of information about the page.

Notes

Version 1.0.5:

  • Upgrade to .NET 6

Installation

Use the package manager to install.

Install-Package DotNetExpose -Version 1.0.5

Usage

After install the package:

using Expose.Main;

Create an instance of ExposeHtmlDocument. The constructor needs an URL. This URL will be scraped.

const string URL = "https://www.google.com.br/"

ExposeHtmlDocument expose = new ExposeHtmlDocument(URL);

Return total of CSS files referenced in the html page

int countCSS = expose.CountCSSAsync();

Return total of JS files referenced in the html page

int countJS = expose.CountJSAsync();

Return total of Html Elements

int countHtmlElements = CountHtmlElementsAsync();

Return total of META elements

int countMetaTags = expose.CountMetaAsync();

Return all the JS content

HashSet<string> hsJS = expose.GetJSContentAsync();

Return all the CSS content

HashSet<String> hsCSS =  expose.GetCSSContentAsync();

Return the total of onclick events in all elements in the html

int countOnclickEvents = expose.CountOnclickEventsAsync();

Return the total of Forms in html page

int countForms = expose.CountFormsAsync();

Return the Action and HttpMethod from Form

Dictionary<string,string> dicFormInfo = expose.FormsInfoAsync();

Return the size in Kb of the page

long? pageSize = expose.GetSizeOfPageAsync();

Return the JSON with the amount of info found

string report = expose.GetReportAsync();

Return True/False

bool hasAjaxCall = expose.HasAjaxCallAsync();

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT