HTML Agility Pack

HTML Agility Pack

A .NET library to parse and manipulate HTML documents.

HTML Agility Pack

HTML Agility Pack is a popular .NET library for parsing and manipulating HTML documents. When combined with an HttpClient configured with a HypeProxy.io proxy, you can scrape websites while avoiding IP blocks.

Setting Up HypeProxy.io with HTML Agility Pack

Install the Package

dotnet add package HtmlAgilityPack

Configure HttpClient with Proxy

using System.Net;
using HtmlAgilityPack;

var proxy = new WebProxy("http://fr.hypeproxy.host:port")
{
    Credentials = new NetworkCredential("username", "password")
};

var handler = new HttpClientHandler
{
    Proxy = proxy,
    UseProxy = true
};

var httpClient = new HttpClient(handler);
var html = await httpClient.GetStringAsync("https://example.com");

var doc = new HtmlDocument();
doc.LoadHtml(html);

var title = doc.DocumentNode.SelectSingleNode("//title")?.InnerText;
Console.WriteLine($"Page title: {title}");

With SOCKS5

For SOCKS5 proxies, use a library like SocksSharp or configure a SOCKS5 handler:

dotnet add package SocksSharp

Tips

  • Use HttpClient with a proxy handler instead of HTML Agility Pack's built-in HtmlWeb class for more control over proxy settings.
  • Rotate your IP using the HypeProxy.io API between scraping batches.
  • Implement retry logic for failed requests — the target site may temporarily block your IP.

Was this article helpful?