LangChain

LangChain

A framework for building AI applications with LLMs, supporting proxy-enabled web loading.

LangChain

LangChain is a popular framework for building AI-powered applications with large language models (LLMs). Its web loaders and document retrievers can be configured with proxies for collecting data from the web.

Setting Up HypeProxy.io with LangChain

Using WebBaseLoader with Proxy

from langchain_community.document_loaders import WebBaseLoader
import os

# Set proxy environment variables
os.environ['HTTP_PROXY'] = 'http://username:password@fr.hypeproxy.host:port'
os.environ['HTTPS_PROXY'] = 'http://username:password@fr.hypeproxy.host:port'

loader = WebBaseLoader("https://example.com")
documents = loader.load()

Using AsyncHtmlLoader with Proxy

from langchain_community.document_loaders import AsyncHtmlLoader

urls = ["https://example.com", "https://example.org"]
loader = AsyncHtmlLoader(
    urls,
    proxy="http://username:password@fr.hypeproxy.host:port"
)
documents = loader.load()

Tips

  • Use proxies when loading data from websites that have rate limiting or geo-restrictions.
  • Mobile proxies from HypeProxy.io work best for scraping social media and dynamic content.
  • Combine LangChain's text splitters with proxy-loaded content for RAG pipelines.

Was this article helpful?