Description by the Publisher
The web-scraper for C# allows .Net developers to create logical that extract content from web applications and turn it into JSON, spreadsheets, C# objects or even SQL using simple C# and Linq code.
This leaves the developer with clean, efficient web-scraping applications which are easy to understand and debug.
The C# Web Scraping Library is extremely polite, ensuring that no domain or IP address has too many concurrent requests. It intelligently throttles both client and server side looking for excessive CPU usage and slowing to an appropriate pace. In addition, it can obey robots.txt directives including bot specific crawl rates and limitation. The exact urls and content types to be strapped can be set using logical workflows and regex/wildcard rules.
Screen-scraping is made easier with identity control, automatically managing threads, rate limits, urls, duplicates, retries, proxies, headers and cookies into a an army of virtual browser which can mimic human behavior and even client buttons, fill in forms or log in behind security walls. This is useful for migrating legacy systems, populating enterprise search facilities and for statistical competitive analysis
Full documentation, support and downloadable DLLS for the C# Web Scraper are available from http://ironsoftware.com/csharp/webscraper/ , in addition to links to a .Net 4.5+ Nuget package with full Azure and Mono compatibility.
Limitations in the Downloadable Version
Free C# developer license for testing and evaluation before deployment: http://ironsoftware.com/csharp/webscraper/licensing/iron-webscraper-eula-license.html
Unique Product ID: PID-F90074B71E87
Unique Publisher ID: BID-88001F048887