Monday, February 2, 2009

Offline Crawling - Resolved the issue with webbrowser control in C#


It seems Webbrowser control in C# is more useful for scraping contents from websites and also for doing autologin and autoposting of contents.

Offline scraping will save more time. i-e In offline scraping, we crawl the website onetime and download all the webpages onetime. We can scrap the required contents from the downloaded pages at any number of times without worrying about network constraints.

But we have faced many issues in using webbrowser control for offline scraping.
After searching the net, I came to know that many people are facing similar issue.
Please find below the code which will solve the issue.

string partHtmlpage = "Your html page"
webBrowser1.AllowNavigation = true;
if (webBrowser1.Document != null)
{
webBrowser1.Document.OpenNew(true);

}
else
{
webBrowser1.Navigate("about:blank");


}
webBrowser1.Document.Write(partHtmlpage);
More Articles...

No comments:

Search This Blog