English»Software»Search Engines»Crawlers | searchivarius.org
log in | about 
 



Heritrix  
HTTrack WebSite copier  
Library to scrape and clean web pages to create massive datasets.  
NorConex HTTP collector  
Nutch   - Lucene-based open source search engine.
Scrapy   - a screen scraping and web crawling framework
UbiCrawler