Hrefer — it’s a parster of search engines which helps webmaster for collecting own databases of links. Main Hrefer features: Collecting of proxies for parsing (HTTP, SOCKS) Can parse 10 main search engines: Yandex, Google, MSN, Yahoo, Altavista, Blogsearch.Google, Boardreader, Blogs.Yandex, Baidu Parsing of Google without proxy and ban from search engine Possibility to parse several search engines in same time Easy setup system, and possibility of adding new search engines By using Templates for parsing, can be easy collected databases of: “all forums”, “only phpBB”, “only Invision Power Board”, “only YaBB”, “only VBulletin”, “guestbooks”, “blogs”, “WiKi” etc. All collected links can be sorted by Google PageRank (PR) in 3 different ways Automatic tool for collecting of keywords for parsing Possibility for collecting millions of links with advanced search engine anti-ban system System for collecting of “Bonus” (High PR) free hosts for your satellites After installing, parsing can be started in minutes Minimal system requirements: Windows 2003/2008/XP/Vista/Win7, CPU 1GHz, RAM 512Мб
Here is a cracking Hrefer tutorial video covering sieve filter proxies footprints and Hrefer word lists
Hrefer opening screen
looking at the flexible program settings:
Convert all links to index.
This function allows converting all the harvested links to index (it concerns forums only) as soon as you start parsing
Reject domains with level lower than 2.
If the option is enabled, then only second-level domains will be covered to list in the database; the others will be filtered
Check all links “200 OK” response (will work SLOWLY).
If the option is enabled, it allows you to check the links to the response of 200 OK. If it is ticked, the process of link picking is slowed.
Log founded hight-PR freehostings into the FreeBonus.txt.
When this exclusive option is on, the program saves sub-domains of free hostings with higher Page Rank into the FreeBonus.txt file while parsing (the file is saved in the program root folder)
Enable filtering of duplicated links by hostnames.
The domain duplicates are not included into the compiled database; links are filtered by the similar hosting names
Enable filtering of duplicated links on loading links database.
If enabled, duplicate domains are removed from the database every time you restart the program (this works with every database start which comes to program slow-down)
By hostnames and by entire URL.
These are two options provided to remove duplicate domains
Deep of parsing (pages).
This option allows you to limit the number of pages to be parsed
Do not use additive words.
If this option is chosen, additional words for parsing are not used
Disable filtering harvested links by Sieve-filter.
This option disables filtering feature for the links saved in the template
This option allows you to specify the sequence of the required words to be “glued” when querying based on additional words parsing information by search engines
While parsing multiple search engines you are offered the option to set new query for each search engine or same queries for each search engine
Auto resumption parsing after program starting.
This is the option that lets you run parsing immediately after the Hrefer program is started
This option allows presetting the intervals between the queries for various search engines
Save ‘query -> URL’ into to filename_query.txt. The option allows to save file of all keywords the particular URL is found
Search Engines parsing – principal provisions
The Words and Additive Words databases are required to effectively parse any search engines.
Additive Words. This shows the tags of all the web-sites we are querying for to harvest. This database contains structures of parsing.
Words. This database is used to make the list of queries more completed with all the web-site tags. The words base are intended to have the possibility avail for the highly complete database.
A closer look at the Hrefer Words database tab
The key “Create New!” is used to create a new Hrefer Words database.
It is too simple to create a Words database.
There are several options to get this: