Em Andamento

Article extractor

features:

it will extract articles from [url removed, login to view]

It will extract all the articles from a given category, for example

i will input the category:

[url removed, login to view]:Vacation-Rentals

then the script will extract the 30 articles in that page, following the links.

Then it will follow the link Next 30, and it will extract the next 30 articles .... and so on, until there is no Next 30 link, so the articles have finished at that category.

Content wanted to collect from every article:

lets use following article as an example:

[url removed, login to view];id=34371

the script will collect:

Title: Yes

author in "By" field: NO

Article word count: NO

body article: YES

Adsense ads: NO

whatever followed by http or www : NO (i dont want any url to be collected)

Article Source: http://EzineArticles ..... NO

Pictures: NO

The script will save all the articles in txt files in 2 ways:

first, separated, one file per article, in txt files named from 001 to ....

second, in 40 articles files . So if there is 120 articles in a category it will save them in 3 files. If there is 130 articles it will use 4 files and so.

Scritp should be proved widely to prove that no ban ip occurs by Ezinearticles.

I have available dedicated server. So if your solution is not feasible for running in my PC we will host there.

Habilidades: Programação C, Perl, PHP, Visual Basic

Ver mais: article extractor, article extractor php, rentals c, index php cat, c4p, author it, article com, article by any author, php article extractor, ezinearticles banned, best article extractor, free article extractor, extract article body html, author-it, vacation rentals, travel article, rentals, leisure, condo, cat, article 3, article $3, ads word, category index, word extract

Acerca do Empregador:
( 33 comentários ) albatera, Spain

ID do Projeto: #138842