Cancelado

Finish Perl Script....

Hi,

I have a perl script that uses a spider to crawl certain websites and gathers the information from them and then inserts this information into the database. The script is almost done except it needs just a few small finishing touches which are outlined below.

Remember this script is about 85-90% completed, we just need someone to put the finishing touches on it.

Below

1) Grabbing category we are searching and inserting it into the database.

We search through an perl admin section where we first select the site, then enter in the category(for example books) and then the location. We want it so that when it gathers these results from site it will also insert the category that we have entered into a row in the database.

2) Removing the special characters for some of the words/cities.

Some of the cities/words are pulling in special characters. For example a city such as Montreal may have be spelled Montre@l etc.. We need these characters not to show up and the actual letters to be used.

3) Fixing certain sites because they were only searching one state/province.

There are a few sites that when scraped the gather only searches one state or province even if the city we enter isn't in this province. We need this fixed.

4) If the website address ends in .[url removed, login to view] or [url removed, login to view] etc.. then it doesn't insert it into the database.

We also collect if the listening has a website and if it does it is inserted into the database. The problem is some of the sites have generic yellow pages addresses that are actually an online advertisement for them because they don't have a website. We don't want these inserted into the database and only won't actual websites inserted into the database.

5) We have a php programmer doing a browser based admin section in php. One of the problems we run into is that some of the addresses don't have a postal/zip code. So we wanted to set it up so that when we select certain addresses(in the php admin section) to download then it could run another spider to spider these sites below, grab the zip/postal codes and insert them into the database corressponding to the addresses that we are doing to download?

[url removed, login to view]

[url removed, login to view]

[url removed, login to view];pageId=pcaf_pc_search&gear=postcode

[url removed, login to view]

6) Scraping [url removed, login to view]

If possible we would like the site about to be scraped.

If you have any questions or need any clarification please pm me. Thanks.

--Anthony

Habilidades: Perl

Ver mais: finish script perl, finish perl code, yp, yellow characters, websites problems, tools programmer needs, searching problem, problem websites, php catid, perl search script, letters example, fixing websites, first finish, example letters, asp spider, perl finish script, perl finish, finish perl script, finish perl, script words, cgi programmer, rm, postcode, perl programmer, grabbing

Acerca do Empregador:
( 23 comentários ) bedford, Canada

ID do Projeto: #176850

1 freelancer está ofertando em média $55 para este trabalho

gangabass

I can do this.

$55 USD in 3 dias
(147 Comentários)
5.9