Em Andamento

Website-Crawler and information extraction

Hi,

I need a Crawler script, which reads URLs from a website and casts the underlying websites. The information I need from that websites are:

- Mister/Miss (Herr/Frau)

- Forename

- Name

- Position

- Name of organization

- Street

- Address

- Fon and Fax

- Email

- Website

- link title leading to that website

The crawler should look for this information first under "contact" and then in "disclaimer". It also could be possible, that the crawler find an intro-page, which it has to skip.

If there are several data records an one webpage, it should be saved in the same line.

The Output must be a CSV or Excel-File.

Because these Websites are in german the word:

contact -> Kontakt

disclaimer -> Impressum

Furthermore the Crawler should recognize if there's a position describtion of the contact person. For example "Stadtwehrführer", "Kommandant" or "Stadtwehrleiter". "*leiter" or "*führer" e.g. indicates a position.

Also the crawler should recognize the name of the organization. "Feuerwehr" ist the indicator.

An example:

________________________________________

Verantwortlich:

Feuerwehr Bitterfeld-Wolfen (Name of organization)

PD-Chemiemark Areal A, Geb. 046

Ortsteil Wolfen

06766 Bitterfeld-Wolfen (postal code + city) postal code has 5 numbers in GER

Vertreten durch:

Herr Uwe Wagner (Mister Forename Name)

Stadtwehrleiter (Position)

Kontakt:

Telefon+49 (0) 03494 6660564 (Phone Nr)

E-Mail: abcd(at)[url removed, login to view] (need inteligent scan, a correct email address is the most important)

____________________________________________________

The links-list can be found here:

[url removed, login to view]

Beside an csv file with all extracted data, I need the script to modify and tune it a little bit afterwards.

All Phone and Fax Nrs have to be in the same format!

If you need further information don't hesitate to contact me via mail.

Best regards

Sebastian

Habilidades: PHP

Ver mais: leading websites, find correct email address, find a postal code, correct email address, best webpage, at&t organization, crawler information extraction, correct email format, best php website, website crawler, wagner, telefon, street 3, skip , geb, frau, find the fax no, email crawler, data crawler, crawler, correct csv, city website, php link crawler script, php csv german, script scan email

Acerca do Empregador:
( 1 comentário ) Köln, Germany

ID do Projeto: #586442

Premiar a:

techsolsoftwares

i can develop this crawler

$100 USD em 7 dias
(60 Avaliações)
5.8

19 freelancers estão ofertando em média $174 para este trabalho

phpexp

Please check the PMB.

$199 USD in 4 dias
(186 Comentários)
8.0
SigmaVisual

We can help in your project, please check PMB to see our related experience.

$250 USD in 4 dias
(248 Comentários)
7.9
sureshdevi

I can do this work. Thanks, Suresh

$150 USD in 3 dias
(707 Comentários)
7.5
srinichal

I can deliver the script with perl mechanize

$145 USD in 3 dias
(101 Comentários)
7.1
ASYanush

Good day Sebastian! We can do this job for you. Our team have a good experience in software development. The task is clear for us so we can start to do it.

$80 USD in 3 dias
(11 Comentários)
6.4
webmagics

Hi, We can do [url removed, login to view] see PMB for details. Thanks

$300 USD in 5 dias
(47 Comentários)
6.2
alexander2007

Please check PM. Thanks.

$160 USD in 4 dias
(20 Comentários)
5.8
Zuprem

I am interested in your project.

$200 USD in 5 dias
(53 Comentários)
5.5
silv3rm00n

can do [url removed, login to view]

$200 USD in 2 dias
(25 Comentários)
5.4
psxl88

I am ready for it.

$130 USD in 3 dias
(52 Comentários)
5.3
subratparida

My name is Subrat Parida. I'm the owner of CADS Inc (Computer Applications & Database Solutions Inc) is located in Bhubaneswar, India. I have 8 years of software development working experience in USA and 2 years in Mais

$250 USD in 10 dias
(9 Comentários)
4.7
cberescu

I have extensive experiance with online crawlers. I't can be done easily.

$100 USD in 2 dias
(2 Comentários)
3.3
kymar1n

Hello, We've studied your given requirement. The project you described is quite simple in functionality. We have more then 5 years experience in similar projects. We are Interested in the work and We would like to star Mais

$99 USD in 4 dias
(3 Comentários)
2.4
ccticommx

We can do it

$200 USD in 20 dias
(0 Comentários)
0.0
clearth

please see inbox

$250 USD in 5 dias
(0 Comentários)
0.0
rajtuhin8

please check PMB.

$250 USD in 3 dias
(0 Comentários)
0.0
CrystaltechEsol

plz open pmb

$146 USD in 2 dias
(0 Comentários)
0.0
kalion

Can do it with ruby language.

$100 USD in 3 dias
(0 Comentários)
0.0