Em Andamento

Unique Crawler Script Needed

I need a crawer script that will crawl on 6 different website addresses I will provide - websites that each of them contain a very organized long list of product names that are divided into alphabetical groups by manufacturers. Then the script should copy the entire list from each site into 6 different CSV files...

Read all details at the end of this project description (see at the botto ----->)

## Deliverables

Project details:

I need you to run a script that will do the following:

1) **Crawl**: crawl on 6 different website addresses I will provide - websites that each of them contain a very organized long list of product names that are divided into alphabetical groups by manufacturers. Then the script should copy the entire list from each site into 6 different CSV files and create **4 columns for each**: **One for product name** (e.g. "Xerox monitor B5 G617Q" etc?), another for the **manufacturer** name it belongs to (e.g. - "Xerox"), another for the **Category** it belongs to, (e.g. - "Printers") (these details are already given by each of the sites). And final column - name of the **source website** that product name came from.

2) **Compare lists and find commons**: After script copied the 6 different lists from the 6 different sites, I need the script to search and analyze all the 6 lists - and create the following 2 new lists out of them:

a) **Commons**: whenever script finds that a certain product name appears in more than 1 list - it will copy it to this new CSV file containing only product names that appeared in more than 1 list. (reason - I want to see which names appear in more than 1 site and are not unique to it).

b) **Unique**: names that only appear in 1 list - will be copied to this new list containing unique names.

Important notes: You should run this script anonymously and in a way that will look natural - from a few different IP's if possible.

Script will ignore comma's, hyphens etc.. when deciding if 2 names from different lists should be considered the same. E.g. **HP Business Inkjet 1100dtn** and **HP BUSINESS INKJET 1100-dtn** etc.. will be considered the same. I am sure there will be other examples. I'll check and give you more?

Product manufacturer name should always be the first word before the product name, If it's not - the script should put it there.

Habilidades: Engenharia, Gestão de projetos, Instalação de Script, Shell Script, Arquitetura de software, Teste de Software

Ver mais: new product manufacturers, need manufacturers, find manufacturer name, description product management, website crawler, unique business name, find manufacturers, find unique, dtn, copy product site another, find product script, copy websites csv, file copy monitor, alphabetical word, copy 270, monitor file copy, script organized, website list names addresses, name product site crawler, product names list csv, unique website names, crawl website find files, crawler website, business crawler, product name list csv

Acerca do Empregador:
( 74 comentários ) Israel

ID do Projeto: #3009885

Premiar a:

rotfor

See private message.

$42.5 USD em 24 dias
(62 Avaliações)
5.6

18 freelancers estão ofertando em média $250 para este trabalho

MuktoSoftware

See private message.

$153 USD in 24 dias
(406 Comentários)
7.3
schandraram

See private message.

$459 USD in 24 dias
(137 Comentários)
6.8
tzo

See private message.

$170 USD in 24 dias
(246 Comentários)
6.4
quickprogexpert

See private message.

$280.5 USD in 24 dias
(142 Comentários)
6.3
temp

See private message.

$255 USD in 24 dias
(29 Comentários)
6.0
webspiderinc

See private message.

$255 USD in 24 dias
(42 Comentários)
5.2
brainwithstorm

See private message.

$297.5 USD in 24 dias
(25 Comentários)
4.6
shajijohnc

See private message.

$85 USD in 24 dias
(48 Comentários)
4.6
waisol

See private message.

$170 USD in 24 dias
(12 Comentários)
4.6
ramanujana

See private message.

$50.15 USD in 24 dias
(34 Comentários)
4.4
powzak

See private message.

$127.5 USD in 24 dias
(25 Comentários)
4.1
quaintek

See private message.

$297.5 USD in 24 dias
(16 Comentários)
4.0
onlinechamp

See private message.

$849.15 USD in 24 dias
(12 Comentários)
3.9
kseen

See private message.

$127.5 USD in 24 dias
(31 Comentários)
3.8
superphp

See private message.

$225 USD in 24 dias
(10 Comentários)
3.2
guardianpressinc

See private message.

$382.5 USD in 24 dias
(3 Comentários)
0.0
76east

See private message.

$272 USD in 24 dias
(0 Comentários)
0.0