Data collection scripts(repost)

This job requires creating scripts to gather data from the internet to be entered into a database. There are several pdf files available on the internet from different companies with information about products that I need in a database. 1. Write a script that automatically goes to particular web sites I will specify (a set of 10 companies), automatically downloads the pdf files, and automatically converts them to XML files. 2. Write an XSLT for each different XML file format in order to extract the required fields into a tab-delimited text file. There will be at least one XSLT for each company, as each company has a different file format, most with multiple products listed in each. One product should be listed on one line in the text file with the following fields (with specified field types): URL of pdf file Manufacturer Model Fan Type Width Height Thickness - - (Part #) (Axial, Blower, ..etc.) (mm) (mm) (mm) Maximum Airflow Maximum Pressure Noise Speed Rated Voltage Operating Voltage Rated Current Rated Power (CFM) (In Water) (dB-A) (R.P.M.) (VDC) (VDC) (Amp) (Watt) 3. Write a script to run the XSLT on the files in order to produce the text files. 4. Write a script to enter text files into a database. Once this project is complete, I have other related tasks for you I will post as new projects. When a job is completed well, I give very high ratings and am happy to provide more work for you.

## Deliverables

The first deliverable I want is a fully functional set of scripts for a *single* web site. After I receive this and am able to run it, then I would like you to continue and create the XSLT files for the additional web sites and modify the script to support multiple web sites. 1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.

2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):

a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.

b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.

3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).

## Platform

Windows XP with Mozilla Firefox and Internet explorer

Habilidades: PHP

Veja mais: xslt job, xslt 2.0, work collection, source of collection buyer information, server side scripts, one source water, mozilla firefox company, hire manufacturer, c.r.p. products, continue line in c, axial, work well under pressure, produce a pressure, hire a manufacturer, Database collection, data collection software, current voltage, Axial fan , extract text pdf file, url internet explorer windows, tab delimited text, collection exist, web data extract firefox, extract fields data pdf, php extract pdf file

Acerca do Empregador:
( 7 comentários ) United States

ID do Projeto: #2995408

3 freelancers estão ofertando em média $160 para esse trabalho


See private message.

$55.25 USD in 7 dias
(124 Comentários)

See private message.

$84.15 USD in 7 dias
(3 Comentários)

See private message.

$340 USD in 7 dias
(1 Comentário)