Em Andamento

simple html parsing project in php

he following is an outlined use-case to follow to run the program against one of our applications. Use-Case: "Screen Scraping" means to scrap data from files. You grab the html file and then parse the file using regular expressions to retrieve the information you're looking for. You scrap data when you don't have access to the database and would like to gather data from a website. For Example: If we wanted to scrap data from the contacts database from HBBH. Your program needs to: 1. Auto-Login: [url removed, login to view] username: Leanne password: mulli0n 2. go to [url removed, login to view] -go thru all the records and view the record to retreive the record information. 3. [url removed, login to view] (Your script should suck in the html and then parse the html to grab name, status, address, etc...all data from the records. We need to store the data somewhere. So currently we decided it should be in a comma-separated file so we can export into any database.) NOTE: Your script really doesn't need to go to the listing page. You can write a loop statement to go thru all 50000 records and just hit the following url: [url removed, login to view] [url removed, login to view] [url removed, login to view] That's what we figured is the easiest way, though that record might not always exist. So checks need to be in place for maybe a 400 error or something. NOTE: This should be useable for any site we choose. Though for testing purposes you can run it against the HBBH site. We want the flexibility to be able to run this script for any site we choose without too much modification. I know that the parses will be dependent on the website so we expect some level of abstraction at this level.

## Deliverables

1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.

2) Deliverables must be in ready-to-run condition, as follows? (depending on the nature? of the deliverables):

a)? For web sites or? other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.

b) For all others including desktop software or software the buyer intends to distribute: A software? installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.

3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).

## Platform

php/mysql

Habilidades: Adobe Flash, Amazon Web Services, CSS, Engenharia, MySQL, Perl, PHP, Ruby on Rails, Arquitetura de software, Teste de Software, XML

Ver mais: website source code in php, website for scrap data, web applications in php, view html source code, using regular expressions, using expressions, use regular expressions, use case simple example, use case components, software in php, simple scraping software, scrap information from web, regular expressions list, regular expressions in c, regular expressions example, regular expressions c, project list in php, project 4 u, project 4 hire, php source code testing software name, example regular expressions, c regular expressions, components of use case, is web scraping legal, write some software | perl

Acerca do Empregador:
( 374 comentários ) Canada

ID do Projeto: #3041430

Premiar a:

damirmarkovic

See private message.

$102 USD em 3 dias
(71 Avaliações)
5.4

2 freelancers are bidding on average $125 for this job

digiwebsol

See private message.

$148.75 USD in 3 dias
(9 Comentários)
3.3