Em Andamento

Web scrap project

script language: PHP

front end: html/javascript

database: mysql


Table Structure:


id (auto insert)

createddate (auto insert)

modifieddate (auto update)

Table A (entities)

name varchar

state (varchar) 2 letter us state code

type varchar(city, county, school district, university, college)

url varchar

table B (url)

datasource (url or query where the data came from)

url varchar


maxlayers (defaults 2)

statusname (values would include "found match", "no match")

table C (curl data)

url varchar

retrievedhtml largetext

match varchar

statusname (values would include "undefined" (default), "good", "bad", "review")

The 1st script will:

1 parse the LEA_NAME column for unique values for "school district" names from here - [login to view URL], get the state, school district name, & url. 25,000 results

2 parse the "county names" from here - [login to view URL], get state name, convert it to 2 letter code, 3,098 records

3 parse the "city names" from here - [login to view URL] grab city name, usps (state). 29,000 records.

4 parse the "US college/university" from here - [login to view URL] - & grab college/university name, & url. 2,073 records.

5 populate table A with the name, type, state code (2 digit) while skipping duplicate. convert the state name to 2 digit.

The 1st script will be a one time script, run from linux cli.

The 2nd script will:

1 Loop through table A, & attempt to find the url that matches with a google search, if one was not present from the datasource. The logic must skip certain false positives such as a domain with the word "weather" or "census" or "zillow" or "google" in it or url with ".jpg" or ".asp"

2 populate the record in table B, with :

datasource = the url of the data source above

url = url (skip duplicate)

statusname = null

googleposition = 1-20 (first page of google results only)

The 2nd script will return 35,000 - 200,000 results.

The 2nd script will run periodically from linux cli, on a crontab, & will be rerun, in the future, when additional excemptions are added.

The 2nd script should be multi threaded, & should cap out above a 100mb/second connection

The 3rd php script will:

1 Loop through table B, use curl to retrieve the web page

2 Loop through each of the child pages, for the value in the maxlayers column

3 Look for a particular pattern of text, including a case insensitive search for "bids" "request for proposal" "rfp" "rfq" "request for bids" "proposals"

4 Compare the curl returned html against the keywords

if there is match - insert a record into table C with the url (skip non unique url), the retrieved html, what keyword caused a match, & update table B statusname to "found match"

if there is no match, updated table B maxlayers count upward 1, & updated the statusname to "no match"

Each record from table B may have multiple records in table C

The 3rd script will run be run periodically from the linux cli

The 3rd script should be multi threaded, & should cap out above a 100mb/second connection

The 4th script will be ANDROID PHONE FRIENDLY:

1 Define an sql query which should return the top 10 selection from table C, sorted by modifieddate ASC, WHERE type != "bad" & != "good"

2 Provide a simple html table view front end to review each of the url, which should have columns for all values from table B.

3 An additional column will show a update status button, which when pressed shows the values (as buttons) from table C, which when pressed, update record in table C

The 4th script view is intended for a quality check employee to review all results from, & log if url matches our ultimate criteria or not.

Habilidades: MySQL, PHP, Captura de dados na web

Veja mais: disney web design project, bid web video project, receive sms web base project php, deployment dot net web mobile project pda, web design project outline, aspnet web development project dubai, php web design project shopping cart orders users, example web design project analysis, proxy web browser project, perl web scrap, aspnet web site project, develop web quiz project, demo social network web site project, online buying selling system web base project, web design project satisfaction survey, web design project risks, vacation web page project, php curl retrieve web contents, web form project word

Acerca do Empregador:
( 15 comentários ) Kennesaw, United States

ID do Projeto: #18227988

Concedido a:


I have FULL CONFIDENCE of completing this script WITHOUT USING ANY FRAMEWORK and I am ready to start immediately. I will only use functions/classes that are either provided by core PHP or provided by extensions that Mais

$277 USD em 6 dias
(1 Comentário)

19 freelancers estão ofertando em média $514 para esse trabalho


Hello, Dear How are you? I have check your project description and am ready for discussing with you about project for now. I have experienced in PHHP and WebScraping , MySQL. I will work very hard and best for y Mais

$500 USD in 10 dias
(69 Comentários)
$525 USD in 12 dias
(46 Comentários)

Hi. I am very interested in your project, because I have much experience in such projects. I have good skills with the program language including C/C++, C#, java, php, asp.net, python, VB.NET. So I have expert and s Mais

$555 USD in 10 dias
(105 Comentários)
$350 USD in 10 dias
(73 Comentários)

Hello Sir, I am the expert freelancer here. I am on the 6th position through out the world to deliver the quality job. I have deliver here more than 400 + projects with 100% client satisfaction. I have more than 5 Mais

$600 USD in 10 dias
(54 Comentários)

[login to view URL] I saw your project description carefully and i'm very interesting your project. But i have some question about your project. If u have enough time to discuss about your project with me ,please contact me. An Mais

$555 USD in 10 dias
(20 Comentários)

Hi Nice to meet you. I'm scraping expert. My past works: Youtube comment scrapping Real estate property list to csv Job-site content to csv And scrap posts from facebook, twitter, instagram using scrapy. In add Mais

$500 USD in 10 dias
(59 Comentários)

Hi, I can help you to writes script that parse the pages and save the data in database based on conditions and rules that you describes in the project description. I've read the description carefully that we need to wr Mais

$1000 USD in 10 dias
(47 Comentários)

Hey? How are you? I have reviewed "Web scrap project" .I have good skills for these (MySQL, PHP, Web Scraping). I have been working for 7 yrs in this scope. While we contract and work in our jobs, I will get paid o Mais

$500 USD in 10 dias
(65 Comentários)

I can scrape any data you require. Please contact me and we can discuss getting started, I'm eager to begin working for you.

$333 USD in 10 dias
(5 Comentários)

Hi,dear! I am quite interested in your project - 'Web scrap project'. :) I am a skillful software developer who has rich experience in this field. If you contact me, you and I will be happy. Thank you in advance. Skil Mais

$555 USD in 3 dias
(4 Comentários)

Dear, employer. My name is Xungxiao, I am an experienced web developer, and web scraping expert. I am a new freelancer here, but I have good experiences in web scraping using PHP, Python, Java and so on. I read yo Mais

$333 USD in 10 dias
(12 Comentários)

How are you today? I am a super expert in this area. If you contact me, I can show you my past work too. Please contact me. Thank you.

$555 USD in 10 dias
(1 Comentário)

Hi, **Being an expert with graphic designer and full stack developer. I am applying to your requirement. I have 8+ years of work experience with website development and design**. **Please have a look at my recent w Mais

$555 USD in 10 dias
(4 Comentários)

Hi there I have seen your requirement and ready to work with you right Now. I am expert in Web Design,Development Like PHP,Core PHP,Laravel, Wordpress development Android development: JDK, NDK, Android Studio, fireb Mais

$400 USD in 15 dias
(2 Comentários)

Hello, I have read out your requirements that you need a new website. We are a Team of 6+ years experience 25 Dedicated Professional developers who are very good at designing and Building [login to view URL] have work Mais

$500 USD in 10 dias
(3 Comentários)
$555 USD in 10 dias
(0 Comentários)

"Hi, Hope you are doing well! Thanks for sharing your project requirement with us. It will be our great pleasure to work on your project. I have checked your requirement, yes we can do it, because we already work on si Mais

$616 USD in 7 dias
(0 Comentários)