I need a software/script that will function exactly as follows:
1. on a daily basis it shall crawl 3 of [url removed, login to view] main sections (for example: Jobs, personals, housing) of all cities and countries and it will automatically send an email with links to my website and some text (same content for all but title should be RE: “actual post’s title”) to those emails collected also on an daily basis. Each day it will perform the same action but will only crawl new posts so not to email to the same poster twice for the same ad.
2. It shall create a database with all the email addresses collected that are not in the @[url removed, login to view] format. Users shall have a opt out option in the email ( I will provide the copy content for the email, you handle the rest of its creation. Will be in plain html with a logo) where they can have their email removed from the database and will also result in being automatically skipped and not emailed even if the crawler finds new posts with that email address.
3. It shall not use, count or store help/support/info/etc..@[url removed, login to view] email addresses
4. It shall generate a simple report of daily activity (emails found with relative post title, messages sent, messages delivered successfully, delivery success rate in %, opt-outs, visits to my website as a result of email, signups as a result of email, number of visitors with the relative site’s page name where they left my site, average number of pages viewed by visitors) that will be sent to my personal email daily. An email delivery success rate below 95% will not meet the project specification.
5. I will tell you later which sub-sections of the 3 main sections I want to crawl but I can tell you they should be around 6.
6. Language: php/mysql, script should not be hosted on my website but I should receive a backup copy for my records. Script should be hosted in server where it can be used for this kind of operations; this should be provided by you somehow. I do not care if you use pre-existing scripts as base and further develop/adopt it to do exactly what I mentioned as long as this project specifications are met. Of course I should be legally entitled to use the script.
7. Since most emails addresses it will find and send emails to will be in the job-#########@[url removed, login to view] and gigs-#######@[url removed, login to view] format, you should find a way to bypass getting emails blocked or not delivered by the craigslist system which if I don’t mistake (you would have to double check but if you’re a craigslist expert you should already know) do not allow more then X number of emails to be sent to @[url removed, login to view] addresses from the same sender’s email address.
8. Payment: 100% escrow released after I test your final work for 3 days.
9. What bids we will not consider: bids above budget, placeholder bids, bids that require different payment terms, bids for services that do not completely fulfil this project requirements
10. I should also be able to specify the crawl (exclude pages where some keywords are mentioned or crawl only pages where some keywords are mentioned