This is a re-posted project due to the fact many providers said it should be a small project instead of a medium project.
I am looking for someone to Web scrape a regulation from a U.S Governement website and cut, paste and format the text, tables and figures into MS Word.
The regulation that is needed to be scraped is Title 29 Code of Federal Regulation 1910 Occupational Safety and Health Standards (29 CFR 1910).
The regulation is located at:
[url removed, login to view];p_toc_level=1&p_keyvalue=1910
I want the entire standard (regulation) put in to MS Word. The standard begins with 1910 - Table of Contents and ends with 1910.1450 App B - References (Non-Mandatory).
The formating detail will be determined after the bidding ends. But formatting details will include the text font, page setup, what will be bolded, the format of the table, the placement of the figures and hyperlinks of the table of contents.
This project must be completed within 10 days of accepting the bid.
I would guess that the final Word document may be around 600-800 pages or more. Please look at the regualtion before making a bid.
After the successful completion of this project, the service provider will be granted another project for another regulation.
Remember: Government regulations do not have a copyright. So you are not breaking any laws.
General Formatting Guidelines-
All fonts will be Verdana.
No text are in cells. Just free text.
Title Text is Verdana (12pt)
Section numbers are (10pt), royal blue, underlined
Terms are (10pt), black, bold
All table lines must be even.
See "Sample Format for OSHA Regulations.docx" for an example.
Here the .doc file also