i need someone to write a script or program so i can extract specific data from HTML files
e.g suppose i have raw html code like this
<p> John Brown, 1 Anytown Road, Anytown </p>
<p> DOB: 01/01/1980 </p>
<p> Kevin Jones, 23 Somewhere Street, Somewhere </p>
<p> DOB: 02/02/1973 </p>
so i only want to extract ALL the data between EACH <p> and </p>
so the output will be
John Brown, 1 Anytown Road, Anytown
Kevin Jones, 23 Somewhere Street, Somewhere
Note: there must be a <p> followed by a </p> and we want whats in between EACH set of <p> and </p> pair
in that example above there is 4 seperate pairings of <p> & </p> to return data between!
each line is treated individually so we find the first <p> then look for the first </p> and return the data between them.
then move onto the next one
find the next <p> and then the next </p> and return data between and repeat until no more can be found.
Note: if we find a <p> but there is NO </p> after it then STOP
74 freelancers estão ofertando em média $80 para este trabalho
Hi, I'm interested in your project and would like to develop a great scraper for you according to your requirements. Could you please tell me which programming language you prefer, PHP or Python? Thanks in advance.