Encerrado

RSS Parsing via XSLT

Our business is building a system to parse RSS feeds.

RSS is supposed to conform to a "standard", but we have found that there are at least four versions of RSS (0.91, 0.92, 1.0, and 2.0) and countless permutations.

RSS feeds typically do not follow the standards. One RSS feed may have elements from the 0.91 and 0.92 versions in it. This makes parsing difficult.

After doing some research, we realize that XSLT is the way to go. We would like to use an XSLT document to transform an RSS feed into an XML document that we can then parse for information.

The problem is that nobody on our team has the experience necessary to design the necessary XSLT document.

That's where you come in!

## Deliverables

1) One properly formed XSLT document that must transform all versions of properly formed and validated RSS.

2) The output of the XSLT document must be a properly formed XML document containing the specified nodes (see deliverable number 4). Each node will contain the corresponding value from the RSS feed.

3) The XSLT document must make an attempt to handle malformed RSS. This is the most difficult part, but a reasonable attemp must be made to "guess" the values needed if the RSS is not properly formed. Ideally, the XSLT document will be able to successfully transform at least 95% of the RSS feeds it encounters. If the RSS is unusable, the resulting XML document must make note.

4) The fields below are a list of what information is needed for each RSS feed. The XML that is output from the XSLT document will be based on these fields. The structure of the output XML will use the fields in *italics* as the main nodes:

**RSS Feed Information**:

<*title*>Captured from:

<title>

<*link*>Captured from:

<link>

<*description*>Captured from:

<description>

<*copyright* >Captured from:

<copyright>

<dc:rights>

<*image*>Captured from:

<image>

<title>

<url>

<link>

<width>

<height>

<description>

<*lastBuildDate*>Captured from:

<lastbuilddate>

<dc:date>

**RSS Feed Items**:

<*link*>Captured from:

<link>

<*title*>Captured from:

<title>

<*description*>Captured from:

<description>

<*content*>Captured from:

<content:encoded>

<*date* >Captured from:

<pubdate>

<dc:date>

<*creator*>Captured from:

<dc:creator>

<*comments*>Captured from:

<comments>

<*subject*>Captured from:

<dc:subject>

5) Not all values listed above will be available for each RSS feed. If the value is not available, the node must be included, but its value will be "NA".

6) Perl will be used for the processing. More specifically, the XSLT document must work with XML::RSS::Tools. This should not be an issue, but it is a requirement.

## Platform

Because XSLT is platform independent, there is no need for OS compatability. See deliverable number 6 for environment requirements.

Habilidades: Engenharia, Linux, MySQL, Perl, PHP, Arquitetura de software, Teste de Software, Hospedagem Web, Gestão de Site , Teste de Website, XML, XSLT

Ver mais: xslt 2.0, os independent, note creator, what is perl used for, xslt xml, document creator, rss requirement, experience xslt, xslt rss feed, php image feed description, xml transform, image parsing, rss url, rss output, url link xml, node rss, rss url list, parse rss xml, permutations list, content rss, parse rss, rss feeds url, rss feed url, xml document creator, via date

Acerca do Empregador:
( 0 comentários ) United States

ID do Projeto: #3015140

5 freelancers are bidding on average $282 for this job

fishersoftware

See private message.

$850 USD in 7 dias
(11 Comentários)
4.9
deletethisplease

See private message.

$425 USD in 7 dias
(33 Comentários)
4.7
cchrysostom

See private message.

$42.5 USD in 7 dias
(1 Comentário)
1.3
rakisg

See private message.

$68 USD in 7 dias
(0 Comentários)
0.0
binarypixelvw

See private message.

$25.5 USD in 7 dias
(0 Comentários)
0.0