Our business is building a system to parse RSS feeds.
RSS is supposed to conform to a "standard", but we have found that there are at least four versions of RSS ([url removed, login to view], [url removed, login to view], 1.0, and 2.0) and countless permutations.
RSS feeds typically do not follow the standards. One RSS feed may have elements from the [url removed, login to view] and [url removed, login to view] versions in it. This makes parsing difficult.
After doing some research, we realize that XSLT is the way to go. We would like to use an XSLT document to transform an RSS feed into an XML document that we can then parse for information.
The problem is that nobody on our team has the experience necessary to design the necessary XSLT document.
That's where you come in!
1) One properly formed XSLT document that must transform all versions of properly formed and validated RSS.
2) The output of the XSLT document must be a properly formed XML document containing the specified nodes (see deliverable number 4). Each node will contain the corresponding value from the RSS feed.
3) The XSLT document must make an attempt to handle malformed RSS. This is the most difficult part, but a reasonable attemp must be made to "guess" the values needed if the RSS is not properly formed. Ideally, the XSLT document will be able to successfully transform at least 95% of the RSS feeds it encounters. If the RSS is unusable, the resulting XML document must make note.
4) The fields below are a list of what information is needed for each RSS feed. The XML that is output from the XSLT document will be based on these fields. The structure of the output XML will use the fields in *italics* as the main nodes:
**RSS Feed Information**:
<*copyright* >Captured from:
**RSS Feed Items**:
<*date* >Captured from:
5) Not all values listed above will be available for each RSS feed. If the value is not available, the node must be included, but its value will be "NA".
6) Perl will be used for the processing. More specifically, the XSLT document must work with XML::RSS::Tools. This should not be an issue, but it is a requirement.
Because XSLT is platform independent, there is no need for OS compatability. See deliverable number 6 for environment requirements.