Jump to content

Atom or RSS1 generation valid? [solved]


Recommended Posts

[EDIT:]Please jump to my post #7, as I have solved part of my problem.[/EDIT]

 

I have written a RSS1 generator, and an ATOM generator in PHP, for giving my site's news from my "*.news" files.

 

Unfortunately, even though (to my knowledge) each flow is conformant to its specification, I can't read them! Firefox extension for RSS/Atom says that both are unrecognized flows, and Thunderbird just hangs when adding either.

 

Could someone with knowledge in those protocols/languages, help me find what is wrong?

 

ATOM example:

http://yves.gablin.club.fr/_php/news-atom....epath=/all.news

 

related HTTP headers:

Content-type: application/atom+xml

Content-Disposition: attachment; filename="all.news.atom"

 

RSS1 example:

http://yves.gablin.club.fr/_php/news-rss1....epath=/all.news

(note: I've tried removing entirely the DOCTYPE part, and thus using the full qualified values everywhere instead of entities, but to no avail...)

 

related HTTP headers:

Content-type: application/rss+xml

Content-Disposition: attachment; filename="all.news.rdf"

 

If I can solve the Atom problem, I'll consider myself happy :) and I'll remove RSS1. I won't remove Atom, though, because I like the ways it "fits" my needs.

Thanks,

 

Yves.

Edited by theYinYeti
Link to comment
Share on other sites

Well, neither of them work here in anything.

 

Liferea just blurts out empty news items, straw refuses to even display it.

 

wget couldnt even download it:

 

--21:59:11--  http://yves.gablin.club.fr/_php/news-rss1.php?nolog=1
 (try:20) => `news-rss1.php?nolog=1'
Connecting to yves.gablin.club.fr[195.36.164.12]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 701 [text/html]

0% [                                                              ] 0             --.--K/s

21:59:12 (0.00 B/s) - Connection closed at byte 0. Giving up.

and it loops on and on from there.

Straw doesnt display squat, I dont think it even downloads it, i think it suffers the same fate as wget.

 

It isnt a problem with the readers either, ive used them both before without any problems whatsoever for multiple other feeds, some generated from php.

 

in the console, liferea spits out a lot of this though:

unsupported entity: home
unsupported entity: xhtmlns

the news items are titleless and contentless. although it does recognise that you have <items>

 

besides, whats with the RSS1? It's old and kinda dirty.

 

iphitus

Link to comment
Share on other sites

Iphitus, I thank you very much for testing!

 

RSS1 is the only W3C-almost-endorsed flow protocol, as it is derived, and explicitely linked-to, from their RDF recommandation. RSS1 is anything but dirty, as its construction rules are very well specified, and it is based on RDF.

Yet, the problem with RSS1 seems to be it's versatility, as it is a XML application that permits most if not all of XML extensibility, resulting in many valid constructions not being supported. For example, the problem you had with 'home' and 'xhtmlns' comes from entities, that I have declared (valid but obviously unsupported).

 

Previous versions of RSS are badly designed. Newer versions are under-documented, and non-standard, though *not* badly designed. So my preference will probably go to Atom, which is well-specified, well-designed, and in the process of getting approved as an IETF standard.

 

Back on topic, I have posted a new version of the RSS1 script, that completely removes the DOCTYPE declaration (no custom entities left).

 

Here are two new links with less news in it (smaller test case):

http://yves.gablin.club.fr/_php/new.../art/index.news

http://yves.gablin.club.fr/_php/new.../art/index.news

(Links corrected. They were previously pointing to localhost!)

 

Also, in both scripts, I have removed the 'Content-Disposition' header (might help with some software...).

 

...Ooops! Child is ill. Have to leave. Thanks again. Bye,

 

Yves.

Edited by theYinYeti
Link to comment
Share on other sites

... Child is better :) I'm back.

Well, I've changed the generation so as to generate NO entity at all, not even & amp ; or such common ones. Instead I use the Unicode numbers: & #39 ; for example.

 

For me, the result is still... unrecognized! I can't believe that. What's wrong???

 

Yves.

Link to comment
Share on other sites

the rss1 feed isnt valid:

http://feedvalidator.org/check.cgi?url=htt...h%3D%2Fall.news

nor the atom feed:

http://feedvalidator.org/check.cgi?url=htt...h%3D%2Fall.news

 

but even with minor errors like those, they *should* display in any RSS reader, I cant see what the problem is.

 

what makes it even stranger, is that when i piped the URL's into mark pilgrim's great python feedparser, I was able to parse the RSS feed perfectly!

 

https://iphitus.no-ip.org/rss-test.tar.gz

 

download it, extract it, and run the yvesparser.py script. it works fine :/

 

it doesnt make any sense whatsoever. straw uses mark pilgrim's feedparser which perfectly finely parsed it above, but straw cannot load it.

 

iphitus

 

btw, wget was fine, it was just getting backgrounded by the & in the url, when i escaped it, it downloaded it perfectly fine.

Edited by iphitus
Link to comment
Share on other sites

Wow! Iphitus, thank you for the link!

I did not know, that such a link did exist. And I'm quite sure, that those two errors can make a huge difference, one being on a *unique* key appearing *twice*, and the other being about too new a namespace (I'll have to downgrade Atom), whereas the namespace is the definitive way to recognize the "grammar"!

 

Actually, one week ago, I did not even know, what RSS and RDF really were, and I had never ever heard of the name "Atom" :lol:

I jumped into feed generation, whereas I have never used feeds myself before, nor do I know anyone who has...

 

Eh eh! ...me going to work... I'll be back!

Link to comment
Share on other sites

Things have changed. Both feeds are now correct. Actually, only two changes had to be done:

- remove the duplicate ID,

- replace all entity references with Unicode references (eg: & amp ; -> & #38 ; (IIRC)).

 

Now my site has the Atom link on the index page, both on screen and in the header. Atom still uses too new a protocol for the validator, but it's OK with me: Firefox does recognize it, and displays the orange icon in the status bar. (FYI, having the RSS1 feed is just a matter of replacing the 'news-atom.php' part with 'news-rss1.php' in the URL).

 

So you could say I'm happy with the result... Well, I am, except that...

 

In order to inform people without a feed reader, I use an XSL stylesheet to transform the raw XML feed into an HTML page with explanations. To "please" Firefox, I had to:

- send the Atom (or RSS) content in application/xml instead of application/atom+xml (or rss+xml);

- remove the forementionned header ('Content-Disposition');

- write a get.php script, to be able to choose the MIME header myself, because FF goes into "strict" mode, and does not accept the XSL file if it does not come in application/xml.

 

I'm not really happy with this. Is there any workaround?

 

Besides, even though I know that XSL works in IE, *my* XSL does not. Could someone help me spot where the problem is (or if it simply is "my" IE that does not work).

 

For a start, I'd like to hear from you IE users: does my feed (when you click on the "ATOM" icon) display as HTML or XML? In what version of IE?

 

Thanks to all testers out there. My whole PHP framework will soon be released under GPL, and it will be in part thanks to you :)

 

Yves.

Edited by theYinYeti
Link to comment
Share on other sites

No IE users? Maybe my fault... I'll try and better explain what I need. It's really easy:

 

I'd like to know what you see when you click on one of the two links in the first post above. I need this from people using a XSL-aware browser that is *not* firefox (this one is working OK). This includes Internet Explorer 6 for example (maybe also 5.5).

 

You should see:

 

-A- either XML code like that:

<?xml version="1.0" encoding="ISO-8859-1" ?> 
 <?xml-stylesheet type="text/xsl" media="screen" href="get.php?mimetype=text/xml&filepath=/_php/data/news-rss1.xsl" ?> 
- <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" ...>
- <channel rdf:about="http://yves.gablin.club.fr/_php/news-rss1.php?nolog=1&filepath=%2Fall.news&lang=fr">
 <title>! Nouvelles !</title> 
 <link>[url]http://yves.gablin.club.fr/[/url]</link> 
 <description>Nouvelles du site de Yves ET Iris</description> 
- <items>
- <rdf:Seq>
 <rdf:li rdf:resource="http://yves.gablin.club.fr/all.news/fr/20050404-233000" />
...

 

-B- or a nice HTML page, like that:

ATOM : ! News !

News for Iris and Yves web site

This is an Atom feed, styled with XSL and CSS

Atom is an alternative to HTML, that is dedicated to giving continuous information about ...

 

Please tell me if you see -A- or -B-, and with what version of what browser. Thank you.

 

Yves.

Link to comment
Share on other sites

Thanks for testing!

 

I have spotted a bug in IE, and one in Firefox too. (BTW, same for me here at the office: "customized" IE 6). Well, I managed to make both IE and Firefox happy, and changes are now commited.

For me I see XSL-generated HTML with IE6 on internet, and the same in Firefox in local (my FF is not allowed to go through the firewall/proxy...)

 

How is it now with you? Any news about Opera?

Thanks,

 

Yves.

Link to comment
Share on other sites

  • 3 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

×
×
  • Create New...