Ticket #476 (new defect)

Opened 19 months ago

Last modified 19 months ago

MagpieRSS: Failed to parse RSS file. (Empty document at line 1, column 1)

Reported by: reporter Owned by: mbonetti
Priority: normal Milestone:
Component: BUGS Version: 0.5.4
Severity: normal Keywords: magpie parsing
Cc:

Description

for importing

http://steffino.gedankenhabitat.de/feed/

and

http://www.libertas-cara.de/?feed=rss2

i get the error

MagpieRSS: Failed to parse RSS file. (Empty document at line 1, column 1)

why is it and what can i do about this?

Change History

Changed 19 months ago by mbonetti

  • keywords magpie parsing added

Confirmed.

Could it be that the "Generated by" comment on line 1 messes up Magpie's internals?

Changed 19 months ago by mbonetti

It looks like some extra data is appended to the content by the generating site (Wordpress bug?)

<?php
$feed = 'http://www.veracode.com/blog/?feed=rss2'; // bug
//$feed = 'http://steffino.gedankenhabitat.de/feed/'; // bug

$contents = file_get_contents($feed);
var_dump(substr($contents,0,50));
?>

This yields:

(9:20)-[~/Desktop]:: php test.php     
string(50) "30a2
<?xml version="1.0" encoding="UTF-8"?>

The extra chars at the beginning of the XML feed mess PHP's internal XML parser, which is used by MagpieRSS, hence the error.

I guess we could filter out any extra data before the XML header, before it gets passed to the parser, but I'm not sure this is the best way to handle this bug.

Changed 19 months ago by mbonetti

Also:

(11:07)-[~/Sites/dev/rss]:: wget -o /dev/null -O - 'http://www.veracode.com/blog/?feed=rss2'  |hexdump -C | head -5 
00000000  33 30 61 32 0d 0a 3c 3f  78 6d 6c 20 76 65 72 73  |30a2..<?xml vers|
00000010  69 6f 6e 3d 22 31 2e 30  22 20 65 6e 63 6f 64 69  |ion="1.0" encodi|
00000020  6e 67 3d 22 55 54 46 2d  38 22 3f 3e 0a 3c 21 2d  |ng="UTF-8"?>.<!-|
00000030  2d 20 67 65 6e 65 72 61  74 6f 72 3d 22 77 6f 72  |- generator="wor|
00000040  64 70 72 65 73 73 2f 32  2e 31 2e 33 22 20 2d 2d  |dpress/2.1.3" --|
Note: See TracTickets for help on using tickets.