读取站点更新纪录(RSS2.0) - php篇 :新浪,雅虎新闻

80酷酷网    80kuku.com

  rss|新浪|雅虎|站点

[前言]
在个人建站的过程中,经常要从其他网站获取大量动态信息。
本文所描述的就是使用php程序读取rss标准的xml格式文件,动态显示他人站点的信息列表。

[演示]

 

Yahoo News : perl php  Perl/PHP XML::RSS读Yahoo新闻(英文)的例子 
My CSDN Blog : perl php  Perl/PHP XML::RSS读取个人CSDN博客的例子 
JLinux : perl php  Perl/PHP XML::RSS读取JLinux的例子 
新浪新闻  综合 perl php 体育 perl php 娱乐 perl php 

 

[前提]
对于php编程爱好者来说,前期的准备相对简单,只要有php4以上的环境就可以建立此功能。

[对应的XML/RSS文件的格式]
基本上很多网站提供的用来做rss浏览的文件都是以下的格式,这是符合xml的w3c通用标准的。
简单的分析一下,
基本的树结构是,
一个rss根下,有一个channel节点,
  该channel节点下的title,link,description属性是常用的,
     然后就是item节点,众多item节点是最近跟新的若干篇文章,
    该item节点下的title,link,pubDate,description属性是常用的。
   简单格式如下:

<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/">
<channel>
<title>本站点频道的标题</title>
<link>链接地址</link>
<description>站点频道描述信息</description>
<item>
<title>文章1</title>
<link>文章1链接地址</link>
<description>文章1内容简介</description>
</item>
<item>
<title>文章2</title>
<link>文章2链接地址</link>
<description>文章2内容简介</description>
</item>
</channel>
</rss>

举例:

- <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/">
- <channel>
  <title>邢晓宁专栏</title>
  <link>http://blog.csdn.net/thefirstwind/</link>
  <description>代码一生</description>
  <dc:language>af</dc:language>
  <generator>.Text Version 1.0.1.1</generator>
  <image>http://counter.csdn.net/pv.aspx?id=72</image>
- <item>
  <dc:creator>♂猜猜♂(邢晓宁)</dc:creator>
  <title>在 MS Windows 下建立 DocBook 的解譯環境</title>
  <link>http://blog.csdn.net/thefirstwind/archive/2006/12/21/1451714.aspx</link>
  <pubDate>Thu, 21 Dec 2006 13:50:00 GMT</pubDate>
  <guid>http://blog.csdn.net/thefirstwind/archive/2006/12/21/1451714.aspx</guid>
  <wfw:comment>http://blog.csdn.net/thefirstwind/comments/1451714.aspx</wfw:comment>
  <comments>http://blog.csdn.net/thefirstwind/archive/2006/12/21/1451714.aspx#Feedback</comments>
  <slash:comments>0</slash:comments>
  <wfw:commentRss>http://blog.csdn.net/thefirstwind/comments/commentRss/1451714.aspx</wfw:commentRss>
  <trackback:ping>http://tb.blog.csdn.net/TrackBack.aspx?PostId=1451714</trackback:ping>
  <description>在 MS Windows 下建立 DocBook 的解譯環境<img src ="http://blog.csdn.net/thefirstwind/aggbug/1451714.aspx" width = "1" height = "1" /></description>
  </item>
- <item>
  <dc:creator>邢晓宁</dc:creator>
  <title>程序员学习的革命-如何使用大脑</title>
  <link>http://blog.csdn.net/thefirstwind/archive/2006/12/13/1440965.aspx</link>
  <pubDate>Wed, 13 Dec 2006 09:41:00 GMT</pubDate>
  <guid>http://blog.csdn.net/thefirstwind/archive/2006/12/13/1440965.aspx</guid>
  <wfw:comment>http://blog.csdn.net/thefirstwind/comments/1440965.aspx</wfw:comment>
  <comments>http://blog.csdn.net/thefirstwind/archive/2006/12/13/1440965.aspx#Feedback</comments>
  <slash:comments>27</slash:comments>
  <wfw:commentRss>http://blog.csdn.net/thefirstwind/comments/commentRss/1440965.aspx</wfw:commentRss>
  <trackback:ping>http://tb.blog.csdn.net/TrackBack.aspx?PostId=1440965</trackback:ping>
  <description>很多人搞技术,还有很多转行搞技术,搞了一段时间终于发现,自己不适合作技术,要我说其实就是用脑方式的问题。真的学会适当的用脑方式,编程编起来得心应手。<img src ="http://blog.csdn.net/thefirstwind/aggbug/1440965.aspx" width = "1" height = "1" /></description>
  </item>
  </channel>
  </rss>
 

[核心程序]

<?php

$RSSURL = "http://blog.csdn.net/thefirstwind/Rss.aspx";
$buff = "";
$fp = fopen($RSSURL,"r");
while ( !feof($fp) ) {
    $buff .= fgets($fp,4096);
}
fclose($fp);

$parser = xml_parser_create();
xml_parser_set_option($parser,XML_OPTION_SKIP_WHITE,1);
xml_parse_into_struct($parser,$buff,$values,$idx);
xml_parser_free($parser);

$in_item = 0;
foreach ($values as $value) {
    $tag  = $value["tag"];
    $type = $value["type"];
    $value = $value["value"];

    $tag = strtolower($tag);
    if ($tag == "item" && $type == "open") {
        $in_item = 1;
    } else if ($tag == "item" && $type == "close") {
        echo <<<EOM
$title
$link
$description
EOM;
        $in_item = 0;
    }
    if ($in_item) {
        switch ($tag) {
            case "title":
                $title = $value;
                break;
            case "link":
                $link = $value;
                break;
            case "description":
                $description = $value;
                break;
        }
    }
}

?>

 

[配合上以上说明,完整的源代码如下]
以下附加了css样式。

<?php

#$RSSURL = "http://www3.asahi.com/rss/index.rdf";
#$RSSURL = "http://rss.news.yahoo.com/rss/topstories";
$RSSURL = "http://blog.csdn.net/thefirstwind/Rss.aspx";
#$RSSURL = "http://jlinux.ddo.jp/bbs/rss.php?auth=0";
#$RSSURL = "http://rss.sina.com.cn/news/marquee/ddt.xml";
#$RSSURL = "http://rss.sina.com.cn/news/allnews/sports.xml";
#$RSSURL = "http://rss.sina.com.cn/news/allnews/ent.xml";

$buff = "";
$fp = fopen($RSSURL,"r");
while ( !feof($fp) ) {
    $buff .= fgets($fp,4096);
}
fclose($fp);

$parser = xml_parser_create();
xml_parser_set_option($parser,XML_OPTION_SKIP_WHITE,1);
xml_parse_into_struct($parser,$buff,$values,$idx);
xml_parser_free($parser);
$channel_title = $values[2]["value"];
echo <<<__HTML__
<html>
<head>
<meta http-equiv='content-type' content='text/html; charset=UTF-8'>
<title>$channel_title</title>
<link rel='stylesheet' type='text/css' id='css' href='/bbs/forumdata/cache/style_1.css'>
<script type='text/javascript' src='/bbs/include/common.js'></script>
<script type='text/javascript' src='/bbs/include/menu.js'></script>
</head>
<body>

<table border='1'>
<tr><td>
<img src='http://www.pushad.com/XrssFile/2007-1/30/2007130142039121.gif'>  
<!--
<img src='http://www.pushad.com/XrssFile/2007-1/30/2007130142039669.gif'>  
<img src='http://jlinux.ddo.jp/bbs/images/default/logo.gif'>  
<img src='http://www.pushad.com/XrssFile/2007-1/30/2007130142039970.gif'>  
//-->
</td>
<td>
$channel_title
$channel_lastBuildDate

</td>
</td>
__HTML__;

$in_item = 0;
foreach ($values as $value) {
    $tag  = $value["tag"];
    $type = $value["type"];
    $value = $value["value"];

    $tag = strtolower($tag);
    if ($tag == "item" && $type == "open") {
        $in_item = 1;
    } else if ($tag == "item" && $type == "close") {
        echo <<<EOM
<tr>
  <td colspan='2' class='header'width='400'>
    <a href="$link">$title</a>
  </td>
</tr>
<tr>
  <td colspan='2' width='400'align='right'>
    $pubDate
  </td>
</tr>
<tr>
  <td colspan='2' width='400'>
    $description
  </td>
</tr>
<tr>
  <td>
     
  </td>
</tr>
EOM;
        $in_item = 0;
    }
    if ($in_item) {
        switch ($tag) {
            case "title":
                $title = $value;
                break;
            case "link":
                $link = $value;
                break;
            case "pubDate":
                $pubDate = $value;
                break;
            case "description":
                $description = $value;
                break;
        }
    }
}

echo <<< __HTMLEND__
</table>
</body>
</html>
__HTMLEND__;

?>
                       



分享到
  • 微信分享
  • 新浪微博
  • QQ好友
  • QQ空间
点击: