Context Navigation

Changes between Initial Version and Version 1 of jazz/12-01-04

                       v1
+= 2012-01-04 =
+== AJAX Crawler / Crawling AJAX ==
+ * [wiki:jazz/10-10-17 2010-10-17]
+ * <參考> [http://www.ajaxprojects.com/ajax/newsdetails.php?itemid=178 Crawling AJAX]
+{{{
+Shreeraj Shah's paper, Crawling Ajax-driven Web 2.0 Applications, does a nice job of
+describing the "event-driven" approach to web crawling.
+It has following three key components
+. Javascript analysis and interpretation with linking to Ajax
+. DOM event handling and dispatching
+. Dynamic DOM content extraction
+The easiest way to implement an AJAX-enabled, event-driven crawler is to use Watir and
+Crowbar, that will allow you to control Firefox or IE from code, allowing you to extract
+page data after it has processed any Javascript.
+}}}
+ * 可以用的工具包括基於 Ruby 可以控制 IE 的 [http://watir.com/ Watir]，跟可以用 GET/PUT 方式控制 Firefox 的 [http://simile.mit.edu/wiki/Crowbar Crowbar]，兩個的授權都是 BSD。
+ * [http://code.google.com/intl/zh-TW/web/ajaxcrawling/ Making AJAX Applications Crawlable] - Google 提出一個應變標準（Specification）來讓 AJAX 應用程式或網頁可以被搜尋得到。
+ * [http://crawljax.com/ crawljax] - 用 Java 寫的 AJAX Crawler ，[http://crawljax.com/documentation/publications/ 有很多論文發表]
+ * http://watij.com/ - Watij – Web Application Testing in Java
+ * http://htmlunit.sourceforge.net/ - HtmlUnit is a "GUI-Less browser for Java programs"