| | 1 | = 2010-10-20 = |
| | 2 | |
| | 3 | == Hadoop == |
| | 4 | |
| | 5 | * http://github.com/tomwhite/hadoop-book/ - Tom White "Hadoop: The Definitive Guide" 的範例程式 |
| | 6 | |
| | 7 | * [http://www.sys-con.com/node/1573226/print NoHadoop: Big Data Requires Not Only Hadoop] |
| | 8 | * 這篇文章講了很多 Hadoop 替代方案,針對許多不同的應用情境: |
| | 9 | * [http://cloudscale.com CloudScale] - 即時資料倉儲需求(realtime data-wharehouse requirements) |
| | 10 | * !CloudScale 稱之為 Redoop - 出自 [http://www.sys-con.com/node/1572508/print Hadoop and Realtime Cloud Computing](2010-10-15) |
| | 11 | * MPI , BSP - 若有高速運算需求(supercomputing requirements) |
| | 12 | * Pregel (架構,沒有code,Google) - 若要作大量圖形演算法運算(graph computing requirements) |
| | 13 | * Percolator (架構,沒有code,Google)- 若必須針對大量資料作持續的更新(incrementally update the analytics on a massive data set continuously) |
| | 14 | |
| | 15 | * 看樣子今年 Hadoop World NYC 發生了很多事情。 |
| | 16 | * 第一: Oracle 與 Hadoop 整合 - 一個新的專案叫做 [http://www.questsoftware.in/Ora-Oop/ OraOop] |
| | 17 | * [http://www.ctoedge.com/content/helping-oracle-get-along-hadoop Helping Oracle Get Along with Hadoop] |
| | 18 | * [[Image(http://img.itbe.com/ctoedge/quest5.gif)]] |
| | 19 | * 第二: Membase 與 Hadoop 整合 - 可以讓 Hadoop 更即時(Real-Time) ?? 跟我最初想的方法很像(Ex. 結合 ActiveMQ),想想用 Memcache 也是不錯的選擇啦~搞不好可以用 memcache 來作參數傳遞跟公用變數(Global Variable)。 |
| | 20 | * [http://www.smartbrief.com/news/aaaa/industryMW-detail.jsp?id=838FF0EC-C664-4F16-9F77-F61CA011BB57 Membase-Cloudera Integration Joins Leading Hadoop Distribution and Real-Time NoSQL Database] |
| | 21 | * [http://blog.membase.com/membase-cloudera-integration Membase and Cloudera Integration] |
| | 22 | * [http://www.h-online.com/open/news/item/Bi-directional-connection-for-Membase-and-Cloudera-Hadoop-1106525.html Bi-directional connection for Membase and Cloudera Hadoop] |
| | 23 | {{{ |
| | 24 | The first consists of a Membase NodeCode module that streams data from Membase to CDH in real-time, |
| | 25 | while the second consists of a Sqoop-derived batch loader utility that allows for the loading of data |
| | 26 | to and from Membase and CDH. |
| | 27 | }}} |
| | 28 | * 第三: Twitter 與 Hadoop |
| | 29 | * [http://siliconangle.com/blog/2010/10/12/hadoop-is-a-big-part-of-twitter%E2%80%99s-ecosystem/ Hadoop is a "Big part of Twitter’s ecosystem"] |
| | 30 | {{{ |
| | 31 | For something like Hadoop, the presence of a robust, public tool really helps |
| | 32 | to build a prosperous ecosystem. |
| | 33 | }}} |
| | 34 | * [http://www.computerworld.com/s/article/print/9191098/Twitter_solves_its_data_formatting_challenge?taxonomyName=Storage&taxonomyId=19 Twitter solves its data formatting challenge] |
| | 35 | {{{ |
| | 36 | While primary copies of user Tweets are kept in MySQL and Cassandra databases, |
| | 37 | the company is also building a second data repository, running on Hadoop, that |
| | 38 | can be used for analytics and applications. |
| | 39 | }}} |