= NoSQL = == Histroy == * 2009-07-01: [http://www.computerworld.com/s/article/print/9135086/No_to_SQL_Anti_database_movement_gains_steam_?taxonomyId=173&taxonomyName=Databases No to SQL? Anti-database movement gains steam] * NoSQL 活動 - 漸漸有一股勢力在推動不要用 SQL Database .... 資料庫真的退流行了嗎??!! ([wiki:jazz/09-07-16 2009-07-16]) * 我個人的觀點是任何典範的轉移並不會一蹴可及,馬上轉換,往往是潛移默化,慢慢地改變。 ([wiki:jazz/09-07-27 2009-07-27]) * NoSQL 會不會歷史重演 OODBMS ?! - [http://developers.slashdot.org/article.pl?sid=01/05/03/1434242 Why Aren't You Using An OODMS?] (2010-07-25) {{{ There was no great story on schema migration either. }}} == Trend Observation == * [趨勢分析] [http://www.google.com/trends?q=memcached%2Cmongodb%2C+couchdb%2C+hbase%2C+apache+cassandra&ctab=0&geo=all&date=2010&sort=0 比較 memcached, MongoDB, CouchDB, HBase, Apache Cassandra] ([wiki:jazz/10-06-29 2010-06-29]) * [[Image(jazz/10-06-29:10-06-29_mongodb_couchdb_hbase_apache_cassandra_hypertable_2010.png,width=800)]] * 從 [http://www.google.com.tw/trends?q=mongodb%2C+sqlite%2C+CouchDB%2C+google+gears&ctab=0&geo=all&date=all&sort=1 Google Trends] 的搜尋趨勢可以看到 SQLite > Google Gears > CouchDB 的趨勢,象徵著 distributed database 的影響力尚未普及。([wiki:jazz/09-09-23 2009-09-23]) * [[Image(wiki:jazz/09-09-23:sqlite_gears.jpg,width=600)]] * [http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores Anti-RDBMS: A list of distributed key-value stores] - 列舉出一些分散式 key-value 資料庫,不過漏了 ([wiki:jazz/09-09-23 2009-09-23]) * 噗浪的[http://opensource.plurk.com/LightCloud/ lightcloud]。 * [http://code.google.com/p/redis/ redis] - A persistent key-value database with built-in net interface written in ANSI-C for Posix systems * [http://www.linux-mag.com/cache/7496/1.html Redis: Lightweight key/value Store That Goes the Extra Mile] - 另一個輕量級的 Key/Value 資料庫 * [http://wiki.apache.org/couchdb/ CouchDB] - * [http://packages.ubuntu.com/couchdb Ubuntu couchdb 套件] * [http://packages.debian.org/couchdb Debian couchdb 套件] * [http://wiki.apache.org/couchdb/EntityRelationship Modeling Entity Relationships in CouchDB] * [http://code.google.com/p/couchdb-fuse/ couchdb-fuse] - CouchDB FUSE File System * [影片] [http://video.yahoo.com/watch/2278623/7162319 Next Generation Data Storage with CouchDB (speaker: Jan Lehnardt) - Part 1] * [影片] [http://video.yahoo.com/watch/2278711/7162483 Next Generation Data Storage with CouchDB (speaker: Jan Lehnardt) - Part 2] * [影片] [http://video.yahoo.com/watch/2241669/7074711 MapReduce vs MySQL (speaker Stu Hood) - Part 1] * [影片] [http://video.yahoo.com/watch/2241684/7074739 MapReduce vs MySQL (speaker Stu Hood) - Part 2] * [影片] [http://video.yahoo.com/watch/2242180/7076074 MapReduce vs MySQL (speaker Stu Hood) - Part 3] * 在看維基百科的時候,發現 CouchDB 同時被分屬在 [http://en.wikipedia.org/wiki/Column-oriented_DBMS Column-oriented DBMS] 跟 [http://en.wikipedia.org/wiki/Document-oriented_database Document-oriented database] ([wiki:jazz/09-09-23 2009-09-23]) * [http://labs.mudynamics.com/2009/04/03/interactive-couchdb/ Interactive CouchDB] - 這裡用 Java Script 示範了 CouchDB 結合不同用途的 MapReduce 實作。 * [http://dotcloud.org/ dot.Cloud] - an open-source cloud federation platform. - 從 Feature 看起來頗 Powerful ([wiki:jazz/09-09-23 2009-09-23]) * Keep your servers under revision control * Stop worrying about maintaining state: just create and kill instances * Use tools you know: rsync, mercurial/git, ssh * Push a small upgrade to your images without moving gigabytes around * Replicate multi-server setups in just one command * Cleanly separate data (DB, logs, content) and code (OS, libraries, binaries, configuration) * Map data volumes to any available storage (NAS, EBS, S3) * Save bandwidth by delivering your app closer to the consumer * [http://code.google.com/p/gears-dblib/ gears-dblib] - A simple abstraction on top of the Database object in Gears ([wiki:jazz/09-09-23 2009-09-23]) * [http://code.google.com/p/orient/ orient] - NoSQL document database light, portable and fast. Supports ACID Tx, Indexes, asynch queries, SQL layer, clustering, etc ([wiki:jazz/09-09-23 2009-09-23]) == Primer == * [http://devblog.streamy.com/2009/08/24/cap-theorem/ CAP Theorem] * [http://en.wikipedia.org/wiki/ACID ACID] == Comparison == * [http://ossdbsurvey.org/survey_lca2010.pdf Which databases solve my problem? a survey of open source databases] - [http://diaspora.gen.nz/~rodgerd/archives/1215-Survey-of-Open-Source-Database.html Survey of Open Source Database] - http://ossdbsurvey.org/ * [http://www.roadtofailure.com/2009/10/29/hbase-vs-cassandra-nosql-battle/ HBase vs. Cassandra: NoSQL Battle!] == Open Source Projects == * 2009-06-13: [http://blog.oskarsson.nu/2009/06/nosql-debrief.html NOSQL debrief ] ([wiki:jazz/09-08-22 2009-08-22]) * [http://vimeo.com/5288034 NOSQL - CouchDB] ([wiki:jazz/09-08-11 2009-08-11]) * [http://vimeo.com/5198661 NOSQL - Hypertable] ([wiki:jazz/09-08-11 2009-08-11]) * [http://vimeo.com/5198411 NOSQL - HBase] ([wiki:jazz/09-08-11 2009-08-11]) == NoSQL : HBase == * 11:10–12:00 Upcoming improvements for HBase - Andrew Purtell (Trend Micro) ([wiki:jazz/10-04-24 2010-04-24]) * Big Data -> Medium Data 都需要 * Cloud Computing - Scale Free * Disk Seek time remains nearly constant -> Index(B-Tree), Seek (RMDB) 慢!! * No distributed transactions, no complex locking, no waits or deadlocks * 不要用 Spreadsheet 的想法看待 HBase, 或許可以用 Tag 的想法去看待它。 * HBase 跟 !BigTable 都是 CP 架構(注重 Consistancy 與 Partition Tolerance,根據 CAP Theorem 因此無法確保 Avaibility,寧可服務中斷也要資料正確!!) * HDFS-200 (working append) 將在 HBase 0.20.5 加入支援資料持續遞增的功能。 * ACID ? - [http://en.wikipedia.org/wiki/ACID atomicity, consistency, isolation, durability] * 新功能: * 跨資料中心備份 - 透過 Log Ship * 安全性強化 - 支援 authentication, authorization,Yahoo! 寫了很多新的安全性支援,包括 Kerberos 認證、Data isolation at the HDFS layer、Secure RPC。因此必須新增角色來作存取控管(Access Control Role) * Coprocessor - 靈感來自於 !BigTable 的新功能 Coprocessor,加入 !RegionObservor (需要再花點時間看清楚用途!!) * 最近 waue 在投影片中用到一張新的 !MapReduce 圖,今天再次在 Andrew 的演講中看到,出處是 Lars George 的部落格『[http://www.larsgeorge.com/2009/05/hbase-mapreduce-101-part-i.html HBase MapReduce 101 - Part I]』 * [[Image(http://1.bp.blogspot.com/_Cib_A77V54U/ShJ8K99N0fI/AAAAAAAAACY/aFbcbtIK4nI/s400/MapReduce2.png)]] * [http://www.larsgeorge.com/2010/02/fosdem-2010-nosql-talk.html FOSDEM 2010 NoSQL Talk] * [http://fosdem.org/2010/schedule/tracks/nosql FOSDEM NoSQL session] * [http://www.larsgeorge.com/2009/03/hbase-vs-couchdb-in-berlin.html HBase vs. CouchDB in Berlin] * MongoDB, CouchDB, HBase, Cassandra, Hypertable 趨勢觀察 ([wiki:jazz/10-04-24 2010-04-24]) * [[Image(wiki:jazz/10-04-24:mongodb_couchdb_hbase_cassandra_hypertable.png,width=800)]] * 當然縱使 MongoDB 與 CouchDB 已經有很高的排行榜,跟 SQLite 比起來,還是差很多。 從 [http://www.google.com.tw/trends?q=mongodb%2C+sqlite%2C+CouchDB%2C+google+gears&ctab=0&geo=all&date=all&sort=1 Google Trends] 的搜尋趨勢可以看到 SQLite > Google Gears > CouchDB 的趨勢,象徵著 distributed database 的影響力尚未普及。 ([wiki:jazz/09-09-23 2009-09-23]) * [[Image(wiki:jazz/09-09-23:sqlite_gears.jpg,width=800)]] * 從 [http://www.google.com.tw/trends?q=sqlite%2C+google+gears%2C+mongodb%2C+CouchDB&ctab=0&geo=all&date=2010&sort=1 2010 年 Google Trends] 的搜尋趨勢可以看到 MongoDB 已經接近 Google Gears 了(當然跟 Google 宣布支援 HTML5 Local Database 有關). ([wiki:jazz/10-04-24 2010-04-24]) * [[Image(wiki:jazz/10-04-24:sqlite_mongodb_couchdb.png,width=800)]] == NoSQL : Cassandra == * [http://incubator.apache.org/cassandra/ Cassandra] - a highly scalable, eventually consistent, distributed, structured key-value store. ([wiki:jazz/09-09-18 2009-09-18]) * [http://www.ethiopianreview.com/scitech/5281 Looking to the future with Cassandra] * Cassandra was open sourced by Facebook in 2008 * Cassandra brings together the distributed systems technologies from Dynamo and the data model from Google's BigTable. * 09:00–09:50 nosql cassandra - Gasol (Pixnet) ([wiki:jazz/10-04-24 2010-04-24]) * http://cassandra.apache.org/ * 具備副本機制,優先存在記憶體中,後續寫入 commit log 中。採取完全平等的分散式架構,沒有 Hadoop !NameNode 單點失效問題(Single Point of Failure) * [http://en.wikipedia.org/wiki/CAP_theorem CAP theorem] - Consistency (一致性), Availibility (可用性), Partition Tolerance (容錯性)三個勢必要犧牲其中一個!! * 2010-03-02: [https://www.cloudkick.com/blog/2010/mar/02/4_months_with_cassandra/ 4 Months with Cassandra, a love story] - 用 Cassandra 的經驗分享 ([wiki:jazz/10-03-04 2010-03-04]) == NoSQL : Voldemort == * [http://project-voldemort.com/ Voldemort] - a distributed key-value storage system ([wiki:jazz/09-09-18 2009-09-18]) * used at !LinkedIn for certain high-scalability storage problems == NoSQL : Redis == * [http://www.linux-mag.com/cache/7496/1.html Redis: Lightweight key/value Store That Goes the Extra Mile] - 另一個輕量級的 Key/Value 資料庫 ([wiki:jazz/09-09-02 2009-09-02]) == Use Case ==