[[PageOutline]] = NoSQL = == Histroy == * 2009-07-01: [http://www.computerworld.com/s/article/print/9135086/No_to_SQL_Anti_database_movement_gains_steam_?taxonomyId=173&taxonomyName=Databases No to SQL? Anti-database movement gains steam] * NoSQL 活動 - 漸漸有一股勢力在推動不要用 SQL Database .... 資料庫真的退流行了嗎??!! ([wiki:jazz/09-07-16 2009-07-16]) * 我個人的觀點是任何典範的轉移並不會一蹴可及,馬上轉換,往往是潛移默化,慢慢地改變。 ([wiki:jazz/09-07-27 2009-07-27]) * NoSQL 會不會歷史重演 OODBMS ?! - [http://developers.slashdot.org/article.pl?sid=01/05/03/1434242 Why Aren't You Using An OODMS?] (2010-07-25) * [http://teddziuba.com/2010/03/i-cant-wait-for-nosql-to-die.html I can't wait for NoSql to die] - [http://www.javaworld.com.tw/roller/ingramchen/entry/nosql_to_die NoSQL to die (中文評論)] {{{ There was no great story on schema migration either. }}} == Definition == * http://en.wikipedia.org/wiki/NoSQL - 維基百科 * [http://www.informationweek.com/news/development/architecture-design/showArticle.jhtml?articleID=224900559 The NoSQL Alternative ] * [http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html Amazon's Dynamo] * [http://blog.xdite.net/?cat=91 Scaling Rails Site:Reading Material # 1~ # 5] (2010-07-26) * 2009-07-25 : [http://blog.gslin.org/archives/2009/07/25/2065/ Distributed Key-Value Database] - gslin 介紹分散式 Key-Value Datastore 的一些背景觀念 (2010-07-26) * 2009-11-06 : [http://plog.longwin.com.tw/news-technology/2009/11/06/key-value-system-category-2009 Key-Value 系統 分類整理 (NoSQL)] == Primer == * NoSQL 的 A (ACID), B (BASE), C (CAP Theorem) * [http://en.wikipedia.org/wiki/ACID ACID] * [http://queue.acm.org/detail.cfm?id=1394128 BASE: An Acid Alternative] * [http://devblog.streamy.com/2009/08/24/cap-theorem/ CAP Theorem] - [http://cacm.acm.org/blogs/blog-cacm/83396-errors-in-database-systems-eventual-consistency-and-the-cap-theorem/fulltext Errors in Database Systems, Eventual Consistency, and the CAP Theorem] * [http://en.wikipedia.org/wiki/CAP_theorem CAP theorem] - Consistency (一致性), Availibility (可用性), Partition Tolerance (容錯性)三個勢必要犧牲其中一個!! == Papers and Websites == * http://nosqlsummer.org/ - 一個關於 NoSQL 與分散式系統的論文閱讀俱樂部!!!! * http://nosql-database.org/ - 介紹 NoSQL 分類的重要入口網站 * http://www.nosqldatabases.com/ * http://nosql.mypopescu.com/ == 趨勢分析 Trend Observation == * [http://www.simplyhired.com/a/jobtrends/trend/q-Cassandra%2C+Redis%2C+Voldemort%2C+Simpledb%2C+Couchdb%2C+Mongodb%2C+Hbase%2C+Hypertable 從 Simply Hired 職缺趨勢來看 NoSQL 的需求量],可以看到 [http://aws.amazon.com/simpledb/ Amazon 的 SimpleDB] 曾經一度很高。 2010 年 NoSQL 職缺排行榜:Cassandra > HBase > CouchDB =~ MongoDB > SimpleDB (2010-07-25)(2010-07-27) * [[Image(jazz/NoSQL:10-07-27_simpledb_hbase_couchdb_mongodb_cassandra_Hired_Trends_1.png,width=800)]] * [http://www.google.com/trends?q=memcached%2Cmongodb%2C+couchdb%2C+hbase%2C+apache+cassandra&ctab=0&geo=all&date=2010&sort=0 比較 memcached, MongoDB, CouchDB, HBase, Apache Cassandra] ([wiki:jazz/10-06-29 2010-06-29]) * [[Image(jazz/10-06-29:10-06-29_mongodb_couchdb_hbase_apache_cassandra_hypertable_2010.png,width=800)]] * 當然縱使 MongoDB 與 CouchDB 已經有很高的排行榜,跟 SQLite 比起來,還是差很多。 從 [http://www.google.com.tw/trends?q=mongodb%2C+sqlite%2C+CouchDB%2C+google+gears&ctab=0&geo=all&date=all&sort=1 Google Trends] 的搜尋趨勢可以看到 SQLite > Google Gears > CouchDB 的趨勢,象徵著 distributed database 的影響力尚未普及。 ([wiki:jazz/09-09-23 2009-09-23]) * [[Image(wiki:jazz/09-09-23:sqlite_gears.jpg,width=800)]] * [http://www.google.com.tw/trends?q=MongoDB%2C+CouchDB%2C+HBase%2C+apache+Cassandra%2C+Hypertable&ctab=0&geo=all&date=all&sort=0 MongoDB, CouchDB, HBase, Cassandra, Hypertable] 趨勢觀察 ([wiki:jazz/10-04-24 2010-04-24]) * [[Image(wiki:jazz/10-04-24:mongodb_couchdb_hbase_cassandra_hypertable.png,width=800)]] * 從 [http://www.google.com.tw/trends?q=sqlite%2C+google+gears%2C+mongodb%2C+CouchDB&ctab=0&geo=all&date=2010&sort=1 2010 年 Google Trends] 的搜尋趨勢可以看到 MongoDB 已經接近 Google Gears 了(當然跟 Google 宣布支援 HTML5 Local Database 有關). ([wiki:jazz/10-04-24 2010-04-24]) * [[Image(wiki:jazz/10-04-24:sqlite_mongodb_couchdb.png,width=800)]] == Comparison == * [http://en.wikipedia.org/wiki/Structured_storage Structured storage] (2010-07-26) * [http://nosql.mypopescu.com/tagged/benchmark NoSQL benchmarks and performance evaluations] * [http://ossdbsurvey.org/survey_lca2010.pdf Which databases solve my problem? a survey of open source databases] - [http://diaspora.gen.nz/~rodgerd/archives/1215-Survey-of-Open-Source-Database.html Survey of Open Source Database] - http://ossdbsurvey.org/ * [http://www.roadtofailure.com/2009/10/29/hbase-vs-cassandra-nosql-battle/ HBase vs. Cassandra: NoSQL Battle!] * 2010-07-04 : [http://www.mongodb.org/display/DOCS/MongoDB,+CouchDB,+MySQL+Compare+Grid MongoDB, CouchDB, MySQL Compare Grid] (2010-07-26) * 2010-06-30 : [http://newsicare.wordpress.com/2010/06/30/mongodb-vs-couchdb/ A brief comparison of MongoDB and CouchDB] (2010-07-26) * 2010-06-12 : [http://www.synchrosinteractive.com/blog/1-software/30-nifty-new-databases Nifty New Databases] (2010-07-26) * 2010-03-30 : [http://www.gonosql.com/couchdb-vs-mongodb-comparison/ CouchDB vs MongoDB Comparison] (2010-07-26) * 2010-02-25 : [http://www.cattell.net/datastores/Datastores.pdf Datastore Comparison: High Performance Scalable Data Stores] (PDF) - [http://highscalability.com/blog/2010/2/25/paper-high-performance-scalable-data-stores.html Paper: High Performance Scalable Data Stores] (2010-07-26) * 2010-01-05 : [http://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape.html NoSql Databases – Part 1 - Landscape] || Distributed || Not Distributed (responsibility on client) || || * Amazon Dynamo[[BR]] * Amazon S3[[BR]] * Scalaris[[BR]] * Voldemort[[BR]] * CouchDB (thru Lounge)[[BR]] * Riak[[BR]] * MongoDB (in alpha)[[BR]] * !BigTable[[BR]] * Cassandra[[BR]] * !HyperTable[[BR]] * HBase || * Redis[[BR]] * Tokyo Tyrant[[BR]] * MemcacheDB[[BR]] * Amazon SimpleDB || * 2009-11-25 : [http://robbin.javaeye.com/blog/524977 NoSQL数据库探讨之一 - 为什么要用非关系数据库?] * 2009-11-15 : [http://blogs.neotechnology.com/emil/2009/11/nosql-scaling-to-size-and-scaling-to-complexity.html NoSQL: scaling to size and scaling to complexity] - 說明了 Key-Value Store, !BigTable Clones (aka "Column Family") , Document Oriented, Graph Databases 在大小(Size)與複雜度上的關係。 * [[Image(http://blogs.neotechnology.com/.a/6a0120a600b05e970b012875a1df18970c-500pi,width=600)]] * 2009-11-15 : [http://prajwal-tuladhar.net.np/2009/11/15/500/mongodbs-performance-as-compared-to-others-esp-couchdb/ MongoDB’s performance as compared to others] * [[Image(http://prajwal-tuladhar.net.np/wp-content/uploads/2009/11/mongo_performance.png,width=600)]] * 2009-11-09 : [http://www.rackspacecloud.com/blog/2009/11/09/nosql-ecosystem/ NoSQL Ecosystem] (2010-07-25) * [[Image(http://c0179631.cdn.cloudfiles.rackspacecloud.com/NoSQL_1_New.png,width=600)]] * [[Image(http://c0179631.cdn.cloudfiles.rackspacecloud.com/NoSQL_2_New.png,width=600)]] *[[Image(http://c0179631.cdn.cloudfiles.rackspacecloud.com/NoSQL_3.png,width=600)]] * 2009-10-21 : [http://www.viget.com/extend/nosql-misconceptions/ NoSQL Misconceptions] - 關於 NoSQL 的錯誤觀念 * [影片] [http://video.yahoo.com/watch/2241669/7074711 MapReduce vs MySQL (speaker Stu Hood) - Part 1] ([wiki:jazz/09-09-23 2009-09-23]) * [影片] [http://video.yahoo.com/watch/2241684/7074739 MapReduce vs MySQL (speaker Stu Hood) - Part 2] ([wiki:jazz/09-09-23 2009-09-23]) * [影片] [http://video.yahoo.com/watch/2242180/7076074 MapReduce vs MySQL (speaker Stu Hood) - Part 3] ([wiki:jazz/09-09-23 2009-09-23]) == Open Source Projects == * 2009-06-13: [http://blog.oskarsson.nu/2009/06/nosql-debrief.html NOSQL debrief ] ([wiki:jazz/09-08-22 2009-08-22]) * [http://vimeo.com/5288034 NOSQL - CouchDB] ([wiki:jazz/09-08-11 2009-08-11]) * [http://vimeo.com/5198661 NOSQL - Hypertable] ([wiki:jazz/09-08-11 2009-08-11]) * [http://vimeo.com/5198411 NOSQL - HBase] ([wiki:jazz/09-08-11 2009-08-11]) * 2009-01-19 : [http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores Anti-RDBMS: A list of distributed key-value stores] - 列舉出一些分散式 key-value 資料庫 ([wiki:jazz/09-09-23 2009-09-23]) * 噗浪的[http://opensource.plurk.com/LightCloud/ lightcloud]。 * [http://code.google.com/p/gears-dblib/ gears-dblib] - A simple abstraction on top of the Database object in Gears ([wiki:jazz/09-09-23 2009-09-23]) * [http://code.google.com/p/orient/ orient] - NoSQL document database light, portable and fast. Supports ACID Tx, Indexes, asynch queries, SQL layer, clustering, etc ([wiki:jazz/09-09-23 2009-09-23]) * 2009-03-30 : [http://www.harecoded.com/non-rdbm-distributed-databases-map-reduce-key-value-and-cloud-computing-227117 non-RDBM distributed databases, map/reduce, key/value and cloud computing] (2010-07-25) == NoSQL : HBase == * 11:10–12:00 Upcoming improvements for HBase - Andrew Purtell (Trend Micro) ([wiki:jazz/10-04-24 2010-04-24]) * Big Data -> Medium Data 都需要 * Cloud Computing - Scale Free * Disk Seek time remains nearly constant -> Index(B-Tree), Seek (RMDB) 慢!! * No distributed transactions, no complex locking, no waits or deadlocks * 不要用 Spreadsheet 的想法看待 HBase, 或許可以用 Tag 的想法去看待它。 * HBase 跟 !BigTable 都是 CP 架構(注重 Consistancy 與 Partition Tolerance,根據 CAP Theorem 因此無法確保 Avaibility,寧可服務中斷也要資料正確!!) * HDFS-200 (working append) 將在 HBase 0.20.5 加入支援資料持續遞增的功能。 * ACID ? - [http://en.wikipedia.org/wiki/ACID atomicity, consistency, isolation, durability] * 新功能: * 跨資料中心備份 - 透過 Log Ship [https://issues.apache.org/jira/browse/HBASE-1295 HBASE-1295 : Multi data center replication] * 安全性強化 - 支援 authentication, authorization,Yahoo! 寫了很多新的安全性支援,包括 Kerberos 認證、Data isolation at the HDFS layer、Secure RPC。因此必須新增角色來作存取控管(Access Control Role) * Coprocessor - 靈感來自於 !BigTable 的新功能 Coprocessor,加入 !RegionObservor (需要再花點時間看清楚用途!!) * 最近 waue 在投影片中用到一張新的 !MapReduce 圖,今天再次在 Andrew 的演講中看到,出處是 Lars George 的部落格『[http://www.larsgeorge.com/2009/05/hbase-mapreduce-101-part-i.html HBase MapReduce 101 - Part I]』 * [[Image(http://1.bp.blogspot.com/_Cib_A77V54U/ShJ8K99N0fI/AAAAAAAAACY/aFbcbtIK4nI/s400/MapReduce2.png)]] * [http://www.larsgeorge.com/2010/02/fosdem-2010-nosql-talk.html FOSDEM 2010 NoSQL Talk] * [http://fosdem.org/2010/schedule/tracks/nosql FOSDEM NoSQL session] * [http://www.larsgeorge.com/2009/03/hbase-vs-couchdb-in-berlin.html HBase vs. CouchDB in Berlin] == NoSQL : Cassandra == * [http://incubator.apache.org/cassandra/ Cassandra] - a highly scalable, eventually consistent, distributed, structured key-value store. ([wiki:jazz/09-09-18 2009-09-18]) * [http://www.ethiopianreview.com/scitech/5281 Looking to the future with Cassandra] * Cassandra was open sourced by Facebook in 2008 * Cassandra brings together the distributed systems technologies from Dynamo and the data model from Google's !BigTable. * 09:00–09:50 nosql cassandra - Gasol (Pixnet) ([wiki:jazz/10-04-24 2010-04-24]) * http://cassandra.apache.org/ * 具備副本機制,優先存在記憶體中,後續寫入 commit log 中。採取完全平等的分散式架構,沒有 Hadoop !NameNode 單點失效問題(Single Point of Failure) * 2010-03-02: [https://www.cloudkick.com/blog/2010/mar/02/4_months_with_cassandra/ 4 Months with Cassandra, a love story] - 用 Cassandra 的經驗分享 ([wiki:jazz/10-03-04 2010-03-04]) === Use Case === * 2010-07-09 : [http://engineering.twitter.com/2010/07/cassandra-at-twitter-today.html Cassandra at Twitter Today] (2010-07-30) * 2010-03-07 : [http://about.digg.com/node/564 Saying Yes to NoSQL; Going Steady with Cassandra] - Digg 決定繼續使用 Cassandra * 2010-02-23 : [http://nosql.mypopescu.com/post/407159447/cassandra-twitter-an-interview-with-ryan-king Cassandra @ Twitter: An Interview with Ryan King] - Twitter 很早就想用 Cassandra * 2009-07-02 : [http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/ up and running with cassandra] (內容於 2010-05-11 更新) * 2009-09-09 : [http://about.digg.com/blog/looking-future-cassandra Looking to the future with Cassandra] - Digg 使用 Cassandra {{{ After considering HBase, Hypertable, Cassandra, Tokyo Cabinet/Tyrant, Voldemort, and Dynomite, we settled on Cassandra. }}} === Installation === * [http://wiki.apache.org/cassandra/DebianPackaging Cassandra Debian 套件] (2010-07-25) * [http://www.javaworld.com.tw/roller/ingramchen/entry/consistency Consistency (中文)] (2010-07-25) * 2010-05-30 : [http://coderjournal.com/2010/03/cassandra-jump-start-for-the-windows-developer/ Cassandra Jump Start For The Windows Developer] (2010-07-27) * 2010-05-24 : [http://www.unixmen.com/linux-tutorials/960-install-nosql-cassandra-db-in-ubuntu-via-ppa-repository How to install NOSQL cassandra DB in ubuntu and debian via ppa repository] (2010-07-27) == NoSQL : MongoDB == * 2010-02-28 : [http://blog.boxedice.com/2010/02/28/notes-from-a-production-mongodb-deployment/ Notes from a production MongoDB deployment] - 大型 MongoDB 不數的建議 * 2009-07-25 : [http://blog.boxedice.com/2009/07/25/choosing-a-non-relational-database-why-we-migrated-from-mysql-to-mongodb Choosing a non-relational database; why we migrated from MySQL to MongoDB] * 2009-11-19 : [http://rubyconf2009.confreaks.com/19-nov-2009-16-20-getting-non-relational-with-mongodb-michael-dirolf.html Getting Non-Relational with MongoDB] == NoSQL : CouchDB == * [http://wiki.apache.org/couchdb/ CouchDB] ([wiki:jazz/09-09-23 2009-09-23]) * [http://packages.ubuntu.com/couchdb Ubuntu couchdb 套件] ([wiki:jazz/09-09-23 2009-09-23]) * [http://packages.debian.org/couchdb Debian couchdb 套件] ([wiki:jazz/09-09-23 2009-09-23]) * [http://wiki.apache.org/couchdb/EntityRelationship Modeling Entity Relationships in CouchDB] ([wiki:jazz/09-09-23 2009-09-23]) * [http://code.google.com/p/couchdb-fuse/ couchdb-fuse] - CouchDB FUSE File System ([wiki:jazz/09-09-23 2009-09-23]) * [影片] [http://video.yahoo.com/watch/2278623/7162319 Next Generation Data Storage with CouchDB (speaker: Jan Lehnardt) - Part 1] ([wiki:jazz/09-09-23 2009-09-23]) * [影片] [http://video.yahoo.com/watch/2278711/7162483 Next Generation Data Storage with CouchDB (speaker: Jan Lehnardt) - Part 2] ([wiki:jazz/09-09-23 2009-09-23]) * 在看維基百科的時候,發現 CouchDB 同時被分屬在 [http://en.wikipedia.org/wiki/Column-oriented_DBMS Column-oriented DBMS] 跟 [http://en.wikipedia.org/wiki/Document-oriented_database Document-oriented database] ([wiki:jazz/09-09-23 2009-09-23]) * [http://labs.mudynamics.com/2009/04/03/interactive-couchdb/ Interactive CouchDB] - 這裡用 Java Script 示範了 CouchDB 結合不同用途的 MapReduce 實作。 * [http://dotcloud.org/ dot.Cloud] - an open-source cloud federation platform. - 從 Feature 看起來頗 Powerful ([wiki:jazz/09-09-23 2009-09-23]) * Keep your servers under revision control * Stop worrying about maintaining state: just create and kill instances * Use tools you know: rsync, mercurial/git, ssh * Push a small upgrade to your images without moving gigabytes around * Replicate multi-server setups in just one command * Cleanly separate data (DB, logs, content) and code (OS, libraries, binaries, configuration) * Map data volumes to any available storage (NAS, EBS, S3) * Save bandwidth by delivering your app closer to the consumer * 2009-11-04 : [http://blog.cloudant.com/benchmarking-couchdb-with-baracus Benchmarking CouchDB with Baracus] == NoSQL : Voldemort == * [http://project-voldemort.com/ Voldemort] - a distributed key-value storage system ([wiki:jazz/09-09-18 2009-09-18]) * used at !LinkedIn for certain high-scalability storage problems == NoSQL : Redis == * [http://code.google.com/p/redis/ redis] - A persistent key-value database with built-in net interface written in ANSI-C for Posix systems * [http://www.linux-mag.com/cache/7496/1.html Redis: Lightweight key/value Store That Goes the Extra Mile] - 另一個輕量級的 Key/Value 資料庫 ([wiki:jazz/09-09-02 2009-09-02]) == Use Case ==