Skip to main content

Posts

Showing posts from May, 2014

Mongo Learning Series 6

Week 6: Application Engineering Mongo Application Engineering 1.       Durability of Writes 2.       Availability / Fault Tolerance 3.       Scaling WriteConcern Traditionally when we insert/update records that operation is performed as a fire and forget, Mongo Shell however wants to know if the operation is successful and hence calls getLastError every single time. There are couple of arguments for (getLastError) with which the operations can be perfomed W: 1  - - wait for a write acknowledgement. Still not durable, if the changes were made in memory returns true. Not necessarily after it is written to disk. If the system fails before writing to disk the data will be lost. J:1  --  journal. Return only acknowledgement on disk write and is guaranteed. The operation can be replayed if lost. Api.mongodb...

Mongo Learning Series 5

Week 5: Aggregation Framework The aggregation pipeline is a framework for performing aggregation tasks, modeled on the concept of data processing pipelines. Using this framework, MongoDB passes the documents of a single collection through a pipeline Let’s say there is a table Name Category Manufacturer Price iPad Tablet Apple 499 S4 Cell Phone Samsung 350 If I wanted to find out how many products from each manufacturer from each manufacturer, the way it is done in SQL is through a query : Select manufacturer, count(*) from products group by manufacturer We need to use Mongo aggregation framework to use similar to “group by“ use agg db.products.aggregate([ {$group: { _id:”$manufacturer”,num_products:{$sum:1} }}]) Aggregation pipeline   Aggregation uses a pipeline in MongoDB.   The concept of pipes is similar to unix. At the top is the collection...

Mongo Learning Series 4

Week 4: Performance Indexes Database performance is driven by indexes for MongoDB as any other database Databases stores the data in large files on disk, which represents the collection. There is no particular order for the documents on the disk, it could be anywhere. When you query for a particular document, what the database will have to do by default is scan through the entire collection to find the data. This is called a table scan in a relational DB and a collection scan in Mongo DB and it is death to performance. It will be extremely slow. Instead the data is indexed to perform better. How does indexing work: If something is ordered/sorted then it is quick to find the data. MongoDB keeps the key ordered. MongoDB does not keep the keys linearly ordered, but uses BTree. When looking for the items, look for the key in the index which has a pointer to the document and thus retrieve the document. In MongoDB indexes are ordered list of keys Example: (name, Hair_...