Do not forward this document to any non-Infosys mail ID. Forwarding this document to a
non-Infosys mail ID may lead to disciplinary action against you, including termination of
employment.
Contents of this material cannot be used in any other internal or external document
without explicit permission from E&R@infosys.com.
Introduction to MongoDB
Education & Research
2012 Infosys Limited, Bangalore, India. All rights reserved. Infosys believes the information in this document is accurate as of its publication date; such
information is subject to change without notice. Infosys acknowledges the proprietary rights of other companies to the trademarks, product names and such
other intellectual property rights mentioned in this document. Except as expressly permitted, neither this document nor any part of it may be reproduced, stored
in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, printing, photocopying, recording or otherwise, without the prior
permission of Infosys Limited and/or any named intellectual property rights holders under this document.
ER/CORP/CRS/ERCLD0008/003
Confidential Information
This Document is confidential to Infosys Limited. This document contains information and data that Infosys considers confidential
and proprietary (Confidential Information).
Any disclosure of Confidential Information to, or use of it by a third party, will be damaging to Infosys.
Ownership of all Infosys Confidential Information, no matter in what media it resides, remains with Infosys.
Confidential information in this document shall not be disclosed, duplicated or used in whole or in part for any purpose without
specific written permission of an authorized representative of Infosys.
Course Objectives
Performing basic operations through shell prompt
Session Plan
Querying Mongo
Aggregation
Indexing
Backup and Restore
Import and Export Data
Mongo tools
Querying Mongo
Examples
db.trainees.find({stream: Java, track: fast track},{name:1,
emp_id:1})
This will return the name, emp_id and the document _id of all those
documents where the stream equals Java and track equals fast track
db.trainees.find({},{name:1, emp_id:1, _id:0})
This will return all trainees name and emp_id as there is no where
clause. Also null can be specified instead of {}.
{$nin: [Java, OS]} will retrieve trainee details who are not in both
Java and OS.
db.collection_name.find({key: {$all: [v1, v2]}})
will retreive those documents that have all the values passed in the
argument as values of the key array
Example: db.trainees.find({module: {$all: [JPA, POJO]}}) will retreive
those trainee details whose module array has JPA and POJO in it
will retrieve the documents whose key array has size specified in
value
Example: db.trainees.find({module: {$size: 4}}) will select the
trainees who have completed 4 modules
$size operator cannot have a range as its value, i.e., $gt, $lt, $gte, $lte,
$ne cannot be used with $size operators value, the value can only be an
integer
db.collection_name.find({},{array_field :{$slice : n}})
$elemMatch
Problem Statement: To find all the employees who are certified in AWS with grade A
Expected output: Only the second document must be returned
In the first document, there is an array element with name AWS and
there is also another array element with grade A, thus satisfying both the
selection condition
Example:
db.trainees.find({$where: this.currentCDP > this.previousCDP})
db.trainees.find({$where: function() {return (this.currentCDP > this.previousCDP)})
gives the number of indexes on that collection, total size of all indexes
and individual size of each index along with other information
db.getLastError()
gives the details of the last error that occurred during a write operation
if any
db.copyDatabase(mysourcedb, mydestdb,
MYSGEC240748D:27017)
db.collection_name.find().sort({key: n})
AGGREGATION
Distinct
db.collection_name.distinct(key) will return the documents with distinct values for the
passed key
db.collection_name.distinct(key, {JSON for where clause} ) will return documents that
meets the search criteria and with distinct values for the passed key
Group
Assume there is a employee_details relational database
table with fields emp_no, emp_name, role, experience and
resources_allocated.
},
initial: { total_resources : 0 }
})
db. employee_details.group( {
key: {role: 1 },
cond: {experience : { $lt: 3 } },
reduce: function ( cur, result ) {
result.total_resources += cur.resources_allocated;
},
initial: { total_resources : 0 }
})
Now if we have to group those employee_details whose experience is less than 3, based on the role and
experience, and then calculate the sum of resources allocated to each role, the SQL query will be as follows
db. employee_details.group( {
key: {role: 1, experience: 1 },
cond: {experience : { $lt: 3 } },
reduce: function ( cur, result ) {
result.total_resources += cur.resources_allocated;
},
initial: { total_resources : 0 }
})
group function cannot be used with sharded cluster. For sharded cluster,
aggregation framework has to be used.
Aggregation framework
Used to calculate aggregated values without map-reduce on a sharded cluster
Provides similar functionality to
Framework Components
Pipelines are the different properties that the aggregation framework provides.
These properties can be chained. The different pipeline properties are
$project, $match, $limit, $skip, $unwind, $group, $sort
Expressions are the operators that calculate values wen the pipeline properties
are performed. The various expressions that are available are classified into
Boolean, comparison, arithmetic, string, date and conditional operators.
This will retrieve the role, experience and _id from all the documents.
In above example, it is assumed that specialization is an array field. Suppose if there are
2 specialization for a particular document, then the output will have two entries with same
emp_no and emp_name but differs only by the specialization.
This will group the employees based on role (given as the value of _id). Also it displays
the total number of employees under the particular role, as it adds 1 to the groups count
each time it encounters a matching document. And it gives the total resources allocated to
that role, as it adds up the individual resources of that groups employees.
Aggregation framework
More Examples
SELECT
SUM(resources_allocated) AS
total_resources
FROM employee_details
db. employee_details.aggregate(
[
{ $group: { _id: null,
total_resources: { $sum:
"$resources_allocated" } } }
])
Aggregation framework
More Examples Contd.
Indexing
Indexing
Performance can be increased by proper implementation of Indexes
db.collection_name.ensureIndex({key:1})
1 represents ascending Index and -1 represents descending Index
Indexing
ensureIndex() method can have an optional second parameter
Indexing
db.collection_name.getIndexes() will get all the Indexes created on the particular
collection
db.collection_name.reIndex() rebuilds all the indexes on the particular collection
db.collection_name.totalIndexSize() will give the total size in bytes of all the
indexes
Index Types
_id Index
_id index is a unique index on the _id field
MongoDB creates this index by default on all collections
Cannot delete the index on _id.
Secondary Indexes
All indexes in MongoDB are secondary indexes
Can create indexes on any field within any document or sub-document
It can be Indexes on Sub-documents, Embedded Fields or Compound Indexes
Summary
Querying Mongo
Aggregation
Indexing
Backup and Restore
Import and Export Data
Mongo tools
References
www.mongodb.org/
Thank You
ER/CORP/CRS/ERCLD0008/003