MongoDB Day2 PDF

Usage Guidelines
Do not forward this document to any non-Infosys mail ID. Forwarding this document to a
non-Infosys mail ID may lead to disciplinary action against you, including termination of
employment.
Contents of this material cannot be used in any other internal or external document
without explicit permission from E&R@infosys.com.
Introduction to MongoDB
Education & Research
2012 Infosys Limited, Bangalore, India. All rights reserved. Infosys believes the information in this document is accurate as of its publication date; such
information is subject to change without notice. Infosys acknowledges the proprietary rights of other companies to the trademarks, product names and such
other intellectual property rights mentioned in this document. Except as expressly permitted, neither this document nor any part of it may be reproduced, stored
in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, printing, photocopying, recording or otherwise, without the prior
permission of Infosys Limited and/or any named intellectual property rights holders under this document.
ER/CORP/CRS/ERCLD0008/003
Confidential Information
This Document is confidential to Infosys Limited. This document contains information and data that Infosys considers confidential
and proprietary (Confidential Information).
Confidential Information includes, but is not limited to, the following:
Corporate and Infrastructure information about Infosys;
Infosys project management and quality processes;
Project experiences provided included as illustrative case studies.
Any disclosure of Confidential Information to, or use of it by a third party, will be damaging to Infosys.
Ownership of all Infosys Confidential Information, no matter in what media it resides, remains with Infosys.
Confidential information in this document shall not be disclosed, duplicated or used in whole or in part for any purpose without
specific written permission of an authorized representative of Infosys.
Course Objectives
Performing basic operations through shell prompt
Performing aggregation functions in shell

Create and manage indexes
To import and export data
Session Plan
Querying Mongo
Aggregation
Indexing
Backup and Restore
Import and Export Data
Mongo tools
Querying Mongo
Querying Mongo : Selection

db.collection_name.find({JSON_for_where_clause})
Example : db.trainees.find({stream: Java, track: fast track})

This will return all those documents where the stream equals Java
and track equals fast track.
Querying Mongo : Selection & Projection

db.collection_name.find({JSON_for_where_clause},
{JSON_for_select_clause})
Examples
db.trainees.find({stream: Java, track: fast track},{name:1,
emp_id:1})
This will return the name, emp_id and the document _id of all those
documents where the stream equals Java and track equals fast track
db.trainees.find({},{name:1, emp_id:1, _id:0})
This will return all trainees name and emp_id as there is no where
clause. Also null can be specified instead of {}.
Querying Mongo : Operators

db.collection_name.find({$or:[{key1:
v1},{ k2: v2}]})
will select the documents if any one condition is satisfied

Example: db.trainees.find({batch: Jan12CS, $or:[{stream:
Java},{stream: OS},{track: intermediate}]})
This will select the trainees from Jan12CS batch who belong to either
Java stream or to OS stream or if they are from intermediate track
db.collection_name.find({key1: {$gt: v1}})
will fetch all documents with key1s value greater than v1

Example: db.trainees.find({GPA: {$gt: 4, $lt: 4.9}})
This will return all the trainee details who have a GPA of between 4 and 4.9
Similarly, we have gte, lt, lte and ne to check greater than or equal, less
than, less than or equal and not equals respectively

db.collection_name.find({key: {$in: [v1, v2]}})
will retreive those documents whose key value is equal to either v1 or

v2.
Example: db.trainees.find({stream: {$in: [Java, OS]}}) will retrieve
the Java and OS stream trainee details
{$nin: [Java, OS]} will retrieve trainee details who are not in both
Java and OS.
db.collection_name.find({key: {$all: [v1, v2]}})
will retreive those documents that have all the values passed in the
argument as values of the key array
Example: db.trainees.find({module: {$all: [JPA, POJO]}}) will retreive
those trainee details whose module array has JPA and POJO in it

db.collection_name.find({key: {$size: value}})
will retrieve the documents whose key array has size specified in
value
Example: db.trainees.find({module: {$size: 4}}) will select the
trainees who have completed 4 modules
$size operator cannot have a range as its value, i.e., $gt, $lt, $gte, $lte,
$ne cannot be used with $size operators value, the value can only be an
integer
db.collection_name.find({},{array_field :{$slice : n}})
n can be positive or negative

this will return only the first n values from the given array (if n is
positive) or the last n values in the given array (when n is negative)

db.collection_name.find({key: {$exists:true})
useful to retrieve only those documents which have entries for a

particular key.
Example: db.trainees.find(certification: {$exists: true}) will select
the details of those trainees who have done some certification.

db.collection_name.find({key: perl_compatible_regex})
select those documents whose key has value that matches with the
given perl_compatible_regex.
Example:
db.trainees.find({name: /an/i}) will retrieve all trainees whose name
has the alphabet series an in their name
The i at the end specifies that the regular expression is case
insensitive.
db.trainees.find({emp_id: /^620/}) will retrieve all trainees whose

emp_id starts with 620.

db.collection_name.find({key: {$type: value}})
will select only those documents whose keys values data type
matches with the data type passed.
Assume for the field certification, few documents have the name of the
course (whose data type will be string), few have the number of
certification(type Double) and few have null. If you want to select only
those documents with the course name, then use the query,
db.trainees.find({certification: {$type: 2}})
Data types and the values to be passed:
Double - 1; String 2; Array 4; Object id 7;
Boolean 8; Date 9; Null 10; Regular Expression 11
Accessing embedded document

db.collection_name.find({parent_key.emdedded_key: value}) is
used to find those documents whose embedded documents
embedded_keys is equal to the value passed.
Eg: db.trainees.find({project.IDE: Eclipse}) will retrieve the trainees
who use Eclipse IDE for their project.
$elemMatch
Consider a collection CDP with the following documents
{ emp_id: 101, certification:

[ { name: Big Data, grade: A },
{ name: AWS, grade: B }
]
}
{ emp_id: 102, certification:

[ { name: Hadoop, grade: B },
{ name: AWS, grade: A }
]
Problem Statement: To find all the employees who are certified in AWS with grade A
Expected output: Only the second document must be returned

db.CDP.find({certification.name: AWS, certification.grade: A})
This will return both the documents because
In the first document, there is an array element with name AWS and
there is also another array element with grade A, thus satisfying both the
selection condition
The second document is displayed because it has an array element

that has both the name as AWS and grade as A
So to get the desired output (only the second document), a documents

should be selected only when both conditions are satisfied by the single
element of the array
For this we use $elemMatch operator
So the query to do the same will be
db.CDP.find({ certification: { $elemMatch: { name: AWS, grade: A}}})

$where
Helps to use javascript expression (as a string) or javascript functions

in query
The javascript expression or function is processed against each
document
Each document is referred using this or obj in the javascript
Example:
db.trainees.find({$where: this.currentCDP > this.previousCDP})
db.trainees.find({$where: function() {return (this.currentCDP > this.previousCDP)})
Always $where is executed as the last filter during selection
Querying Mongo : Functions

db.collection_name.find().count()
Number of documents in the given collection

db.collection_name.find().explain()
Number of objects scanned, time taken to scan and other useful

information
db.collection_name.distinct(key)
Returns an array of distinct values for the key

db.collection_name.help()
gives all the commands that can be performed on the collection

db.help()
gives all the commands that can be performed on the database

db.stats()
gives information about the database such as name, number of

collections and indexes, and the amount of memory used by it
db.collection_name.stats()
gives the number of indexes on that collection, total size of all indexes
and individual size of each index along with other information
db.getLastError()
gives the details of the last error that occurred during a write operation
if any

db.serverStatus() will give details about the host server, the mongodb version,
the process (mongod / mongos), the memory used by the server, no. of client
connections, the different operations executed by the server, and the cursor type
used.
db.currentOp() returns an array that contains various information (like
operationId, secs running, operation name, namespace, the client that issued the
operation, lock status) about all the currently executing operations.

To copy database between two server instances, copyDatabase() function can be
used from the destination server instance
Example:
db.copyDatabase(mysourcedb, mydestdb,
MYSGEC240748D:27017)
Will copy the database mysourcedb from the server running at

MYSGEC240748D:27017 to the destination server (current server)
with the name mydestdb
Querying Mongo : Limiting & Ordering

db.collection_name.find().limit(n)
limits the results to n documents.
db.collection_name.find().sort({key: n})
will sort the result based on the field key

n can take either 1 or -1
1 for ascending, and -1 for descending
Querying Mongo : Skipping and Chaining

db.collection_name.find().skip(n)
skips the first n documents of the result set of find function
limit(), sort() and skip() function can be chained.
Example: db.trainees.find().sort({emp_id: 1}).limit(10).skip(5) will

display the ten trainee details sorted based on their emp_id after
skipping the first five in the result generated by find().
Quiz : Provide the Mongo equivalent

1.
INSERT INTO users(user_id, age, status) VALUES ("bcd001", 45, "A")
Answer: db.users.insert( { user_id: "bcd001", age: 45, status: "A" } )
2. SELECT user_id, status FROM users

Answer: db.users.find( { }, { user_id: 1, status: 1, _id: 0 } )
3. SELECT user_id, status FROM users WHERE status = "A

Answer: db.users.find( { status: "A" }, { user_id: 1, status: 1, _id: 0 } )
4. SELECT * FROM users WHERE status = "A" OR age = 50
Answer : db.users.find( { $or: [ { status: "A" } , { age: 50 } ] } )
5. SELECT * FROM users WHERE age > 25 AND age <= 50

Answer: db.users.find( { age: { $gt: 25, $lte: 50 } } )
Quiz : Provide the Mongo equivalent

6. SELECT * FROM users WHERE user_id like "%bc%"
Answer: db.users.find( { user_id: /bc/ } )
7. SELECT * FROM users WHERE status = "A" ORDER BY user_id DESC

Answer: db.users.find( { status: "A" } ).sort( { user_id: -1 }
8. SELECT COUNT(*) FROM users

Answer: db.users.count() OR db.users.find().count()
9. SELECT COUNT(user_id) FROM users

Answer : db.users.count( { user_id: { $exists: true } } )
10. SELECT DISTINCT(status) FROM users

Answer: db.users.distinct( "status" )
AGGREGATION
Simple Aggregation Functions

Count
db.collection_name.count() gives the number of documents present in the collection.
db.collection_name.count({JSON for where clause}) will give the number of documents
with the specified selecting criteria.
Distinct
db.collection_name.distinct(key) will return the documents with distinct values for the
passed key
db.collection_name.distinct(key, {JSON for where clause} ) will return documents that
meets the search criteria and with distinct values for the passed key
Simple Aggregation Functions (Contd.)
Group
Assume there is a employee_details relational database
table with fields emp_no, emp_name, role, experience and
resources_allocated.
The MongoDB document equivalent will be like

{emp_no: 6475, emp_name: amit, role: project lead, experience:
7, resources_allocated: 5 }

Now if we have to group the employee_details based on the role and calculate
the sum of resources allocated to each role, the SQL query will be as follows
SELECT role, SUM(resources_allocated) as total_resources
FROM emloyee_details
GROUP BY role
The MongoDB equivalent will be

db. employee_details.group( {
key: {role: 1 },
reduce: function ( cur, result ) {
result.total_resources += cur.resources_allocated;
},
initial: { total_resources : 0 }
})

Now if we have to group those employee_details whose experience is less than 3, based on the
role and calculate the sum of resources allocated to each role, the SQL query will be as follows
SELECT role, SUM(resources_allocated) as total_resources
WHERE experience < 3
GROUP BY role
key: {role: 1 },
cond: {experience : { $lt: 3 } },
},
})
Now if we have to group those employee_details whose experience is less than 3, based on the role and
experience, and then calculate the sum of resources allocated to each role, the SQL query will be as follows
SELECT role, experience, SUM(resources_allocated) as total_resources

WHERE experience < 3
GROUP BY role, experience
key: {role: 1, experience: 1 },
cond: {experience : { $lt: 3 } },
},
})

Group Syntax:
db.collection_name.group({key, reduce, initial, [keyf,][cond,][finalize]})
Key specifies the key based on which grouping should be done.
Reduce it is a function that specifies what operation (like count, sum) has to be performed on the
grouping documents. The function takes two parameters the current document and the result till
the aggregation of previous document.
Initial the result set of the aggregation operation will be initialized with this value at the beginning
of the operation.
Keyf it is an alternative for key field. This function is defined when grouping has to be done
based on some derived values rather than the fields.
Cond specifies the selection criteria. Only the documents qualifying with this condition will be
considered for grouping.
Finalize it is a function that specifies the changes that need to be done to the final result set.
group function cannot be used with sharded cluster. For sharded cluster,
aggregation framework has to be used.
Aggregation framework
Used to calculate aggregated values without map-reduce on a sharded cluster
Provides similar functionality to
GROUP BY and related SQL operators
Provides simple forms of self joins

Have projection capabilities which reshapes the result
Framework Components
Pipelines are the different properties that the aggregation framework provides.
These properties can be chained. The different pipeline properties are
$project, $match, $limit, $skip, $unwind, $group, $sort
Expressions are the operators that calculate values wen the pipeline properties
are performed. The various expressions that are available are classified into
Boolean, comparison, arithmetic, string, date and conditional operators.
Aggregation framework (Contd.)

$project helps to select particular fields.
db.employee_details.aggregate(
{ $project : {
role : 1 ,
experience : 1 ,
}}
);
This will retrieve the role, experience and _id from all the documents.

$match used to filter documents using a selection criteria
db. employee_details.aggregate(
{ $match : { role : project lead } }
);
This will return those documents whose role is project lead

$limit used to limit the number of documents displayed
{ $limit : 5 }
);
This will display only 5 documents from the collection.

$skip skips the specified number of documents from the result set
{ $skip : 5 }
);
This skips the first 5 documents in the result set.

$unwind if there are n values in an array field, if unwind is set to the field, it
creates n copies of the document, each copy having one value from the array
{ $project : {
emp_no : 1 ,
emp_name : 1 ,
specialization : 1
}},
{ $unwind : "$specialization" }
);
In above example, it is assumed that specialization is an array field. Suppose if there are
2 specialization for a particular document, then the output will have two entries with same
emp_no and emp_name but differs only by the specialization.

$group performs grouping operation
{ $group : {
_id : $role,
tot_no_of_emp_in_this_role : { $sum : 1 },
tot_resources : { $sum : $resources_allocated }
}}
);
This will group the employees based on role (given as the value of _id). Also it displays
the total number of employees under the particular role, as it adds 1 to the groups count
each time it encounters a matching document. And it gives the total resources allocated to
that role, as it adds up the individual resources of that groups employees.

$group must have any of the following aggregate function with it to develop the
composite value.
$addToSet, $first, $last, $max, $min, $avg, $push, $sum

$sort sorts the result set
{ $sort : { experience : 1 } }
);
This sorts the result set based on the experience.
More Examples
SELECT
SUM(resources_allocated) AS
total_resources
FROM employee_details
db. employee_details.aggregate(
[
{ $group: { _id: null,
total_resources: { $sum:
"$resources_allocated" } } }
])
Sum the resources_allocated field

from employee_details
More Examples Contd.
Indexing
Indexing
Performance can be increased by proper implementation of Indexes
Indexes increases the speed of read operations

Index can be created on any field using the following syntax
db.collection_name.ensureIndex({key:1})
1 represents ascending Index and -1 represents descending Index
Index can be dropped by

db.collection_name.dropIndex({key:1})
Indexes are auto updated after every insert
Indexing
ensureIndex() method can have an optional second parameter
Few values which it can take
{unique: true} : to create a unique index

{background: true} : the system does not wait for the index to be
created. Index will be created in the background
{sparse: true} : will create indexes only on those documents that has
the indexed field in it
{dropDups: true} : will delete those documents that has duplicated
values for the indexed fields
Indexing
db.collection_name.getIndexes() will get all the Indexes created on the particular
collection
db.collection_name.reIndex() rebuilds all the indexes on the particular collection
db.collection_name.totalIndexSize() will give the total size in bytes of all the
indexes
Index Types
_id Index
_id index is a unique index on the _id field
MongoDB creates this index by default on all collections
Cannot delete the index on _id.
Secondary Indexes
All indexes in MongoDB are secondary indexes
Can create indexes on any field within any document or sub-document
It can be Indexes on Sub-documents, Embedded Fields or Compound Indexes
Backup and Restore
Backup and Restore

Backups
Backups of the databases can be created by instantiating the
mongodump application (present in the bin folder)
The syntax is
mongodump --out path_to_store_backup

It can also be customized to backup a particular database or collection
mongodump --out path_to_store_backup --db
database_name --collection collection_name
To restore a backup, mongorestore application have to be instantiated
mongorestore --collection collection_name --db

database_name
path_to_the_backup\collection_name.bson
Backup and Restore - Cont

MongoDB export
To export a collection from the server to local machine to a json

or csv file, mongoexport application can be used
mongoexport --db database_name --collection collection_name -out path_to_json\file_name.json
mongoexport --db database_name --collection collection_name -csv --out path_to_json\file_name.csv --fields

field_name1,field_name2
Backup and Restore Contd.

MongoDB import
To import data from a json or csv file to a collection,

mongoimport application can be used
mongoimport --db dest_database_name --collection
dest_collection_name path_to_input_json
mongoimport --type csv --db dest_database_name --collection

dest_collection_name path_to_input_csv --fields
new_field_name1,new_field_name2
Summary
Querying Mongo
Aggregation
Indexing
Backup and Restore
Import and Export Data
Mongo tools
References
www.mongodb.org/
Karl Seguin, The Little MongoDB Book

Kristina Chodorow & Michael Dirolf, MongoDB: The Definitive Guide, O'Reilly
Media, 2010
www.mkyong.com/tutorials/java-mongodb-tutorials/
Thank You
ER/CORP/CRS/ERCLD0008/003

MongoDB Day2 PDF

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

MongoDB Day2 PDF

Diunggah oleh

Hak Cipta:

Format Tersedia

Usage Guidelines

Confidential Information includes, but is not limited to, the following:

Corporate and Infrastructure information about Infosys;

Infosys project management and quality processes;

Project experiences provided included as illustrative case studies.

Performing aggregation functions in shell

Querying Mongo : Selection

Example : db.trainees.find({stream: Java, track: fast track})

Querying Mongo : Selection & Projection

Querying Mongo : Operators

v1},{ k2: v2}]})

will select the documents if any one condition is satisfied

will fetch all documents with key1s value greater than v1

Querying Mongo : Operators

will retreive those documents whose key value is equal to either v1 or

Querying Mongo : Operators

n can be positive or negative

Querying Mongo : Operators

useful to retrieve only those documents which have entries for a

Querying Mongo : Operators

db.trainees.find({emp_id: /^620/}) will retrieve all trainees whose

Querying Mongo : Operators

Querying Mongo : Operators

Accessing embedded document

Querying Mongo : Operators

Consider a collection CDP with the following documents

{ emp_id: 101, certification:

{ emp_id: 102, certification:

Querying Mongo : Operators

This will return both the documents because

The second document is displayed because it has an array element

So to get the desired output (only the second document), a documents

Querying Mongo : Operators

Helps to use javascript expression (as a string) or javascript functions

Each document is referred using this or obj in the javascript

Always $where is executed as the last filter during selection

Querying Mongo : Functions

Number of documents in the given collection

Number of objects scanned, time taken to scan and other useful

Returns an array of distinct values for the key

gives all the commands that can be performed on the collection

gives all the commands that can be performed on the database

Querying Mongo : Functions

gives information about the database such as name, number of

Querying Mongo : Functions

Querying Mongo : Functions

Will copy the database mysourcedb from the server running at

Querying Mongo : Limiting & Ordering

limits the results to n documents.

will sort the result based on the field key

Querying Mongo : Skipping and Chaining

limit(), sort() and skip() function can be chained.

Example: db.trainees.find().sort({emp_id: 1}).limit(10).skip(5) will

Quiz : Provide the Mongo equivalent

INSERT INTO users(user_id, age, status) VALUES ("bcd001", 45, "A")

Answer: db.users.insert( { user_id: "bcd001", age: 45, status: "A" } )

2. SELECT user_id, status FROM users

3. SELECT user_id, status FROM users WHERE status = "A

4. SELECT * FROM users WHERE status = "A" OR age = 50

Answer : db.users.find( { $or: [ { status: "A" } , { age: 50 } ] } )

5. SELECT * FROM users WHERE age > 25 AND age <= 50

Quiz : Provide the Mongo equivalent