Anda di halaman 1dari 48

F

r
e
e

S
a
m
p
l
e
r
When you buy an ebook through oreilly.com, you get lifetime access to the book, and
whenever possible we provide it to you in four, DRM-free le formatsPDF, .epub,
Kindle-compatible .mobi, and Android .apk ebookthat you can use on the devices of
your choice. Our ebook les are fully searchable and you can cut-and-paste and print
them. We also alert you when weve updated the les with corrections and additions.
Learn more at http://oreilly.com/ebooks/
You can also purchase OReilly ebooks through iTunes,
the Android Marketplace, and Amazon.com.
OReilly EbooksYour bookshelf on your devices!
MongoDB: The Definitive Guide
Ly Kiistina Chouoiow anu Michael Diioll
Copyiight 2010 Kiistina Chouoiow anu Michael Diioll. All iights ieseiveu.
Piinteu in the Uniteu States ol Ameiica.
PuLlisheu Ly O`Reilly Meuia, Inc., 1005 Giavenstein Highway Noith, SeLastopol, CA 95+72.
O`Reilly Looks may Le puichaseu loi euucational, Lusiness, oi sales piomotional use. Online euitions
aie also availaLle loi most titles (http://ny.sajariboo|son|inc.con). Foi moie inloimation, contact oui
coipoiate/institutional sales uepaitment: (S00) 99S-993S oi corporatcorci||y.con.
Editor: ]ulie Steele
Production Editor: Teiesa Elsey
Copyeditor: Kim Vimpsett
Proofreader: Apostiophe Euiting Seivices
Production Services: Molly Shaip
Indexer: Ellen Tioutman Zaig
Cover Designer: Kaien Montgomeiy
Interior Designer: Daviu Futato
Illustrator: RoLeit Romano
Printing History:
SeptemLei 2010: Fiist Euition.
Nutshell HanuLook, the Nutshell HanuLook logo, anu the O`Reilly logo aie iegisteieu tiauemaiks ol
O`Reilly Meuia, Inc. MongoDB: Thc Dcjinitivc Guidc, the image ol a mongoose lemui, anu ielateu tiaue
uiess aie tiauemaiks ol O`Reilly Meuia, Inc.
Many ol the uesignations useu Ly manulactuieis anu selleis to uistinguish theii piouucts aie claimeu as
tiauemaiks. Vheie those uesignations appeai in this Look, anu O`Reilly Meuia, Inc., was awaie ol a
tiauemaik claim, the uesignations have Leen piinteu in caps oi initial caps.
Vhile eveiy piecaution has Leen taken in the piepaiation ol this Look, the puLlishei anu authois assume
no iesponsiLility loi eiiois oi omissions, oi loi uamages iesulting liom the use ol the inloimation con-
taineu heiein.
ISBN: 97S-1-++9-3S156-1
M
12S353+19S
Table of Contents
Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
A Rich Data Mouel 1
Easy Scaling 2
Tons ol Featuies. 2
.Vithout Saciilicing Speeu 3
Simple Auministiation 3
But Vait, That`s Not All. +
2. Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Documents 5
Collections 7
Schema-Fiee 7
Naming S
DataLases S
Getting anu Staiting MongoDB 10
MongoDB Shell 11
Running the Shell 11
A MongoDB Client 12
Basic Opeiations with the Shell 12
Tips loi Using the Shell 1+
Data Types 15
Basic Data Types 16
NumLeis 1S
Dates 19
Aiiays 19
EmLeuueu Documents 20
iu anu OLjectIus 20
v
3. Creating, Updating, and Deleting Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Inseiting anu Saving Documents 23
Batch Inseit 23
Inseits: Inteinals anu Implications 2+
Removing Documents 25
Remove Speeu 25
Upuating Documents 26
Document Replacement 26
Using Mouilieis 27
Upseits 36
Upuating Multiple Documents 3S
Retuining Upuateu Documents 39
The Fastest Viite This Siue ol Mississippi +1
Sale Opeiations +2
Catching Noimal Eiiois +3
Reguests anu Connections +3
4. Querying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Intiouuction to linu +5
Specilying Vhich Keys to Retuin +6
Limitations +7
Queiy Ciiteiia +7
Queiy Conuitionals +7
OR Queiies +S
$not +9
Rules loi Conuitionals +9
Type-Specilic Queiies +9
null +9
Regulai Expiessions 50
Queiying Aiiays 51
Queiying on EmLeuueu Documents 53
$wheie Queiies 55
Cuisois 56
Limits, Skips, anu Soits 57
Avoiuing Laige Skips 5S
Auvanceu Queiy Options 60
Getting Consistent Results 61
Cuisoi Inteinals 63
5. Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Intiouuction to Inuexing 65
Scaling Inuexes 6S
Inuexing Keys in EmLeuueu Documents 6S
vi | Table of Contents
Inuexing loi Soits 69
Uniguely Iuentilying Inuexes 69
Unigue Inuexes 69
Diopping Duplicates 70
Compounu Unigue Inuexes 70
Using explain anu hint 70
Inuex Auministiation 75
Changing Inuexes 76
Geospatial Inuexing 77
Compounu Geospatial Inuexes 7S
The Eaith Is Not a 2D Plane 79
6. Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
count S1
uistinct S1
gioup S2
Using a Finalizei S+
Using a Function as a Key S6
MapReuuce S6
Example 1: Finuing All Keys in a Collection S7
Example 2: Categoiizing VeL Pages S9
MongoDB anu MapReuuce 90
7. Advanced Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
DataLase Commanus 93
How Commanus Voik 9+
Commanu Releience 95
Cappeu Collections 97
Piopeities anu Use Cases 9S
Cieating Cappeu Collections 99
Soiting Au Natuiel 99
TailaLle Cuisois 101
GiiuFS: Stoiing Files 101
Getting Staiteu with GiiuFS: mongoliles 102
Voiking with GiiuFS liom the MongoDB Diiveis 102
Unuei the Hoou 103
Seivei-Siue Sciipting 10+
uL.eval 10+
Stoieu ]avaSciipt 105
Secuiity 106
DataLase Releiences 107
Vhat Is a DBRel? 107
Example Schema 107
Table of Contents | vii
Diivei Suppoit loi DBRels 10S
Vhen Shoulu DBRels Be Useu? 10S
8. Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Staiting anu Stopping MongoDB 111
Staiting liom the Commanu Line 112
File-Baseu Conliguiation 113
Stopping MongoDB 11+
Monitoiing 11+
Using the Aumin Inteilace 115
seiveiStatus 116
mongostat 11S
Thiiu-Paity Plug-Ins 11S
Secuiity anu Authentication 11S
Authentication Basics 11S
How Authentication Voiks 120
Othei Secuiity Consiueiations 121
Backup anu Repaii 121
Data File Backup 121
mongouump anu mongoiestoie 122
lsync anu Lock 123
Slave Backups 12+
Repaii 12+
9. Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Mastei-Slave Replication 127
Options 12S
Auuing anu Removing Souices 129
Replica Sets 130
Initializing a Set 132
Noues in a Replica Set 133
Failovei anu Piimaiy Election 135
Peiloiming Opeiations on a Slave 136
Reau Scaling 137
Using Slaves loi Data Piocessing 137
How It Voiks 13S
The Oplog 13S
Syncing 139
Replication State anu the Local DataLase 139
Blocking loi Replication 1+0
Auministiation 1+1
Diagnostics 1+1
Changing the Oplog Size 1+1
viii | Table of Contents
Replication with Authentication 1+2
10. Sharding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Intiouuction to Shaiuing 1+3
Autoshaiuing in MongoDB 1+3
Vhen to Shaiu 1+5
The Key to Shaiuing: Shaiu Keys 1+5
Shaiuing an Existing Collection 1+5
Inciementing Shaiu Keys Veisus Ranuom Shaiu Keys 1+6
How Shaiu Keys Allect Opeiations 1+6
Setting Up Shaiuing 1+7
Staiting the Seiveis 1+7
Shaiuing Data 1+S
Piouuction Conliguiation 1+9
A RoLust Conlig 1+9
Many mongos 1+9
A Stuiuy Shaiu 150
Physical Seiveis 150
Shaiuing Auministiation 150
conlig Collections 150
Shaiuing Commanus 152
11. Example Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Chemical Seaich Engine: ]ava 155
Installing the ]ava Diivei 155
Using the ]ava Diivei 155
Schema Design 156
Viiting This in ]ava 15S
Issues 159
News Aggiegatoi: PHP 159
Installing the PHP Diivei 160
Using the PHP Diivei 161
Designing the News Aggiegatoi 162
Tiees ol Comments 162
Voting 16+
Custom SuLmission Foims: RuLy 16+
Installing the RuLy Diivei 16+
Using the RuLy Diivei 165
Custom Foim SuLmission 166
RuLy OLject Mappeis anu Using MongoDB with Rails 167
Real-Time Analytics: Python 16S
Installing PyMongo 16S
Using PyMongo 16S
Table of Contents | ix
MongoDB loi Real-Time Analytics 169
Schema 169
Hanuling a Reguest 170
Using Analytics Data 170
Othei Consiueiations 171
A. Installing MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
B. mongo: The Shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
C. MongoDB Internals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
x | Table of Contents
CHAPTER 2
Getting Started
MongoDB is veiy poweilul, Lut it is still easy to get staiteu with. In this chaptei we`ll
intiouuce some ol the Lasic concepts ol MongoDB:
A docuncnt is the Lasic unit ol uata loi MongoDB, ioughly eguivalent to a iow in
a ielational uataLase management system (Lut much moie expiessive).
Similaily, a co||cction can Le thought ol as the schema-liee eguivalent ol a taLle.
A single instance ol MongoDB can host multiple inuepenuent databascs, each ol
which can have its own collections anu peimissions.
MongoDB comes with a simple Lut poweilul ]avaSciipt shc||, which is uselul loi
the auministiation ol MongoDB instances anu uata manipulation.
Eveiy uocument has a special key, "_id", that is unigue acioss the uocument`s
collection.
Documents
At the heait ol MongoDB is the concept ol a docuncnt: an oiueieu set ol keys with
associateu values. The iepiesentation ol a uocument uilleis Ly piogiamming language,
Lut most languages have a uata stiuctuie that is a natuial lit, such as a map, hash, oi
uictionaiy. In ]avaSciipt, loi example, uocuments aie iepiesenteu as oLjects:
{"greeting" : "Hello, world!"}
This simple uocument contains a single key, "greeting", with a value ol "Hello,
world!". Most uocuments will Le moie complex than this simple one anu olten will
contain multiple key/value paiis:
{"greeting" : "Hello, world!", "foo" : 3}
5
This example is a goou illustiation ol seveial impoitant concepts:
Key/value paiis in uocuments aie oiueieuthe eailiei uocument is uistinct liom
the lollowing uocument:
{"foo" : 3, "greeting" : "Hello, world!"}
In most cases the oiueiing ol keys in uocuments is not impoitant.
In lact, in some piogiamming languages the uelault iepiesentation
ol a uocument uoes not even maintain oiueiing (e.g., uictionaiies
in Python anu hashes in Peil oi RuLy 1.S). Diiveis loi those lan-
guages usually have some mechanism loi specilying uocuments
with oiueiing loi the iaie cases when it is necessaiy. (Those cases
will Le noteu thioughout the text.)
Values in uocuments aie not just LloLs. They can Le one ol seveial uilleient uata
types (oi even an entiie emLeuueu uocumentsee EmLeuueu Docu-
ments on page 20). In this example the value loi "greeting" is a stiing, wheieas
the value loi "foo" is an integei.
The keys in a uocument aie stiings. Any UTF-S chaiactei is alloweu in a key, with a
lew notaLle exceptions:
Keys must not contain the chaiactei 0 (the null chaiactei). This chaiactei is useu
to signily the enu ol a key.
The . anu $ chaiacteis have some special piopeities anu shoulu Le useu only in
ceitain ciicumstances, as uesciiLeu in latei chapteis. In geneial, they shoulu Le
consiueieu ieseiveu, anu uiiveis will complain il they aie useu inappiopiiately.
Keys staiting with shoulu Le consiueieu ieseiveu; although this is not stiictly
enloiceu.
MongoDB is type-sensitive anu case-sensitive. Foi example, these uocuments aie
uistinct:
{"foo" : 3}
{"foo" : "3"}
As aie as these:
{"foo" : 3}
{"Foo" : 3}
A linal impoitant thing to note is that uocuments in MongoDB cannot contain uuplicate
keys. Foi example, the lollowing is not a legal uocument:
{"greeting" : "Hello, world!", "greeting" : "Hello, MongoDB!"}
6 | Chapter 2:Getting Started
Collections
A co||cction is a gioup ol uocuments. Il a uocument is the MongoDB analog ol a iow
in a ielational uataLase, then a collection can Le thought ol as the analog to a taLle.
Schema-Free
Collections aie schcna-jrcc. This means that the uocuments within a single collection
can have any numLei ol uilleient shapes. Foi example, Loth ol the lollowing uocu-
ments coulu Le stoieu in a single collection:
{"greeting" : "Hello, world!"}
{"foo" : 5}
Note that the pievious uocuments not only have uilleient types loi theii values (stiing
veisus integei) Lut also have entiiely uilleient keys. Because any uocument can Le put
into any collection, the guestion olten aiises: Vhy uo we neeu sepaiate collections at
all? It`s a goou guestionwith no neeu loi sepaiate schemas loi uilleient kinus ol
uocuments, why shou|d we use moie than one collection? Theie aie seveial goou
ieasons:
Keeping uilleient kinus ol uocuments in the same collection can Le a nightmaie
loi uevelopeis anu aumins. Developeis neeu to make suie that each gueiy is only
ietuining uocuments ol a ceitain kinu oi that the application coue peiloiming a
gueiy can hanule uocuments ol uilleient shapes. Il we`ie gueiying loi Llog posts,
it`s a hassle to weeu out uocuments containing authoi uata.
It is much lastei to get a list ol collections than to extiact a list ol the types in a
collection. Foi example, il we hau a type key in the collection that saiu whethei
each uocument was a skim, whole, oi chunky monkey uocument, it woulu
Le much slowei to linu those thiee values in a single collection than to have thiee
sepaiate collections anu gueiy loi theii names (see SuLcollections
on page S).
Giouping uocuments ol the same kinu togethei in the same collection allows loi
uata locality. Getting seveial Llog posts liom a collection containing only posts will
likely ieguiie lewei uisk seeks than getting the same posts liom a collection con-
taining posts anu authoi uata.
Ve Legin to impose some stiuctuie on oui uocuments when we cieate inuexes.
(This is especially tiue in the case ol unigue inuexes.) These inuexes aie uelineu
pei collection. By putting only uocuments ol a single type into the same collection,
we can inuex oui collections moie elliciently.
As you can see, theie aie sounu ieasons loi cieating a schema anu loi giouping ielateu
types ol uocuments togethei. MongoDB just ielaxes this ieguiiement anu allows ue-
velopeis moie llexiLility.
Collections | 7
Naming
A collection is iuentilieu Ly its name. Collection names can Le any UTF-S stiing, with
a lew iestiictions:
The empty stiing ("") is not a valiu collection name.
Collection names may not contain the chaiactei 0 (the null chaiactei) Lecause
this uelineates the enu ol a collection name.
You shoulu not cieate any collections that stait with systcn., a pielix ieseiveu loi
system collections. Foi example, the systcn.uscrs collection contains the uataLase`s
useis, anu the systcn.nancspaccs collection contains inloimation aLout all ol the
uataLase`s collections.
Usei-cieateu collections shoulu not contain the ieseiveu chaiactei $ in the name.
The vaiious uiiveis availaLle loi the uataLase uo suppoit using $ in collection
names Lecause some system-geneiateu collections contain it. You shoulu not use
$ in a name unless you aie accessing one ol these collections.
Subcollections
One convention loi oiganizing collections is to use namespaceu suLcollections sepa-
iateu Ly the . chaiactei. Foi example, an application containing a Llog might have a
collection nameu b|og.posts anu a sepaiate collection nameu b|og.authors. This is loi
oiganizational puiposes onlytheie is no ielationship Letween the b|og collection (it
uoesn`t even have to exist) anu its chiluien.
Although suLcollections uo not have any special piopeities, they aie uselul anu incoi-
poiateu into many MongoDB tools:
GiiuFS, a piotocol loi stoiing laige liles, uses suLcollections to stoie lile metauata
sepaiately liom content chunks (see Chaptei 7 loi moie inloimation aLout
GiiuFS).
The MongoDB weL console oiganizes the uata in its DBTOP section Ly
suLcollection (see Chaptei S loi moie inloimation on auministiation).
Most uiiveis pioviue some syntactic sugai loi accessing a suLcollection ol a given
collection. Foi example, in the uataLase shell, db.blog will give you the b|og col-
lection, anu db.blog.posts will give you the b|og.posts collection.
SuLcollections aie a gieat way to oiganize uata in MongoDB, anu theii use is highly
iecommenueu.
Databases
In auuition to giouping uocuments Ly collection, MongoDB gioups collections into
databascs. A single instance ol MongoDB can host seveial uataLases, each ol which can
Le thought ol as completely inuepenuent. A uataLase has its own peimissions, anu each
8 | Chapter 2:Getting Started
uataLase is stoieu in sepaiate liles on uisk. A goou iule ol thumL is to stoie all uata loi
a single application in the same uataLase. Sepaiate uataLases aie uselul when stoiing
uata loi seveial application oi useis on the same MongoDB seivei.
Like collections, uataLases aie iuentilieu Ly name. DataLase names can Le any UTF-S
stiing, with the lollowing iestiictions:
The empty stiing ("") is not a valiu uataLase name.
A uataLase name cannot contain any ol these chaiacteis: ' ' (a single space), ., $, /,
, oi 0 (the null chaiactei).
DataLase names shoulu Le all loweicase.
DataLase names aie limiteu to a maximum ol 6+ Lytes.
One thing to iememLei aLout uataLase names is that they will actually enu up as liles
on youi lilesystem. This explains why many ol the pievious iestiictions exist in the liist
place.
Theie aie also seveial ieseiveu uataLase names, which you can access uiiectly Lut have
special semantics. These aie as lollows:
adnin
This is the ioot uataLase, in teims ol authentication. Il a usei is auueu to the
adnin uataLase, the usei automatically inheiits peimissions loi all uataLases.
Theie aie also ceitain seivei-wiue commanus that can Le iun only liom the ad-
nin uataLase, such as listing all ol the uataLases oi shutting uown the seivei.
|oca|
This uataLase will nevei Le ieplicateu anu can Le useu to stoie any collections that
shoulu Le local to a single seivei (see Chaptei 9 loi moie inloimation aLout iep-
lication anu the local uataLase).
conjig
Vhen Mongo is Leing useu in a shaiueu setup (see Chaptei 10), the conjig uataLase
is useu inteinally to stoie inloimation aLout the shaius.
By piepenuing a collection`s name with its containing uataLase, you can get a lully
gualilieu collection name calleu a nancspacc. Foi instance, il you aie using the
b|og.posts collection in the cns uataLase, the namespace ol that collection woulu Le
cms.blog.posts. Namespaces aie limiteu to 121 Lytes in length anu, in piactice, shoulu
Le less than 100 Lytes long. Foi moie on namespaces anu the inteinal iepiesentation
ol collections in MongoDB, see Appenuix C.
Databases | 9
Getting and Starting MongoDB
MongoDB is almost always iun as a netwoik seivei that clients can connect to anu
peiloim opeiations on. To stait the seivei, iun the mongod executaLle:
$ ./mongod
./mongod --help for help and startup options
Sun Mar 28 12:31:20 Mongo DB : starting : pid = 44978 port = 27017
dbpath = /data/db/ master = 0 slave = 0 64-bit
Sun Mar 28 12:31:20 db version v1.5.0-pre-, pdfile version 4.5
Sun Mar 28 12:31:20 git version: ...
Sun Mar 28 12:31:20 sys info: ...
Sun Mar 28 12:31:20 waiting for connections on port 27017
Sun Mar 28 12:31:20 web admin interface listening on port 28017
Oi il you`ie on Vinuows, iun this:
$ mongod.exe
Foi uetaileu inloimation on installing MongoDB on youi system, see
Appenuix A.
Vhen iun with no aiguments, mongod will use the uelault uata uiiectoiy, /data/db/ (oi
C:\data\db\ on Vinuows), anu poit 27017. Il the uata uiiectoiy uoes not alieauy exist
oi is not wiitaLle, the seivei will lail to stait. It is impoitant to cieate the uata uiiectoiy
(e.g., n|dir -p /data/db/), anu to make suie youi usei has peimission to wiite to the
uiiectoiy, Leloie staiting MongoDB. The seivei will also lail to stait il the poit is not
availaLlethis is olten causeu Ly anothei instance ol MongoDB that is alieauy iunning.
The seivei will piint some veision anu system inloimation anu then Legin waiting loi
connections. By uelault, MongoDB listens loi socket connections on poit 27017.
mongod also sets up a veiy Lasic HTTP seivei that listens on a poit 1,000 highei than
the main poit, in this case 2S017. This means that you can get some auministiative
inloimation aLout youi uataLase Ly opening a weL Liowsei anu going to http://|oca|
host:28017.
You can salely stop mongod Ly typing Ctil-c in the shell that is iunning the seivei.
Foi moie inloimation on staiting oi stopping MongoDB, see Staiting
anu Stopping MongoDB on page 111, anu loi moie on the auminis-
tiative inteilace, see Using the Aumin Inteilace on page 115.
10 | Chapter 2:Getting Started
MongoDB Shell
MongoDB comes with a ]avaSciipt shell that allows inteiaction with a MongoDB in-
stance liom the commanu line. The shell is veiy uselul loi peiloiming auministiative
lunctions, inspecting a iunning instance, oi just playing aiounu. The mongo shell is a
ciucial tool loi using MongoDB anu is useu extensively thioughout the iest ol the text.
Running the Shell
To stait the shell, iun the mongo executaLle:
$ ./mongo
MongoDB shell version: 1.6.0
url: test
connecting to: test
type "help" for help
>
The shell automatically attempts to connect to a MongoDB seivei on staitup, so make
suie you stait mongod Leloie staiting the shell.
The shell is a lull-leatuieu ]avaSciipt inteipietei, capaLle ol iunning aiLitiaiy ]avaSciipt
piogiams. To illustiate this, let`s peiloim some Lasic math:
> x = 200
200
> x / 5;
40
Ve can also leveiage all ol the stanuaiu ]avaSciipt liLiaiies:
> Math.sin(Math.PI / 2);
1
> new Date("2010/1/1");
"Fri Jan 01 2010 00:00:00 GMT-0500 (EST)"
> "Hello, World!".replace("World", "MongoDB");
Hello, MongoDB!
Ve can even ueline anu call ]avaSciipt lunctions:
> function factorial (n) {
... if (n <= 1) return 1;
... return n * factorial(n - 1);
... }
> factorial(5);
120
Note that you can cieate multiline commanus. The shell will uetect whethei the ]ava-
Sciipt statement is complete when you piess Entei anu, il it is not, will allow you to
continue wiiting it on the next line.
MongoDB Shell | 11
A MongoDB Client
Although the aLility to execute aiLitiaiy ]avaSciipt is cool, the ieal powei ol the shell
lies in the lact that it is also a stanu-alone MongoDB client. On staitup, the shell con-
nects to the tcst uataLase on a MongoDB seivei anu assigns this uataLase connection
to the gloLal vaiiaLle db. This vaiiaLle is the piimaiy access point to MongoDB thiough
the shell.
The shell contains some auu-ons that aie not valiu ]avaSciipt syntax Lut weie imple-
menteu Lecause ol theii lamiliaiity to useis ol SQL shells. The auu-ons uo not pioviue
any extia lunctionality, Lut they aie nice syntactic sugai. Foi instance, one ol the most
impoitant opeiations is selecting which uataLase to use:
> use foobar
switched to db foobar
Now il you look at the db vaiiaLle, you can see that it ieleis to the joobar uataLase:
> db
foobar
Because this is a ]avaSciipt shell, typing a vaiiaLle will conveit the vaiiaLle to a stiing
(in this case, the uataLase name) anu piint it.
Collections can Le accesseu liom the db vaiiaLle. Foi example, db.baz ietuins the baz
collection in the cuiient uataLase. Now that we can access a collection in the shell, we
can peiloim almost any uataLase opeiation.
Basic Operations with the Shell
Ve can use the loui Lasic opeiations, cieate, ieau, upuate, anu uelete (CRUD), to
manipulate anu view uata in the shell.
Create
The insert lunction auus a uocument to a collection. Foi example, suppose we want
to stoie a Llog post. Fiist, we`ll cieate a local vaiiaLle calleu post that is a ]avaSciipt
oLject iepiesenting oui uocument. It will have the keys "title", "content", anu
"date" (the uate that it was puLlisheu):
> post = {"title" : "My Blog Post",
... "content" : "Here's my blog post.",
... "date" : new Date()}
{
"title" : "My Blog Post",
"content" : "Here's my blog post.",
"date" : "Sat Dec 12 2009 11:23:21 GMT-0500 (EST)"
}
This oLject is a valiu MongoDB uocument, so we can save it to the b|og collection using
the insert methou:
12 | Chapter 2:Getting Started
> db.blog.insert(post)
The Llog post has Leen saveu to the uataLase. Ve can see it Ly calling find on the
collection:
> db.blog.find()
{
"_id" : ObjectId("4b23c3ca7525f35f94b60a2d"),
"title" : "My Blog Post",
"content" : "Here's my blog post.",
"date" : "Sat Dec 12 2009 11:23:21 GMT-0500 (EST)"
}
You can see that an "_id" key was auueu anu that the othei key/value paiis weie saveu
as we enteieu them. The ieason loi "_id"`s suuuen appeaiance is explaineu at the enu
ol this chaptei.
Read
find ietuins all ol the uocuments in a collection. Il we just want to see one uocument
liom a collection, we can use findOne:
> db.blog.findOne()
{
"_id" : ObjectId("4b23c3ca7525f35f94b60a2d"),
"title" : "My Blog Post",
"content" : "Here's my blog post.",
"date" : "Sat Dec 12 2009 11:23:21 GMT-0500 (EST)"
}
find anu findOne can also Le passeu ciiteiia in the loim ol a gueiy uocument. This will
iestiict the uocuments matcheu Ly the gueiy. The shell will automatically uisplay up
to 20 uocuments matching a find, Lut moie can Le letcheu. See Chaptei + loi moie
inloimation on gueiying.
Update
Il we woulu like to mouily oui post, we can use update. update takes (at least) two
paiameteis: the liist is the ciiteiia to linu which uocument to upuate, anu the seconu
is the new uocument. Suppose we ueciue to enaLle comments on the Llog post we
cieateu eailiei. Ve`ll neeu to auu an aiiay ol comments as the value loi a new key in
oui uocument.
The liist step is to mouily the vaiiaLle post anu auu a "comments" key:
> post.comments = []
[ ]
Then we peiloim the upuate, ieplacing the post titleu My Blog Post with oui new
veision ol the uocument:
> db.blog.update({title : "My Blog Post"}, post)
MongoDB Shell | 13
Now the uocument has a "comments" key. Il we call find again, we can see the new key:
> db.blog.find()
{
"_id" : ObjectId("4b23c3ca7525f35f94b60a2d"),
"title" : "My Blog Post",
"content" : "Here's my blog post.",
"date" : "Sat Dec 12 2009 11:23:21 GMT-0500 (EST)"
"comments" : [ ]
}
Delete
remove ueletes uocuments peimanently liom the uataLase. Calleu with no paiameteis,
it iemoves all uocuments liom a collection. It can also take a uocument specilying
ciiteiia loi iemoval. Foi example, this woulu iemove the post we just cieateu:
> db.blog.remove({title : "My Blog Post"})
Now the collection will Le empty again.
Tips for Using the Shell
Because mongo is simply a ]avaSciipt shell, you can get a gieat ueal ol help loi it Ly
simply looking up ]avaSciipt uocumentation online. The shell also incluues Luilt-in
help that can Le accesseu Ly typing help:
> help
HELP
show dbs show database names
show collections show collections in current database
show users show users in current database
show profile show recent system.profile entries w. time >= 1ms
use <db name> set current database to <db name>
db.help() help on DB methods
db.foo.help() help on collection methods
db.foo.find() list objects in collection foo
db.foo.find( { a : 1 } ) list objects in foo where a == 1
it result of the last line evaluated
Help loi uataLase-level commanus is pioviueu Ly db.help();, anu help at the collec-
tions can Le accesseu with db.foo.help();.
A goou way ol liguiing out what a lunction is uoing is to type it without the paientheses.
This will piint the ]avaSciipt souice coue loi the lunction. Foi example, il we aie cuiious
aLout how the update lunction woiks oi cannot iememLei the oiuei ol paiameteis, we
can uo the lollowing:
> db.foo.update
function (query, obj, upsert, multi) {
assert(query, "need a query");
assert(obj, "need an object");
this._validateObject(obj);
this._mongo.update(this._fullName, query, obj,
14 | Chapter 2:Getting Started
upsert ? true : false, multi ? true : false);
}
Theie is also an autogeneiateu API ol all the ]avaSciipt lunctions pioviueu Ly the shell
at http://api.nongodb.org/js.
Inconvenient collection names
Fetching a collection with db.collectionName almost always woiks, unless the collec-
tion name actually is a piopeity ol the uataLase class. Foi instance, il we aie tiying to
access the vcrsion collection, we cannot say db.version Lecause db.version is a uataLase
lunction. (It ietuins the veision ol the iunning MongoDB seivei.)
> db.version
function () {
return this.serverBuildInfo().version;
}
db`s collection-ietuining Lehavioi is only a lallLack loi when ]avaSciipt cannot linu a
matching piopeity. Vhen theie is a piopeity with the same name as the uesiieu col-
lection, we can use the getCollection lunction:
> db.getCollection("version");
test.version
This can also Le hanuy loi collections with invaliu ]avaSciipt in theii names. Foi ex-
ample, joo-bar is a valiu collection name, Lut it`s vaiiaLle suLtiaction in ]avaSciipt.
You can get the joo-bar collection with db.getCollection("foo-bar").
In ]avaSciipt, x.y is iuentical to x['y']. This means that suLcollections can Le accesseu
using vaiiaLles, not just liteial names. That is, il you neeueu to peiloim some opeiation
on eveiy b|og suLcollection, you coulu iteiate thiough them with something like this:
var collections = ["posts", "comments", "authors"];
for (i in collections) {
doStuff(db.blog[collections[i]]);
}
Insteau ol this:
doStuff(db.blog.posts);
doStuff(db.blog.comments);
doStuff(db.blog.authors);
Data Types
The Leginning ol this chaptei coveieu the Lasics ol what a uocument is. Now that you
aie up anu iunning with MongoDB anu can tiy things on the shell, this section will
uive a little ueepei. MongoDB suppoits a wiue iange ol uata types as values in uocu-
ments. In this section, we`ll outline all ol the suppoiteu types.
Data Types | 15
Basic Data Types
Documents in MongoDB can Le thought ol as ]SON-like in that they aie conceptually
similai to oLjects in ]avaSciipt. ]SON is a simple iepiesentation ol uata: the specilica-
tion can Le uesciiLeu in aLout one paiagiaph (http://www.json.org pioves it) anu lists
only six uata types. This is a goou thing in many ways: it`s easy to unueistanu, paise,
anu iememLei. On the othei hanu, ]SON`s expiessive capaLilities aie limiteu, Lecause
the only types aie null, Loolean, numeiic, stiing, aiiay, anu oLject.
Although these types allow loi an impiessive amount ol expiessivity, theie aie a couple
ol auuitional types that aie ciucial loi most applications, especially when woiking with
a uataLase. Foi example, ]SON has no uate type, which makes woiking with uates even
moie annoying than it usually is. Theie is a numLei type, Lut only onetheie is no
way to uilleientiate lloats anu integeis, nevei minu any uistinction Letween 32-Lit anu
6+-Lit numLeis. Theie is no way to iepiesent othei commonly useu types, eithei, such
as iegulai expiessions oi lunctions.
MongoDB auus suppoit loi a numLei ol auuitional uata types while keeping ]SON`s
essential key/value paii natuie. Exactly how values ol each type aie iepiesenteu vaiies
Ly language, Lut this is a list ol the commonly suppoiteu types anu how they aie iep-
iesenteu as pait ol a uocument in the shell:
nu||
Null can Le useu to iepiesent Loth a null value anu a nonexistent lielu:
{"x" : null}
boo|can
Theie is a Loolean type, which will Le useu loi the values 'true' anu 'false':
{"x" : true}
32-bit intcgcr
This cannot Le iepiesenteu on the shell. As mentioneu eailiei, ]avaSciipt suppoits
only 6+-Lit lloating point numLeis, so 32-Lit integeis will Le conveiteu into those.
1-bit intcgcr
Again, the shell cannot iepiesent these. The shell will uisplay them using a special
emLeuueu uocument; see the section NumLeis on page 1S loi uetails.
1-bit j|oating point nunbcr
All numLeis in the shell will Le ol this type. Thus, this will Le a lloating-point
numLei:
{"x" : 3.14}
As will this:
{"x" : 3}
string
Any stiing ol UTF-S chaiacteis can Le iepiesenteu using the stiing type:
16 | Chapter 2:Getting Started
{"x" : "foobar"}
synbo|
This type is not suppoiteu Ly the shell. Il the shell gets a symLol liom the uataLase,
it will conveit it into a stiing.
objcct id
An oLject iu is a unigue 12-Lyte ID loi uocuments. See the section iu anu OL-
jectIus on page 20 loi uetails:
{"x" : ObjectId()}
datc
Dates aie stoieu as milliseconus since the epoch. The time zone is not stoieu:
{"x" : new Date()}
rcgu|ar cxprcssion
Documents can contain iegulai expiessions, using ]avaSciipt`s iegulai expiession
syntax:
{"x" : /foobar/i}
codc
Documents can also contain ]avaSciipt coue:
{"x" : function() { /* ... */ }}
binary data
Binaiy uata is a stiing ol aiLitiaiy Lytes. It cannot Le manipulateu liom the shell.
naxinun va|uc
BSON contains a special type iepiesenting the laigest possiLle value. The shell uoes
not have a type loi this.
nininun va|uc
BSON contains a special type iepiesenting the smallest possiLle value. The shell
uoes not have a type loi this.
undcjincd
Unuelineu can Le useu in uocuments as well (]avaSciipt has uistinct types loi null
anu unuelineu):
{"x" : undefined}
array
Sets oi lists ol values can Le iepiesenteu as aiiays:
{"x" : ["a", "b", "c"]}
cnbcddcd docuncnt
Documents can contain entiie uocuments, emLeuueu as values in a paient
uocument:
{"x" : {"foo" : "bar"}}
Data Types | 17
Numbers
]avaSciipt has one numLei type. Because MongoDB has thiee numLei types (+-Lyte
integei, S-Lyte integei, anu S-Lyte lloat), the shell has to hack aiounu ]avaSciipt`s lim-
itations a Lit. By uelault, any numLei in the shell is tieateu as a uouLle Ly MongoDB.
This means that il you ietiieve a +-Lyte integei liom the uataLase, manipulate its uocu-
ment, anu save it Lack to the uataLase cvcn without changing thc intcgcr, the integei
will Le iesaveu as a lloating-point numLei. Thus, it is geneially a goou iuea not to
oveiwiite entiie uocuments liom the shell (see Chaptei 3 loi inloimation on making
changes to the values ol inuiviuual keys).
Anothei pioLlem with eveiy numLei Leing iepiesenteu Ly a uouLle is that theie aie
some S-Lyte integeis that cannot Le accuiately iepiesenteu Ly S-Lyte lloats. Theieloie,
il you save an S-Lyte integei anu look at it in the shell, the shell will uisplay it as an
emLeuueu uocument inuicating that it might not Le exact. Foi example, il we save a
uocument with a "myInteger" key whose value is the 6+-Lit integei, 3, anu then look at
it in the shell, it will look like this:
> doc = db.nums.findOne()
{
"_id" : ObjectId("4c0beecfd096a2580fe6fa08"),
"myInteger" : {
"floatApprox" : 3
}
}
The numLei is not changeu in the uataLase (unless you mouily anu iesave the oLject
liom the shell, in which case it will tuin into a lloat); the emLeuueu uocument just
inuicates that the shell is uisplaying a lloating-point appioximation ol an S-Lyte integei.
Il this emLeuueu uocument has only one key, it is, in lact, exact.
Il you inseit an S-Lyte integei that cannot Le accuiately uisplayeu as a uouLle, the shell
will auu two keys, "top" anu "bottom", containing the 32-Lit integeis iepiesenting the
+ high-oiuei Lytes anu + low-oiuei Lytes ol the integei, iespectively. Foi instance, il
we inseit 9223372036854775807, the shell will show us the lollowing:
> db.nums.findOne()
{
"_id" : ObjectId("4c0beecfd096a2580fe6fa09"),
"myInteger" : {
"floatApprox" : 9223372036854776000,
"top" : 2147483647,
"bottom" : 4294967295
}
}
The "floatApprox" emLeuueu uocuments aie special anu can Le manipulateu as num-
Leis as well as uocuments:
> doc.myInteger + 1
4
18 | Chapter 2:Getting Started
> doc.myInteger.floatApprox
3
All +-Lyte integeis can Le iepiesenteu exactly Ly an S-Lyte lloating-point numLei, so
they aie uisplayeu noimally.
Dates
In ]avaSciipt, the Date oLject is useu loi MongoDB`s uate type. Vhen cieating a new
Date oLject, always call new Date(...), not just Date(...). Calling the constiuctoi as a
lunction (that is, not incluuing new) ietuins a stiing iepiesentation ol the uate, not an
actual Date oLject. This is not MongoDB`s choice; it is how ]avaSciipt woiks. Il you
aie not caielul to always use the Date constiuctoi, you can enu up with a mishmash ol
stiings anu uates. Stiings uo not match uates, anu vice veisa, so this can cause pioLlems
with iemoving, upuating, gueiying.pietty much eveiything.
Foi a lull explanation ol ]avaSciipt`s Date class anu acceptaLle loimats loi the con-
stiuctoi, see ECMASciipt specilication section 15.9 (availaLle loi uownloau at http://
www.ccnascript.org).
Dates in the shell aie uisplayeu using local time zone settings. Howevei, uates in the
uataLase aie just stoieu as milliseconus since the epoch, so they have no time zone
inloimation associateu with them. (Time zone inloimation coulu, ol couise, Le stoieu
as the value loi anothei key.)
Arrays
Aiiays aie values that can Le inteichangeaLly useu loi Loth oiueieu opeiations (as
though they weie lists, stacks, oi gueues) anu unoiueieu opeiations (as though they
weie sets).
In the lollowing uocument, the key "things" has an aiiay value:
{"things" : ["pie", 3.14]}
As we can see liom the example, aiiays can contain uilleient uata types as values (in
this case, a stiing anu a lloating-point numLei). In lact, aiiay values can Le any ol the
suppoiteu values loi noimal key/value paiis, even nesteu aiiays.
One ol the gieat things aLout aiiays in uocuments is that MongoDB unueistanus
theii stiuctuie anu knows how to ieach insiue ol aiiays to peiloim opeiations on
theii contents. This allows us to gueiy on aiiays anu Luilu inuexes using theii contents.
Foi instance, in the pievious example, MongoDB can gueiy loi all uocuments wheie
3.1+ is an element ol the "things" aiiay. Il this is a common gueiy, you can even cieate
an inuex on the "things" key to impiove the gueiy`s speeu.
MongoDB also allows atomic upuates that mouily the contents ol aiiays, such as
ieaching into the aiiay anu changing the value pic to pi. Ve`ll see moie examples ol
these types ol opeiations thioughout the text.
Data Types | 19
Embedded Documents
EmLeuueu uocuments aie entiie MongoDB uocuments that aie useu as the va|uc loi a
key in anothei uocument. They can Le useu to oiganize uata in a moie natuial way
than just a llat stiuctuie.
Foi example, il we have a uocument iepiesenting a peison anu want to stoie his auuiess,
we can nest this inloimation in an emLeuueu "address" uocument:
{
"name" : "John Doe",
"address" : {
"street" : "123 Park Street",
"city" : "Anytown",
"state" : "NY"
}
}
The value loi the "address" key in the pievious example is anothei uocument with its
own values loi "street", "city", anu "state".
As with aiiays, MongoDB unueistanus the stiuctuie ol emLeuueu uocuments anu is
aLle to ieach insiue ol them to Luilu inuexes, peiloim gueiies, oi make upuates.
Ve`ll uiscuss schema uesign in uepth latei, Lut even liom this Lasic example, we can
Legin to see how emLeuueu uocuments can change the way we woik with uata. In a
ielational uataLase, the pievious uocument woulu pioLaLly Le moueleu as two sepaiate
iows in two uilleient taLles (one loi people anu one loi auuiesses). Vith MongoDB
we can emLeu the auuiess uocument uiiectly within the peison uocument. Vhen useu
piopeily, emLeuueu uocuments can pioviue a moie natuial (anu olten moie ellicient)
iepiesentation ol inloimation.
The llip siue ol this is that we aie Lasically uenoimalizing, so theie can Le moie uata
iepetition with MongoDB. Suppose auuiesses weie a sepaiate taLle in a ielational
uataLase anu we neeueu to lix a typo in an auuiess. Vhen we uiu a join with people
anu auuiesses, we`u get the upuateu auuiess loi eveiyone who shaies it. Vith
MongoDB, we`u neeu to lix the typo in each peison`s uocument.
_id and ObjectIds
Eveiy uocument stoieu in MongoDB must have an "_id" key. The "_id" key`s value
can Le any type, Lut it uelaults to an ObjectId. In a single collection, eveiy uocument
must have a unigue value loi "_id", which ensuies that eveiy uocument in a collection
can Le uniguely iuentilieu. That is, il you hau two collections, each one coulu have a
uocument wheie the value loi "_id" was 123. Howevei, neithei collection coulu contain
moie than one uocument wheie "_id" was 123.
20 | Chapter 2:Getting Started
ObjectIds
ObjectId is the uelault type loi "_id". It is uesigneu to Le lightweight, while still Leing
easy to geneiate in a gloLally unigue way acioss uispaiate machines. This is the main
ieason why MongoDB uses ObjectIds as opposeu to something moie tiauitional, like
an autoinciementing piimaiy key: it is uillicult anu time-consuming to synchionize
autoinciementing piimaiy keys acioss multiple seiveis. Because MongoDB was ue-
signeu liom the Leginning to Le a uistiiLuteu uataLase, uealing with many noues is an
impoitant consiueiation. The ObjectId type, as we`ll see, is easy to geneiate in a shaiueu
enviionment.
ObjectIds use 12 Lytes ol stoiage, which gives them a stiing iepiesentation that is 2+
hexauecimal uigits: 2 uigits loi each Lyte. This causes them to appeai laigei than they
aie, which makes some people neivous. It`s impoitant to note that even though an
ObjectId is olten iepiesenteu as a giant hexauecimal stiing, the stiing is actually twice
as long as the uata Leing stoieu.
Il you cieate multiple new ObjectIds in iapiu succession, you can see that only the last
lew uigits change each time. In auuition, a couple ol uigits in the miuule ol the
ObjectId will change (il you space the cieations out Ly a couple ol seconus). This is
Lecause ol the mannei in which ObjectIds aie cieateu. The 12 Lytes ol an ObjectId aie
geneiateu as lollows:
0 1 2 3 4 5 6 7 8 9 10 11
Timestamp Machine PID Increment
The liist loui Lytes ol an ObjectId aie a timestamp in seconus since the epoch. This
pioviues a couple ol uselul piopeities:
The timestamp, when comLineu with the next live Lytes (which will Le uesciiLeu
in a moment), pioviues unigueness at the gianulaiity ol a seconu.
Because the timestamp comes liist, it means that ObjectIds will soit in rough|y
inseition oiuei. This is not a stiong guaiantee Lut uoes have some nice piopeities,
such as making ObjectIds ellicient to inuex.
In these loui Lytes exists an implicit timestamp ol when each uocument was cie-
ateu. Most uiiveis expose a methou loi extiacting this inloimation liom an
ObjectId.
Because the cuiient time is useu in ObjectIds, some useis woiiy that theii seiveis will
neeu to have synchionizeu clocks. This is not necessaiy Lecause the actual value ol the
timestamp uoesn`t mattei, only that it is olten new (once pei seconu) anu incieasing.
The next thiee Lytes ol an ObjectId aie a unigue iuentiliei ol the machine on which it
was geneiateu. This is usually a hash ol the machine`s hostname. By incluuing these
Lytes, we guaiantee that uilleient machines will not geneiate colliuing ObjectIds.
Data Types | 21
To pioviue unigueness among uilleient piocesses geneiating ObjectIds concuiiently
on a single machine, the next two Lytes aie taken liom the piocess iuentiliei (PID) ol
the ObjectId-geneiating piocess.
These liist nine Lytes ol an ObjectId guaiantee its unigueness acioss machines anu
piocesses loi a single seconu. The last thiee Lytes aie simply an inciementing countei
that is iesponsiLle loi unigueness within a seconu in a single piocess. This allows loi
up to 256
3
(16,777,216) unigue ObjectIds to Le geneiateu pcr proccss in a single seconu.
Autogeneration of _id
As stateu pieviously, il theie is no "_id" key piesent when a uocument is inseiteu, one
will Le automatically auueu to the inseiteu uocument. This can Le hanuleu Ly the
MongoDB seivei Lut will geneially Le uone Ly the uiivei on the client siue. Theie aie
a couple ol ieasons loi that:
Although ObjectIds aie uesigneu to Le lightweight anu easy to geneiate, theie is
still some oveiheau involveu in theii geneiation. The uecision to geneiate them on
the client siue iellects an oveiall philosophy ol MongoDB: woik shoulu Le pusheu
out ol the seivei anu to the uiiveis whenevei possiLle. This philosophy iellects the
lact that, even with scalaLle uataLases like MongoDB, it is easiei to scale out at the
application layei than at the uataLase layei. Moving woik to the client siue ieuuces
the Luiuen ieguiiing the uataLase to scale.
By geneiating ObjectIds on the client siue, uiiveis aie capaLle ol pioviuing iichei
APIs than woulu Le otheiwise possiLle. Foi example, a uiivei might have its
insert methou eithei ietuin the geneiateu ObjectId oi inject it uiiectly into the
uocument that was inseiteu. Il the uiivei alloweu the seivei to geneiate
ObjectIds, then a sepaiate gueiy woulu Le ieguiieu to ueteimine the value ol
"_id" loi an inseiteu uocument.
22 | Chapter 2:Getting Started
CHAPTER 4
Querying
This chaptei looks at gueiying in uetail. The main aieas coveieu aie as lollows:
You can peiloim au hoc gueiies on the uataLase using the find oi findOne lunctions
anu a gueiy uocument.
You can gueiy loi ianges, set inclusion, inegualities, anu moie Ly using
$ conuitionals.
Some gueiies cannot Le expiesseu as gueiy uocuments, even using $ conuitionals.
Foi these types ol complex gueiies, you can use a $where clause to hainess the lull
expiessive powei ol ]avaSciipt.
Queiies ietuin a uataLase cuisoi, which lazily ietuins Latches ol uocuments as you
neeu them.
Theie aie a lot ol metaopeiations you can peiloim on a cuisoi, incluuing skipping
a ceitain numLei ol iesults, limiting the numLei ol iesults ietuineu, anu soiting
iesults.
Introduction to find
The find methou is useu to peiloim gueiies in MongoDB. Queiying ietuins a suLset
ol uocuments in a collection, liom no uocuments at all to the entiie collection. Vhich
uocuments get ietuineu is ueteimineu Ly the liist aigument to find, which is a uocu-
ment specilying the gueiy to Le peiloimeu.
An empty gueiy uocument (i.e., {}) matches eveiything in the collection. Il find isn`t
given a gueiy uocument, it uelaults to {}. Foi example, the lollowing:
> db.c.find()
ietuins eveiything in the collection c.
Vhen we stait auuing key/value paiis to the gueiy uocument, we Legin iestiicting oui
seaich. This woiks in a stiaightloiwaiu way loi most types. Integeis match integeis,
Looleans match Looleans, anu stiings match stiings. Queiying loi a simple type is as
45
easy as specilying the value that you aie looking loi. Foi example, to linu all uocuments
wheie the value loi "age" is 27, we can auu that key/value paii to the gueiy uocument:
> db.users.find({"age" : 27})
Il we have a stiing we want to match, such as a "username" key with the value "joe",
we use that key/value paii insteau:
> db.users.find({"username" : "joe"})
Multiple conuitions can Le stiung togethei Ly auuing moie key/value paiis to the gueiy
uocument, which gets inteipieteu as conuition1 AND conuition2 AND . AND
conuitionN. Foi instance, to get all useis who aie 27-yeai-olus with the useiname
joe, we can gueiy loi the lollowing:
> db.users.find({"username" : "joe", "age" : 27})
Specifying Which Keys to Return
Sometimes, you uo not neeu all ol the key/value paiis in a uocument ietuineu. Il this
is the case, you can pass a seconu aigument to find (oi findOne) specilying the keys you
want. This ieuuces Loth the amount ol uata sent ovei the wiie anu the time anu memoiy
useu to uecoue uocuments on the client siue.
Foi example, il you have a usei collection anu you aie inteiesteu only in the "user
name" anu "email" keys, you coulu ietuin just those keys with the lollowing gueiy:
> db.users.find({}, {"username" : 1, "email" : 1})
{
"_id" : ObjectId("4ba0f0dfd22aa494fd523620"),
"username" : "joe",
"email" : "joe@example.com"
}
As you can see liom the pievious output, the "_id" key is always ietuineu, even il it
isn`t specilically listeu.
You can also use this seconu paiametei to excluue specilic key/value paiis liom the
iesults ol a gueiy. Foi instance, you may have uocuments with a vaiiety ol keys, anu
the only thing you know is that you nevei want to ietuin the "fatal_weakness" key:
> db.users.find({}, {"fatal_weakness" : 0})
This can even pievent "_id" liom Leing ietuineu:
> db.users.find({}, {"username" : 1, "_id" : 0})
{
"username" : "joe",
}
46 | Chapter 4:Querying
Limitations
Theie aie some iestiictions on gueiies. The value ol a gueiy uocument must Le a con-
stant as lai as the uataLase is conceineu. (It can Le a noimal vaiiaLle in youi own coue.)
That is, it cannot ielei to the value ol anothei key in the uocument. Foi example, il we
weie keeping inventoiy anu we hau Loth "in_stock" anu "num_sold" keys, we coulu
compaie theii values Ly gueiying the lollowing:
> db.stock.find({"in_stock" : "this.num_sold"}) // doesn't work
Theie aie ways to uo this (see $wheie Queiies on page 55), Lut you will usually get
Lettei peiloimance Ly iestiuctuiing youi uocument slightly, such that a noimal gueiy
will sullice. In this example, we coulu insteau use the keys "initial_stock" anu
"in_stock". Then, eveiy time someone Luys an item, we ueciement the value ol the
"in_stock" key Ly one. Finally, we can uo a simple gueiy to check which items aie out
ol stock:
> db.stock.find({"in_stock" : 0})
Query Criteria
Queiies can go Leyonu the exact matching uesciiLeu in the pievious section; they can
match moie complex ciiteiia, such as ianges, OR-clauses, anu negation.
Query Conditionals
"$lt", "$lte", "$gt", anu "$gte" aie all compaiison opeiatois, coiiesponuing to <, <=,
>, anu >=, iespectively. They can Le comLineu to look loi a iange ol values. Foi ex-
ample, to look loi useis who aie Letween the ages ol 1S anu 30 inclusive, we can uo this:
> db.users.find({"age" : {"$gte" : 18, "$lte" : 30}})
These types ol iange gueiies aie olten uselul loi uates. Foi example, to linu people who
iegisteieu Leloie ]anuaiy 1, 2007, we can uo this:
> start = new Date("01/01/2007")
> db.users.find({"registered" : {"$lt" : start}})
An exact match on a uate is less uselul, Lecause uates aie only stoieu with milliseconu
piecision. Olten you want a whole uay, week, oi month, making a iange gueiy
necessaiy.
To gueiy loi uocuments wheie a key`s value is not egual to a ceitain value, you must
use anothei conuitional opeiatoi, "$ne", which stanus loi not egual. Il you want to
linu all useis who uo not have the useiname joe, you can gueiy loi them using this:
> db.users.find({"username" : {"$ne" : "joe"}})
"$ne" can Le useu with any type.
Query Criteria | 47
OR Queries
Theie aie two ways to uo an OR gueiy in MongoDB. "$in" can Le useu to gueiy loi a
vaiiety ol values loi a single key. "$or" is moie geneial; it can Le useu to gueiy loi any
ol the given values acioss multiple keys.
Il you have moie than one possiLle value to match loi a single key, use an aiiay ol
ciiteiia with "$in". Foi instance, suppose we weie iunning a iallle anu the winning
ticket numLeis weie 725, 5+2, anu 390. To linu all thiee ol these uocuments, we can
constiuct the lollowing gueiy:
> db.raffle.find({"ticket_no" : {"$in" : [725, 542, 390]}})
"$in" is veiy llexiLle anu allows you to specily ciiteiia ol uilleient types as well as values.
Foi example, il we aie giauually migiating oui schema to use useinames insteau ol usei
ID numLeis, we can gueiy loi eithei Ly using this:
> db.users.find({"user_id" : {"$in" : [12345, "joe"]})
This matches uocuments with a "user_id" egual to 123+5, anu uocuments with a
"user_id" egual to "joe".
Il "$in" is given an aiiay with a single value, it Lehaves the same as uiiectly matching
the value. Foi instance, {ticket_no : {$in : [725]}} matches the same uocuments as
{ticket_no : 725}.
The opposite ol "$in" is "$nin", which ietuins uocuments that uon`t match any ol the
ciiteiia in the aiiay. Il we want to ietuin all ol the people who uiun`t win anything in
the iallle, we can gueiy loi them with this:
> db.raffle.find({"ticket_no" : {"$nin" : [725, 542, 390]}})
This gueiy ietuins eveiyone who uiu not have tickets with those numLeis.
"$in" gives you an OR gueiy loi a single key, Lut what il we neeu to linu uocuments
wheie "ticket_no" is 725 oi "winner" is true? Foi this type ol gueiy, we`ll neeu to use
the "$or" conuitional. "$or" takes an aiiay ol possiLle ciiteiia. In the iallle case, using
"$or" woulu look like this:
> db.raffle.find({"$or" : [{"ticket_no" : 725}, {"winner" : true}]})
"$or" can contain othei conuitionals. Il, loi example, we want to match any ol the thiee
"ticket_no" values oi the "winner" key, we can use this:
> db.raffle.find({"$or" : [{"ticket_no" : {"$in" : [725, 542, 390]}},
{"winner" : true}]})
Vith a noimal AND-type gueiy, you want to naiiow youi iesults uown as lai as pos-
siLle in as lew aiguments as possiLle. OR-type gueiies aie the opposite: they aie most
ellicient il the liist aiguments match as many uocuments as possiLle.
48 | Chapter 4:Querying
$not
"$not" is a metaconuitional: it can Le applieu on top ol any othei ciiteiia. As an example,
let`s consiuei the mouulus opeiatoi, "$mod". "$mod" gueiies loi keys whose values, when
uiviueu Ly the liist value given, have a iemainuei ol the seconu value:
> db.users.find({"id_num" : {"$mod" : [5, 1]}})
The pievious gueiy ietuins useis with "id_num"s ol 1, 6, 11, 16, anu so on. Il we want,
insteau, to ietuin useis with "id_num"s ol 2, 3, +, 5, 7, S, 9, 10, 12, anu so on, we can
use "$not":
> db.users.find({"id_num" : {"$not" : {"$mod" : [5, 1]}}})
"$not" can Le paiticulaily uselul in conjunction with iegulai expiessions to linu all
uocuments that uon`t match a given pattein (iegulai expiession usage is uesciiLeu in
the section Regulai Expiessions on page 50).
Rules for Conditionals
Il you look at the upuate mouilieis in the pievious chaptei anu pievious gueiy uocu-
ments, you`ll notice that the $-pielixeu keys aie in uilleient positions. In the gueiy,
"$lt" is in the innei uocument; in the upuate, "$inc" is the key loi the outei uocument.
This geneially holus tiue: conuitionals aie an innei uocument key, anu mouilieis aie
always a key in the outei uocument.
Multiple conuitions can Le put on a single key. Foi example, to linu all useis Letween
the ages ol 20 anu 30, we can gueiy loi Loth "$gt" anu "$lt" on the "age" key:
> db.users.find({"age" : {"$lt" : 30, "$gt" : 20}})
Any numLei ol conuitionals can Le useu with a single key. Multiple upuate mouilieis
cannot Le useu on a single key, howevei. Foi example, you cannot have a mouiliei
uocument such as {"$inc" : {"age" : 1}, "$set" : {age : 40}} Lecause it
mouilies "age" twice. Vith gueiy conuitionals, no such iule applies.
Type-Specific Queries
As coveieu in Chaptei 2, MongoDB has a wiue vaiiety ol types that can Le useu in a
uocument. Some ol these Lehave specially in gueiies.
null
null Lehaves a Lit stiangely. It uoes match itsell, so il we have a collection with the
lollowing uocuments:
> db.c.find()
{ "_id" : ObjectId("4ba0f0dfd22aa494fd523621"), "y" : null }
Type-Specific Queries | 49
{ "_id" : ObjectId("4ba0f0dfd22aa494fd523622"), "y" : 1 }
{ "_id" : ObjectId("4ba0f148d22aa494fd523623"), "y" : 2 }
we can gueiy loi uocuments whose "y" key is null in the expecteu way:
> db.c.find({"y" : null})
{ "_id" : ObjectId("4ba0f0dfd22aa494fd523621"), "y" : null }
Howevei, null not only matches itsell Lut also matches uoes not exist. Thus, gueiying
loi a key with the value null will ietuin all uocuments lacking that key:
> db.c.find({"z" : null})
{ "_id" : ObjectId("4ba0f0dfd22aa494fd523621"), "y" : null }
{ "_id" : ObjectId("4ba0f0dfd22aa494fd523622"), "y" : 1 }
{ "_id" : ObjectId("4ba0f148d22aa494fd523623"), "y" : 2 }
Il we only want to linu keys whose value is null, we can check that the key is null anu
exists using the "$exists" conuitional:
> db.c.find({"z" : {"$in" : [null], "$exists" : true}})
Unloitunately, theie is no "$eq" opeiatoi, which makes this a little awkwaiu, Lut
"$in" with one element is eguivalent.
Regular Expressions
Regulai expiessions aie uselul loi llexiLle stiing matching. Foi example, il we want to
linu all useis with the name ]oe oi joe, we can use a iegulai expiession to uo case-
insensitive matching:
> db.users.find({"name" : /joe/i})
Regulai expiession llags (i) aie alloweu Lut not ieguiieu. Il we want to match not only
vaiious capitalizations ol joe, Lut also joey, we can continue to impiove oui iegulai
expiession:
> db.users.find({"name" : /joey?/i})
MongoDB uses the Peil CompatiLle Regulai Expiession (PCRE) liLiaiy to match ieg-
ulai expiessions; any iegulai expiession syntax alloweu Ly PCRE is alloweu in
MongoDB. It is a goou iuea to check youi syntax with the ]avaSciipt shell Leloie using
it in a gueiy to make suie it matches what you think it matches.
MongoDB can leveiage an inuex loi gueiies on pielix iegulai expies-
sions (e.g., /^joey/), so gueiies ol that kinu can Le last.
Regulai expiessions can also match themselves. Veiy lew people inseit iegulai expies-
sions into the uataLase, Lut il you inseit one, you can match it with itsell:
> db.foo.insert({"bar" : /baz/})
> db.foo.find({"bar" : /baz/})
50 | Chapter 4:Querying
{
"_id" : ObjectId("4b23c3ca7525f35f94b60a2d"),
"bar" : /baz/
}
Querying Arrays
Queiying loi elements ol an aiiay is simple. An aiiay can mostly Le tieateu as though
each element is the value ol the oveiall key. Foi example, il the aiiay is a list ol liuits,
like this:
> db.food.insert({"fruit" : ["apple", "banana", "peach"]})
the lollowing gueiy:
> db.food.find({"fruit" : "banana"})
will successlully match the uocument. Ve can gueiy loi it in much the same way as
though we hau a uocument that lookeu like the (illegal) uocument: {"fruit" : "apple",
"fruit" : "banana", "fruit" : "peach"}.
$all
Il you neeu to match aiiays Ly moie than one element, you can use "$all". This allows
you to match a list ol elements. Foi example, suppose we cieateu a collection with thiee
elements:
> db.food.insert({"_id" : 1, "fruit" : ["apple", "banana", "peach"]})
> db.food.insert({"_id" : 2, "fruit" : ["apple", "kumquat", "orange"]})
> db.food.insert({"_id" : 3, "fruit" : ["cherry", "banana", "apple"]})
Then we can linu all uocuments with Loth "apple" anu "banana" elements Ly gueiying
with "$all":
> db.food.find({fruit : {$all : ["apple", "banana"]}})
{"_id" : 1, "fruit" : ["apple", "banana", "peach"]}
{"_id" : 3, "fruit" : ["cherry", "banana", "apple"]}
Oiuei uoes not mattei. Notice "banana" comes Leloie "apple" in the seconu iesult.
Using a one-element aiiay with "$all" is eguivalent to not using "$all". Foi instance,
{fruit : {$all : ['apple']} will match the same uocuments as {fruit : 'apple'}.
You can also gueiy Ly exact match using the entiie aiiay. Howevei, exact match will
not match a uocument il any elements aie missing oi supeilluous. Foi example, this
will match the liist uocument shown pieviously:
> db.food.find({"fruit" : ["apple", "banana", "peach"]})
But this will not:
> db.food.find({"fruit" : ["apple", "banana"]})
anu neithei will this:
> db.food.find({"fruit" : ["banana", "apple", "peach"]})
Type-Specific Queries | 51
Il you want to gueiy loi a specilic element ol an aiiay, you can specily an inuex using
the syntax key.index:
> db.food.find({"fruit.2" : "peach"})
Aiiays aie always 0-inuexeu, so this woulu match the thiiu aiiay element against the
stiing "peach".
$size
A uselul conuitional loi gueiying aiiays is "$size", which allows you to gueiy loi aiiays
ol a given size. Heie`s an example:
> db.food.find({"fruit" : {"$size" : 3}})
One common gueiy is to get a iange ol sizes. "$size" cannot Le comLineu with anothei
$ conuitional (in this example, "$gt"), Lut this gueiy can Le accomplisheu Ly auuing
a "size" key to the uocument. Then, eveiy time you auu an element to the aiiay, in-
ciement the value ol "size". Il the oiiginal upuate lookeu like this:
> db.food.update({"$push" : {"fruit" : "strawberry"}})
it can simply Le changeu to this:
> db.food.update({"$push" : {"fruit" : "strawberry"}, "$inc" : {"size" : 1}})
Inciementing is extiemely last, so any peiloimance penalty is negligiLle. Stoiing uocu-
ments like this allows you to uo gueiies such as this:
> db.food.find({"size" : {"$gt" : 3}})
Unloitunately, this technigue uoesn`t woik as well with the "$addToSet" opeiatoi.
The $slice operator
As mentioneu eailiei in this chaptei, the optional seconu aigument to find specilies
the keys to Le ietuineu. The special "$slice" opeiatoi can Le useu to ietuin a suLset
ol elements loi an aiiay key.
Foi example, suppose we hau a Llog post uocument anu we wanteu to ietuin the liist
10 comments:
> db.blog.posts.findOne(criteria, {"comments" : {"$slice" : 10}})
Alteinatively, il we wanteu the last 10 comments, we coulu use -10:
> db.blog.posts.findOne(criteria, {"comments" : {"$slice" : -10}})
"$slice" can also ietuin pages in the miuule ol the iesults Ly taking an ollset anu the
numLei ol elements to ietuin:
> db.blog.posts.findOne(criteria, {"comments" : {"$slice" : [23, 10]}})
This woulu skip the liist 23 elements anu ietuin the 2+th thiough 3+th. Il theie aie
lewei than 3+ elements in the aiiay, it will ietuin as many as possiLle.
52 | Chapter 4:Querying
Unless otheiwise specilieu, all keys in a uocument aie ietuineu when "$slice" is useu.
This is unlike the othei key specilieis, which suppiess unmentioneu keys liom Leing
ietuineu. Foi instance, il we hau a Llog post uocument that lookeu like this:
{
"_id" : ObjectId("4b2d75476cc613d5ee930164"),
"title" : "A blog post",
"content" : "...",
"comments" : [
{
"name" : "joe",
"email" : "joe@example.com",
"content" : "nice post."
},
{
"name" : "bob",
"email" : "bob@example.com",
"content" : "good post."
}
]
}
anu we uiu a "$slice" to get the last comment, we`u get this:
> db.blog.posts.findOne(criteria, {"comments" : {"$slice" : -1}})
{
"_id" : ObjectId("4b2d75476cc613d5ee930164"),
"title" : "A blog post",
"content" : "...",
"comments" : [
{
"name" : "bob",
"email" : "bob@example.com",
"content" : "good post."
}
]
}
Both "title" anu "content" aie still ietuineu, even though they weien`t explicitly in-
cluueu in the key speciliei.
Querying on Embedded Documents
Theie aie two ways ol gueiying loi an emLeuueu uocument: gueiying loi the whole
uocument oi gueiying loi its inuiviuual key/value paiis.
Queiying loi an entiie emLeuueu uocument woiks iuentically to a noimal gueiy. Foi
example, il we have a uocument that looks like this:
{
"name" : {
"first" : "Joe",
"last" : "Schmoe"
},
Type-Specific Queries | 53
"age" : 45
}
we can gueiy loi someone nameu ]oe Schmoe with the lollowing:
> db.people.find({"name" : {"first" : "Joe", "last" : "Schmoe"}})
Howevei, il ]oe ueciues to auu a miuule name key, suuuenly this gueiy won`t woik
anymoie; it uoesn`t match the entiie emLeuueu uocument! This type ol gueiy is also
oiuei-sensitive; {"last" : "Schmoe", "first" : "Joe"} woulu not Le a match.
Il possiLle, it`s usually a goou iuea to gueiy loi just a specilic key oi keys ol an emLeuueu
uocument. Then, il youi schema changes, all ol youi gueiies won`t suuuenly Lieak
Lecause they`ie no longei exact matches. You can gueiy loi emLeuueu keys using uot-
notation:
> db.people.find({"name.first" : "Joe", "name.last" : "Schmoe"})
Now, il ]oe auus moie keys, this gueiy will still match his liist anu last names.
This uot-notation is the main uilleience Letween gueiy uocuments anu othei uocument
types. Queiy uocuments can contain uots, which mean ieach into an emLeuueu
uocument. Dot-notation is also the ieason that uocuments to Le inseiteu cannot con-
tain the . chaiactei. Oltentimes people iun into this limitation when tiying to save URLs
as keys. One way to get aiounu it is to always peiloim a gloLal ieplace Leloie inseiting
oi altei ietiieving, suLstituting a chaiactei that isn`t legal in URLs loi the uot (.)
chaiactei.
EmLeuueu uocument matches can get a little tiicky as the uocument stiuctuie gets
moie complicateu. Foi example, suppose we aie stoiing Llog posts anu we want to linu
comments Ly ]oe that weie scoieu at least a live. Ve coulu mouel the post as lollows:
> db.blog.find()
{
"content" : "...",
"comments" : [
{
"author" : "joe",
"score" : 3,
"comment" : "nice post"
},
{
"author" : "mary",
"score" : 6,
"comment" : "terrible post"
}
]
}
Now, we can`t gueiy using db.blog.find({"comments" : {"author" : "joe", "score" :
{"$gte" : 5}}}). EmLeuueu uocument matches have to match the whole uocument,
anu this uoesn`t match the "comment" key. It also woulun`t woik to uo
db.blog.find({"comments.author" : "joe", "comments.score" : {"$gte" : 5}}),
54 | Chapter 4:Querying
Lecause the authoi ciiteiia coulu match a uilleient comment than the scoie ciiteiia.
That is, it woulu ietuin the uocument shown eailiei; it woulu match "author" :
"joe" in the liist comment anu "score" : 6 in the seconu comment.
To coiiectly gioup ciiteiia without neeuing to specily eveiy key, use "$elemMatch". This
vaguely nameu conuitional allows you to paitially specily ciiteiia to match a single
emLeuueu uocument in an aiiay. The coiiect gueiy looks like this:
> db.blog.find({"comments" : {"$elemMatch" : {"author" : "joe",
"score" : {"$gte" : 5}}}})
"$elemMatch" allows us to gioup oui ciiteiia. As such, it`s only neeueu when you have
moie than one key you want to match on in an emLeuueu uocument.
$where Queries
Key/value paiis aie a laiily expiessive way to gueiy, Lut theie aie some gueiies that
they cannot iepiesent. Foi gueiies that cannot Le uone any othei way, theie aie
"$where" clauses, which allow you to execute aiLitiaiy ]avaSciipt as pait ol youi gueiy.
This allows you to uo (almost) anything within a gueiy.
The most common case loi this is wanting to compaie the values loi two keys in a
uocument, loi instance, il we hau a list ol items anu wanteu to ietuin uocuments wheie
any two ol the values aie egual. Heie`s an example:
> db.foo.insert({"apple" : 1, "banana" : 6, "peach" : 3})
> db.foo.insert({"apple" : 8, "spinach" : 4, "watermelon" : 4})
In the seconu uocument, "spinach" anu "watermelon" have the same value, so we`u like
that uocument ietuineu. It`s unlikely MongoDB will evei have a $ conuitional loi this,
so we can use a "$where" clause to uo it with ]avaSciipt:
> db.foo.find({"$where" : function () {
... for (var current in this) {
... for (var other in this) {
... if (current != other && this[current] == this[other]) {
... return true;
... }
... }
... }
... return false;
... }});
Il the lunction ietuins true, the uocument will Le pait ol the iesult set; il it ietuins
false, it won`t Le.
Ve useu a lunction eailiei, Lut you can also use stiings to specily a "$where" gueiy; the
lollowing two "$where" gueiies aie eguivalent:
> db.foo.find({"$where" : "this.x + this.y == 10"})
> db.foo.find({"$where" : "function() { return this.x + this.y == 10; }"})
$where Queries | 55
"$where" gueiies shoulu not Le useu unless stiictly necessaiy: they aie much slowei
than iegulai gueiies. Each uocument has to Le conveiteu liom BSON to a ]avaSciipt
oLject anu then iun thiough the "$where" expiession. Inuexes cannot Le useu to satisly
a "$where", eithei. Hence, you shoulu use "$where" only when theie is no othei way ol
uoing the gueiy. You can cut uown on the penalty Ly using othei gueiy lilteis in com-
Lination with "$where". Il possiLle, an inuex will Le useu to liltei Laseu on the non-
$where clauses; the "$where" expiession will Le useu only to line-tune the iesults.
Anothei way ol uoing complex gueiies is to use MapReuuce, which is coveieu in the
next chaptei.
Cursors
The uataLase ietuins iesults liom find using a cursor. The client-siue implementations
ol cuisois geneially allow you to contiol a gieat ueal aLout the eventual output ol a
gueiy. You can limit the numLei ol iesults, skip ovei some numLei ol iesults, soit
iesults Ly any comLination ol keys in any uiiection, anu peiloim a numLei ol othei
poweilul opeiations.
To cieate a cuisoi with the shell, put some uocuments into a collection, uo a gueiy on
them, anu assign the iesults to a local vaiiaLle (vaiiaLles uelineu with "var" aie local).
Heie, we cieate a veiy simple collection anu gueiy it, stoiing the iesults in the cursor
vaiiaLle:
> for(i=0; i<100; i++) {
... db.c.insert({x : i});
... }
> var cursor = db.collection.find();
The auvantage ol uoing this is that you can look at one iesult at a time. Il you stoie the
iesults in a gloLal vaiiaLle oi no vaiiaLle at all, the MongoDB shell will automatically
iteiate thiough anu uisplay the liist couple ol uocuments. This is what we`ve Leen
seeing up until this point, anu it is olten the Lehavioi you want loi seeing what`s in a
collection Lut not loi uoing actual piogiamming with the shell.
To iteiate thiough the iesults, you can use the next methou on the cuisoi. You can use
hasNext to check whethei theie is anothei iesult. A typical loop thiough iesults looks
like the lollowing:
> while (cursor.hasNext()) {
... obj = cursor.next();
... // do stuff
... }
cursor.hasNext() checks that the next iesult exists, anu cursor.next() letches it.
The cursor class also implements the iteiatoi inteilace, so you can use it in a forEach
loop:
56 | Chapter 4:Querying
> var cursor = db.people.find();
> cursor.forEach(function(x) {
... print(x.name);
... });
adam
matt
zak
Vhen you call find, the shell uoes not gueiy the uataLase immeuiately. It waits until
you actually stait ieguesting iesults to senu the gueiy, which allows you to chain au-
uitional options onto a gueiy Leloie it is peiloimeu. Almost eveiy methou on a cuisoi
oLject ietuins the cuisoi itsell so that you can chain them in any oiuei. Foi instance,
all ol the lollowing aie eguivalent:
> var cursor = db.foo.find().sort({"x" : 1}).limit(1).skip(10);
> var cursor = db.foo.find().limit(1).sort({"x" : 1}).skip(10);
> var cursor = db.foo.find().skip(10).limit(1).sort({"x" : 1});
At this point, the gueiy has not Leen executeu yet. All ol these lunctions meiely Luilu
the gueiy. Now, suppose we call the lollowing:
> cursor.hasNext()
At this point, the gueiy will Le sent to the seivei. The shell letches the liist 100 iesults
oi liist +MB ol iesults (whichevei is smallei) at once so that the next calls to next oi
hasNext will not have to make tiips to the seivei. Altei the client has iun thiough the
liist set ol iesults, the shell will again contact the uataLase anu ask loi moie iesults.
This piocess continues until the cuisoi is exhausteu anu all iesults have Leen ietuineu.
Limits, Skips, and Sorts
The most common gueiy options aie limiting the numLei ol iesults ietuineu, skipping
a numLei ol iesults, anu soiting. All ol these options must Le auueu Leloie a gueiy is
sent to the uataLase.
To set a limit, chain the limit lunction onto youi call to find. Foi example, to only
ietuin thiee iesults, use this:
> db.c.find().limit(3)
Il theie aie lewei than thiee uocuments matching youi gueiy in the collection, only the
numLei ol matching uocuments will Le ietuineu; limit sets an uppei limit, not a lowei
limit.
skip woiks similaily to limit:
> db.c.find().skip(3)
This will skip the liist thiee matching uocuments anu ietuin the iest ol the matches. Il
theie aie less than thiee uocuments in youi collection, it will not ietuin any uocuments.
sort takes an oLject: a set ol key/value paiis wheie the keys aie key names anu the
values aie the soit uiiections. Soit uiiection can Le 1 (ascenuing) oi -1 (uescenuing). Il
Cursors | 57
multiple keys aie given, the iesults will Le soiteu in that oiuei. Foi instance, to soit the
iesults Ly "username" ascenuing anu "age" uescenuing, we uo the lollowing:
> db.c.find().sort({username : 1, age : -1})
These thiee methous can Le comLineu. This is olten hanuy loi pagination. Foi example,
suppose that you aie iunning an online stoie anu someone seaiches loi np3. Il you
want 50 iesults pei page soiteu Ly piice liom high to low, you can uo the lollowing:
> db.stock.find({"desc" : "mp3"}).limit(50).sort({"price" : -1})
Il they click Next Page to see moie iesults, you can simply auu a skip to the gueiy,
which will skip ovei the liist 50 matches (which the usei alieauy saw on page 1):
> db.stock.find({"desc" : "mp3"}).limit(50).skip(50).sort({"price" : -1})
Howevei, laige skips aie not veiy peiloimant, so theie aie suggestions on avoiuing
them in a moment.
Comparison order
MongoDB has a hieiaichy as to how types compaie. Sometimes you will have a single
key with multiple types, loi instance, integeis anu Looleans, oi stiings anu nulls. Il you
uo a soit on a key with a mix ol types, theie is a pieuelineu oiuei that they will Le soiteu
in. Fiom least to gieatest value, this oiueiing is as lollows:
1. Minimum value
2. null
3. NumLeis (integeis, longs, uouLles)
+. Stiings
5. OLject/uocument
6. Aiiay
7. Binaiy uata
S. OLject ID
9. Boolean
10. Date
11. Timestamp
12. Regulai expiession
13. Maximum value
Avoiding Large Skips
Using skip loi a small numLei ol uocuments is line. Foi a laige numLei ol iesults,
skip can Le slow (this is tiue in neaily eveiy uataLase, not just MongoDB) anu shoulu
Le avoiueu. Usually you can Luilu ciiteiia into the uocuments themselves to avoiu
58 | Chapter 4:Querying
having to uo laige skips, oi you can calculate the next gueiy Laseu on the iesult liom
the pievious one.
Paginating results without skip
The easiest way to uo pagination is to ietuin the liist page ol iesults using limit anu
then ietuin each suLseguent page as an ollset liom the Leginning.
> // do not use: slow for large skips
> var page1 = db.foo.find(criteria).limit(100)
> var page2 = db.foo.find(criteria).skip(100).limit(100)
> var page3 = db.foo.find(criteria).skip(200).limit(100)
...
Howevei, uepenuing on youi gueiy, you can usually linu a way to paginate without
skips. Foi example, suppose we want to uisplay uocuments in uescenuing oiuei Laseu
on "date". Ve can get the liist page ol iesults with the lollowing:
> var page1 = db.foo.find().sort({"date" : -1}).limit(100)
Then, we can use the "date" value ol the last uocument as the ciiteiia loi letching the
next page:
var latest = null;
// display first page
while (page1.hasNext()) {
latest = page1.next();
display(latest);
}
// get next page
var page2 = db.foo.find({"date" : {"$gt" : latest.date}});
page2.sort({"date" : -1}).limit(100);
Now the gueiy uoes not neeu to incluue a skip.
Finding a random document
One laiily common pioLlem is how to get a ianuom uocument liom a collection. The
naive (anu slow) solution is to count the numLei ol uocuments anu then uo a find,
skipping a ianuom numLei ol uocuments Letween 0 anu the size ol the collection.
> // do not use
> var total = db.foo.count()
> var random = Math.floor(Math.random()*total)
> db.foo.find().skip(random).limit(1)
It is actually highly inellicient to get a ianuom element this way: you have to uo a count
(which can Le expensive il you aie using ciiteiia), anu skipping laige numLeis ol ele-
ments can Le time-consuming.
It takes a little loiethought, Lut il you know you`ll Le looking up a ianuom element on
a collection, theie`s a much moie ellicient way to uo so. The tiick is to auu an extia
Cursors | 59
ianuom key to each uocument when it is inseiteu. Foi instance, il we`ie using the shell,
we coulu use the Math.random() lunction (which cieates a ianuom numLei Letween 0
anu 1):
> db.people.insert({"name" : "joe", "random" : Math.random()})
> db.people.insert({"name" : "john", "random" : Math.random()})
> db.people.insert({"name" : "jim", "random" : Math.random()})
Now, when we want to linu a ianuom uocument liom the collection, we can calculate
a ianuom numLei anu use that as gueiy ciiteiia, insteau ol uoing a skip:
> var random = Math.random()
> result = db.foo.findOne({"random" : {"$gt" : random}})
Theie is a slight chance that random will Le gieatei than any ol the "random" values in
the collection, anu no iesults will Le ietuineu. Ve can guaiu against this Ly simply
ietuining a uocument in the othei uiiection:
> if (result == null) {
... result = db.foo.findOne({"random" : {"$lt" : random}})
... }
Il theie aien`t any uocuments in the collection, this technigue will enu up ietuining
null, which makes sense.
This technigue can Le useu with aiLitiaiily complex gueiies; just make suie to have an
inuex that incluues the ianuom key. Foi example, il we want to linu a ianuom plumLei
in Caliloinia, we can cieate an inuex on "profession", "state", anu "random":
> db.people.ensureIndex({"profession" : 1, "state" : 1, "random" : 1})
This allows us to guickly linu a ianuom iesult (see Chaptei 5 loi moie inloimation on
inuexing).
Advanced Query Options
Theie aie two types ol gueiies: wrappcd anu p|ain. A plain gueiy is something like this:
> var cursor = db.foo.find({"foo" : "bar"})
Theie aie a couple options that wiap the gueiy. Foi example, suppose we peiloim
a soit:
> var cursor = db.foo.find({"foo" : "bar"}).sort({"x" : 1})
Insteau ol senuing {"foo" : "bar"} to the uataLase as the gueiy, the gueiy gets wiappeu
in a laigei uocument. The shell conveits the gueiy liom {"foo" : "bar"} to {"$query" :
{"foo" : "bar"}, "$orderby" : {"x" : 1}}.
Most uiiveis pioviue helpeis loi auuing aiLitiaiy options to gueiies. Othei helplul
options incluue the lollowing:
$maxscan : integer
Specily the maximum numLei ol uocuments that shoulu Le scanneu loi the gueiy.
60 | Chapter 4:Querying
$min : document
Stait ciiteiia loi gueiying.
$max : document
Enu ciiteiia loi gueiying.
$hint : document
Tell the seivei which inuex to use loi the gueiy.
$explain : boolean
Get an explanation ol how the gueiy will Le executeu (inuexes useu, numLei ol
iesults, how long it takes, etc.), insteau ol actually uoing the gueiy.
$snapshot : boolean
Ensuie that the gueiy`s iesults will Le a consistent snapshot liom the point in time
when the gueiy was executeu. See the next section loi uetails.
Getting Consistent Results
A laiily common way ol piocessing uata is to pull it out ol MongoDB, change it in some
way, anu then save it again:
cursor = db.foo.find();
while (cursor.hasNext()) {
var doc = cursor.next();
doc = process(doc);
db.foo.save(doc);
}
This is line loi a small numLei ol iesults, Lut it Lieaks uown loi laige numLeis ol
uocuments. To see why, imagine how the uocuments aie actually Leing stoieu. You
can pictuie a collection as a list ol uocuments that looks something like Figuie +-1.
Snowllakes iepiesent uocuments, Lecause eveiy uocument is Leautilul anu unigue.
Iigurc 1-1. A co||cction bcing qucricd
Now, when we uo a find, it staits ietuining iesults liom the Leginning ol the collection
anu moves iight. Youi piogiam giaLs the liist 100 uocuments anu piocesses them.
Vhen you save them Lack to the uataLase, il a uocument uoes not have the pauuing
availaLle to giow to its new size, like in Figuie +-2, it neeus to Le ielocateu. Usually, a
uocument will Le ielocateu to the enu ol a collection (Figuie +-3).
Cursors | 61
Now oui piogiam continues to letch Latches ol uocuments. Vhen it gets towaiu the
enu, it will ietuin the ielocateu uocuments again (Figuie +-+)!
Iigurc 1-1. A cursor nay rcturn thcsc rc|ocatcd docuncnts again in a |atcr batch
The solution to this pioLlem is to snapshot youi gueiy. Il you auu the "$snapshot"
option, the gueiy will Le iun against an unchanging view ol the collection. All gueiies
that ietuin a single Latch ol iesults aie ellectively snapshotteu. Inconsistencies aiise
Iigurc 1-2. An cn|argcd docuncnt nay not jit whcrc it did bcjorc
Iigurc 1-3. MongoDB rc|ocatcs updatcd docuncnts that don`t jit in thcir origina| position
62 | Chapter 4:Querying
only when the collection changes unuei a cuisoi while it is waiting to get anothei Latch
ol iesults.
Cursor Internals
Theie aie two siues to a cuisoi: the client-lacing cuisoi anu the uataLase cuisoi that
the client-siue one iepiesents. Ve have Leen talking aLout the client-siue one up until
now, Lut we aie going to take a Liiel look at what`s happening on the seivei siue.
On the seivei siue, a cuisoi takes up memoiy anu iesouices. Once a cuisoi iuns out ol
iesults oi the client senus a message telling it to uie, the uataLase can liee the iesouices
it was using. Fieeing these iesouices lets the uataLase use them loi othei things, which
is goou, so we want to make suie that cuisois can Le lieeu guickly (within ieason).
Theie aie a couple ol conuitions that can cause the ueath (anu suLseguent cleanup) ol
a cuisoi. Fiist, when a cuisoi linishes iteiating thiough the matching iesults, it will
clean itsell up. Anothei way is that, when a cuisoi goes out ol scope on the client siue,
the uiiveis senu the uataLase a special message to let it know that it can kill that cuisoi.
Finally, even il the usei hasn`t iteiateu thiough all the iesults anu the cuisoi is still in
scope, altei 10 minutes ol inactivity, a uataLase cuisoi will automatically uie.
This ueath Ly timeout is usually the uesiieu Lehavioi: veiy lew applications expect
theii useis to sit aiounu loi minutes at a time waiting loi iesults. Howevei, sometimes
you might know that you neeu a cuisoi to last loi a long time. In that case, many uiiveis
have implementeu a lunction calleu immortal, oi a similai mechanism, which tells the
uataLase not to time out the cuisoi. Il you tuin oll a cuisoi`s timeout, you must iteiate
thiough all ol its iesults oi make suie it gets closeu. Otheiwise, it will sit aiounu in the
uataLase hogging iesouices loievei.
Cursor Internals | 63
You can nd this at oreilly.com
in print or ebook format.
Its also available at your favorite book retailer,
including iTunes, the Android Market, Amazon,
and Barnes & Noble.
oreilly.com Spreading the knowledge of innovators
Want to read more?
book

Anda mungkin juga menyukai