Anda di halaman 1dari 5

Skip to content

 Why GitHub?
 Business
 Explore
 Marketplace
 Pricing

Sign inSign up
 Watch 81
 Star 566
 Fork 147

mongodb/mongo-php-driver

Code Issues 6 Pull requests 3 Projects 0 Insights

Dismiss
Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage
projects, and build software together.

Sign up
New issue

Get a random document from a


very large collection
efficiently. #772
Closed
saxenapawan800 opened this issue on Feb 27 · 5 comments

Comments
Assignees

No one assigned
Labels

None yet
Projects

None yet
Milestone

No milestone
2 participants

saxenapawan800 commented on Feb 27 •


edited

Description

How to get a random document from a very big collection (10000000 documents) using the
new mongodb-php library.
Till now i have tried :

$limit = 1;
$skip = mt_rand(0, $collection->count());
$skip = $skip < 0 ? 0 : $skip;

$options = array(
"limit" => $limit,
"skip" => $skip
);
$collection->find( $conditions, $options );
But it doesn't give me any record at all.

Environment

OS : centos 7
Mongo version : 3.6
mongo php drivers : 1.4

output of $ php -i | grep -E 'mongodb|libmongoc|libbson' :


/etc/php.d/60-mongodb.ini,
mongodb
libbson bundled version => 1.9.2
libmongoc bundled version => 1.9.2
libmongoc SSL => enabled
libmongoc SSL library => OpenSSL
libmongoc crypto => enabled
libmongoc crypto library => libcrypto
libmongoc crypto system profile => disabled
libmongoc SASL => disabled
libmongoc compression => enabled
libmongoc compression snappy => disabled
libmongoc compression zlib => enabled
mongodb.debug => no value => no value

Member
jmikola commented on Feb 28

See https://stackoverflow.com/questions/2824157/random-record-from-mongodb for an


existing discussion on this topic. The stand-out solutions are:

 Using $sample aggregation operator


 Finding the min and max _id values (assuming they are ObjectId types) and computing
a random, synthetic ObjectId between those values to use in a $gte query

There is also an old cookbook article on the subject, although I'm not sure if it's published
anywhere (that links to the RST source).

Note that using skip with a large offset is very inefficient, as it forces the server to scan through
all skipped documents. This applies to your example as well as using limit and skip to
implement crude pagination.

saxenapawan800 commented on Mar 1 •


edited by jmikola

@jmikola thanks mate.I appreciate for your time that you took out to see my issue.
I did resolve it via $sample aggregation.
Though i am wondering now :

$this->collection->update(
array( 'field1' => $field1_value ),
array(
array('some_field' => $some_field_value),
array('field_usage_timestamp' => time()),
array('$inc' => array('field_usage' => 1))
)
);
Basically i want to find a document using some_field and then try to increment field_usage if
document has this field and if not insert this field field_usage with a value of 1.
Any ideas would be highly appreciated. :)

Member
jmikola commented on Mar 1 •
edited

MongoDB\Collection does not have an update() method, so please make sure you're up to date
on the library documentation. Differences from the legacy mongo extension's API are discussed in
the library's Upgrade Guide. While the MongoDB shell does have a multi-
purpose db.collection.update() method to allow for updateOne, updateMany, and replaceOne
operations, the drivers currently provide explicit methods for those three variations of
the update command.
Furthermore, the second argument to update() in your example would not be a valid argument
for either an atomic update or full-document replacement. I would suggest you read through
the Update Documentstutorial in the MongoDB manual. You can select "PHP" at the top of the
document to display syntax examples for the PHP library.
I find a document using some_field

In that case, some_field should be used the first argument, rather than the second. You're
currently matching a document only by the value of field1.
To explain why the second argument is invalid, I'll expand it to how it would serialize in BSON:

[
0 => [ 'some_field' => $some_field_value ],
1 => [ 'field_usage_timestamp' => time() ],
2 => [ '$inc' => [ 'field_usage' => 1 ] ],
]

This would not be a valid replacement document because $inc appears as a field key (document
fields cannot start with "$"). If $inc did not appear, the replacement document would be valid,
but you'd be overwriting the original document so that it only kept is _id field followed by the
three numeric fields (the integers would end up as strings in BSON when added to the root
document).
That said, if you were intending to
instead set some_field and field_usage_timestamp and incrementfield_usage, the following
update document could be used:
[
'$set' [
'some_field' => $some_field_value,
'field_usage_timestamp' => time(),
],
'$inc' => [ 'field_usage' => 1 ],
]

try to increment field_usage if document has this field and if not insert this field field_usage with
a value of 1.

Per $inc: Behavior in the MongoDB manual, that is exactly what already happens.

saxenapawan800 commented on Mar 7

@jmikola thanks mate...that was a typo in question for finding a document using some_field.
I got it working this way, but i think i should have asked you the main issue in first place can you
kindly give it a look.
https://stackoverflow.com/questions/49053539/how-to-group-mongodb-documents-in-php

Member
jmikola commented on Mar 7 •
edited

I replied in the Stack Overflow thread. Closing this out as I believe the question here has been
addressed.

jmikola closed this on Mar 7


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to
comment

 © 2018 GitHub, Inc.


 Terms
 Privacy
 Security
 Status
 Help

 Contact GitHub
 Pricing
 API
 Training
 Blog
 About

Press h to open a hovercard with more details.

Anda mungkin juga menyukai