Sign In Register

How can we help you today?

Start a new topic
Answered

Handling a whole collection

I'm wondering what would be the recommended way of handling a big sized collection.


Let's try this sceneario:


I have a custom runtime collection that holds transactions in the form of: TransactionID, PlayerID, Item and Date


And I want to know totals of items in periods of time.


Now, the solutions I can imagine are basically 2.


1) I get the data and process it myself (that is I write a small program or maybe a spreadsheet).

2) I process the data using Cloud Code.


Now, I know Cloud Code shouldn't take too much time to execute (1 second) and I imagine for the number of reports and quantity of data, 1 second may not be enough.


On the other hand, I don't see any way of exporting data other than the NoSQL interface which limit the result to 1,000. So If I have a million transactions, 1,000 at a time will make it a pain to get.


So I'm asking for some advice here.


Thanks in advance!


Best Answer

Hi Omar,


The most optimal solution largely depends on the expected volume of data you intend to deal with, the indexes you set up on the collection and the aggregation queries you process. As long as your queries don't exceed the script timeout and don't violate the SLA, the Cloud Code method should be fine, though again it depends on how often you intend to run these queries. We generally do not recommend querying large bodies of data every minute or so, for example.


If your data needs are more pertinent then we recommend a daily or weekly export of data onto a local database using the NoSQL REST API. In this case the maximum number of documents you can retrieve per find query is 10000. By retrieving the data periodically and by setting the parameters to query said data is a small script or program, you can automate the retrieval of this data with relative ease.


If you would like more information then please let us know your estimates for how large the collection will be, and how you intend to maintain and use this collection when going live.


-Pádraig

1 Comment

Answer

Hi Omar,


The most optimal solution largely depends on the expected volume of data you intend to deal with, the indexes you set up on the collection and the aggregation queries you process. As long as your queries don't exceed the script timeout and don't violate the SLA, the Cloud Code method should be fine, though again it depends on how often you intend to run these queries. We generally do not recommend querying large bodies of data every minute or so, for example.


If your data needs are more pertinent then we recommend a daily or weekly export of data onto a local database using the NoSQL REST API. In this case the maximum number of documents you can retrieve per find query is 10000. By retrieving the data periodically and by setting the parameters to query said data is a small script or program, you can automate the retrieval of this data with relative ease.


If you would like more information then please let us know your estimates for how large the collection will be, and how you intend to maintain and use this collection when going live.


-Pádraig

Login to post a comment