We have a seasonEnd() call that will reward the entire playerbase. The reward call writes to multiple collections.
Instead of running into the 30 timeout, I was thinking about having the seasonEnd() call create a Spark.getScheduler().inSeconds() for each player. That timer will then reward the player.
If i make the secondsToExpire = 1 for all players. What happens?
I'm mostly curious about concurrency of the execution of timers. If its one at a time, setting them all to be the same might not be a problem.
If it does create a problem, whats an acceptable range of secondsToExpire ?
SparkScheduler will run sequentially for all players so setting the expiry to 1 will likely reward only a handful of players and not the entirety of the playerbase.
I can't recommend an appropriate time for the secondsToExpire as this won't scale well for when the number of users grows.
I would recommend instead utilising Bulk Jobs to reward a set amount of players.
What if a loop was made to partition the data, and scheduled a bunch of identical jobs (just with different set of player records) ? Would that effect scheduler performance ? IOW: Scheduler is set to run 1000 of same module at same time, with each containing a subset of 1000 players (so processing these in parallel), would that increase individual performance as expected rather than waiting for it to process one record at a time? I do not know enough about mongoDB... is there locking with mongoDB where if it is writing, no other process can write to it until it is complete ?
>> SparkScheduler will run sequentially for all players so setting the expiry to 1 will likely reward only a handful of players and not the entirety of the playerbase.
Wait, is it possible that the Scheduler won't execute some of the timers? Lets say that just by coincidence the playerbase has a lot of timers expire at the same time. How do those execute?
>> I would recommend instead utilising Bulk Jobs to reward a set amount of players.
My understanding is that this feature is not available using GS_HOURLY which is where my code is running from. Is this still true?
You are unable to submit a bulk job from a System Script, I was unaware that you were using GS_HOURLY.
Can I get some clarification from you?
You've mentioned secondsToExpire. Where are you defining this? Is this your own parameter?
How is seasonEnd being called, and how does this fit in with the GS_HOURLY System Script?
Sorry not secondsToExpire, but delaySeconds per:
signature inSeconds(string shortCode, number delaySeconds, JSON data)
GS_HOURLY calls checkSeasonEnd() and if a Season has ended (usually midnight) then it calls seasonEnd which attempts to reward the entire player base.
I was thinking of using SparkScheduler.inSeconds() to create a scheduled task for every player. The task would do the rewarding. This would help prevent hitting the 30 second execution limit because each reward action happens independently. I'm also hoping that it will parallelize the rewarding. With regards to that, could you tell me if SparkScheduler will execute tasks sequentially or in parallel?
Thank you for the further clarification.
SparkScheduler runs sequentially and does not run in parallel, this is what our Bulk Jobs were designed for. Unfortunately you can't use them in a System Script. You would have to queue up SparkSchedulers one after the other, and also get a list of playerID's in the module since the GS_HOURLY script doesn't run in the context of any player.
This is an issue other clients have run into aswell. A solution that tends to work is, instead of running a script that awards all the players at once. instead in your checkSeasonEnd() function, write a document to a collection with a timestamp or id, indicating that a particular season has ended. Then, in the Cloud Code for AuthenticationResponse, when a player has successfully authenticated (logged back in), do an additional check on this collection. If there is a new entry with a new id, then you can award that player and write that id or timestamp to a "history", such that the player has a record of what seasons they participated in (and that they are only awarded once).
Does this seem like an acceptable solution?
Yes that solution will work. I do feel we will run into issues about concurrency. The rewarding action won't be uniform. If we get enough people logging in at the same time, it might overload the database.
I will try it this way for now.
Is there a way to possibly work around Bunk Jobs and GS_HOURLY? Perhaps schedule a tasks that then setups up the bulk job?
Also, lets assume I did decide to schedule 100 SparkScheduler tasks. If they all had delaySeconds of 1, and took 2 seconds to complete. How would that behave?
Would I be done with all 100 tasks after 200 seconds? Would some tasks be dropped?
According to this, bulk jobs are no longer a problem. Do you recommend I implement bulk jobs for this task?
I have to be honest, I wasn't aware that this change was coming to the platform. I actually laughed out loud when I read the Release notes myself :D
Yes, I think using Bulk Jobs for this task should work fine for now, the only thing limiting the bulk jobs is that only one can be run at a time for any given script, so for example, if the job for one hour doesn't finish by the next hour, it will fail. I don't see this happening though, Bulk Jobs are generally quite fast.
Again just be aware that this will generally run for each player at the same time, that being, if every player is trying to update the same mongo document, there will be concurrency problems, so have a think about how best you'd like to handle simultaneous read/writes of the same resource.
>> Yes, I think using Bulk Jobs for this task should work fine for now, the only thing limiting the bulk jobs is that only one can be run at a time for any given script, so for example, if the job for one hour doesn't finish by the next hour, it will fail.
Can you describe this in more detail? How does it fail? What exactly fails?
I assume he meant that if the GS_HOURLY extends past an hour, it will be killed in lieu of the next GS_HOURLY update.
Also bulk jobs require a query on the player collection. I would like to run a query on a different collection. It would still produce playerIds.
In MongoDB any kind of join is done via $lookup. But it is only available through aggregate: https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/
Do you have suggestions for how I could run a bulk job query on a different collection? Or some kind of hacky work around?
I should also mention that in an ideal world, we want to be able to query a leaderboard to populate the playerIds for a bluk job.