-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inefficiency when first populating the Cloud with local data #208
Comments
@CapoChino I'll take a look, perhaps we can enhance the performance by adding a predicate to the pull request: find objects with simperiumKey set to nil, and return fault them. Thanks for reporting this bottleneck, much appreciated!. |
Hi Jorge, I'm having trouble with a decent-sized database. It's ~50MB. There are 7 entities. There are ~260K objects in the DB. I'm running with the latest develop branch, clean and without the improvement suggested above. Now the result is worse than before, perhaps due to change d187911 ("Added new ‘markObjectWithPendingChanges’ method"). When I run my app on the simulator with this DB, even without signing into Simperium, it consumes 1.15GB of memory then crashes with failure to alloc mem in the new function mentioned above. This is not a stress test. This is a normal user's database. Thanks for your help. Here's the log with stack trace: 2014-03-10 16:34:58.479 RaceTrace[10233:70b] Reachability Flag Status: -R -----l- networkStatusForFlags |
@CapoChino thanks for reporting this!. Added issue #229 to track this crash. Let's get that low mem crash fixed ASAP! |
@CapoChino hi there, I've just profiled the
Results:
May i ask you if this is still an issue for you (with the latest code!)?. Thanks! |
Hi @jleandroperez, I've updated to the latest code and run a careful test. I’ve run a careful test and have a few findings. The top level summary is:
Here’s the nitty-gritty details:
The entire log file for my test is here: https://www.dropbox.com/s/a88048kunv3srgv/SimperiumIssue208Log.txt 18 seconds elapsed before the first log line. This is the time it took to migrate my database to add the ghostData and simperiumKey fields. 103 seconds elapsed while simperium indexed the new objects, added keys, etc. This is 1340 entities/s. Not quite the 3.3k entities/s that you got, but close enough. That’s not great, but it’s fast enough for me. I did receive a couple memory warnings, but it didn’t crash. I also saw a few different types of warning messages that didn’t make sense. For example: 2014-03-25 17:45:31.881 RaceTrace[18383:9903] Simperium warning: couldn't processLocalObjectWithKey 5e88c9ef1559432a9b8e0134f71a6963 because the object no longer exists 2014-03-25 17:45:34.488 RaceTrace[18383:a003] Simperium POST returned error 409 for change { 2014-03-25 17:45:34.755 RaceTrace[18383:7f03] Simperium error: transform diff for a ghost member (ghost <SPGhost: 0x24044250>, memberData (null)) that doesn't exist (bLon): { Are any of these alarming? The device went along uploading data to Simperium very slowly for about 8 more minutes and then my device suddenly rebooted completely! After the reboot, I ran my app again. This time, Simperium only took 4s to start up since all the objects had keys this time. Good. The bad news is that it uploaded for 10 minutes then my device crashed/rebooted again. Let’s ignore the reboots for a moment and come back that later when we know more. During this second run, I concurrently ran a Python script to query the number of objects in the Simperium bucket every 10 seconds. This showed how fast (er slow) the object uploads were happening: GpsPoint : count: 1670 time: 1395808679.18 After this the device rebooted so obviously no new entities were pushed up. Two things stand out:
For reference: I’ve had a quick look with Instruments and it looks like most of the upload time is spent in the new saveAndWait function and in some WebSocket function. I’ll look again more carefully and report back again with findings. Please let me know if you have any questions. As always, thank you for your diligent work responding to concerns and improving your awesome product. |
@CapoChino Can you send us your app-id so we can check to make sure everything is ok server side? Also, we think you should find a way to consolidate your data points (perhaps batch them in groups of 1000 or 2000). This will cut down the number of requests and you should see better response times. As I mentioned in the email, the stack isn't currently optimized for the use case of hundreds of thousands of very small objects. |
@CapoChino thanks for sharing the details!. I've been working on this issue for a couple hours. So far i managed to replicate the slow-sync (but no phone reboots, so far). A couple notes: Error 409: Simperium warning: couldn't processLocalObjectWithKey.. because the object no longer exists: Simperium error: transform diff for a ghost member (ghost , memberData (null)) that doesn't exist (bLon): Profiling: I'll keep you posted, |
Hi Jorge, I've done some profiling too. You're right; it does look like Core Data access is one of the bigger bottlenecks. That's actually good news, as Core Data provides lots of opportunities for optimization. It can be really fast, actually. I've been digging into the code, and I do see some good usage of CoreData. However, my profiling results pointed me to one alarming thing. processLocalObjectWithKey is taking about 25% of the total time. That function does a Core Data fetch for a single object. It's called with an enumeration of all pending changes, which, in the case we're looking at here is ~150k local objects being pushed to the cloud. I think the enumerateQueuedChangesForBucket could be reworked to fetch all (or a batch) of changed objects and then operate on the chunk instead of fetching one-at-a-time. The whole thing might be being exacerbated by thread thrashing since it looks like 2 queues operate in unison to do this push: the object processor queue and the persistentmutableset queue responsible for handling keysForObjectsWithMoreChanges. The saveAndWait function of the latter queue is also taking about 25% of the total time. It looks like save is being called an extra time with each object pushed. Also maybe we could batch the changes and save only every X objects. I'll send you an email with the instruments sample data. What do you think? |
Oh, a couple more notes: I ran on my phone and didn't get the device reboot. Maybe it only happens when debugging. So that's good. It means that worst-case, maybe I could just tolerate the slow upload. I hesitate to bundle objects on my end for a few reasons:
But if there's not enough development time or we can't do much better in the Simperium code, I can make due somehow, either tolerate slow uploads or bundle objects. |
@CapoChino hey there! I've just pushed a couple optimizations in this branch. Whenever possible, would you please give it a shot?. Remote changes are now being processed effectively in batch, and local changes as well!. Thanks in advance! |
That is so awesome, Jorge! I can't wait to try it out. I'll report back Thank you!! On Mon, Apr 7, 2014 at 2:48 PM, Jorge Leandro Perez <
|
No problem!. In Simulator Tests memory peaks at 100MB maximum. Sync'ing 50k objects is taking between 15-35 minutes, but i'm unable to trigger any stall'ment behavior (or low memory crashes, so far). Thanks for testing this out! |
Ok, running the same test as before, I got a very steady 5.5 objects/s one In an ideal world, the upload would be faster, but I'm satisfied with this On Mon, Apr 7, 2014 at 3:02 PM, Jorge Leandro Perez <
|
@CapoChino thanks for the feedback!! I've merged a couple changes into 208 branch, that will prevent weird scenarios. As soon as a couple critical bugs are solved, i'll be taking a look at how we could further improve performance. If you have any issues with the current performance, please, just let us know. I'll be merging those changes into develop shortly! (By the way, congratulations on your app, it looks awesome!). |
Issue #208: Performance Improvements
@CapoChino please, if you're still having any kind of issues, let us know. Let's track further optimizations in new specific issues. Thank you! |
Just for reference/information, I pulled the latest code yesterday and tested again with large and small databases on both actual iPhone 5 and the simulator. The general finding is that the small database uploaded at a much faster RATE (objects per second) than the large one. I haven't profiled again since the last time, but I'm betting much of the time is saving SPMutableSet-keysForObjectsWithMoreChanges-*.dat files. One of my .dat files is 5MB. Detailed results:
Note that rate was 6 times faster on smaller database. Again, this is for future reference purposes. I'm not requesting further work on this specific issue. |
@CapoChino thanks for the update on this!. Let's track new speed-enhancements in this new ticket: #274 Thank you!! |
I was testing with a small sample database of ~3000 objects. On the simulator, the startWithAppID function was taking about 45s. I noticed it was firing a fault for every object as it was creating and setting a new simperiumKey for each object.
The code responsible for this is in validateObjectsForBucketName in SPCoreDataStorage.m. It tries to be nice and prefetch only the objectId and simperiumId, which I think would be sufficient in most cases. However, when simperiumKey is nil, this causes a fault to fire for the object.
In objectKeysAndIdsForBucketName I added the following line to fetch all objects in one trip to the db:
[NSClassFromString(entity.managedObjectClassName) MR_findAll];
This uses MagicalRecord but could easily be written as a plain Core Data fetch. I understand that this fetch may not be necessary in other cases. However, for the first-time push to the cloud, it dropped the 45s delay to 10s. For a much larger database, I waited hours before giving up. With this change, it finished, I think in a matter of 30s or so. Didn't time that one nor do I remember. All I know is that it did actually finish.
The text was updated successfully, but these errors were encountered: