Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficiency when first populating the Cloud with local data #208

Closed
CapoChino opened this issue Feb 10, 2014 · 17 comments
Closed

Inefficiency when first populating the Cloud with local data #208

CapoChino opened this issue Feb 10, 2014 · 17 comments
Assignees
Milestone

Comments

@CapoChino
Copy link

I was testing with a small sample database of ~3000 objects. On the simulator, the startWithAppID function was taking about 45s. I noticed it was firing a fault for every object as it was creating and setting a new simperiumKey for each object.

The code responsible for this is in validateObjectsForBucketName in SPCoreDataStorage.m. It tries to be nice and prefetch only the objectId and simperiumId, which I think would be sufficient in most cases. However, when simperiumKey is nil, this causes a fault to fire for the object.

In objectKeysAndIdsForBucketName I added the following line to fetch all objects in one trip to the db:

[NSClassFromString(entity.managedObjectClassName) MR_findAll];

This uses MagicalRecord but could easily be written as a plain Core Data fetch. I understand that this fetch may not be necessary in other cases. However, for the first-time push to the cloud, it dropped the 45s delay to 10s. For a much larger database, I waited hours before giving up. With this change, it finished, I think in a matter of 30s or so. Didn't time that one nor do I remember. All I know is that it did actually finish.

@jleandroperez
Copy link
Contributor

@CapoChino I'll take a look, perhaps we can enhance the performance by adding a predicate to the pull request: find objects with simperiumKey set to nil, and return fault them.

Thanks for reporting this bottleneck, much appreciated!.

@jleandroperez jleandroperez added this to the v0.7.0 milestone Feb 11, 2014
@jleandroperez jleandroperez self-assigned this Feb 11, 2014
@CapoChino
Copy link
Author

Hi Jorge,

I'm having trouble with a decent-sized database. It's ~50MB. There are 7 entities. There are ~260K objects in the DB. I'm running with the latest develop branch, clean and without the improvement suggested above. Now the result is worse than before, perhaps due to change d187911 ("Added new ‘markObjectWithPendingChanges’ method").

When I run my app on the simulator with this DB, even without signing into Simperium, it consumes 1.15GB of memory then crashes with failure to alloc mem in the new function mentioned above. This is not a stress test. This is a normal user's database.

Thanks for your help.

Here's the log with stack trace:

2014-03-10 16:34:58.479 RaceTrace[10233:70b] Reachability Flag Status: -R -----l- networkStatusForFlags
2014-03-10 16:34:58.488 RaceTrace[10233:70b] Simperium managing 152 Session object instances
2014-03-10 16:34:58.494 RaceTrace[10233:70b] Simperium managing 16 Track object instances
2014-03-10 16:34:58.489 RaceTrace[10233:f03] Simperium starting...
2014-03-10 16:34:58.497 RaceTrace[10233:f03] Simperium starting...
2014-03-10 16:34:58.499 RaceTrace[10233:f03] Simperium loaded 6 entity definitions
2014-03-10 16:34:58.656 RaceTrace[10233:70b] Simperium managing 1061 Sector object instances
2014-03-10 16:34:59.081 RaceTrace[10233:70b] Simperium managing 1061 Lap object instances
2014-03-10 16:35:18.802 RaceTrace[10233:70b] Simperium managing 255876 GpsPoint object instances
2014-03-10 16:36:08.287 RaceTrace[10233:70b] Simperium managing 17 SectorLine object instances
2014-03-10 16:36:08.289 RaceTrace[10233:3d17] Simperium didn't find an existing auth token (username (null); token (null); appID: prop-carriers-18f)
2014-03-10 16:36:13.273 RaceTrace[10233:70b] Unbalanced calls to begin/end appearance transitions for <UINavigationController: 0xb7a0e90>.
RaceTrace(10233,0xb0115000) malloc: *** mach_vm_map(size=151552) failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
2014-03-10 16:37:04.140 RaceTrace[10233:380b] * Terminating app due to uncaught exception 'NSMallocException', reason: '* NSAllocateObject(): attempt to allocate object of class '__NSArrayI' failed'
*** First throw call stack:
(
0 CoreFoundation 0x035635e4 __exceptionPreprocess + 180
1 libobjc.A.dylib 0x02bd48b6 objc_exception_throw + 44
2 CoreFoundation 0x03515823 __CFAllocateObject2 + 275
3 CoreFoundation 0x035e00fb +[__NSArrayI __new:::] + 43
4 CoreFoundation 0x03517271 -[__NSPlaceholderArray initWithObjects:count:] + 225
5 CoreFoundation 0x03529d52 +[NSArray arrayWithObjects:count:] + 66
6 CoreFoundation 0x0354ae03 -[NSSet allObjects] + 243
7 RaceTrace 0x000d8e84 -[SPPersistentMutableSet save] + 84
8 RaceTrace 0x0010561d -[SPChangeProcessor markObjectWithPendingChanges:bucket:] + 653
9 RaceTrace 0x0014d672 __40-[SPWebSocketChannel sendObjectChanges:]_block_invoke + 306
10 libdispatch.dylib 0x02ef97f8 _dispatch_call_block_and_release + 15
11 libdispatch.dylib 0x02f0e4b0 _dispatch_client_callout + 14
12 libdispatch.dylib 0x02efc07f _dispatch_queue_drain + 452
13 libdispatch.dylib 0x02efbe7a _dispatch_queue_invoke + 128
14 libdispatch.dylib 0x02efce1f _dispatch_root_queue_drain + 83
15 libdispatch.dylib 0x02efd137 _dispatch_worker_thread2 + 39
16 libsystem_pthread.dylib 0x0329adab _pthread_wqthread + 336
17 libsystem_pthread.dylib 0x0329ecce start_wqthread + 30
)
libc++abi.dylib: terminating with uncaught exception of type NSException

@jleandroperez
Copy link
Contributor

@CapoChino thanks for reporting this!. Added issue #229 to track this crash. Let's get that low mem crash fixed ASAP!

@jleandroperez
Copy link
Contributor

@CapoChino hi there,

I've just profiled the validateObjectsForBucketName method. Scenario:

  1. An app with no simperium sync, inserts 10k objects.
  2. Updated the data model (added simperiumKey and simperiumGhost + initialized simperium).

Results:

  • With Verbose Logging disabled, it's taking 2.9 seconds to process 10k objects (iPhone 5).
  • Verbose logging enabled, it took 14 seconds.

May i ask you if this is still an issue for you (with the latest code!)?.

Thanks!

@CapoChino
Copy link
Author

Hi @jleandroperez,

I've updated to the latest code and run a careful test. I’ve run a careful test and have a few findings. The top level summary is:

  1. Low Memory Issues with SPPersistentMutableSet #229 (low memory crash) seems fixed
  2. Inefficiency when first populating the Cloud with local data #208 (very slow upload) still happens for me
  3. Simperium printed some mysterious error/warning messages
  4. My device rebooted spontaneously after running Simperium for awhile.

Here’s the nitty-gritty details:

  • I tested on an iPhone 5 (iOS 7.0.4).
  • My test database 19MB (sqlite file), 138k objects, starts with no simperium fields
  • Simperium Verbose Logging disabled

The entire log file for my test is here: https://www.dropbox.com/s/a88048kunv3srgv/SimperiumIssue208Log.txt

18 seconds elapsed before the first log line. This is the time it took to migrate my database to add the ghostData and simperiumKey fields. 103 seconds elapsed while simperium indexed the new objects, added keys, etc. This is 1340 entities/s. Not quite the 3.3k entities/s that you got, but close enough. That’s not great, but it’s fast enough for me.

I did receive a couple memory warnings, but it didn’t crash. I also saw a few different types of warning messages that didn’t make sense. For example:

2014-03-25 17:45:31.881 RaceTrace[18383:9903] Simperium warning: couldn't processLocalObjectWithKey 5e88c9ef1559432a9b8e0134f71a6963 because the object no longer exists

2014-03-25 17:45:34.488 RaceTrace[18383:a003] Simperium POST returned error 409 for change {
ccids = (
12de982082e24a32a4180a85efd91394
);
clientid = "ios-07d936cf75674871b88fcfb32e24ac77";
error = 409;
id = 6822518fa4544cc398f07670a4e3fffa;
}

2014-03-25 17:45:34.755 RaceTrace[18383:7f03] Simperium error: transform diff for a ghost member (ghost <SPGhost: 0x24044250>, memberData (null)) that doesn't exist (bLon): {
o = "+";
v = "-122.330326782822";
}

Are any of these alarming?

The device went along uploading data to Simperium very slowly for about 8 more minutes and then my device suddenly rebooted completely! After the reboot, I ran my app again. This time, Simperium only took 4s to start up since all the objects had keys this time. Good.

The bad news is that it uploaded for 10 minutes then my device crashed/rebooted again. Let’s ignore the reboots for a moment and come back that later when we know more.

During this second run, I concurrently ran a Python script to query the number of objects in the Simperium bucket every 10 seconds. This showed how fast (er slow) the object uploads were happening:

GpsPoint : count: 1670 time: 1395808679.18
GpsPoint : count: 1736 time: 1395808694.83 delta: 66 new entities in 15.6530220509 seconds = 4.21643819229 entities/s
GpsPoint : count: 1808 time: 1395808709.61 delta: 138 new entities in 30.4364309311 seconds = 4.53404015446 entities/s
GpsPoint : count: 1887 time: 1395808725.33 delta: 217 new entities in 46.1517241001 seconds = 4.70188284904 entities/s
GpsPoint : count: 1961 time: 1395808741.65 delta: 291 new entities in 62.4713189602 seconds = 4.65813760368 entities/s
GpsPoint : count: 2037 time: 1395808757.6 delta: 367 new entities in 78.4273719788 seconds = 4.67948868795 entities/s
GpsPoint : count: 2110 time: 1395808774.35 delta: 440 new entities in 95.1741731167 seconds = 4.62310294475 entities/s
GpsPoint : count: 2189 time: 1395808790.78 delta: 519 new entities in 111.599216938 seconds = 4.65057026599 entities/s
GpsPoint : count: 2264 time: 1395808807.4 delta: 594 new entities in 128.220112085 seconds = 4.63265856143 entities/s
GpsPoint : count: 2315 time: 1395808823.13 delta: 645 new entities in 143.956870079 seconds = 4.48050863877 entities/s
GpsPoint : count: 2315 time: 1395808839.05 delta: 645 new entities in 159.871942997 seconds = 4.03447902058 entities/s
GpsPoint : count: 2341 time: 1395808854.38 delta: 671 new entities in 175.200814009 seconds = 3.82989088148 entities/s
GpsPoint : count: 2380 time: 1395808871.39 delta: 710 new entities in 192.216583014 seconds = 3.69374998176 entities/s
GpsPoint : count: 2421 time: 1395808887.08 delta: 751 new entities in 207.899301052 seconds = 3.61232575675 entities/s
GpsPoint : count: 2461 time: 1395808903.75 delta: 791 new entities in 224.572595119 seconds = 3.52224633455 entities/s
GpsPoint : count: 2502 time: 1395808919.92 delta: 832 new entities in 240.747452974 seconds = 3.45590364393 entities/s
GpsPoint : count: 2542 time: 1395808935.79 delta: 872 new entities in 256.61115098 seconds = 3.39813759718 entities/s
GpsPoint : count: 2582 time: 1395808951.41 delta: 912 new entities in 272.233130932 seconds = 3.35006983492 entities/s
GpsPoint : count: 2585 time: 1395808967.62 delta: 915 new entities in 288.443203926 seconds = 3.17220162426 entities/s
GpsPoint : count: 2585 time: 1395808984.24 delta: 915 new entities in 305.067434072 seconds = 2.99933686066 entities/s

After this the device rebooted so obviously no new entities were pushed up. Two things stand out:

  1. Best case here is about 4.7 object uploads per second. These entities have 7 numeric attributes and 1 string (key) attribute. Given 8 bytes per number and 32 bytes for the key string, that’s 88 bytes/object or 414 bytes/second. (Not a very impressive data transfer rate, needless to say.)
  2. The transfer rate slows down over time, and has pretty much stopped even before the device rebooted. In the final 53 seconds before the crash, only 3 new entities were uploaded.

For reference:
Time @912 entities: 21:42:31.41
Time @reboot (915 entities): 21:43:24.204

I’ve had a quick look with Instruments and it looks like most of the upload time is spent in the new saveAndWait function and in some WebSocket function. I’ll look again more carefully and report back again with findings.

Please let me know if you have any questions. As always, thank you for your diligent work responding to concerns and improving your awesome product.

@chengfred
Copy link
Contributor

@CapoChino Can you send us your app-id so we can check to make sure everything is ok server side?

Also, we think you should find a way to consolidate your data points (perhaps batch them in groups of 1000 or 2000). This will cut down the number of requests and you should see better response times. As I mentioned in the email, the stack isn't currently optimized for the use case of hundreds of thousands of very small objects.

@jleandroperez
Copy link
Contributor

@CapoChino thanks for sharing the details!.

I've been working on this issue for a couple hours. So far i managed to replicate the slow-sync (but no phone reboots, so far).

A couple notes:

Error 409:
That's triggered because of changes posted more than once. It's expected, due to the phone-reboot

Simperium warning: couldn't processLocalObjectWithKey.. because the object no longer exists:
This log is expected when processing objects that got deleted. Is it possible that, at some point, one of the object was deleted?.

Simperium error: transform diff for a ghost member (ghost , memberData (null)) that doesn't exist (bLon):
I'll be hunting down this one. The flow that triggers this is not clear, yet.

Profiling:
So far, i found that (one of the) bottlenecks is object-retrieval itself, from CoreData, to process diffs. As @fredrocious mentioned, that would be greatly alleviated if it was possible to consolidate data points in groups (I.E. "snapshot" of 1k points?).

I'll keep you posted,
Thanks again!

@CapoChino
Copy link
Author

Hi Jorge,

I've done some profiling too. You're right; it does look like Core Data access is one of the bigger bottlenecks. That's actually good news, as Core Data provides lots of opportunities for optimization. It can be really fast, actually. I've been digging into the code, and I do see some good usage of CoreData.

However, my profiling results pointed me to one alarming thing. processLocalObjectWithKey is taking about 25% of the total time. That function does a Core Data fetch for a single object. It's called with an enumeration of all pending changes, which, in the case we're looking at here is ~150k local objects being pushed to the cloud.

I think the enumerateQueuedChangesForBucket could be reworked to fetch all (or a batch) of changed objects and then operate on the chunk instead of fetching one-at-a-time.

The whole thing might be being exacerbated by thread thrashing since it looks like 2 queues operate in unison to do this push: the object processor queue and the persistentmutableset queue responsible for handling keysForObjectsWithMoreChanges. The saveAndWait function of the latter queue is also taking about 25% of the total time. It looks like save is being called an extra time with each object pushed. Also maybe we could batch the changes and save only every X objects.

I'll send you an email with the instruments sample data.

What do you think?

@CapoChino
Copy link
Author

Oh, a couple more notes:

I ran on my phone and didn't get the device reboot. Maybe it only happens when debugging. So that's good. It means that worst-case, maybe I could just tolerate the slow upload.

I hesitate to bundle objects on my end for a few reasons:

  1. I think it might require a lot of work. It would probably need to permeate to my Core Data model and across all my code. For example, we query on properties of individual points. That wouldn't work if individual Points weren't rows, but within a bundle.
  2. It would mostly just help with this one case: the initial upload. I.e. a lot of work, little payoff.
  3. It seems more beneficial if we can improve Simperium performance for all users. Even users with a moderate number of objects might benefit from our optimization efforts.

But if there's not enough development time or we can't do much better in the Simperium code, I can make due somehow, either tolerate slow uploads or bundle objects.

@jleandroperez
Copy link
Contributor

@CapoChino hey there!

I've just pushed a couple optimizations in this branch.

Whenever possible, would you please give it a shot?. Remote changes are now being processed effectively in batch, and local changes as well!.

Thanks in advance!

@CapoChino
Copy link
Author

That is so awesome, Jorge! I can't wait to try it out. I'll report back
as soon as I get a chance.

Thank you!!

On Mon, Apr 7, 2014 at 2:48 PM, Jorge Leandro Perez <
[email protected]> wrote:

@CapoChino https://github.com/CapoChino hey there!

I've just pushed a couple optimizations in this branchhttps://github.com/Simperium/simperium-ios/tree/issues/208-performance
.

Whenever possible, would you please give it a shot?. Remote changes are
now being processed effectively in batch, and local changes as well!.

Thanks in advance!

Reply to this email directly or view it on GitHubhttps://github.com//issues/208#issuecomment-39788200
.

@jleandroperez
Copy link
Contributor

No problem!. In Simulator Tests memory peaks at 100MB maximum.

Sync'ing 50k objects is taking between 15-35 minutes, but i'm unable to trigger any stall'ment behavior (or low memory crashes, so far).

Thanks for testing this out!

@CapoChino
Copy link
Author

Ok, running the same test as before, I got a very steady 5.5 objects/s one
time. Same test different day I got a steady 4.0 objects/s upload. And
again, my device would reboot after uploading for several minutes while
debugging. But just running the app normally, no reboot. And I didn't get
any low memory crashes or even warnings that I know of. So your
enhancement doubled the upload rate for me, thanks!

In an ideal world, the upload would be faster, but I'm satisfied with this
performance for now. If sometime in the future I want to improve things,
I'll profile again and take a look. Thanks again for your work on this! I
think Simperium is ready to ship with my app. I just have a few minor
things to do on my end. Exciting!

On Mon, Apr 7, 2014 at 3:02 PM, Jorge Leandro Perez <
[email protected]> wrote:

No problem!. In Simulator Tests memory peaks at 100MB maximum.

Sync'ing 50k objects is taking between 15-35 minutes, but i'm unable to
trigger any stall'ment behavior (or low memory crashes, so far).

Thanks for testing this out!

Reply to this email directly or view it on GitHubhttps://github.com//issues/208#issuecomment-39789461
.

@jleandroperez
Copy link
Contributor

@CapoChino thanks for the feedback!!

I've merged a couple changes into 208 branch, that will prevent weird scenarios. As soon as a couple critical bugs are solved, i'll be taking a look at how we could further improve performance.

If you have any issues with the current performance, please, just let us know. I'll be merging those changes into develop shortly!

(By the way, congratulations on your app, it looks awesome!).

jleandroperez added a commit that referenced this issue Apr 14, 2014
Issue #208: Performance Improvements
@jleandroperez
Copy link
Contributor

@CapoChino please, if you're still having any kind of issues, let us know.

Let's track further optimizations in new specific issues. Thank you!

@jleandroperez jleandroperez modified the milestones: v0.6.6, v0.7.0 May 9, 2014
@CapoChino
Copy link
Author

Just for reference/information, I pulled the latest code yesterday and tested again with large and small databases on both actual iPhone 5 and the simulator. The general finding is that the small database uploaded at a much faster RATE (objects per second) than the large one. I haven't profiled again since the last time, but I'm betting much of the time is saving SPMutableSet-keysForObjectsWithMoreChanges-*.dat files. One of my .dat files is 5MB.

Detailed results:

  • smalltest (11k objects)
    • Full upload on Simulator: 5 minutes to upload sample database to cloud, which is 37 objects/second, so that's not bad, actually! That completed successfully.
    • Full upload on iPhone5: 6 minutes = 30 objects/s, still pretty good.
  • bigtest (138k objects)
    • Full upload on Simulator: Never completed due to bug I'll file shortly, but rate was ~15-20 objects/s.
    • Full upload on iPhone5: 7.5 hrs = 5.1 objects/s

Note that rate was 6 times faster on smaller database.

Again, this is for future reference purposes. I'm not requesting further work on this specific issue.

@jleandroperez
Copy link
Contributor

@CapoChino thanks for the update on this!.

Let's track new speed-enhancements in this new ticket: #274
Perhaps we could target the collection that stores sets for an upgrade.

Thank you!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants