-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multithreaded parsing #108
Comments
Evan, do you mind elaborating on why you don't want to split the file into chunks? Is it a concern about reduced performance due to searching for valid chunk boundaries and not being able to evaluate global variables as they are parsed? We could probably make it do none of that if the file is a small subninja. (But if it's small, does the performance matter?) Is the concern there the increased complexity? |
Thanks for being so thoughtful about this. I guess first as a general statement of principle I am a fan of (1) making single-threaded code as fast as possible, since that also improves multithreaded performance and (2) when possible, making code fast by making it simpler, especially with architectural simplifications. So it's appealing to me in general if there's an arrangement of files such that they can be safely independently evaluated. (Originally ninja only had 'include' but then I added 'subninja' when I realized it could allow for parallelism, not that we ever benefitted...) Similarly this is why I tried (and failed, as you discovered, whoops) to evaluate EvalStrings aggressively. Bigger picture, n2 was mostly a toy for me to play around with, so I don't really have any strong justification for ... really any sort of technical judgement. I wrote a kind of introspective post about Ninja and one response that I found pretty insightful was someone pointing out I ought to encourage forks: here, I found it, I am 'evmar' there. I don't mention this in any way to try tell you to go away! But rather that I wonder if you might be able to make better progress if you cut me out of your decisionmaking process, given this is kind of a side hobby thing for me... PS: re mmap, n2 currently relies on a trailing nul in the input files (see this improvement for loading) so mmap won't work. That is possible to fix (by changing how EOF is handled) but it will be some work. |
Yeah, maybe it would be better if I completed a fork with all of android's requirements, then upstreamed afterwards. I was just somewhat hesitant to do that at first because of the added work rebasing on top of upstream changes. I would like to ensure we're going in the same direction though, especially with regards to multithreading / parse time in general. Android also uses a trailing nul byte, and does combine that with mmap. Judging based on n2's tracing output, just reading the file into memory takes over twice as long as the android fork's entire ninja invocation, so I think mmap will definitely be necessary. |
Oh that is coool! I had seen some claims that mmap could be counterintuitively slower than read so I hadn't thought about this but from the code there it looks like a good idea! |
I have a WIP branch that implements multithreading (without mmap), but it doesn't actually improve the load times at all. It does seem like parsing finishes faster, but there's a lot of time spent evaluating the parsed elements afterwards. So I may have to look into some more fundamental performance improvements. |
Android has ninja files that total ~3GB in size. Android's fork of ninja has a multithreaded parser that is able to parse the ninja files in just under a second, but n2 takes over 14 seconds to parse the same files. All these numbers are for AOSP, the internal branch is roughly 50% larger.
I tried to get numbers for how fast the parsing was before 909ac60, but n2 fails to parse without that commit:
Android's multithreaded implementation can be seen here. (unfortunately it's not indexed on cs.android.com)
In #94 Evan expressed interest in only multithreading individual subninja files, but not chunks of a single file. This would be more work for android, as we'd have to change soong and kati to split up their ninja files, but may be possible, I need to look into it.
In addition, Android mmap's the file instead of reading it into memory, which probably helps read times. I'm not sure if this would become less effective if many smaller files had to be mmap'd / read individually.
The text was updated successfully, but these errors were encountered: