-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
genhtml is too slow compared to old version #284
Comments
In general, parallel execution is faster if you have enough work that can be done independently, have sufficient resources to allow parallel execution without thrashing, and the overhead of parallelism is less than time taken by the particular task. Also, there is a Probably worth to use current TOT, as there have been some performance optimizations since the 2.0 release. |
Here is some summary info generated from genhtml.json( without {"test1": {"overall": 69.693657875061, "html": {"cnt": 884, "min": 0.00220179557800293, "max": 0.715844869613647, "sum": 37.969685792922974, "avg": 0.042952133249912865}, "load": {"cnt": 699, "min": 0.000123023986816406, "max": 0.0527210235595703, "sum": 2.5745861530303955, "avg": 0.0036832419928904086}, "categorize": {"cnt": 699, "min": 0.000234842300415039, "max": 0.246188163757324, "sum": 8.29749059677124, "avg": 0.011870515875209214}, "dir": {"cnt": 184, "min": 0.00324487686157227, "max": 0.0333459377288818, "sum": 1.4431145191192625, "avg": 0.007843013690865557}, "file": {"cnt": 699, "min": 0.00352787971496582, "max": 1.24869990348816, "sum": 56.45982718467713, "avg": 0.08077228495661964}}, "test2": {"overall": 53.7329921722412, "html": {"cnt": 681, "min": 0.00214409828186035, "max": 0.708955049514771, "sum": 28.689658641815186, "avg": 0.04212872047256268}, "load": {"cnt": 533, "min": 0.00011897087097168, "max": 0.0503799915313721, "sum": 1.939279317855835, "avg": 0.003638422735189184}, "categorize": {"cnt": 533, "min": 0.000257968902587891, "max": 0.220337152481079, "sum": 6.480504274368286, "avg": 0.012158544604818548}, "dir": {"cnt": 147, "min": 0.00304913520812988, "max": 0.028817892074585, "sum": 1.0736997127532961, "avg": 0.007304079678593851}, "file": {"cnt": 533, "min": 0.00348591804504395, "max": 1.21855282783508, "sum": 42.94275641441345, "avg": 0.08056802329158246}}, "test3": {"overall": 28.3487250804901, "html": {"cnt": 379, "min": 0.00214195251464844, "max": 0.499402046203613, "sum": 15.58631682395935, "avg": 0.04112484650121201}, "load": {"cnt": 288, "min": 0.000129938125610352, "max": 0.0341091156005859, "sum": 1.0401902198791504, "avg": 0.0036117715968026053}, "categorize": {"cnt": 288, "min": 0.000262975692749023, "max": 0.122973203659058, "sum": 3.4219408035278325, "avg": 0.011881738901138307}, "dir": {"cnt": 90, "min": 0.00344610214233398, "max": 0.0301930904388428, "sum": 0.6219146251678466, "avg": 0.006910162501864962}, "file": {"cnt": 288, "min": 0.00318288803100586, "max": 0.779124021530151, "sum": 23.067707538604736, "avg": 0.08009620673126644}}, "test4": {"overall": 18.4163129329681, "html": {"cnt": 211, "min": 0.00220489501953125, "max": 0.738096952438354, "sum": 9.605504512786867, "avg": 0.04552371807007994}, "load": {"cnt": 166, "min": 0.000113964080810547, "max": 0.0519740581512451, "sum": 0.649959087371826, "avg": 0.003915416188986903}, "categorize": {"cnt": 166, "min": 0.000352144241333008, "max": 0.218640089035034, "sum": 2.2901556491851807, "avg": 0.013796118368585426}, "dir": {"cnt": 44, "min": 0.00253701210021973, "max": 0.0253028869628906, "sum": 0.33654904365539556, "avg": 0.00764884190125899}, "file": {"cnt": 166, "min": 0.0033719539642334, "max": 1.25581407546997, "sum": 14.650397777557371, "avg": 0.08825540829853838}}, "test5": {"overall": 8.58937692642212, "html": {"cnt": 127, "min": 0.00346803665161133, "max": 0.148995161056519, "sum": 4.690181016921997, "avg": 0.03693055918836218}, "load": {"cnt": 106, "min": 0.00026392936706543, "max": 0.0105841159820557, "sum": 0.32061481475830084, "avg": 0.003024668063757555}, "categorize": {"cnt": 106, "min": 0.000406026840209961, "max": 0.0425150394439697, "sum": 1.0230870246887207, "avg": 0.009651764383855855}, "dir": {"cnt": 20, "min": 0.00366497039794922, "max": 0.0290079116821289, "sum": 0.1894917488098145, "avg": 0.009474587440490725}, "file": {"cnt": 106, "min": 0.00561189651489258, "max": 0.248773097991943, "sum": 6.9603416919708225, "avg": 0.06566360086764927}}, "test6": {"overall": 7.45630288124084, "html": {"cnt": 98, "min": 0.00310301780700684, "max": 0.227667808532715, "sum": 4.2659528255462655, "avg": 0.043530130872921076}, "load": {"cnt": 76, "min": 0.000192880630493164, "max": 0.0157678127288818, "sum": 0.2864232063293456, "avg": 0.0037687263990703367}, "categorize": {"cnt": 76, "min": 0.000235080718994141, "max": 0.0471210479736328, "sum": 0.8142759799957274, "avg": 0.01071415763152273}, "dir": {"cnt": 21, "min": 0.0032801628112793, "max": 0.0259180068969727, "sum": 0.15669584274292, "avg": 0.007461706797281905}, "file": {"cnt": 76, "min": 0.00458002090454102, "max": 0.345099925994873, "sum": 6.142184734344482, "avg": 0.08081822018874318}}} with {"test1": {"overall": 139.099180936813, "html": {"cnt": 884, "min": 0.00317001342773438, "max": 0.79960298538208, "sum": 38.1168737411499, "avg": 0.04311863545378948}, "load": {"cnt": 699, "min": 0.000245094299316406, "max": 0.0875239372253418, "sum": 3.3733892440795894, "avg": 0.00482602180841143}, "categorize": {"cnt": 699, "min": 0.000663042068481445, "max": 0.232724905014038, "sum": 11.837536096572876, "avg": 0.01693495865031885}, "directory": {"cnt": 184, "min": 1.12537908554077, "max": 3.34291887283325, "sum": 444.03566646575933, "avg": 2.413237317748692}, "file": {"cnt": 699, "min": 1.06501793861389, "max": 3.74806904792786, "sum": 1511.8968088626862, "avg": 2.1629425019494795}, "child": {"cnt": 884, "min": 0.0118467807769775, "max": 1.38333296775818, "sum": 87.0812258720398, "avg": 0.09850817406339343}}, "test2": {"overall": 87.1324789524078, "html": {"cnt": 681, "min": 0.00324201583862305, "max": 0.801820993423462, "sum": 31.7707302570343, "avg": 0.04665305470930147}, "load": {"cnt": 533, "min": 0.000234127044677734, "max": 0.0558700561523438, "sum": 2.3206124305725098, "avg": 0.004353869475745797}, "categorize": {"cnt": 533, "min": 0.000843048095703125, "max": 0.249961853027344, "sum": 8.851559162139893, "avg": 0.016607052837035445}, "directory": {"cnt": 147, "min": 0.778333902359009, "max": 3.31218600273132, "sum": 280.12529325485224, "avg": 1.9056142398289269}, "file": {"cnt": 533, "min": 0.792023897171021, "max": 3.2999119758606, "sum": 893.5714342594147, "avg": 1.6764942481414908}, "child": {"cnt": 681, "min": 0.0102710723876953, "max": 1.3947548866272, "sum": 66.63386845588684, "avg": 0.097847090243593}}, "test3": {"overall": 27.4819920063019, "html": {"cnt": 379, "min": 0.0034639835357666, "max": 0.458012104034424, "sum": 16.648162126541138, "avg": 0.043926549146546534}, "load": {"cnt": 288, "min": 0.000232934951782227, "max": 0.0298130512237549, "sum": 1.1898078918457031, "avg": 0.0041312774022420245}, "categorize": {"cnt": 288, "min": 0.000811100006103516, "max": 0.153878927230835, "sum": 4.552825689315795, "avg": 0.01580842253234651}, "directory": {"cnt": 90, "min": 0.458359956741333, "max": 1.46185898780823, "sum": 86.64420795440677, "avg": 0.9627134217156308}, "file": {"cnt": 288, "min": 0.456877946853638, "max": 1.90998005867004, "sum": 264.52932047843933, "avg": 0.918504584994581}, "child": {"cnt": 379, "min": 0.00936508178710938, "max": 0.746864080429077, "sum": 32.218116521835334, "avg": 0.08500822301275814}}, "test4": {"overall": 12.3653080463409, "html": {"cnt": 211, "min": 0.00300908088684082, "max": 0.815544128417969, "sum": 9.816606521606444, "avg": 0.046524201524201156}, "load": {"cnt": 166, "min": 0.000207901000976562, "max": 0.0422279834747314, "sum": 0.664856195449829, "avg": 0.004005157803914632}, "categorize": {"cnt": 166, "min": 0.000793933868408203, "max": 0.219452857971191, "sum": 2.8899414539337154, "avg": 0.017409285867070575}, "directory": {"cnt": 44, "min": 0.32911205291748, "max": 1.22095799446106, "sum": 31.903474807739258, "avg": 0.7250789729031649}, "file": {"cnt": 166, "min": 0.310504913330078, "max": 2.15262699127197, "sum": 111.91055321693419, "avg": 0.6741599591381577}, "child": {"cnt": 211, "min": 0.00918817520141602, "max": 1.33370518684387, "sum": 18.484768152236935, "avg": 0.08760553626652576}}, "test5": {"overall": 5.10910105705261, "html": {"cnt": 127, "min": 0.00406289100646973, "max": 0.173205852508545, "sum": 5.150458574295043, "avg": 0.040554791923583015}, "load": {"cnt": 106, "min": 0.000497817993164062, "max": 0.0174529552459717, "sum": 0.38253927230834966, "avg": 0.0036088610595127325}, "categorize": {"cnt": 106, "min": 0.000803947448730469, "max": 0.0517008304595947, "sum": 1.4405694007873533, "avg": 0.013590277365918427}, "directory": {"cnt": 20, "min": 0.178601980209351, "max": 0.47684907913208, "sum": 7.307308197021484, "avg": 0.36536540985107424}, "file": {"cnt": 106, "min": 0.158262968063354, "max": 0.692163944244385, "sum": 38.51706671714783, "avg": 0.3633685539353569}, "child": {"cnt": 127, "min": 0.00975990295410156, "max": 0.29651403427124, "sum": 9.54466986656189, "avg": 0.07515488083907}}, "test6": {"overall": 3.60685014724731, "html": {"cnt": 98, "min": 0.00437092781066895, "max": 0.213585138320923, "sum": 4.359879493713379, "avg": 0.044488566262381415}, "load": {"cnt": 76, "min": 0.000296115875244141, "max": 0.0146920680999756, "sum": 0.2980520725250243, "avg": 0.003921737796381899}, "categorize": {"cnt": 76, "min": 0.00069880485534668, "max": 0.0452780723571777, "sum": 1.024108409881592, "avg": 0.013475110656336737}, "directory": {"cnt": 21, "min": 0.151760101318359, "max": 0.423412084579468, "sum": 6.200369358062746, "avg": 0.2952556837172736}, "file": {"cnt": 76, "min": 0.127248048782349, "max": 0.550164937973022, "sum": 23.31259274482727, "avg": 0.3067446413793062}, "child": {"cnt": 98, "min": 0.0101768970489502, "max": 0.323693990707397, "sum": 7.506403207778928, "avg": 0.07659595109978498}}} It seems that |
I fear, not helpful. If 'file' times are going up: that suggests that your filesystem or network cannot keep up with the demand, and is a bottleneck. |
All files are processed locally rather than retrieved from the network. As you concerned, disk storage might be slow, so this time the output directory was mounted as tmpfs, but that didn't help. .json files: By the way |
The data suggests that you are running out of resources/thrashing. Note that we do expect the 'child' processing time to be longer than the single-threaded processing time for the same file or directory object. It is also interesting that your data seems to show that lcov 2.0/2.1 is slower than 1.14 in single-threaded mode. I wonder if that is still true, if you enable branch coverage. |
Have written a script to generate sample project, the results may be reproducible.
Results tested on another system with 32cores, 64G mem. I looked at the resource usage and that the memory is definitely enough:
We see that in single-threaded mode, 2.1 is slower than 1.14~1.16. |
Thanks for the testcase generator...quite helpful. Serial execution with lcov/2.* is slower than previous versions when generating very simple reports because of new features: the implementation is looking for additional information, branching around code that isn't used in those reports, checking for errors, etc. None of that is free. Parallel execution is not helping in your example for essentially the same reason/the same problem which motivated 40f4b04: when the child process has relatively little to do, then completed children can pile up, waiting in the queue to be reaped. The result is that their idle/wait time is longer than their process time. However: I don't have a strong motivation to implement that at present, as we almost always use more complicated reports - typically, annotated differential coverage reports - and the consequent additional child workload means that completed tasks don't linger on the queue. Your example behaves similarlly when I modify it to use some additional features (e.g., annotations). If this means that a particular job should use serial processing because parallel execution does not help: I think that is OK. Similarly: if this means that some jobs should use an old lcov release: I think that that is OK too. My last question is what your goal/your application is. |
In my case, several .info files were processed one by one, and "--parallel" increased the overall time. Generating the coverage report took almost half of the total time. I noticed that older versions seemed to be faster, so I'm wondering if there was a bug. |
Does "processed one by one" mean "aggregation" - set union? Or a different operation? |
By "processed one by one" I mean:
I need to process total 6 .info files, the elapsed time is the sum of each .info processing time, which gives me about 276.68 seconds. process the .info files simultaneously(each without parallelism):
The elapsed time is determined by the longest processing time among all the .info files processed simultaneously, which gives me about 70.51 seconds. |
running on ubuntu20.04, with 16 cores, 32G memory.
tested using the same tracefile(about 700 files, 300k+ lines, 20k+ functions, 100k+ branches), genhtml command:
with ubuntu20.04 default lcov 1.14:
with lcov 2.0:
with lcov 2.0(--parallel):
It seems that adding
--parallel
makes things worse, which consumes more CPU and memory, but have no advantage.(I thought parallelism should be faster?)The text was updated successfully, but these errors were encountered: