Open
Bug 901132
Opened 12 years ago
Updated 2 years ago
Tool to summarize build dependencies
Categories
(Firefox Build System :: General, defect)
Firefox Build System
General
Tracking
(Not tracked)
NEW
People
(Reporter: gps, Unassigned)
References
Details
Attachments
(6 files, 2 obsolete files)
We know we have a header dependency problem in Gecko with some headers getting included into thousands of files and causing mass dependency invalidation.
I'd like to put a tool in developer's hands that allows them to summarize dependency information.
I'm thinking for the first incantation we can just search for .pp files in the objdir and print summaries or dump lists of dependent targets.
I wrote a tool to do this for http://gregoryszorc.com/presentations/2012-11-29-firefox-build-system/. I'm sure I have it somewhere in my patch queue. I'll resurrect it as a mach command or something and get it checked in.
![]() |
||
Comment 1•12 years ago
|
||
I hope you're using the -H option supported by GCC and clang! It prints out the headers that are read and makes this kind of thing really easy.
Crucially, both compilers avoid reading .h files that are #included multiple times by a .cpp file if they can (assuming the .h files have the appropriate #ifndef guards -- see bug 881579 for details), and if you try to compute how many headers are #included without accounting for that you'll probably get the wrong number.
Comment 2•12 years ago
|
||
Nick did a bunch of work in bug 634839 and bug 888768 on reducing includes in SpiderMonkey, which is worth a read for anybody looking to do that in the browser.
![]() |
||
Comment 3•12 years ago
|
||
I also posted a detailed summary to the related dev.build/dev.platform thread. It's not visible yet at https://groups.google.com/forum/#!topic/mozilla.dev.builds/sQgQtSDr6yI but hopefully will be soon.
Reporter | ||
Comment 4•12 years ago
|
||
Turns out I had an old patch in my patch queue that prints counts of
dependencies. It does this by parsing all the .pp dependency files in
the object directory created during builds. The files listed in these
files come from the compiler, so I assume the list is as robust as
passing -H to the compiler.
We probably want to make this more useful before landing. What other
features do people want?
Reporter | ||
Updated•12 years ago
|
Assignee: nobody → gps
![]() |
||
Comment 5•12 years ago
|
||
Can you give some example output? I'm still not clear exactly what this tool does.
Comment 6•12 years ago
|
||
My computer helpfully crashed just as I was uploading this the first time :-(
This a simple python script that tries to read all of the files in the source and object directories of mozilla-central/comm-central [I ran it on a ccrework'd comm-central locally] and build up a complete directed graph of all inclusions in the tree. It's not perfect--it can't resolve about 2500 of the 115000 inclusions to specific files--but the heuristics I used were:
1. Find the header in dist/include[/nspr]? Use that!
2. Is it in the local srcdir? Use that!
3. Is it in the relevant objdir? Use that!
4. Is there only one file with that name? Use that!
5. Either there's no file or too many files... we're screwed :-)
The output is a DOT file written to stdout (yes, I'm lazy). The graph is way too big for DOT to handle, but the format is well-known enough that other graph libraries can read or write it.
Comment 7•12 years ago
|
||
As an example of what this map can tell us, here is a list of all simple cycles in the inclusion graph. It's amusing that the number of cycles with 1 node in them is not 0...
Comment 8•12 years ago
|
||
I ran this python code with my full includes list:
import networkx as nx
includesG = nx.DiGraph(nx.read_dot('/src/mozilla-tools/full-includes.dot'))
files = dict()
for n in includesG.nodes():
if n.endswith('.h'):
continue
for x in nx.descendants(includesG, n):
files[x] = files.get(x, 0) + 1
includeCount = list((v,k) for k,v in files.items())
includeCount.sort(reverse=True)
for count, name in includeCount:
print '%5d %s' % (count, name)
And that made this file. The results are somewhat surprising--the most heavily included non-XPCOM/MFBT/JS/mozalloc header is PSpdyPush3.h. No, I am not kidding, and I looked it up, and the rationale actually makes sense.
Comment 9•12 years ago
|
||
This is another data point to correlate with: the number of times .h/.idl/.ipdl files have been changed in the last 10000 changesets before a11cd50edb0d (getting us back to the beginning of May, so about 3 months worth).
Conclusion, we're rebuilding more than 3000 files daily on average because of jsapi.h alone.
![]() |
||
Comment 10•12 years ago
|
||
A lot of those JS headers have seen a lot of churn recently because of all the header/#include refactoring I've done in the past couple of months.
Having said that, jsapi.h is enormous and does get a lot of churn.
Comment 11•12 years ago
|
||
This is a list of all include relations, ranked by their "criticality": the number of source files that would no longer include this header if the #include directive were dropped. Note that I'm scraping by the source code here, so this is going to assume that every #ifdef foo/#include/#endif is being activated. I've also eliminated the 92% of the file where the criticality is 1 or 0.
Cycles are broken in arbitrary places, and I've changed my definition of source files from "anything not named .h or .idl" to "any filename that contains .c or .C or .m and is never included by anyone else"
![]() |
||
Comment 12•12 years ago
|
||
At first glance I wonder if templatizing XPCOM strings would be a net win over the file I/O incurred by all those includes...
Comment 13•12 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #12)
> At first glance I wonder if templatizing XPCOM strings would be a net win
> over the file I/O incurred by all those includes...
Bug 717356? I don't know if C-style templating is slower or faster than C++-style templating.
![]() |
||
Comment 14•12 years ago
|
||
> This is a list of all include relations, ranked by their "criticality"
This line appears to be bogus:
707 js/src/jit/shared/IonFrames-shared.h from ipc/chromium/src/base/hash_tables.h
Comment 15•12 years ago
|
||
I was cooking up a patch as part of bug 858927 to limit the inclusion of TimeStamp.h because every time I touched that header it seemed that I had to rebuild half of my tree. Shall I make a separate bug for it or is it enough to track that work there?
Comment 16•12 years ago
|
||
(In reply to comment #15)
> I was cooking up a patch as part of bug 858927 to limit the inclusion of
> TimeStamp.h because every time I touched that header it seemed that I had to
> rebuild half of my tree. Shall I make a separate bug for it or is it enough to
> track that work there?
Please file a separate bug, thanks!
Comment 17•12 years ago
|
||
(In reply to Joshua Cranmer [:jcranmer] from comment #11)
> Created attachment 789006 [details]
> List of inclusion directives by criticality
Might it be worthwhile to put this on an Etherpad or something so people can investigate the top problems and annotate them with bug numbers/reasons why they can't be removed/etc.?
![]() |
||
Comment 18•12 years ago
|
||
> Might it be worthwhile to put this on an Etherpad or something so people can
> investigate the top problems and annotate them with bug numbers/reasons why
> they can't be removed/etc.?
That would be useful. Having said that, we'll need to re-run the analysis every so often (e.g. I just landed bug 904962 which removed a few critical ones). I'm not sure how to combine those two needs. I guess we could get the diff each time it's re-run, and then somehow incorporate the changes into the etherpad?
![]() |
||
Comment 19•12 years ago
|
||
jcranmer, a bunch of header improvements have been made. Can you regenerate the "transitively" and "criticality" lists? Thanks!
Flags: needinfo?(Pidgeot18)
Comment 20•12 years ago
|
||
Attachment #785612 -
Attachment is obsolete: true
Flags: needinfo?(Pidgeot18)
Comment 21•12 years ago
|
||
And the updated version of criticality. These are up-to-date as of 4ccfffb6262a.
Attachment #789006 -
Attachment is obsolete: true
Updated•12 years ago
|
Attachment #798117 -
Attachment is patch: false
![]() |
||
Comment 22•12 years ago
|
||
Here are the top 5 transitively included files, before:
7915 mozilla/obj-x86_64-unknown-linux-gnu/dist/include/nspr/prcpucfg.h
7726 mozilla/obj-x86_64-unknown-linux-gnu/dist/include/nspr/prtypes.h
7726 mozilla/obj-x86_64-unknown-linux-gnu/dist/include/nspr/obsolete/protypes.h
7584 mozilla/obj-x86_64-unknown-linux-gnu/dist/include/nspr/prinrval.h
7327 mozilla/obj-x86_64-unknown-linux-gnu/dist/include/nspr/prthread.h
and after:
5922 --GENERATED--/dist/include/nspr/prcpucfg.h
5733 --GENERATED--/dist/include/nspr/prtypes.h
5733 --GENERATED--/dist/include/nspr/obsolete/protypes.h
5516 --GENERATED--/dist/include/nspr/prinrval.h
5257 --GENERATED--/dist/include/nspr/prlong.h
Four of the five are the same files, but the numbers are way down. Looks like the work that various people are doing is having an effect!
Comment 23•12 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #22)
> Four of the five are the same files, but the numbers are way down. Looks
> like the work that various people are doing is having an effect!
The two lists aren't comparable: I excluded only files ending in .h for the old version, but both .h and .idl in the second version. I think I may also have used comm-central for the old version, but all recent stats are computed with mozilla-central.
Reporter | ||
Updated•11 years ago
|
Assignee: gps → nobody
Updated•7 years ago
|
Product: Core → Firefox Build System
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•