Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Do not perform typechecking if files are unchanged when compiling with -p #40721

Open
5 tasks done
TimvdLippe opened this issue Sep 23, 2020 · 8 comments
Open
5 tasks done
Labels
Domain: --incremental The issue relates to incremental compilation Domain: tsc -b Issues related to build mode In Discussion Not yet reached consensus Suggestion An idea for TypeScript

Comments

@TimvdLippe
Copy link
Contributor

TimvdLippe commented Sep 23, 2020

Search Terms

incremental, composite

Chrome DevTools and TypeScript

TLDR: integrate/improve incremental build functionality into -p

Please see the summary at the bottom for the actual feature request in this issue. The rest of it is (important) background information as to why we are making this feature request.

As you might be aware, Chrome DevTools is migrating from the Closure Compiler to the TypeScript compiler.
As part of the integration of TypeScript with GN/Ninja, we have written a desugaring Python script to eventually call tsc.

The high-level process is as follows:

  1. GN/Ninja figures out which GN actions need to run, based on the files that have been changed
  2. One of these GN actions could be calling ts_library.py
  3. The Python script first generates a tsconfig.json, based on its file inputs and general configuration. This tsconfig.json file is written to the filesystem, see below for an example
  4. We call tsc with pinned versions of both Node and TypeScript and point it to the tsconfig.json file with the -p compiler flag

This setup is similar to tsc -b.
However, since Chrome DevTools is part of the Chromium codebase, we have to integrate with GN/Ninja.
As such, GN/Ninja is "running the world", rather than a tool like TypeScript.
Therefore, we are not able to use tsc -b, as it assumes that tsc is the tool "running the world".

In general, this setup works.
Sadly, one area that we do have some issues is related to the performance of the TypeScript compiler.

Performance investigation

There are two areas of interest for our integration: the performance of both a clean and an incremental build.

For a clean build, we are mostly bound by the performance of the TypeScript compiler itself.
Since we have no prior information, we can only take advantage of compiler options that improve performance.
For example, we have been using --skipLibCheck for all targets except one, as we can assume that libs generally don't have problems across multiple different subfolders.

For an incremental build, the situation is a bit different.
Since GN/Ninja is quite smart at figuring out when (not) to run a GN action, we have optimized our TypeScript integration to only run if strictly necessary.
To do so, we are taking advantage of .tsbuildinfo files and general caching of results.

Sadly, even for incremental builds we are observing quite long compilation times.
Therefore, I decided to do a performance investigation in the TypeScript compiler explicitly for its incremental build performance.

Incremental build analysis

The base assumption that I operated on was the following:

Given two consecutive invocations of tsc without any file changes, the second tsc invocation should perform minimal (if at all any) work

However, I quickly realized that this assumption is not the case.
The scenario that I tested was the following:

  1. Given that I have performed a fresh build of DevTools
  2. Verify that a rebuild with GN/Ninja shows "no work to do"
  3. Call tsc manually as if it were part of a normal GN action and observe its performance

The command I used to analyze its performance was the following:

$ time third_party/node/node.py --output --trace-ic node_modules/typescript/lib/tsc.js -p out/Default/gen/front_end/sdk/sdk-tsconfig.json --extendedDiagnostics --generateCpuProfile profile.cpuprofile

Example output (collapsed for brevity):

$ time third_party/node/node.py --output node_modules/typescript/lib/tsc.js -p out/Default/gen/front_end/sdk/sdk-tsconfig.json --extendedDiagnostics
Files:                         170
Lines:                       87519
Nodes:                      322413
Identifiers:                112717
Symbols:                     83238
Types:                       22788
Instantiations:              24303
Memory used:               148140K
Assignability cache size:     3899
Identity cache size:          1478
Subtype cache size:            597
Strict subtype cache size:     499
I/O Read time:               0.02s
Parse time:                  0.98s
ResolveTypeReference time:   0.00s
ResolveModule time:          0.06s
Program time:                1.14s
Bind time:                   0.53s
Check time:                  2.54s
transformTime time:          0.91s
Total time:                  4.20s
 
real    0m5.524s
user    0m10.460s
sys     0m0.342s

Since DevTools has a lot of files/LoC, the summation of the invocation times adds up to minutes. In this analysis, I chose the sdk folder, as Ninja reports that it is the slowest part of the DevTools build (log collapsed for brevity):

$ NINJA_SUMMARIZE_BUILD=1 autoninja -C out/Release -w dupbuild=err
depot_tools/ninja -C out/Release -w dupbuild=err -j 10 -d stats
ninja: Entering directory `out/Release'
[1 processes, 1/1 @ 3.2/s : 0.312s ] Regenerating ninja files
[1 processes, 1502/1502 @ 4.7/s : 322.149s ] STAMP obj/generate_devtools_grd.stamp
metric                count   avg (us)  total (ms)
.ninja parse          4       48223.8   192.9
canonicalize str      50764   0.2       7.8
canonicalize path     51274   0.1       4.3
lookup node           57528   0.2       9.8
.ninja_log load       2       14999.0   30.0
.ninja_log recompact  1       322624.0  322.6
node stat             24605   17.1      421.2
.ninja_deps load      2       175.5     0.4
depfile load          2       435.0     0.9
StartEdge             1504    1378.1    2072.7
FinishCommand         1503    149.8     225.2
 
path->node hash load 0.78 (9599 entries / 12289 buckets)
   Longest build steps:
          2.3 weighted s to build (38 items) gen/front_end/perf_ui/perf_ui-tsconfig.json, gen/front_end/perf_ui/perf_ui-tsconfig.json.tsbuildinfo, ... (13.2 s elapsed time)
          2.3 weighted s to build (29 items) gen/front_end/console/console-tsconfig.json, gen/front_end/console/console-tsconfig.json.tsbuildinfo, ... (13.7 s elapsed time)
          3.2 weighted s to build (65 items) gen/front_end/profiler/profiler-tsconfig.json, gen/front_end/profiler/profiler-tsconfig.json.tsbuildinfo, ... (18.0 s elapsed time)
          3.3 weighted s to build (77 items) gen/front_end/network/network-tsconfig.json, gen/front_end/network/network-tsconfig.json.tsbuildinfo, ... (18.2 s elapsed time)
          3.5 weighted s to build (104 items) gen/front_end/sources/sources-tsconfig.json, gen/front_end/sources/sources-tsconfig.json.tsbuildinfo, ... (18.3 s elapsed time)
          3.5 weighted s to build (62 items) gen/front_end/resources/resources-tsconfig.json, gen/front_end/resources/resources-tsconfig.json.tsbuildinfo, ... (16.7 s elapsed time)
          3.6 weighted s to build (119 items) gen/front_end/elements/elements-tsconfig.json, gen/front_end/elements/elements-tsconfig.json.tsbuildinfo, ... (20.2 s elapsed time)
          3.7 weighted s to build (65 items) gen/front_end/timeline/timeline-tsconfig.json, gen/front_end/timeline/timeline-tsconfig.json.tsbuildinfo, ... (17.4 s elapsed time)
          3.8 weighted s to build (179 items) gen/front_end/ui/ui-tsconfig.json, gen/front_end/ui/ui-tsconfig.json.tsbuildinfo, ... (14.4 s elapsed time)
          4.7 weighted s to build (191 items) gen/front_end/sdk/sdk-tsconfig.json, gen/front_end/sdk/sdk-tsconfig.json.tsbuildinfo, ... (13.1 s elapsed time)
   Time by build-step type:
          0.1 s weighted time to generate 7 .css files (0.7 s elapsed time sum)
          0.2 s weighted time to generate 6 .html files (1.5 s elapsed time sum)
          0.4 s weighted time to generate 1 .grd files (0.4 s elapsed time sum)
          1.2 s weighted time to generate 800 .stamp files (7.5 s elapsed time sum)
          2.4 s weighted time to generate 84 .prebundle.ts files (14.3 s elapsed time sum)
          2.8 s weighted time to generate 95 .json files (16.2 s elapsed time sum)
         44.5 s weighted time to generate 187 .js files (272.7 s elapsed time sum)
        270.6 s weighted time to generate 322 .d.ts files (1811.4 s elapsed time sum)
   322.1 s weighted time (2124.7 s elapsed time sum, 6.6x parallelism)
   1502 build steps completed, average of 4.66/s

After analyzing the flamecharts produced by tsc, I observed that TypeScript was indeed checking the source files, even though technically no files had changed.
Yet in its flamechart, I found references to the incremental build, which we have turned on via --composite (which in turn implies --incremental).

The callstack included:

  1. performIncrementalCompilation
  2. createIncrementalProgram
  3. createIncrementalCompilerHost
  4. changeCompilerHostLikeToUseCache

Based on these functions, I ventured further and eventually found references to a function called tryReuseStructureFromOldProgram.
This function sounded very interesting, so I decided to figure out its callstack (console.log(new Error().stack)):

Error
   at tryReuseStructureFromOldProgram (devtools-frontend/node_modules/typescript/lib/tsc.js:85780:25)
   at Object.createProgram (devtools-frontend/node_modules/typescript/lib/tsc.js:85464:30)
   at Object.getBuilderCreationParameters (devtools-frontend/node_modules/typescript/lib/tsc.js:88599:29)
   at createEmitAndSemanticDiagnosticsBuilderProgram (devtools-frontend/node_modules/typescript/lib/tsc.js:88876:107)
   at Object.createIncrementalProgram (devtools-frontend/node_modules/typescript/lib/tsc.js:90295:16)
   at Object.performIncrementalCompilation (devtools-frontend/node_modules/typescript/lib/tsc.js:90254:33)
   at performIncrementalCompilation (devtools-frontend/node_modules/typescript/lib/tsc.js:92484:29)
   at executeCommandLineWorker (devtools-frontend/node_modules/typescript/lib/tsc.js:92356:17)
   at devtools-frontend/node_modules/typescript/lib/tsc.js:92401:99
   at devtools-frontend/node_modules/typescript/lib/tsc.js:4422:25

tryReuseStructureFromOldProgram returns 0 (which implies its program could not be reused), as oldProgram does not exist.

However, when analyzing createBuilderProgramState I discovered that it was correctly deducing that there were no files changed.
console.log(state.changedFilesSet); logged an empty set.
This is correct, as no files had changed and the full program information from the .tsbuildinfo could be used.

At this point, I was a bit puzzled.
It seemed like tsc was able to figure out nothing had changed, yet it was still doing work.
Based on the content of the .tsbuildinfo file, I continued searching for its content.
There were two interesting fieldnames: signature and version.

When searching for \.\bsignature\b, I found two interesting functions:

const computeHash = host.createHash || generateDjb2Hash;.

Sadly computeHash is passed in as a method parameter into a lot of functions.
Therefore, it is difficult to figure out where it is actually used.

/**
* Returns if the shape of the signature has changed since last emit
*/
export function updateShapeSignature

The second function was a lot more interesting and also had references to computeHash.
Based on my reading of these functions, tsc can figure when (not) to compile a particular project.
This is (as expected) based on file hashes and checking (among other things) the compiler version it was previously compiled with.

While these functions seemed what I was looking for, adding logging to either of those showed that they were not called at all.
I added additional logging to numerous callsides of updateShapeSignature, yet none of these were called.

At this point, I was a bit confused as to how/why the .tsbuildinfo was seemingly used, but not used determining whether it should compile at all.

TLDR: integrate/improve incremental build functionality into -p

Eventually I realized the following: tsc -b and tsc -w can make efficient decisions about recompilation.
These two modes can figure out whether recompilation is necessary and bail out if the above mentioned functions determine that nothing has changed.

However, tsc -p does not take advantage of this functionality.
To improve the incremental build performance of DevTools (where the assumption is that tsc is not "running the world"), can we extend tsc -p to prevent unnecessary checking when no files have changed?

Essentially, my expectation would be that the time third_party/node/node.py command I posted all the way at the top would do no (or near zero) work, if nothing has changed.
This would have significant performance improvements for DevTools, where a majority of the files rarely change and rebuilds are very frequent.

Checklist

My suggestion meets these guidelines:

  • This wouldn't be a breaking change in existing TypeScript/JavaScript code
  • This wouldn't change the runtime behavior of existing JavaScript code
  • This could be implemented without emitting different JS based on the types of the expressions
  • This isn't a runtime feature (e.g. library functionality, non-ECMAScript syntax with JavaScript output, etc.)
  • This feature would agree with the rest of TypeScript's Design Goals.
@DanielRosenwasser DanielRosenwasser added In Discussion Not yet reached consensus Suggestion An idea for TypeScript Domain: --incremental The issue relates to incremental compilation Domain: tsc -b Issues related to build mode labels Sep 23, 2020
@DanielRosenwasser
Copy link
Member

Reading this more, I would think that this is actually a bug rather than a suggestion, but I'm conservatively marking it as a suggestion. Maybe @sheetalkamat can speak to whether the current behavior was intentional or not.

@TimvdLippe
Copy link
Contributor Author

Classifying it as a bug seems fine to me as well 😄

@sheetalkamat
Copy link
Member

I think the behaviour is intentional..
tsc -b means you are expecting the file upto date check to detemine if things need to be compiled or not.
tsc -p doesnt mean that. What it means is that construct program and then do incremental emit/check if incremental flag is on.
The main difference is tsc -b checks the timestamps on input and output to determine if things need to be compiled at all.
tsc -p doesnt and it has to create the program because it doesnt know if world has changed (eg. module resolution files are available or not so it needs to parse those files.
Having said that apart from having to parse those files, time is also spent in module resolution so we have been thinking about how we can optimize that part (it is for both modes) and #40356 will help in that.

@TimvdLippe
Copy link
Contributor Author

We initially attempted to use tsc -b, however we ran into the problem where TypeScript expects its "running the world" in builder mode. This meant that if we declared two targets in GN (A and B) where A depends on B, GN would first compile B and then compile A. However, during compilation of A, it would recheck whether B is up-to-date. This led to non-determinism in our build system, as B was now also being compiled during A's compilation, leading to files changed and timestamp changes.

Basically, we want the functionality of tsc -b, but without the introduction of non-determinism as GN should be running the world, rather than tsc. Would you be open to extending tsc -b to not perform the recursive checks and assume that its project references are up-to-date?

@sheetalkamat
Copy link
Member

Basically, we want the functionality of tsc -b, but without the introduction of non-determinism as GN should be running the world, rather than tsc. Would you be open to extending tsc -b to not perform the recursive checks and assume that its project references are up-to-date?

I am not sure what mode that would be.. Note that if project has references to another project, if there are changes in the referenced project, it needs to be built. So that upto date check is correct and because tsc -b means build solution, it will build that referenced solution. So i think tsc -b not building whole world is confusing.
With tsc -p doing time checks is tricky since we dont want to do this unconditionally for sure so this will have to be under some flag.Even with incremental, people building when there are changes are more compared to when things are upto date so that check is just added overhead. It also raises question as to which files are input files (potentially add files in the program?) which is not what build does.. it only relies on config file specified input files and ignores eg node_modules and such dependencies for upto date check. Basically its not very clear who and how much this check adds as overhead (it builds up if you have large program to check file timestamps) vs perf

@TimvdLippe
Copy link
Contributor Author

Basically, we want the functionality of tsc -b, but without the introduction of non-determinism as GN should be running the world, rather than tsc. Would you be open to extending tsc -b to not perform the recursive checks and assume that its project references are up-to-date?

I am not sure what mode that would be.. Note that if project has references to another project, if there are changes in the referenced project, it needs to be built. So that upto date check is correct and because tsc -b means build solution, it will build that referenced solution. So i think tsc -b not building whole world is confusing.

If you solely use tsc -b, then it would indeed need to verify that the referenced project is up-to-date. However, we are operating in a build system where that is a guarantee. But I understand that tsc -b is aimed towards a "tsc runs the world", which makes sense imo.

With tsc -p doing time checks is tricky since we dont want to do this unconditionally for sure so this will have to be under some flag.Even with incremental, people building when there are changes are more compared to when things are upto date so that check is just added overhead. It also raises question as to which files are input files (potentially add files in the program?) which is not what build does.. it only relies on config file specified input files and ignores eg node_modules and such dependencies for upto date check. Basically its not very clear who and how much this check adds as overhead (it builds up if you have large program to check file timestamps) vs perf

Adding a flag would be okay for us. We have full control over tsc, so that is quite easy to do.

I am not really following the other parts of your comment, I am sorry. With regards to our input files, we specify all input files and disable all other resolution. E.g. we also remove the @types directory resolution. Typically our programs are small, at most 10-15 files per program.

I understand your concerns about additional overhead for the majority of tsc -p invocations. Putting it behind a flag would maybe make that work? Adding a --trust-me-i-am-an-engineer (name TBD 😉) where tsc -p assumes that all of its project references are up-to-date, but only performs the timestamp checks for its current input files, tsc version, etc...

If you want, I can help out prototyping to figure out what would work for us. If you could provide me pointers in the code to where I should be looking, I can help debugging next week.

@TimvdLippe
Copy link
Contributor Author

As a small update: given that this particular use case does not seem to and will not be supported by the TypeScript compiler, we have since been looking at mitigating the impact with GOMA: https://bugs.chromium.org/p/chromium/issues/detail?id=1139220 We are currently in discussion with the GOMA team to figure out an implementation. Sadly, this solution is not available for non-Googlers, which means that Chromium builds for non-Googlers will remain slow.

@Misaka-0x447f

This comment was marked as off-topic.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Domain: --incremental The issue relates to incremental compilation Domain: tsc -b Issues related to build mode In Discussion Not yet reached consensus Suggestion An idea for TypeScript
Projects
None yet
Development

No branches or pull requests

4 participants