[nll] fewer allocations in liveness, use dirty list #51819

nikomatsakis · 2018-06-26T17:43:24Z

@nnethercote reports that the liveness computations do a lot of allocations. Indeed, the liveness results are stored in a Vec<IdxSetBuf>:

rust/src/librustc_mir/util/liveness.rs

Line 59 in 2808460

pub ins: IndexVec<BasicBlock, LocalSet>,

rust/src/librustc_mir/util/liveness.rs

Line 49 in 2808460

pub type LocalSet = IdxSetBuf<Local>;

This means that we will allocate a separate bitset for the ins (and outs!) of each basic block. This is rather inefficient. The other dataflow implementations use a different setup. They have just one big index set which has the bits for every basic block in one allocation. They also avoid allocating both an ins and outs -- since liveness is a reverse analysis, they would basically have only the single outs vector (what is live on exit).

The ins vector is only used during generation of the liveness results here (as well as some assertions and debug printouts later on):

rust/src/librustc_mir/util/liveness.rs

Lines 139 to 144 in 2808460

    
           // outs[b] = ∪ {ins of successors} 
        
           bits.clear(); 
        
           for &successor in mir.basic_blocks()[b].terminator().successors() { 
        
               bits.union(&ins[successor]); 
        
           } 
        
           outs[b].clone_from(&bits);

In fact, you don't really need it there -- instead, when you process a block X, you would compute take outs[X], subtract the kills and add the defs, and then take that resulting bit set and propagate it to each predecessor of X.

The text was updated successfully, but these errors were encountered:

nikomatsakis · 2018-06-26T20:09:46Z

In addition, while we are modifying liveness, we can probably adjust the main computation to be less naive:

rust/src/librustc_mir/util/liveness.rs

Lines 135 to 155 in 764232c

    
           while changed { 
        
               changed = false; 
        
               for b in mir.basic_blocks().indices().rev() { 
        
                   // outs[b] = ∪ {ins of successors} 
        
                   bits.clear(); 
        
                   for &successor in mir.basic_blocks()[b].terminator().successors() { 
        
                       bits.union(&ins[successor]); 
        
                   } 
        
                   outs[b].clone_from(&bits); 
        
                   // bits = use ∪ (bits - def) 
        
                   def_use[b].apply(&mut bits); 
        
                   // update bits on entry and flag if they have changed 
        
                   if ins[b] != bits { 
        
                       ins[b].clone_from(&bits); 
        
                       changed = true; 
        
                   } 
        
               } 
        
           }

Here you can see we repeatedly iterate over all the basic blocks and propagate them. We should avoid propagating bits from basic blocks whose "out set" did not change in the previous iteration.

nnethercote · 2018-06-27T03:48:32Z

@nikomatsakis: are you able to make these changes?

nikomatsakis · 2018-06-27T22:14:52Z

@nnethercote I haven't been able to get to it yet; if you want to take a stab at it, please feel free!

nnethercote · 2018-06-28T05:40:45Z

Ok, I've started looking at this. I see various possibilities, most of which don't overlap with the things you've identified above :)

The other dataflow implementations use a different setup.

I think I see why this code does it this way -- it has various uses of solo bitsets, and it makes the code simpler if the per-BB bitsets can have the same form.

The clone_from calls are silly -- each one is replacing a heap-allocated bitset of length N with a clone of another heap-allocated bitset of length N. It should be possible to do an in-place overwrite.

you would compute take outs[X]

I don't understand this sentence... is there a typo, or does "take" have a meaning in this context that I'm not familiar with?

I can also see how to avoid some more allocations in defs_uses().

Also, in the common case liveness_of_locals() is called twice in a row, with a different mode each time (regular and drop). I wonder if the two calls can be combined somehow.

nnethercote · 2018-06-28T09:49:49Z

#51869 and #51870 implement some of these ideas, getting rid of a chunk of the liveness allocations. The speed-ups were smaller than I'd hoped for, though. I'll do some more profiling tomorrow.

nikomatsakis · 2018-06-28T10:30:58Z

OK, @nnethercote I left a few comments on those PRs. Good catch in both cases. (Indeed I may have been wrong about which allocations are most important! =)

I still think it'd be good to make the other changes I described, in particular perhaps the dirty list (which isn't necessarily about allocations -- I hope we're not allocating per block in that inner loop!), so let's not close the bug until we're fully satisfied.

nikomatsakis · 2018-06-29T10:17:03Z

@nnethercote I'm going to take a stab at introducing a dirty list into the propagation, unless you are doing so already. It seems orthogonal enough from your existing PRs. Would that step on your toes at all?

nikomatsakis · 2018-06-29T10:40:11Z

In progress branch:

https://github.com/nikomatsakis/rust/tree/nll-liveness-dirty-list

nikomatsakis · 2018-06-29T10:48:18Z

Branch now contains (a) a dirty list and (b) it removes the ins sets. I am going to measure the performance effects of those two changes separately.

nnethercote · 2018-06-30T06:27:31Z

Looks good, #51869 and #51870 are all I have done on this stuff.

nikomatsakis · 2018-07-06T14:54:04Z

Now that #51896 etc has landed I'm going to close this as we don't have immediately actionable stuff in here.

nikomatsakis added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-compiler-nll NLL-performant Working towards the "performance is good" goal labels Jun 26, 2018

nikomatsakis mentioned this issue Jun 26, 2018

[nll] introduce dirty list to dataflow #51813

Closed

nikomatsakis changed the title ~~[nll] fewer allocations in liveness~~ [nll] fewer allocations in liveness, use dirty list Jun 26, 2018

nikomatsakis mentioned this issue Jun 29, 2018

introduce dirty list to liveness, eliminate ins vector #51896

Merged

nikomatsakis added this to the Rust 2018 Preview 2 milestone Jun 29, 2018

nikomatsakis closed this as completed Jul 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[nll] fewer allocations in liveness, use dirty list #51819

[nll] fewer allocations in liveness, use dirty list #51819

nikomatsakis commented Jun 26, 2018

nikomatsakis commented Jun 26, 2018

Uh oh!

nnethercote commented Jun 27, 2018 •

edited

Loading

Uh oh!

nikomatsakis commented Jun 27, 2018

Uh oh!

nnethercote commented Jun 28, 2018

Uh oh!

nnethercote commented Jun 28, 2018

Uh oh!

nikomatsakis commented Jun 28, 2018

Uh oh!

nikomatsakis commented Jun 29, 2018

Uh oh!

nikomatsakis commented Jun 29, 2018

Uh oh!

nikomatsakis commented Jun 29, 2018

Uh oh!

nnethercote commented Jun 30, 2018

Uh oh!

nikomatsakis commented Jul 6, 2018

Uh oh!

[nll] fewer allocations in liveness, use dirty list #51819

[nll] fewer allocations in liveness, use dirty list #51819

Comments

nikomatsakis commented Jun 26, 2018

nikomatsakis commented Jun 26, 2018

Uh oh!

nnethercote commented Jun 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikomatsakis commented Jun 27, 2018

Uh oh!

nnethercote commented Jun 28, 2018

Uh oh!

nnethercote commented Jun 28, 2018

Uh oh!

nikomatsakis commented Jun 28, 2018

Uh oh!

nikomatsakis commented Jun 29, 2018

Uh oh!

nikomatsakis commented Jun 29, 2018

Uh oh!

nikomatsakis commented Jun 29, 2018

Uh oh!

nnethercote commented Jun 30, 2018

Uh oh!

nikomatsakis commented Jul 6, 2018

Uh oh!

nnethercote commented Jun 27, 2018 •

edited

Loading