Overview

btdu collects data using sub-processes. Each subprocess is started with its own random sub-seed, which then prints results to its standard output. The main process collects data from subprocesses, and arranges it into a tree.

Source layout

alloc.d - Memory allocation utility code.
browser.d - The ncurses-based user interface.
common.d - Small module with common definitions.
impexp.d - Import/export code.
main.d - Entry point and event loop.
paths.d - Implements a trie for efficiently storing paths and associated hierarchical data.
proto.d - Describes the protocol used between the main process and subprocesses.
sample.d - The main loop for subprocesses. Performs sample acquisition.
state.d - Global variables.
subproc.d - Code to manage subprocesses from the main process.

Design decisions

Here are a few potential questions about decisions made when writing btdu, which have possibly non-obvious answers.

Why copy ncdu's UI?

Many users who would like to use btdu have probably already used ncdu. Mimicking a familiar user interface allows the users to be immediately productive in the new software. Since btdu aims to be a tool which solves problems, as opposed to creating them, removing the obstacle of learning a new user interface was seen as a non-negligible benefit.

Why sub-processes and not threads?

To collect data, btdu issues ioctls to the kernel. These ioctls can take a very long time to execute, and are non-interruptible. The main reason why this becomes an issue is that D's garbage collector, which operates according to a stop-the-world model, needs to suspend all threads for it to scan reachable objects. Threads executing ioctls can only be "suspended" once the ioctl finishes - even though no userspace code is executed during the ioctl, the signal is still processed only after the ioctl returns. This means that a GC cycle will last for as long as the longest ioctl, which can take up to a few seconds. As the UI is not redrawn or can process keyboard input during this time, this would make it unpleasantly unresponsive.

There are some other, similar but less obvious reasons, such as allowing the user to instantly exit back to the shell even though an ioctl would otherwise need to take a few more seconds to finish.

Why random samples and not some even / predictable order?

Given the use of multiple subprocesses which collect data in parallel, there's two general approaches that this could be done:

Divide the volume into some predefined sections, one per subprocess, and dedicate each section to a subprocess. The subprocess then autonomously picks samples within its section using whatever deterministic algorithm.
Have the main process decide on the order of samples to query, and dispatch these samples in order to subprocesses as they report that they are idle and ready to do more work.

The problem with the first approach is that the time it takes to resolve a sample varies. Consider the simple hypothetical scenario where the first half of the disk is twice as fast as the second half of the disk. Then, with two subprocesses, the subprocess responsible for the first half of the disk could collect 200 samples in the time that the other subprocess would collect only 100. If the samples were treated equally, this would thus create the false impression that the first half of the disk contains two thirds of the data, instead of just half.

Though btdu could do this and compensate by scaling the weight of the samples by the time it took to resolve them, this introduces two orders of inaccuracy and causes the results to be further skewed by temporary effects such as disk cache.

The second approach listed above does avoid this problem. However, its implementation requires more elaborate communication with subprocesses - currently, subprocesses do not read any data from the parent process at all.

How does using a random uniform distribution avoid the problem caused by variable sample resolution duration above?

Let's consider a hypothetical extreme scenario where samples in the first half of the disk take 1 millisecond to resolve, but 10 milliseconds in the second half. For illustration, we'll use 1000 worker processes.

What will happen is that, even though the total number of samples collected in the first half will initially be higher, every time as workers assigned to the first half quickly finish their work, half of them will get assigned to the second half. Thus, all workers will quickly get "clammed up" and assigned to the slow second half. This process repeats as the workers in the second half finish resolving. Most importantly, the difference in the total number of samples remains the same, even as the total number of samples grows:

Elapsed ms	Busy workers, 1st half	Busy workers, 2nd half	Total samples, 1st half	Total samples, 2nd half
0	500	500	0	0
1	250	750	500	0
2	125	875	750	0
3	63	938	875	0
4	31	969	938	0
5	16	984	969	0
6	8	992	984	0
7	4	996	992	0
8	2	998	996	0
9	1	999	998	0
10	250	750	999	500
11	250	750	1250	750
12	188	812	1500	875
13	125	875	1687	938
14	78	922	1812	969
15	47	953	1891	984
16	27	973	1937	992
17	16	984	1965	996
18	9	991	1980	998
19	5	995	1989	999
20	128	872	1994	1250
21	189	811	2122	1500
22	188	812	2311	1687
23	157	843	2499	1812
24	117	883	2656	1891
25	82	918	2773	1937
...	...	...	...	...
100	94	906	9482	8671
...	...	...	...	...
500	91	909	45826	45008
...	...	...	...	...
1000	91	909	91281	90463

As you can see, the ratio for the total number of samples still converges towards 50%.

Why use the same random seed by default?

Using a fixed seed instead of an unique (unpredictable) seed enables the following workflow:

Run btdu to collect initial information about disk usage.
Quit btdu, and delete the biggest space hog (e.g. a batch of old snapshots).
Re-run btdu to acquire fresh results.

Because the metadata that btdu accessed will now be in the operating system's cache, the second invocation is likely to be much faster, and it will quickly "fast-forward" to the point where the first invocation stopped.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARCHITECTURE.md

ARCHITECTURE.md

Overview

Source layout

Design decisions

Why copy ncdu's UI?

Why sub-processes and not threads?

Why random samples and not some even / predictable order?

How does using a random uniform distribution avoid the problem caused by variable sample resolution duration above?

Why use the same random seed by default?

Files

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Overview

Source layout

Design decisions

Why copy ncdu's UI?

Why sub-processes and not threads?

Why random samples and not some even / predictable order?

How does using a random uniform distribution avoid the problem caused by variable sample resolution duration above?

Why use the same random seed by default?