-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsoftware-engineering.typ
2258 lines (1660 loc) · 79.2 KB
/
software-engineering.typ
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
#set text(font: "New Computer Modern", lang: "en", weight: "light", size: 11pt)
#set page(margin: 1.75in)
#set par(leading: 0.55em, spacing: 0.85em, first-line-indent: 1.8em, justify: true)
#set heading(numbering: "1.1")
#set math.equation(numbering: "(1)")
#show figure: set block(breakable: true)
#show figure.caption: set align(center)
#show heading: set block(above: 1.4em, below: 1em)
#show outline.entry.where(level: 1): it => { show repeat : none; v(1.1em, weak: true); text(size: 1em, strong(it)) }
#show raw.where(block: true): block.with(inset: 1em, width: 100%, fill: luma(254), stroke: (left: 5pt + luma(245), rest: 1pt + luma(245)))
#show figure.where(kind: raw): it => { set align(left); it }
#show raw: set text(font:"CaskaydiaCove NFM", lang: "en", weight: "light", size: 9pt)
#show sym.emptyset : sym.diameter
#let reft(reft) = box(width: 9pt, place(dy: -8pt, dx: -0pt,
box(radius: 100%, width: 9pt, height: 9pt, inset: 1pt, stroke: .5pt, // fill: black,
align(center + horizon, text(font: "CaskaydiaCove NFM", size: 7pt, repr(reft)))
)
))
#show raw: r => { show regex("t(\d)"): it => { reft(int(repr(it).slice(2, -1))) }; r }
// #let note(body) = block(width: 100%, inset: 1em, stroke: (paint: silver, dash: "dashed"), body)
#let note(body) = box(width: 100%, inset: 1em, fill: luma(254), stroke: (paint: silver, dash: "dashed"), body)
// #let note(body) = block(inset: 1em, fill: luma(254), stroke: (thickness: 1pt, paint: luma(245)), body)
#let docs(dest) = link("https://en.cppreference.com/w/cpp/numeric/random/uniform_int_distribution", [`(`#underline(offset: 2pt, `docs`)`)`])
#page(align(center + horizon, {
heading(outlined: false, numbering: none, text(size: 1.5em)[Software Engineering])
text(size: 1.3em)[
Cicio Ionuț \
]
align(bottom, datetime.today().display("[day]/[month]/[year]"))
}))
#page[The latest version of the `.pdf` and the referenced material can be found at the following link: #underline(link("https://github.com/CuriousCI/software-engineering")[https://github.com/CuriousCI/software-engineering])]
#page(outline(indent: auto, depth: 3))
#set page(numbering: "1")
= Software models
Software projects require *design choices* that often can't be driven by experience or reasoning alone. That's why a *model* of the project is needed to compare different solutions.
== The _"Amazon Prime Video"_ article
If you were tasked with designing the software architecture for Amazon Prime Video, which choices would you make? What if you had the to keep the costs minimal? Would you use a distributed architecture or a monolith application?
More often than not, monolith applications are considered more costly and less scalable than the counterpart, due to an inefficient usage of resources. But, in a recent article, a Senior SDE at Prime Video describes how they _"*reduced the cost* of the audio/video monitoring infrastructure by *90%*"_ @prime by using a monolith architecture.
There isn't a definitive way to answer these type of questions, but one way to go about it is building a model of the system to compare the solutions. In the case of Prime Video, _"the audio/video monitoring service consists of three major components:"_ @prime
- the _media converter_ converts input audio/video streams
- the _defect detectors_ analyze frames and audio buffers in real-time
- the _orchestrator_ controls the flow in the service
#figure(caption: "audio/video monitoring system process")[#{
set text(font: "CaskaydiaCove NFM", weight: "light", lang: "en")
block(inset: 1em, stroke: 1pt + luma(245), fill: luma(254),
image("public/audio-video-monitor.svg", width: 100%)
)
}]
To answer questions about the system, it can be simulated by modeling its components as *Markov decision processes*.
== Models <traffic>
=== Markov chain <markov-chain>
In simple terms, a Markov chain $M$ is described by a set of *states* $S$ and the *transition probability* $p : S times S -> [0, 1]$ such that $p(s'|s)$ is the probability to transition from $s$ to $s'$. The transition probability $p$ is constrained by @markov-chain-constrain
$ forall s in S space.en sum_(s' in S) p(s'|s) = 1 $ <markov-chain-constrain>
A Markov chain (or Markov process) is characterized by memorylesness (called the Markov property), meaning that predictions can be made solely on its present state, and aren't influenced by its history.
#figure(caption: [Example Markov chain with $S = {"rainy", "sunny"}$])[
#{
set text(font: "CaskaydiaCove NF", weight: "light", lang: "en")
block(width: 100%, inset: 1em, fill: luma(254), stroke: 1pt + luma(245),
image("public/weather-system.svg", width: 70%)
)
}
] <rainy-sunny>
#figure(caption: [Transition probability of @rainy-sunny])[
#table(
columns: (auto, auto, auto),
stroke: .1pt,
table.header([], [sunny], [rainy]),
[sunny], [0.8], [0.2],
[rainy], [0.5], [0.5]
)
] <rainy-sunny-transition-matrix>
If a Markov chain $M$ transitions at *discrete time* steps (i.e. the time steps $t_0, t_1, t_2, ...$ are a countable) and the *state space* is countable, then it's called a DTMC (discrete-time Markov chain). There are other classifications for continuous state space and continuous-time.
The Markov process is characterized by a *transition matrix* which describes the probability of certain transitions, like the one in @rainy-sunny-transition-matrix. Later in the guide it will be shown that implementing transition matrices in `C++` is really simple with the `<random>` library.
=== Markov decision process <mdp>
A Markov decision process (MDP), despite sharing the name, is *different* from a Markov chain, because it interacts with an *external environment*. A MDP $M$ is a tuple $(U, X, Y, p, g)$ s.t.
- $U$ is the set of *input values*
- $X$ is the set of *states*
- $Y$ is the set of *output values*
- $p : X times X times U -> [0, 1]$ is such that $p(x'|x, u)$ is the probability to *transition* from state $x$ to state $x'$ when the *input value* is $u$
- $g : X -> Y$ is the *output function*
- $x_0 in X$ is the *initial state*
The same constrain in @markov-chain-constrain holds for MDPs, with an important difference: *for each input value*, the sum of the transition probabilities for *that input value* must be 1.
$ forall x in X space.en forall u in U space.en sum_(x' in X) p(x'|x, u) = 1 $
==== MDP example <mdp-example>
The development process of a company can be modeled as a MDP \ $M = (U, X, Y, p, g)$ s.t.
- $U = {epsilon}$ #footnote[If $U$ is empty $M$ can't transition, at least 1 input is required, i.e. $epsilon$], $X = {0, 1, 2, 3, 4}, Y = "Cost" times "Duration", x_0 = 0$
$
g(x) = cases(
(0, 0) & quad x in {0, 4} \
(20000, 2) & quad x in {1, 3} \
(40000, 4) & quad x = 2
)
$
#v(1em)
#align(center)[
#figure({
set text(font: "CaskaydiaCove NF", weight: "light", lang: "en")
block(inset: 1em, fill: luma(254), stroke: 1pt + luma(245),
image("public/development-process-markov-chain.svg"))
}, caption: "the model of a team's development process") <development-process>
]
#v(1em)
#grid(
columns: (auto, auto),
gutter: 1em,
[
#table(
columns: (auto, auto, auto, auto, auto, auto),
stroke: .1pt,
align: center + horizon,
table.header([$epsilon$], [*0*], [*1*], [*2*], [*3*], [*4*]),
[*0*], [0], [1], [0], [0], [0],
[*1*], [0], [.3], [.7], [0], [0],
[*2*], [0], [.1], [.2], [.7], [0],
[*3*], [0], [.1], [.1], [.1], [.7],
[*4*], [0], [0], [0], [0], [1],
)
],
[
Only *1 transition matrix* is needed, as $|U| = 1$ (there's 1 input value). If $U$ had multiple input values, like ${"on", "off", "wait"}$, then multiple transition matrices would have been required, one *for each input value*.
]
)
=== Network of MDPs
Let $M_1, M_2$ be two MDPs s.t.
- $M_1 = (U_1, X_1, Y_1, p_1, g_1)$
- $M_2 = (U_2, X_2, Y_2, p_2, g_2)$
Then $M = (U_1, X_1 times X_2, Y_2, p, g)$ s.t.
- $p((x_1', x_2') | (x_1, x_2), u_1) = (p(x_1'|x_1, u_1), p(x_2'|x_2, g_1(x_1)))$
- $g((x_1, x_2)) = g_2(x_2)$
- TODO the connection can also be partial
#pagebreak()
== Other methods
=== Incremental average <incremental-average>
Given a set of values $X = {x_1, ..., x_n} subset RR$ the average $overline(x)_n = (sum_(i = 0)^n x_i) / n$ can be computed with a simple procedure
#figure(caption: `examples/average.cpp`)[
```cpp
float average(std::vector<float> X) {
float sum = 0;
for (auto x_i : X)
sum += x_i;
return sum / X.size();
}
```
]
The problem with this procedure is that, by adding up all the values before the division, the *numerator* could *overflow*, even if the value of $overline(x)_n$ fits within the IEEE-754 limits. Nonetheless, $overline(x)_n$ can be calculated incrementally.
$
overline(x)_(n + 1) = (sum_(i = 0)^(n + 1) x_i) / (n + 1) =
((sum_(i = 0)^n x_i) + x_(n + 1)) / (n + 1) =
(sum_(i = 0)^n x_i) / (n + 1) + x_(n + 1) / (n + 1) = \
((sum_(i = 0)^n x_i) n) / ((n + 1) n) + x_(n + 1) / (n + 1) =
(sum_(i = 0)^n x_i) / n dot.c n / (n + 1) + x_(n + 1) / (n + 1) = \
overline(x)_n dot n / (n + 1) + x_(n + 1) / (n + 1)
$
With this formula the numbers added up are smaller: $overline(x)_n$ is multiplied by $n / (n + 1) tilde 1$, and, if $x_(n + 1)$ fits in IEEE-754, then $x_(n + 1) / (n + 1)$ can also be encoded.
#figure(caption: `examples/average.cpp`)[
```cpp
float incr_average(std::vector<float> X) {
float average = 0;
for (size_t n = 0; n < X.size(); n++)
average =
average * ((float)n / (n + 1)) + X[n] / (n + 1);
return average;
}
```
]
In `examples/average.cpp` the procedure `incr_average()` successfuly computes the average, but the simple `average()` returns `Inf`.
#pagebreak()
=== Welford's online algorithm <welford>
In a similar fashion, it could be faster and require less memory to calculate the *standard deviation* incrementally. Welford's online algorithm can be used for this purpose.
$
M_(2, n) = sum_(i=1)^n (x_i - overline(x)_n)^2 \
M_(2, n) = M_(2, n-1) + (x_n - overline(x)_(n - 1))(x_n - overline(x)_n) #reft(2) \
sigma^2_n = M_(2, n) / n \
// s^2_n = M_(2, n) / (n - 1)
$
Given $M_2$, if $n > 0$, the standard deviation is $sqrt(M_(2, n) / n)$ #reft(3) . The average can be calculated incrementally like in @incremental-average #reft(1) .
#figure(caption: `mocc/math.cpp`)[
```cpp
void Stat::save(real_t x_i) {
real_t next_mean =
mean_ * ((real_t)n / (n + 1)) + x_i / (n + 1); t1
m_2__ += (x_i - mean_) * (x_i - next_mean); t2
mean_ = next_mean;
n++;
}
real_t Stat::stddev() const {
return n > 0 ? sqrt(m_2__ / n) : 0; t3
}
```
]
=== Euler method // for ordinary differential equations
When an ordinary differential equation can't be solved analitically, the solution must be approximated. There are many techniques: one of the simplest ones (yet less accurate and efficient) is the forward Euler method, described by the following equation:
$ y_(n + 1) = y_n + Delta dot.c f(x_n, y_n) $ <euler-method>
Let the function $y$ be the solution to the following problem
$
cases(
y(x_0) = y_0 \
y'(x) = f(x, y(x))
)
$
Let $y(x_0) = y_0$ be the initial condition of the system, and $y' = f(x, y(x))$ be the known *derivative* of $y$ ($y'$ is a function of $x$ and $y(x)$). To approximate $y$, a $Delta$ is chosen (the smaller, the more precise the approximation) s.t. $x_(n + 1) = x_n + Delta$. Now, understanding @euler-method should be easier: the value of $y$ at the *next step* is the current value of $y$ plus the value of its derivative $y'$ (multiplied by $Delta$). In @euler-method $y'$ is multiplied by $Delta$ because when going to the next step, all the derivatives from $x_n$ to $x_(n + 1)$ must be added up, and it's done by adding up
$ (x_(n + 1) - x_n) dot.c f(x_n, y_n) = Delta dot.c f(x_n, y_n) $
Where $y_n = y(x_n)$. Let's consider the example in @euler-method-example.
$
cases(y(x_0) = 0 \ y'(x) = 2x), quad "with" Delta = 1, 0.5, 0.3, 0.25
$ <euler-method-example>
The following program approximates @euler-method-example with different $Delta$ values.
#figure(caption: `examples/euler.cpp`)[
```cpp
#define SIZE 4
float derivative(float x) { return 2 * x; }
int main() {
size_t x[SIZE];
for (size_t i = 0; i < SIZE; i++) {
x[i] = 0;
float delta = 1. / (i + 1);
for (float t = 0; t <= 10; t += delta)
x[i] += delta * derivative(t);
}
return 0;
}
```
]
The approximation is close, but not very precise. However, the error analysis is beyond this guide's scope.
#figure(caption: "examples/euler.svg", image("examples/euler.svg", width: 77%))
=== Monte Carlo method
"Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results.
The underlying concept is to use randomness to solve problems that might be deterministic in principle [...] Monte Carlo methods are mainly used in three distinct problem classes: optimization, numerical integration, and generating draws from a probability distribution" @monte-carlo-method
#note[
The cost to develop a feature for a certain software is described by an uniform discrete distribution $cal(U){300, 1000}$. Determine the probability that the cost is less than $550$.
]
The problem above can be easily solved analitically, but let's use the Monte Carlo method to approximate its value.
#figure(caption: `examples/montecarlo.cpp`)[
```cpp
#include <iostream>
#include <random>
int main() {
std::random_device random_device;
std::default_random_engine rand_engine(random_device());
std::uniform_int_distribution<> rand_cost(300, 1000);
const size_t ITERATIONS = 10000;
size_t below_550 = 0;
for (size_t i = 0; i < ITERATIONS; i++) t1
if (rand_cost(rand_engine) < 550) t2
below_550++; t3
std::cout << (double)below_550/ITERATIONS t4
<< std::endl;
return 0;
}
```
]
The first step is to simulate for a certain number of iterations #reft(1) the system (in this example, "simulating the system" means generating a random integer cost between 300 and 1000 #reft(2) ). If the the iteration respects the requested condition ($< 550$), then it's counted #reft(3).
At the end of the simulations, the probability is calculated as #math.frac([iterations below 550], [total iterations]) #reft(4) . The bigger is the number of iterations, the more precise is the approximation. This type of calculation can be very easily distributed in a HPC (high performance computing) context.
// === Simulated annealing would be cool
// The Monte Carlo method is used to approximate values. For example, if you want to approximate the area under a function, just scale the function to a $1 times 1$ square, generate $n$ random x points, count how many of them are under the curve (let that number be $s$) and the area is $s / n$. It can also be used to approximate $pi$ for example.
//
// The Monte Carlo method can be used to approximate probabilities:
// - run a simulation $n$ times (where $n$ is very big)
// - for each simulation
// - get the cost of the simulation
// - add $1$ to $s$ if the cost is below a certain threshold $t$
// - the probability that the cost of the system is below $t$ is $s / n$
// - TODO: this can be seen in `practice/etc...`
#pagebreak()
= How to `C++`
This section covers the basics assuming the reader already knows `C`.
== Prelude
`C++` is a strange language, and some of its quircks need to be pointed out to have a better understanding of what the code does in later sections.
=== Operator overloading <operator-overloading>
#figure(caption: `examples/random.cpp`)[
```cpp
#include <iostream>
#include <random>
int main() {
std::cout << random() t1 << std::endl;
std::random_device random_device;
std::cout << random_device() t2 << std::endl;
int seed = random_device();
std::default_random_engine random_engine(seed); t3
std::cout << random_engine() t4 << std::endl;
}
```
] <random-example-1>
In @random-example-1, to generate a random number, ```cpp random_device()``` #reft(2) and ```cpp random_engine()``` #reft(4) are used like functions, but they *aren't functions*, they're *instances* of a ```cpp class```. That's because in `C++` the behaviour a certain operator (like `+`, `+=`, `<<`, `>>`, `[]`, `()` etc..) when used on a instance of the ```cpp class``` can be defined by the programmer.
It's called *operator overloading*, and it's relatively common feature:
- in `Python` operation overloading is done by implementing methods with special names, like ```python __add__()``` @python-operator-overloading
- in `Rust` it's done by implementing the `Trait` associated with the operation, like ```rust std::ops::Add``` @rust-operator-overloading.
- `Java` and `C` don't have operator overloading
For example, ```cpp std::cout``` is an instance of the ```cpp std::basic_ostream class```, which overloads the method "```cpp operator<<()```" @basic-ostream; ```cpp std::cout << "Hello"``` is a valid piece of code which *doesn't do a bitwise left shift* like it would in `C`, but prints on the standard output the string ```cpp "Hello"```. // The same applies to to file streams.
```cpp random()``` #reft(1) _should_ be a regular function call, but, for example, ```cpp std::default_random_engine random_engine(seed);``` #reft(3) is something else alltogether: a constructor call, where ```cpp seed``` is the parameter passed to the constructor to instantiate the ```cpp random_engine``` object.
// === Instantiating objects
#pagebreak()
== The ```cpp random``` library <random-library>
The `C++` standard library offers tools to easily implement MDPs.
=== Random engines
In `C++` there are many ways to *generate random numbers* @pseudo-random-number-generation. Generally it's not recommended to use ```cpp random()``` #reft(1) /*(reasons...)*/. It's better to use a *random generator* #reft(5), because it's fast, deterministic (given a seed, the sequence of generated numbers is the same) and can be used with distributions. A `random_device` is a non deterministic generator: it uses a *hardware entropy source* (if available) to generate the random numbers.
#figure(caption: `examples/random.cpp`)[
```cpp
#include <iostream>
#include <random>
int main() {
std::cout << random() t1 << std::endl;
std::random_device random_device; t2
std::cout << random_device() t3 << std::endl;
int seed = random_device(); t4
std::default_random_engine random_engine(seed); t5
std::cout << random_engine() t6 << std::endl;
}
```
] <random-example>
The typical course of action is to instantiate a `random_device` #reft(2), and use it to generate a seed #reft(4) for a `random_engine`. Given that random engines can be used with distributions, they're really useful to implement MDPs. Also, #reft(3) and #reft(6) are examples of *operator overloading* (@operator-overloading).
From this point on, ```cpp std::default_random_engine``` will be reffered to as ```cpp urng_t``` (uniform random number generator type).
#figure()[
```cpp
#include <random>
// works like typedef in C
using urng_t = std::default_random_engine;
int main() {
urng_t urng(190201); // constructor call with parameter 190201
}
```
]
=== Distributions <distributions>
Just the capability to generate random numbers isn't enough, these numbers need to be manipulated to fit certain needs. Luckly, `C++` covers *basically all of them*. For example, the MDP in @development-process can be easily simulated with the following code code:
#figure(caption: `examples/transition_matrix.cpp`)[
```cpp
#include <iostream>
#include <random>
using urng_t = std::default_random_engine;
int main() {
std::random_device random_device;
urng_t urng(random_device());
std::discrete_distribution<> transition_matrix[] = {
{0, 1},
{0, .3, .7},
{0, .1, .2, .7},
{0, .1, .1, .1, .7},
{0, 0, 0, 0, 1},
};
size_t state = 0;
for (size_t step = 0; step < 15; step++) {
state = transition_matrix[state](urng);
std::cout << state << std::endl;
}
return 0;
}
```
] <transition-matrix>
==== Uniform discrete distribution #docs("https://en.cppreference.com/w/cpp/numeric/random/uniform_int_distribution") <uniform-int>
#note[
To test a system $S$ it's requried to build a generator that sends value $v_t$ to $S$ every $T_t$ seconds. For each send, the value of $T_t$ is an *integer* chosen uniformly in the range $[20, 30]$.
]
The `C` code to compute $T_t$ would be ```cpp T = 20 + rand() % 11;```, which is very *error prone*, hard to remember and has no semantic value. In `C++` the same can be done in a *simpler* and *cleaner* way:
```cpp
std::uniform_int_distribution<> random_T(20, 30); t1
size_t T = t2 random_T(urng);
```
The interval $T_t$ can be easily generated #reft(2) without needing to remember any formula or trick. The behaviour of $T_t$ is defined only once #reft(1), so it can be easily changed without introducing bugs or inconsistencies. It's also worth to take a look at the implementation of the exercise above (with the addition that $v_t = T_t$), as it comes up very often in software models.
#figure(caption: `examples/interval_generator.cpp`)[
```cpp
#include <iostream>
#include <random>
using urng_t = std::default_random_engine;
int main() {
std::random_device random_device;
urng_t urng(random_device());
std::uniform_int_distribution<> random_T(20, 30);
size_t T = random_T(urng), next_request_time = T;
for (size_t time = 0; time < 1000; time++) {
if (time < next_request_time)
continue;
std::cout << T << std::endl;
T = random_T(urng);
next_request_time = time + T;
}
return 0;
}
```
]
The ```cpp uniform_int_distribution``` has many other uses, for example, it could uniformly generate a random state in a MDP. Let ```cpp STATES_SIZE``` be the number of states
```cpp
uniform_int_distribution<> random_state(0, STATES_SIZE-1 t1);
```
```cpp random_state``` generates a random state when used. Be careful! Remember to use ```cpp STATES_SIZE-1``` #reft(1), because ```cpp uniform_int_distribution``` is inclusive. Forgettig ```cpp -1``` can lead to very sneaky bugs, like random segfaults at different instructions. It's very hard to debug unless using `gdb`. The ```cpp uniform_int_distribution``` can also generate negative integers, for example $z in { x | x in ZZ and x in [-10, 15]}$.
==== Uniform continuous distribution #docs("https://en.cppreference.com/w/cpp/numeric/random/uniform_real_distribution") <uniform-real>
It's the same as above, with the difference that it generates *real* numbers in the range $[a, b) subset RR$.
==== Bernoulli distribution #docs("https://en.cppreference.com/w/cpp/numeric/random/uniform_real_distribution") <bernoulli>
#note[
To model a network protocol $P$ it's necessary to model requests. When sent, a request can randomly fail with probability $p = 0.001$.
]
Generally, a random fail can be simulated by generating $r in [0, 1]$ and checking whether $r < p$.
```cpp
std::uniform_real_distribution<> uniform(0, 1);
if (uniform(urng) < 0.001)
fail();
```
A `std::bernoulli_distribution` is a better fit for this specification, as it generates a boolean value and its semantics represents "an event that could happen with a certain probability $p$".
```cpp
std::bernoulli_distribution random_fail(0.001);
if (random_fail(urng))
fail();
```
==== Normal distribution #docs("https://en.cppreference.com/w/cpp/numeric/random/normal_distribution") <normal>
Typical Normal distribution, requires the mean #reft(1) and the stddev #reft(2) .
#figure(caption: `examples/normal.cpp`)[
```cpp
#include <iomanip>
#include <iostream>
#include <map>
#include <random>
using urng_t = std::default_random_engine;
int main() {
std::random_device random_device;
urng_t urng(random_device());
std::normal_distribution<> normal(12 t1, 2 t2);
std::map<long, unsigned> histogram{};
for (size_t i = 0; i < 10000; i++)
++histogram[(size_t)normal(urng)];
for (const auto [k, v] : histogram)
if (v / 200 > 0)
std::cout << std::setw(2) << k << ' '
<< std::string(v / 200, '*') << '\n';
return 0;
}
```
]
```bash
8 **
9 ****
10 *******
11 *********
12 *********
13 *******
14 ****
15 **
```
==== Exponential distribution #docs("https://en.cppreference.com/w/cpp/numeric/random/exponential_distribution") <exponential>
#note[
A server receives requests at a rate of 5 requests per minute from each client. You want to rebuild the architecture of the server to make it cheaper. To test if the new architecture can handle the load, its required to build a model of client that sends requests at random intervals with an expected rate of 5 requests per minute.
]
It's easier to simulate the system in seconds (to have more precise measurements). If the client sends 5/min, the rate in seconds should be $lambda = 5 / 60 ~ 0.083$ requests per second.
#figure(caption: `examples/exponential.cpp`)[
```cpp
/* ... */
int main() {
std::random_device random_device;
urng_t urng(random_device());
std::exponential_distribution<> random_interval(5./60.);
real_t next_request_time = 0;
std::vector<real_t> req_per_min = {0};
for (real_t time_s = 0; time_s < HORIZON; time_s++) {
if (((size_t)time_s) % 60 == 0)
req_per_min.push_back(0); t1
if (time_s < next_request_time)
continue;
req_per_min.back()++; t2
next_request_time = time_s + random_interval(urng);
}
real_t mean = 0;
for (auto x : req_per_min) t3
mean += x;
std::cout << mean / req_per_min.size() << std::endl;
}
```
]
The code above has a counter to measure how many requests were sent each minute. A new counter is added every 60 seconds #reft(1) , and it's incremented by 1 each time a request is sent #reft(2) . At the end, the average of the counts is calculated #reft(3) , and it comes out to be about 5 requests every 60 seconds (or 5 requests per minute).
==== Poisson distribution #docs("https://en.cppreference.com/w/cpp/numeric/random/poisson_distribution") <poisson>
The Poisson distribution is closely related to the Exponential distribution, as it randomly generates a number of items in a time unit given the average rate.
```cpp
#include <iomanip>
#include <iostream>
#include <map>
#include <random>
using urng_t = std::default_random_engine;
using real_t = float;
const size_t HORIZON = 10000;
int main() {
std::random_device random_device;
urng_t urng(random_device());
std::poisson_distribution<> poisson(4);
std::map<long, unsigned> histogram{};
for (size_t i = 0; i < 10000; i++)
++histogram[(size_t)poisson(urng)];
for (const auto [k, v] : histogram)
if (v / 100 > 0)
std::cout << std::setw(2) << k << ' '
<< std::string(v / 100, '*') << '\n';
}
```
```bash
0 *
1 *******
2 **************
3 *******************
4 ********************
5 ***************
6 **********
7 *****
8 **
9 *
```
==== Geometric distribution #docs("https://en.cppreference.com/w/cpp/numeric/random/geometric_distribution") <geometric>
A typical geometric distribution, has the same API as the others.
=== Discrete distribution & transition matrices #docs("https://en.cppreference.com/w/cpp/numeric/random/discrete_distribution") <discrete>
#note[
To choose the architecture for an e-commerce it's necessary to simulate realistic purchases. After interviewing 678 people it's determined that 232 of them would buy a shirt from your e-commerce, 158 would buy a hoodie and the other 288 would buy pants.
]
The objective is to simulate random purchases reflecting the results of the interviews. One way to do it is to calculate the percentage of buyers for each item, generate $r in [0, 1]$, and do some checks on $r$. However, this specification can be implemented very easily in `C++` by using a ```cpp std::discrete_distribution```, without having to do any calculation or write complex logic.
// #align(center)[
#figure(caption: `examples/TODO.cpp`)[
```cpp
enum Item { Shirt = 0, Hoodie = 1, Pants = 2 };
int main() {
std::discrete_distribution<>
rand_item = {232, 158, 288}; t1
for (size_t request = 0; request < 1000; request++) {
switch (rand_item(urng)) {
case Shirt: t2 std::cout << "shirt"; break;
case Hoodie: t2 std::cout << "hoodie"; break;
case Pants: t2 std::cout << "pants"; break;
}
std::cout << std::endl;
}
return 0;
}
```
]
The `rand_item` instance generates a random integer $x in {0, 1, 2}$ (because 3 items were sepcified in the array #reft(1) , if the items were 10, then $x$ would have been s.t. $0 <= x <= 9$). The ```cpp = {a, b, c}``` syntax can be used to intialize the a discrete distribution because `C++` allows to pass a ```cpp std::array``` to a constructor @std-array.
The ```cpp discrete_distribution``` uses the in the array to generates the probability for each integer. For example, the probability to generate `0` would be calculated as $232 / (232 + 158 + 288)$, the probability to generate `1` would be $158 / (232 + 158 + 288)$ an the probability to generate `2` would be $288 / (232 + 158 + 288)$. This way, the sum of the probabilities is always 1, and the probability is proportional to the weight.
To map the integers to the actual items #reft(2) an ```cpp enum``` is used: for simple enums each entry can be converted automatically to its integer value (and viceversa). In `C++` there is another construct, the ```cpp enum class``` which doesn't allow implicit conversion (the conversion must be done with a function or with ```cpp static_cast```), but it's more typesafe (see @enum).
The ```cpp discrete_distribution``` can also be used for transition matrices, like the one in @rainy-sunny-transition-matrix. It's enough to assign each state a number (e.g. ```cpp sunny = 0, rainy = 1```), and model the transition probability of *each state* as a discrete distribution.
```cpp
std::discrete_distribution[] transition_matrix = {
/* 0 */ { /* 0 */ 0.8, /* 1 */ 0.2},
/* 1 */ { /* 0 */ 0.5, /* 1 */ 0.5}
}
```
In the example above the probability to go from state ```cpp 0 (sunny)``` to ```cpp 0 (sunny)``` is 0.8, the probability to go from state ```cpp 0 (sunny)``` to ```cpp 1 (rainy)``` is 0.2 etc...
The `discrete_distribution` can be initialized if the weights aren't already know and must be calculated.
#figure(caption: `practice/2025-01-09/1/main.cpp`)[
```cpp
for (auto &weights t1 : matrix)
transition_matrix.push_back(
std::discrete_distribution<>(
weights.begin(), t2 weights.end() t3 )
);
```
]
The weights are stored in a ```cpp vector``` #reft(1) , and the ```cpp discrete_distribution``` for each state is initialized by indicating the pointer at the beginning #reft(2) and at the end #reft(3) of the vector. This works with dynamic arrays too.
== Data
=== Manual memory allocation
If you allocate with ```cpp new```, you must deallocate with ```cpp delete```, you can't mixup them with ```c malloc()``` and ```c free()```
To avoid manual memory allocation, most of the time it's enough to use the structures in the standard library, like ```cpp std::vector<T>```.
=== Data structures
==== Vectors <std-vector>
// === ```cpp std::vector<T>()``` <std-vector>
You don't have to allocate memory, basically never! You just use the structures that are implemented in the standard library, and most of the time they are enough for our use cases. They are really easy to use.
Vectors can be used as stacks.
==== Deques <std-deque>
// === ```cpp std::deque<T>()``` <std-deque>
Deques are very common, they are like vectors, but can be pushed and popped in both ends, and can b used as queues.
==== Sets <std-set>
Not needed as much, works like the Python set. Can be either a set (ordered) or an unordered set (uses hashes)
==== Maps <std-map>
Could be useful. Can be either a map (ordered) or an unordered map (uses hashes)
== I/O
Input output is very simple in C++.
=== Standard I/O <iostream>
=== Files <files>
Working with files is way easier in `C++`
```cpp
#include <fstream>
int main(){
std::ofstream output("output.txt");
std::ifstream params("params.txt");
while (etc...) {}
output.close();
params.close();
}
```
//
// #align(center)[
// ```cpp
// #include <ofstream>
// #include <ifstream>
//
// int main(){
// ofstream output("output.txt");
// output << "some text" << std::endl;
// output.close();
//
// ifstream params("params.txt");
// int number;
// while (params >> number) {
// // do stuff with number...
// }
// params.close();
//
// return 0;
// }
// ```
// ]
== Code structure
=== Classes
- TODO:
- Maybe constructor
- Maybe operators? (more like nah)
- virtual stuff (interfaces)
=== Structs
- basically like classes, but with everything public by default
=== Enums <enum>
- enum vs enum class
- an example maybe
- they are useful enough to model a finite domain
=== Inheritance
#pagebreak()
= Debugging with `gdb`
It's super useful! Trust me, if you learn this everything is way easier (printf won't be useful anymore)
First of all, use the `-ggdb3` flags to compile the code. Remember to not use any optimization like `-O3`... using optimizations makes the program harder to debug.
```makefile
DEBUG_FLAGS := -ggdb3 -Wall -Wextra -pedantic
```
Then it's as easy as running `gdb ./main`
- TODO: could be useful to write a script if too many args
- TODO: just bash code to compile and run
- TODO (just the most useful stuff, technically not enough):
- r
- c
- n
- c 10
- enter (last instruction)
- b
- on lines
- on symbols
- on specific files
- clear
- display
- set print pretty on
#pagebreak()
= Examples
Each example has 4 digits `xxxx` that are the same as the ones in the `software` folder in the course material. The code will be *as simple as possible* to better explain the core functionality, but it's *strongly suggested* to try to add structure _(classes etc..)_ where it *seems fit*.
== First examples
This section puts together the *formal definitions* and the `C++` knowledge to implement some simple MDPs.
=== A simple MDP `[1100]` <a-simple-markov-chain>
The first MDP $M = (U, X, Y, p, g)$ is such that
- $U = {epsilon}$ #footnote[See @mdp-example]
- $X = [0,1] times [0,1]$, each state is a pair #reft(3) of *real* numbers #reft(2)
- $Y = [0,1] times [0,1]$
// - $p : X times X times U -> X = cal(U)(0, 1) times cal(U)(0, 1)$, the transition probability is a *uniform continuous* distribution #reft(1)
- $p : X times X times U -> X = (cal(U)(0, 1), cal(U)(0, 1)$) the transition probability #reft(1) //the transition probability is a *uniform continuous* distribution #reft(1)
- $g : X -> Y : (r_0, r_1) |-> (r_0, r_1)$ outputs the current state #reft(4)
- $x_0 = (0, 0)$ is the initial state #reft(3)
#figure(caption: `software/1100/main.cpp`)[
```cpp
#include <fstream>
#include <random>
#include "../../mocc/mocc.hpp"
const size_t HORIZON = 10;
int main() {
std::random_device random_device;
urng_t urng(random_device());
std::uniform_real_distribution<real_t> uniform(0, 1); t1
std::vector<real_t t2 > state(2, 0); t3
std::ofstream file("logs");
for (size_t time = 0; time <= HORIZON; time++) {
for (auto &r_i : state)
r_i = uniform(urng); t1
file << time << ' ';
for (auto r_i : state)
file << r_i << ' '; t4
file << std::endl;
}
file.close(); return 0;
}
```
]
=== MDPs network pt.1 `[1200]` <simple-mdps-connection-1>
This example has 2 MDPs $M_0, M_1$ s.t.
- $M_0 = (U^0, X^0, Y^0, p^0, g^0)$
- $M_1 =(U^1, X^1, Y^1, p^1, g^1)$
$M_0$ and $M_1$ are similar to the MDP in @a-simple-markov-chain, with the difference that $U^i = [0, 1] times [0, 1]$, having $U^i = X^i$, meaning $p$ must be redefined:
$ p^i (x'|x, u) = cases(1 & "if" x' = u \ 0 & "otherwise") $
Then the 2 MDPs can be connected
$
& U^0_(t + 1) = (r_0 dot.c cal(U)(0, 1), r_1 dot.c cal(U)(0, 1)) "where" g^1 (X^1_t) = (r_0, r_1) \
& U^1_(t + 1) = (r_0 + cal(U)(0, 1), r_1 + cal(U)(0, 1)) "where" g^0 (X^0_t) = (r_0, r_1)
$ <mdps-connection-1>
Given that $g^i (X^i_t) = X^i_t$ and $U^i_t = X^i_t$, the connection in @mdps-connection-1 can be simplified:
$
& X^0_(t + 1) = (r_0 dot.c cal(U)(0, 1), r_1 dot.c cal(U)(0, 1)) "where" X^1_t = (r_0, r_1) \
& X^1_(t + 1) = (r_0 + cal(U)(0, 1), r_1 + cal(U)(0, 1)) "where" X^0_t = (r_0, r_1)
$ <mdps-connection-2>
With @mdps-connection-2 the code is easier to write, but this approach works for small examples like this one. For more complex systems it's better to design a module for each component and handle the connections more explicitly.
#figure(caption: `software/1200/main.cpp`)[
```cpp
/* ... */
const size_t HORIZON = 100;
struct MDP { real_t state[2]; };
int main() {
/* ... */
std::vector<MDP> mdps(2, {0, 0});
for (size_t time = 0; time <= HORIZON; time++)
for (size_t r = 0; r < 2; r++) {
mdps[0].state[r] =
mdps[1].state[r] * uniform(urng);
mdps[1].state[r] =
mdps[0].state[r] + uniform(urng);
}
/* ... */
}
```
]
=== MDPs network pt.2 `[1300]`
This example is similar to the one in @simple-mdps-connection-1, with a few notable differences:
- $U^i = X^i = Y^i = RR times RR$
- the initial states are $x^0_0 = (1, 1), x^1_0 = (2, 2)$
- the connections are slightly more complex.
- no probability is involved
Having
$ p((x_0 ', x_1 ')|(x_0, x_1), (u_0, u_1)) = cases(1 & "if" ... \ 0 & "otherwise" ) $
The implementation would be
#figure(caption: `software/1300/main.cpp`)[
```cpp
/* ... */
int main() {
/* ... */
std::vector<MDP> mdps({{1, 1}, {2, 2}});
for (size_t time = 0; time <= HORIZON; time++) {
mdps[0].state[0] =
.7 * mdps[0].state[0] + .7 * mdps[0].state[1];
mdps[0].state[1] =
-.7 * mdps[0].state[0] + .7 * mdps[0].state[1];
mdps[1].state[0] =
mdps[1].state[0] + mdps[1].state[1];
mdps[1].state[1] =
-mdps[1].state[0] + mdps[1].state[1];