-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path30c3-5542.txt
468 lines (335 loc) · 19.9 KB
/
30c3-5542.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
Here, the subtitles for talk Strange machines and reflections are supposed to be created
Revisiting "Trusting Trust" for binary toolchains
(en)
sergeybratus, Julian Bangert,bx
Saal 2
Link and further information can be found here: https://events.ccc.de/congress/2013/wiki/Static:Projects
or: www.twitter.com/c3subtitles (most up to date infos)
The language is supposed to be:
[ ] German
[ X] English
(the orignal talk-language)
Amara Link: http://www.amara.org/de/videos/3mLEkbrxB4aO/info/
-------------------------------------------------------------------------------------------------------------
Hello, it's great to be here. we're going to build on our previous year's talk, which some of your may have heard. if not, we'll catch you up. I'm sergey, we're from darkmouth. Julian was from darkmouth as well, but he defected. Traitor.
Okay. So, this is what we're going to talk about. Imagine, if you will, that we could make bugs dissappear, magically, for some definition of dissappear and some definition of bugs. Would we still be able to, will we be able to build chains of trust that wouldn't suck so much as they do now.
We're going to explain why not. and why the standard practices of constructing chains of trust these days have to change before we can actually trust them. if you want a tongue twister to go and take home, "bugless bable breaks badly". because this is going to be about the tower of babel and the problems that you get when you have things disagreeing on components of you your trust chains disagreeing on dialects. And we're going to show you some case studies. In particular, l-signing and a bit of macao(?). And we're going to show you how linux kernel disagrees with the linux loader about what is where in the loaded image. isn't that fun?
i'll stuck with theory, and then hand it to bx and then he'll hand it to juli. so 1984, ken thompson and his famous "reflections on trusting trust", almost
30 years ago, 29, but that's an off-by-one, came up with this maxim, that
you can't trust code that you did'nt completely write yourself. he planted
a bug in the compiler, and the compiler, having compiled whatever was
built with that trust chain, propagated the back door into everything
compiled. and you shouldn't really trust your compiler.
He went on an predicted the future and by now evertything that he predicted
came true. including well placed microcode bugs. And really these days
you'll look at software and wonder which software doesn't have planted bugs,
right?. Certain orgainsation has something to do with it, but honestly 30
years ago i never imagined that i ever encounter a deliberately planted bug
and here we go.
what if we could wave a magic wand and make it all go away. what if we could
complete trust the author. and what if the author could completely trust him
or herself to have the code do exactly what the author intetded. would we
solve trusting trust then? and the answer: hell no! and the reason is that
trust chains are built up and up and up and up and all the components of
those trust chains must talk to each other and agree on what they're saying
and agreeing on what they're interpreting. this is what software engireering
pretty much looks today, right? if you need the narrative [unverständlich]
reminder, people decided to build a tower to reach the sky. they, the
powers-that-be did not like it, so they make the dialects of thebuilders
divert. nad suddenly one builder asked for rope, he was instead handed a
saw. or somebody said 'give me a saw', and the other heard 'your father was
a donkey'.
As you can imagine the construction didn't go well afterwalrds. if we were
to explain that with cats, the components of the trust chain must speak
precisely the same dialect of the inputs or else you're going to have, well,
we're going to call this the bable-cat, the ceiling which is over your moral
health. the babel-cat is in you rtrust chains.
well, let's make this a bit more theoreectial before we get to the practice.
what is a chain of trust? a chain of trust is really a chain of execution
environments of increasing power. you start at boot, you execute some code,
you load your os, you execute some code. and eventually you'll get to the
executable that you want to run, construct a process out of it and run it.
and hopefully throughout that entire process no unexpected computation,
otherwise known as 'pwnage', occurs. it's the same code, it's the same bytes
that are being interpreted by a chain of parsers and handlel by a chain of
loaders. and so the meaning of those bytes has to be completely agreed upon
between all of those trust links.
now, when i say trust, what exactly do i mean, right? how do we trust, why
do we trust things, why do we trust bytes to run? there are two approaches
to this. you cannot analyze turing complete binary code. what is of course
that you compile down to. and so instead you sign it. you trust the process
of your developers, you trust their good intentions, you trust their secops
and so on. so you get this frozen bit of code and you hope that it would be
immutable(?). or ther's another way. you can actually check the code, do
static analysis of it and convince yourself that the execution of it would
be exactly as intended. we can do that as well. and then you can trust it
because youv'e analyzed it. it doesnt work with generic code, no matter
what anti-virus vendors will tell you how many times a day they will solve
the halting problem. you know.
so let's see which kinds of trust are there in trust chains. and the answer
is 'both', right? you freeze some bits in your binary but you also want to
do composition.
[stoped at 8min19s of the official recording]
You also want to have signed code,
in the end your trusted binary ends up being this sausage thing
it tells you where these trusted bits are
enter stage right, tables
it decides where.. signatures are.
tables are simple
it's full of ofsetes,
youcan put data about how to relocate the code, how to patch the code
it makes aslr possible
any input is a program,
any table is a program
periodic table, this is prbably the top of my gfx abilties
it's atable that describes the elements of the trustes sausage
I could say ABI
there are pieces of code, pieces of logic baked into the processor
they are in your, loader, kernel, RTLD
they consume the talbes just as your processor consumes it input
hopefully it's simple enough that your computation doens't run away
tables are something of a byte code.
how you locate the signed pieces, you are doing arthinmetic, lops, brankces
you get a turing machin
any input is a program as a matter of princicple
any complex data is indistinguasjble from bytcode driving a virtual machine
a parser code for suff complex input is indist from a cplex machine running its bytecode
so what can go wrong
your parser may be buggy
your parser may overflow
may be corrupted
some other logic will take it and run away with it
the input can be completely well formt. but so complex there is no predicing how it will do
relocatens sumbo tables have that
different pices of the trust chains vwill disagree.
the three cats of the apocalypse
the two cats who get the gold star and the red start
input and parser differentials
letting the turing cat out of the bag
and the cat turns out to be quite an impressive capt
medeival writes knew their cat
there is your runtime after boot
here is the original Ken Thompson plant
then comes the loader, the linker, the relocator and then of course there is memory management
..
these pieces that have the gold stars are turing complete on their inputs.
you can compile any computation into those tables,
the favorite is the weird machin in the page fault handling logic,
it not only reads from memory but it writes to meory itself
keep your processor alternating between the page fault and the double page fault
you continue getting page faults you can still compute anything you like and write interesting patterns into meomody
ulia wrote the compiler for both
some academic has had an unnerving sense of taste.
the relocating engine that build program in meory, is actually turing complete
The RTLD and the loader can be made to disagree what isn't a program
see different program
The second kind of cat.
first we gave talk on hacker confierence.
why would you know what is inside your machine, when you can solve interesting problems like having a graph looking like
finally published at woot after having it rejectd from som high an mighty venues
tables are programs, they interpret their ...
there are bad things in the input, but you cans suppres them byg checking
this is wrong.
validation is teh same as verificdation as that data causes computation
validation of tables are
even in the absence of bugs you can have complex, impossible analyisi
you will solve teh antivirus vendors halting problem
let's look at the specifics of the machines that execute tables
there we go
prepared students
code signing is being worked into the trust chain more and more
we see it popping up in open sourece, kernel modules, driver # windows
it is used for trust evidence
this binary has integrit, unchanged. since developer handed it to you
have some attrubition, the binary came from these people
it is easy to implement poorly, it's not an algorithm,
so many components involved, in order to have a good amount of trust, you need proper key mgmt,
the program that is loaded in memory is the one we really want to trust, not the one on disk
program in meory is only infuenced by the program on disk
does signing give us what we wanted
we feel that we can trust it because tehre is some kind of integrit
there are many "mahines" involved
we end up with some sort of rube goldberg machine
so there's maby different componets working together.
we have to think about it. when considering code signing, a bunch of questions come to mind
are the machines correct, bug-free, do we understand what they are capable of
making the loader do something
two machines operating on the same data, do they agree.
are the tables correct and complete.
can we trust the tranformations done to the process loaded into memry after the static analysis,
another question is enforecemetn, how and what do we want to enforce.
I've looked at may elf code signing mechanisms and hope we can learn somthing
we have validator machines, a lot working together.
looking at different code signing for userland and kernel binaryes
creative names for code signing.
Had trouble telling them apart.
writtein 2003-2005 and are obsolete.
a resurgence of work in the kernel module area
questions I want to tevisit, are machines in codesigning correctly implemented.
in Mach0 XML is impletentein signing.
In ELF I saw everything was written in C
one thing but one, one thing was written in C++
I'm sure there won't be any other bugs because of taht
how powerful are our machines
things that look like tables, we found to drive unexpected computation
we need to understand what codesigning is sable to door.
this is old worki with a backdoor in ping
the same backdor in Mach-O
nothing was changed or added and removed to the file
I would argue that hiding computation in metasdata, you can see it but can be difficult.
two different interpreters parse it differntly
codesigning for elf implmentations. be trusted?
In this example from bsign I love this example
I don't recall why we need to check more than teh elf header
the intersting pard is the author thought we cans stop here and be ok
I see this as a red flag, we need to think more when developing code signing
in pares differentials ,
this pariticular asn1 ...
ELF has multiple ways to interpret it
there are multiple ways to load an elf
ifyour signature is in the same section.
A proposed patch looking for the signature section looking for the first section and looking for the correct name and .
Skip over the sign section. I can give other sections the same name. is there anything preventing me from inserting nw sections
signing the singatures, who watches the watchers, this segment
A little bit about mach o code signing.
Had our graphic designer
Macho-o codesignign blob, ther'es a bunch of metadata
internal requirements, what I wrote here is a text interpretation
these are encoded into a bytecode w
ther'es a bunch of hashes and all these different components that must work together and so much room for parsers to disagree.
Everything that's signing, relocating,
you cannot trust code taht you didn't totally load yourself
and now for the third part we bring out the thin cat
that elephant is parser differntials,
histioricall there have been issues like x509,
issues with androids code signing and master key avuln
one implmented in C and one implemented in java
signature matches on install, but you can run something completely different when executing.
I'm going to talk about it happeins in userspace
How may elf parsers are involved in typical program exec
there tends to be two.
elfoader int the kernel binfrmt elf and userland ld.so library
all of these parsers have to see the same sections, and rely on the same metadata to enorce this trust.
they do not share any code or formal grammar to .
in particular the PTLOAD entry, it is there to map the byaes from the elf file into me..
So other parts of the tool chain.
unless you look at them as automaton it is unlikely you will catch all these bugs
This is part o the linux kernel 3.4 there is no grammar, there is no bnf, just randomly throing aaround pointers and..
the typical layout of the elf header is folowed by the program header table, describes how it is to be loaded into memory
I performes one memory map,
it finds out if it's dynamically linked
it loades teh binary from .. special section
then to tell deh loader whee to start parsing, it has to give it the virtual address
the userpacle loader is already mapped into the address space
it does that b lokoing at the address of first ptload and adding the offset to that address.
this is where the loader will look and begin parsing
it turns out adding a file offset to ..
it always places program in first segment and works fine
you can add virtual adress to file offsets as you please
if you place it somehwere else in memory, the kernel
If you control the program memory you can map arbitra memory to arb adresse, no other tool will see it as surch
which is nice if you want to hide a program.
we place the program header table at the end fo the file
if you craft it exaclty right, ..
if you have one of .. code sign g schemes you can point it into middle of signing
in linux you can execute a library
it give you an email where you can bug them about bugs.
it passes control to separate parser.
when you load it as a shared library the kerel never touches it.
which loads it by ld.so which gets a consistent and completely different view.
I hope the demo wokrs correctly
it's straigtforawrdm it calles anohter function, which prints hello world
in order to demonstrate this library,
i
it's loading libgood,so and libpoc.so everything is perfectl yfind
when we execute libpoc direclyt it has a shell,
if yo're running on debuan you might want to figure out what is going on
it's only libgoot, lets look into libgoot, it does not even metion shell how does it for somthing
..
you'll notice that if you take the pain of lokoing at the maps it points to libevil.,so
If I want to load another library I could lok at the tricks used by..
for having a clear understanding of ..
if you run this on a sane system it will tell you it links libevil and not libgood.
the debugging tool will not .
feel free to play around, I'm sure there are many pother parsing ambiguities in ELF.
I will hand it back to sergey
as I metioned loading a shared library is just a trick from PFC?
you can rewrite your binary at will
to cover the last point
we have used relocation entries to .
reloc entries are powerful they are there for editng loaded coad, pachitn rewriting address,
that's cheating right
how about just symbols. dnamic symbol table
I can pretend all of your program is the globa offset table
GOT is something your linkr writes into
you want printf, your first jump will be into the stub thtough the GOT, pointing at that stub
that stub brings you into the dynamic loader, brings you into ..
your next jump, call to pring gost straigt to the library
my enitre program is the GOT, it will take a large symobl table
so to conclude, chains of trust is something taht we absoltely want to have
all those littel steps, links in them andyhitng involving finding pieces we trust
peices we trust conating evidemce, say crypograph. evidence are as importan as s original development porcess.
even though you have no bugs there, you will likely have disagreem in the links of the chain of trust
you will also have powerflu softare engineering hacks, that can rewrite your pgram and deprive you of the ability to trust your poggram
you can start signing everything, but composistion is the sould of software engeenring so you will find it ncessary to combine ..
sure emory corruption is fun, sure shellcode is loverly, but parser differentials are equally important
complex data formats that don't allow checkinf for first order of trust kill trust.
diversion of dialects kills trust
there can be no chain of trust in Babel
all those cats building up balbes toswer, not a good software engineering practis
chains fo trust, when do we want them
NOW right.
simple formats that take away compultaniall power fomr the..
starting with a boot , we hat a look of the top of the trust chain, how about the bottom
EFI how simple is that
you dhould not ru
the installer is an linux installer and what it sees is not what the java crypot ses.
what good is a signature mister anderson if you can not even see the document.
It's all about computation, all about input begin program
you ahve to hobble the power hese can induce in your code
one of them is caled elfbac. we will be relasing code next year
julian is the main develper
that breaks bx's weird machins
it depriveds the reloc automaton of some of the
new hardware that will contain our parsers
bugs are not a necess. for getting pwned
the parses will not see the same thing
we will hold an academinc workshop on this thing
finally we are going to have a workshop where thiese topics will be considered.
may 18th you will find the CFP at this URL
dedicated to the memory of Len Sasserman? (sp) who togheter with his wife invented this
topic
topics. language ambiguity, parser difeerentiols, machines who run away on their input
with that I thank you
[applause]
so we have littel time for QA
please line up at the microfphones.
if you leave please tak your trash with you
please put all teh bottles at the create
we have a q from internet
If you could implement relocation without turing complete semantings.
rpt
well yes and you should
you should imol linking and otehr sptes wihtout turing completenes.s
if you have accidental or designed turing co,pleteess you deplrive yourfself of the ability of checing what result is other than runnning ti, alsso known as halting
this is a very good question
we see turing completeness being sowed everyhwe
downloadin a file from dropbox e have ot run javacript
I need to run javscript why
html 5 togheter by ccss can be tricked int torung completeness.
It's actually proven turing complete, you can find a paper in te links section
we are surrounded by unceccasyr Truring comple, it's tehre to .. the atacer
computation is also a privelige, it is given to the attacher. TC where it doesn't belong, TC on boundaries where data arrives is a relaly bad ide.
Have I responded?
Mike 1 quickly
Did I understand, teh tales to rewrite the code, are they part or not part of the signing process.
many schemse do not signe metadata
hooray
[applause]
the answer is again, we've been akse BX ahs done this for elf PE is acousin of elf with all sots of poperties
mach-o does reloc with bytecode, dedicated bytecode, not just c structs like ELF
that is actual bytecode, teh VM is uncluded for it. turing complete as well
our time is up. thank you very much
If you want to give feedback do so in teh ...FOEBUD? system
and reduce your computational privilege of power.
,