Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Adds support for fuzzing via AFL #100

Merged
merged 15 commits into from
Apr 18, 2017
Merged

Adds support for fuzzing via AFL #100

merged 15 commits into from
Apr 18, 2017

Conversation

pjsg
Copy link
Contributor

@pjsg pjsg commented Aug 24, 2016

This adds a new test (afl_test) which is not run by default, but only if you include it with -f afl_test

This reads from stdin a list of operations (like open, close, read, write, flush, delete, rename etc) and then executes those operations against the SPIFFS api.

Some non-fatal errors have now been made fatal errors (write out of bounds, write which tries to set bits, etc).

AFL can then drive this test and discover interesting combinations of operations that cause failures. The command line:

afl-fuzz -i afltests -o findings ./build/linux_spiffs_test -f afl_test

takes test files from the afltests directory and then puts the results into the findings directory. It may take several hours to find a problem.

@pjsg
Copy link
Contributor Author

pjsg commented Aug 24, 2016

In particular, the two files 100 and 200 in the afltests directory cause crash as follows:

build/linux_spiffs_test -f afl_test < afltests/100
adding suites...
adding test afl_test
1 tests added
running tests...
TEST 1/1 : running test afl_test
trying to write 06 to 61 at addr 0000042d
page 0004  DATA 8001:0000  FIN del IDX USD idl   OBJ_IX_HDR  '6file6.xxxxxxxxxxxxxx'  292 bytes  type:01
    00000400: 01800000f800000024010000013666696c65362e787878787878787878787878  ........$....6file6.xxxxxxxxxxxx
    00000420: 7878000000000000000000000061006200ffffffffffffffffffffffffffffff  xx...........a.b................
    00000440: ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff  ................................
    00000460: ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff  ................................
    00000480: ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff  ................................
    000004a0: ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff  ................................
    000004c0: ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff  ................................
    000004e0: ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff  ................................
Aborted
build/linux_spiffs_test -f afl_test < afltests/200
adding suites...
adding test afl_test
1 tests added
running tests...
TEST 1/1 : running test afl_test
FATAL write addr too high 00ffe03c + 00000002 > 00008000
Aborted

@pellepl
Copy link
Owner

pellepl commented Aug 24, 2016

@pjsg thanks for that Philip! I'll have a thorough look at it tomorrow, at a quick glance it looks good.

@pjsg
Copy link
Contributor Author

pjsg commented Aug 24, 2016

The only extra thing that I might do is to put together a short doc explaining how to do the fuzzing....

@pellepl
Copy link
Owner

pellepl commented Aug 24, 2016

If you find the time to do that it'd be great. Otherwise it is pretty clear from the testcase. The test stuff does not have to be as lucid and documented as the actual spiffs code.

@pjsg pjsg changed the title Adds support for fizzing via AFL Adds support for fuzzing via AFL Aug 24, 2016
@pellepl
Copy link
Owner

pellepl commented Aug 25, 2016

It all looks good to me. If you feel finished I'll merge it.

The only comment I had was about 'fizzing' which in retrospect I gather was a misspelling. Not having English as native language I'm always keen to learn new words. I googled it in hope to learn something new and useful. Something new I learnt... Useful, I hope not :)

@pjsg
Copy link
Contributor Author

pjsg commented Aug 25, 2016

Actually, I realized that I should restructure the way that this test works -- to make it easy to include specific test cases.... I'll do this tonight.

@pjsg
Copy link
Contributor Author

pjsg commented Aug 26, 2016

I think that this is done.... I checked in a couple of tests that cause problems:

fuzzer_found_1 and fuzzer_found_2

I added some code to handle what happens when there is a platform reboot -- i.e. I dump the current state of the file system and then reload it into a new file system.

@pjsg
Copy link
Contributor Author

pjsg commented Aug 28, 2016

I realized that all the failing cases were with multiple open files at the same time. I added a test case 'afl_single' which only uses a single open FD at a time. AFL still finds failures. One of these is test afl_fuzzer_single_1

@pellepl
Copy link
Owner

pellepl commented Aug 28, 2016 via email

@pjsg
Copy link
Contributor Author

pjsg commented Aug 28, 2016

The only downside to merging is that the default tests no longer pass.....

@pellepl
Copy link
Owner

pellepl commented Aug 29, 2016

Ah yes, you have a point. Well - I'll use it locally then, and fix the errors. When all is done, I'll merge this PR and directly after push the fixes. I am however a bit busy at work for the moment, so things must calm down first before I can dig into it.

@pjsg
Copy link
Contributor Author

pjsg commented Jan 24, 2017

I updated to fix a merge conflict. I also added some prettier logging in the crash cases. Typical output is:

TEST 2/4 : running test fuzzer_found_2
  open("6file6.xxxxxxxxxxxxxx", 0x1e) -> 4097
  write(4097, , 1014) -> 1014
  write(4097, , 63) -> 63
  remove("6file6.xxxxxxxxxxxxxx")
  close(4097)
  open("6file6.xxxxxxxxxxxxxx", 0x1e) -> 4097
  write(4097, , 1014)FATAL write addr too high 01fff00c + 00000002 > 00007000

My feeling is that removing a file while the file is open is probably the issue here....

I'm not enough of a spiffs expert to be able to say whether the remove should have been allowed or not.

@pjsg
Copy link
Contributor Author

pjsg commented Mar 17, 2017

The issue appears to have been fixed. This PR can now be merged and doesn't break the tests.

@pjsg
Copy link
Contributor Author

pjsg commented Apr 17, 2017

@pellepl Since the tests now pass, can we just get this merged so that it doesn't rot......

@pellepl pellepl merged commit bd5bdd9 into pellepl:master Apr 18, 2017
@pellepl
Copy link
Owner

pellepl commented Apr 18, 2017

@pjsg of course, sorry for the long delay and thanks for the efforts. Busy times! :)

@pellepl
Copy link
Owner

pellepl commented Apr 19, 2017

@psjg. Odd. Test 81 and 82 breaks on my local computer. Works on yours? 82 goes fatal so everything shuts down.... Ah. Error in test_spiffs.c. exit(0) on out of bounds write. That's probably why build is passing...

@pjsg
Copy link
Contributor Author

pjsg commented Apr 21, 2017

I just pulled the branch and checked out master and this is what I get:

TEST 82/84 : running test fuzzer_found_2
  open("6file6.xxxxxxxxxxxxxx", 0x1e) -> -1
  remove("6file6.xxxxxxxxxxxxxx")
  open("6file6.xxxxxxxxxxxxxx", 0x1e) -> -1
  .. ok
  free blocks     : 6 of 7
  pages allocated : 0
  pages deleted   : 0
  gc runs         : 0
  cache hits      : 37 (sum 37)
  cache misses    : 7 (sum 7)
  cache utiliz    : 0.840909
  RD:          2 reads          90 bytes         45 avg bytes/read
  WR:          0 writes          0 bytes          0 avg bytes/write
  fs consistency check output begin
   check: LOOKUP DELETE PAGE 0001
   check: LOOKUP ERROR -1
   check: INDEX  DELETE ORPHANED INDEX 0002
   check: INDEX  ERROR -1
  fs consistency check output end
  locks : 0
TEST 83/84 : running test fuzzer_found_3
  open("6file6.xxxxxxxxxxxxxx", 0x1e) -> -1
  open("6file6.xxxxxxxxxxxxxx", 0x18) -> -10002
  open("6file6.xxxxxxxxxxxxxx", 0x1e) -> -1
  .. ok
  free blocks     : 8 of 9
  pages allocated : 0
  pages deleted   : 0
  gc runs         : 0
  cache hits      : 1 (sum 1)
  cache misses    : 109 (sum 109)
  cache utiliz    : 0.009091
  RD:         92 reads       11693 bytes        127 avg bytes/read
  WR:          0 writes          0 bytes          0 avg bytes/write
  fs consistency check output begin
   check: LOOKUP DELETE PAGE 0003
   check: LOOKUP ERROR -1
  fs consistency check output end
  locks : 0

I hope that it isn't compiler dependant. I'm using gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC) which is quite an old version.....

Ah -- I just tried on a different system, and I get:

TEST 82/84 : running test fuzzer_found_2
  open("6file6.xxxxxxxxxxxxxx", 0x1e) -> 4097
  write(4097, , 1014) -> 1014
  write(4097, , 63) -> 63
  remove("6file6.xxxxxxxxxxxxxx")
  close(4097)
  open("6file6.xxxxxxxxxxxxxx", 0x1e) -> 4097
  write(4097, , 1014)FATAL write addr too high 01fff00c + 00000002 > 00600000

@pjsg
Copy link
Contributor Author

pjsg commented Apr 21, 2017

Argh! My bad entirely. I had an uncommitted change (I was trying to simulate random power-offs) which made these tests pass......

I checked out a clean copy on my original machine and it still fails... Sorry.

@pjsg
Copy link
Contributor Author

pjsg commented Apr 22, 2017

When I look at the implementation, I'm not even sure that I know what ought to happen in this case.

The issue appears to be that the sequence open/write/remove/close ends up corrupting the file system. There are two versions of _remove, once which takes a name and the other an open file handle. This clearly envisages calling it when there is an open file.

In a POSIX file system, the remove would just orphan the existing file (which would continue to exist) until the last person closes it. At this point it would be deleted.

On windows, AFAIR, you can't delete an open file (I'm not sure if you can rename it either).

For SPIFFS, it isn't clear which model it is trying to adopt (and it might be a third model).

I suspect that the simplest implementation is the windows model where you just return an error code on removes/renames if the file is open. Of course, this means that the _fremove will always return an error.

If I change the fuzzing code to not try and remove (or rename) a file which is open, then the test 82 passes (as it skips the removes). Test 83 still fails:

  open("6file6.xxxxxxxxxxxxxx", 0x1e) -> 4097
  open("6file6.xxxxxxxxxxxxxx", 0x18) -> 4098
  write(4098, , 581) -> 581
  write(4098, , 2007) -> 2007
  write(4098, , 2007) -> 2007
  write(4098, , 79) -> 79
  close(4097)
  open("6file6.xxxxxxxxxxxxxx", 0x1e) -> 4097
  write(4098, , 2007) -> 2007
  write(4097, , 1943)FATAL write addr too high 007fc0fa + 00000002 > 00600000

I suspect that this is a similar issue -- the third open of the file has O_TRUNC set and I'll bet that that confuses something. Maybe there should be a restriction that each file can only be opened once at a time.

@pjsg
Copy link
Contributor Author

pjsg commented Apr 22, 2017

If I implement those restrictions (in the test layer), then I can still provoke a crash with the use of a simulated power cycle (not in the middle of writing, but between calls into SPIFFS). This is true even if I call spiffs_check after mounting....

 open("4file4.xxxxxxxxxx", 0x1e) -> 4097
  write(4097, , 1983) -> 1983
  write(4097, , 1983) -> 1983
  write(4097, , 1983) -> 1983
  write(4097, , 1983) -> 1983
  write(4097, , 1983) -> 1983
  close(4097)
  open("6file6.xxxxxxxxxxxxxx", 0x1e) -> 4098
  write(4098, , 1983) -> 1983
  unmount and remount
  forcecheck() -> 0
  open("4file4.xxxxxxxxxx", 0x18) -> 4097
  write(4097, , 127) -> 127
  close(4097)
  open("4file4.xxxxxxxxxx", 0x18) -> 4098
  write(4098, , 581) -> 581
  close(4098)
  open("4file4.xxxxxxxxxx", 0x1e) -> 4098
  write(4098, , 63) -> 63
  close(4098)
trying to write 1c to 17 at addr 0000322d

@pjsg
Copy link
Contributor Author

pjsg commented Apr 22, 2017

I've dug into this last case, and I think that this is a result of a bad interaction with GC happening at a bad time. The file system is almost full and it first writes the 63 bytes to data to page 17, and then it does some more GC and now has the 63 bytes in page 1c and tries to write that. This doesn't work.

I tried turning off caching, but it doesn't make any difference (except that the error then happens as part of the write() call rather than the close() call).

@pellepl
Copy link
Owner

pellepl commented Apr 23, 2017

I checked out a clean copy on my original machine and it still fails... Sorry.

No problem, I tend to have a mess with my branches my self :)

In a POSIX file system, the remove would just orphan the existing file (which would continue to exist) until the last person closes it. At this point it would be deleted.
On windows, AFAIR, you can't delete an open file (I'm not sure if you can rename it either).
For SPIFFS, it isn't clear which model it is trying to adopt (and it might be a third model).

Umm, that is a good question. I honestly cannot remember, I'll need to scrutinize the code regarding that. I do remember I (though) solved multiple modifications on same file, however caching might mess things up.

I've dug into this last case, and I think that this is a result of a bad interaction with GC happening at a bad time.

I'll have to check this in depth. With a stroke of luck, I might get some time actually doing it soon also. I do recall looking into one of them some time ago - might have been the one you refer to. That particular test had a really on-the-edge config with only four blocks. I am not even sure spiffs should support that. Perhaps it should reject such a mount instead.

The fuzzing tests are a great addition, and I really want to keep them. However, perhaps we should temporarily #if 0 them until it is sorted out - what do you think? I totally I agree I should get my thumbs out and fix it asap, but in the mean time ... :)

@pjsg
Copy link
Contributor Author

pjsg commented Apr 23, 2017

I'd be happy making those two tests non-default tests....

I'll submit a PR to do that....

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants