Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Rewrite of file header slows down as file size increases #27

Open
rocuh opened this issue Feb 12, 2018 · 9 comments
Open

Rewrite of file header slows down as file size increases #27

rocuh opened this issue Feb 12, 2018 · 9 comments

Comments

@rocuh
Copy link

rocuh commented Feb 12, 2018

I'm curious if my understanding of how littlefs works when "overwriting" existing data at the start of a file. For example rewriting a header at the start of a largish file, e.g. something like the following

 elcofs_lseek(testFd, 0, SEEK_SET);
 elcofs_write(testFd, headerData, sizeof(headerData));

As currently what seems to occur is that if you rewrite the beginning of a file the entire file gets rewritten, specifically in lfs_file_flush:

while (file->pos < file->size) {
            // copy over a byte at a time, leave it up to caching
            // to make this efficient
            uint8_t data;
            lfs_ssize_t res = lfs_file_read(lfs, &orig, &data, 1);
            if (res < 0) {
                return res;
            }

            res = lfs_file_write(lfs, file, &data, 1);
            if (res < 0) {
                return res;
            }

            // keep our reference to the rcache in sync
            if (lfs->rcache.block != 0xffffffff) {
                orig.cache.block = 0xffffffff;
                lfs->rcache.block = 0xffffffff;
            }
        }

Is that expected behavior due to need to rebuild the reverse linked list of blocks? I just want to be sure this is expected before I rework my code to put the header data in separate file.

@geky
Copy link
Member

geky commented Feb 12, 2018

Good detective work, you are completely correct, littlefs does not have good performance when it comes to random writes. The underlying copy-on-write (COW) structure is append-only, so littlefs needs to write out the entire file if a header is modified.

This was a tradeoff for code size, since a more complicated COW structure would likely require more code to implement. (Though if there's better small-footprint COW structure out there I'd be excited to know).

How large are your files? If they're <4KiB (1 erase block) it may not be worth optimizing, since most filesystems have to rewrite the block.


Does the header get rewritten more often than the data? If you're looking for a workaround, you could put the header at the end of the file. littlefs would be able to reuse the file's prefix without copying.

 elcofs_lseek(testFd, -sizeof(headerData), SEEK_END);
 elcofs_write(testFd, headerData, sizeof(headerData));

Though I don't know if this is too littlefs specific if you're planning on targeting other filesystems as well.

@rocuh
Copy link
Author

rocuh commented Feb 12, 2018

yes like you said putting the header at the end is another solution.

Basically the file is a database of readings (worse case up to 10000 - typically much lower) that are incrementally added at a worse case rate of 1 or 2 readings a second (typically much slower) - where a reading is probably around 24 bytes. The header is incremental statistical information (amongst other things) on those reading values, so that when the file is reopened the statistics can be can continue to be updated again (fast resuming is required).

Thanks for you input, either separate header file or header at end option looks workable. From tests on our NOR flash performance is fine and remains fairly consistent when using a separate header file.

@rocuh rocuh closed this as completed Feb 12, 2018
@rocuh
Copy link
Author

rocuh commented Mar 3, 2018

Could the linked/skip list of blocks in a file be held in separate metadata blocks rather than in the data block? I suppose the trade off would be other than additional complexity that if you are just appending data to a file there would be 3 erases/copies (data block, file metadata block, directory metadata block) instead of the 2. currently

Random writes would be improved as you would not have to rewrite the data blocks if you modify an earlier block. I suppose if you have a large enough file you have to do something similar to the file metadata blocks as they would have to be stored in similar linked/skip list. But it would still be much faster.

@geky
Copy link
Member

geky commented Mar 12, 2018

Sorry for the late response. Feel free to reopen closed issues even if it's just for an idea/question : )

I think you pretty much nailed the tradeoff in your comment.

Although it should also be noted that the inline list actually does a lot to simplify the logic around writing files. If you had a separate blocks for metadata and data, you would need to buffer both the metadata and data blocks, or jump between the two with rewrites when your program size is larger than a pointer. At the moment I can't think of a good way to handle this without 2x RAM usage per file.

Other notes:

  • Separate metadata blocks are a tradeoff of complexity for improved random writes.
  • You could also have a special case for single block files so at least small files don't have the extra metadata block.
  • With indirect metadata blocks you could also drop the CTZ skip-lists, since files would need to be >8MiB before the skip-list begins helping.

@geky geky reopened this Mar 12, 2018
@sn00pster
Copy link

Is this true for any write?

Lets say I have a 512KiB file, and I position the file back 4KiB and write 512 bytes, does this cause a copy of the entire file or just the blocks after the point of the rewrite?

@geky
Copy link
Member

geky commented Apr 19, 2018

Just the blocks after the point of rewrite. So in your case that's 4KiB that will need to be written out (aligned to the nearest erase block-8B, so maybe a bit more).

In the general case, littlefs is only ~2x better than a filesystem that writes out the entire file every seek.

@sn00pster
Copy link

Sweet, that's what I wanted to hear!

I've just got littlefs running on our boot loader, I've obviously had to remove mass storage which means I've had to add our file transfer protocol into it, I'm just testing now.

Then I need to add support into the actual device firmware that detects if it's littlefs or fat and uses the appropriate file system drivers. Loving littlefs!

@geky
Copy link
Member

geky commented Apr 19, 2018

Very cool! Glad to hear it's working for you!

@Aircoookie
Copy link

First off all, huge congrats to @geky and other maintainers of LittleFS!
It is the best FS out there for small embedded platforms when it comes to read speed and reliability.

Sorry for "reviving" a 2 year old issue, but given that this behavior is still current and my issue is the same I feel it is more appropiate than opening a new issue.
Like perhaps many others, I'm migrating from SPIFFS and am really glad about the improved read speeds and can comfortably say LittleFS is the best out there for most use cases!

The achillis heel of this FS is indeed random writes unfortunately, having to copy almost the entire file if only a single byte is modified in the start or middle of the file is not acceptable in a lot of use cases (see also esp8266/Arduino#7271).
The proposed workaround of splitting into multiple small files is also not suitable for everything, as reading and concating all these files on every read is prohibitively expensive.

Random writes have more than just one drawback:

  • With an ESP8266, quite a popular platform, I can achieve write speeds of roughly 40KB/s. If it is a design goal that a file operation normally shouldn't block other code for more than 1 sec, that limits the possible file size when needing random writes accordingly.
  • Despite the wear leveling, having to rewrite that many blocks can cause quicker wear if you have to update a file a lot
  • I haven't verified, but logically it would seem like a random writable file must not be bigger than the current free blocks (half the FS if there are no other files) to have room for the copy.

Even given the associated challenges, implementing the proposed separate metadata blocks would make LittleFS practically usable for all purposes and would be highly welcome!
If that sort of thing can be implementing in a backwards-compatible way, there could even be a per-file option for whether to use separate metadata blocks so that append-only uses (logfiles etc.) would not have to take the performance hit of the 3rd block write!

Another, more simple to add and even less demanding way would be to optionally allow in-place block updates if the blocks that would need to be re-written are more than x.
This would however jeopardize both the design goals of power failure resiliance and good wear leveling so the above solution would of course be preferred.
Still, for applications where power out during write is very unlikely (e.g. all writes are non-scheduled, but user initiated) it would be preferable to the file close operation taking several seconds after each write.

Thank you once again for the amazing filesystem!

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

4 participants