Rewrite of file header slows down as file size increases #27

rocuh · 2018-02-12T10:52:00Z

I'm curious if my understanding of how littlefs works when "overwriting" existing data at the start of a file. For example rewriting a header at the start of a largish file, e.g. something like the following

 elcofs_lseek(testFd, 0, SEEK_SET);
 elcofs_write(testFd, headerData, sizeof(headerData));

As currently what seems to occur is that if you rewrite the beginning of a file the entire file gets rewritten, specifically in lfs_file_flush:

while (file->pos < file->size) {
            // copy over a byte at a time, leave it up to caching
            // to make this efficient
            uint8_t data;
            lfs_ssize_t res = lfs_file_read(lfs, &orig, &data, 1);
            if (res < 0) {
                return res;
            }

            res = lfs_file_write(lfs, file, &data, 1);
            if (res < 0) {
                return res;
            }

            // keep our reference to the rcache in sync
            if (lfs->rcache.block != 0xffffffff) {
                orig.cache.block = 0xffffffff;
                lfs->rcache.block = 0xffffffff;
            }
        }

Is that expected behavior due to need to rebuild the reverse linked list of blocks? I just want to be sure this is expected before I rework my code to put the header data in separate file.

The text was updated successfully, but these errors were encountered:

geky · 2018-02-12T17:09:14Z

Good detective work, you are completely correct, littlefs does not have good performance when it comes to random writes. The underlying copy-on-write (COW) structure is append-only, so littlefs needs to write out the entire file if a header is modified.

This was a tradeoff for code size, since a more complicated COW structure would likely require more code to implement. (Though if there's better small-footprint COW structure out there I'd be excited to know).

How large are your files? If they're <4KiB (1 erase block) it may not be worth optimizing, since most filesystems have to rewrite the block.

Does the header get rewritten more often than the data? If you're looking for a workaround, you could put the header at the end of the file. littlefs would be able to reuse the file's prefix without copying.

 elcofs_lseek(testFd, -sizeof(headerData), SEEK_END);
 elcofs_write(testFd, headerData, sizeof(headerData));

Though I don't know if this is too littlefs specific if you're planning on targeting other filesystems as well.

rocuh · 2018-02-12T18:05:32Z

yes like you said putting the header at the end is another solution.

Basically the file is a database of readings (worse case up to 10000 - typically much lower) that are incrementally added at a worse case rate of 1 or 2 readings a second (typically much slower) - where a reading is probably around 24 bytes. The header is incremental statistical information (amongst other things) on those reading values, so that when the file is reopened the statistics can be can continue to be updated again (fast resuming is required).

Thanks for you input, either separate header file or header at end option looks workable. From tests on our NOR flash performance is fine and remains fairly consistent when using a separate header file.

rocuh · 2018-03-03T09:26:34Z

Could the linked/skip list of blocks in a file be held in separate metadata blocks rather than in the data block? I suppose the trade off would be other than additional complexity that if you are just appending data to a file there would be 3 erases/copies (data block, file metadata block, directory metadata block) instead of the 2. currently

Random writes would be improved as you would not have to rewrite the data blocks if you modify an earlier block. I suppose if you have a large enough file you have to do something similar to the file metadata blocks as they would have to be stored in similar linked/skip list. But it would still be much faster.

geky · 2018-03-12T20:05:44Z

Sorry for the late response. Feel free to reopen closed issues even if it's just for an idea/question : )

I think you pretty much nailed the tradeoff in your comment.

Although it should also be noted that the inline list actually does a lot to simplify the logic around writing files. If you had a separate blocks for metadata and data, you would need to buffer both the metadata and data blocks, or jump between the two with rewrites when your program size is larger than a pointer. At the moment I can't think of a good way to handle this without 2x RAM usage per file.

Other notes:

Separate metadata blocks are a tradeoff of complexity for improved random writes.
You could also have a special case for single block files so at least small files don't have the extra metadata block.
With indirect metadata blocks you could also drop the CTZ skip-lists, since files would need to be >8MiB before the skip-list begins helping.

sn00pster · 2018-04-19T09:37:04Z

Is this true for any write?

Lets say I have a 512KiB file, and I position the file back 4KiB and write 512 bytes, does this cause a copy of the entire file or just the blocks after the point of the rewrite?

geky · 2018-04-19T10:01:54Z

Just the blocks after the point of rewrite. So in your case that's 4KiB that will need to be written out (aligned to the nearest erase block-8B, so maybe a bit more).

In the general case, littlefs is only ~2x better than a filesystem that writes out the entire file every seek.

sn00pster · 2018-04-19T10:04:57Z

Sweet, that's what I wanted to hear!

I've just got littlefs running on our boot loader, I've obviously had to remove mass storage which means I've had to add our file transfer protocol into it, I'm just testing now.

Then I need to add support into the actual device firmware that detects if it's littlefs or fat and uses the appropriate file system drivers. Loving littlefs!

geky · 2018-04-19T10:06:34Z

Very cool! Glad to hear it's working for you!

Aircoookie · 2020-10-10T21:24:05Z

First off all, huge congrats to @geky and other maintainers of LittleFS!
It is the best FS out there for small embedded platforms when it comes to read speed and reliability.

Sorry for "reviving" a 2 year old issue, but given that this behavior is still current and my issue is the same I feel it is more appropiate than opening a new issue.
Like perhaps many others, I'm migrating from SPIFFS and am really glad about the improved read speeds and can comfortably say LittleFS is the best out there for most use cases!

The achillis heel of this FS is indeed random writes unfortunately, having to copy almost the entire file if only a single byte is modified in the start or middle of the file is not acceptable in a lot of use cases (see also esp8266/Arduino#7271).
The proposed workaround of splitting into multiple small files is also not suitable for everything, as reading and concating all these files on every read is prohibitively expensive.

Random writes have more than just one drawback:

With an ESP8266, quite a popular platform, I can achieve write speeds of roughly 40KB/s. If it is a design goal that a file operation normally shouldn't block other code for more than 1 sec, that limits the possible file size when needing random writes accordingly.
Despite the wear leveling, having to rewrite that many blocks can cause quicker wear if you have to update a file a lot
I haven't verified, but logically it would seem like a random writable file must not be bigger than the current free blocks (half the FS if there are no other files) to have room for the copy.

Even given the associated challenges, implementing the proposed separate metadata blocks would make LittleFS practically usable for all purposes and would be highly welcome!
If that sort of thing can be implementing in a backwards-compatible way, there could even be a per-file option for whether to use separate metadata blocks so that append-only uses (logfiles etc.) would not have to take the performance hit of the 3rd block write!

Another, more simple to add and even less demanding way would be to optionally allow in-place block updates if the blocks that would need to be re-written are more than x.
This would however jeopardize both the design goals of power failure resiliance and good wear leveling so the above solution would of course be preferred.
Still, for applications where power out during write is very unlikely (e.g. all writes are non-scheduled, but user initiated) it would be preferable to the file close operation taking several seconds after each write.

Thank you once again for the amazing filesystem!

…ushing appears to cost of an order of the remainder of the file -- see: littlefs-project/littlefs#27

rocuh closed this as completed Feb 12, 2018

geky reopened this Mar 12, 2018

geky mentioned this issue Aug 2, 2018

Question about scalability and large files #75

Open

conkerkh mentioned this issue Sep 21, 2018

RingBuffer file implementation. #104

Closed

geky mentioned this issue Jul 9, 2019

Open and Close File Performance #214

Open

geky mentioned this issue Aug 8, 2019

Modifying the block in the middle of the file requires copying the following block #244

Closed

geky mentioned this issue Jul 17, 2020

the function lfs_file_flush does so much reiterated work? #439

Open

geky added the performance label Dec 18, 2020

geky mentioned this issue Dec 18, 2020

lfs_file_seek very slowly #504

Closed

geky mentioned this issue Jan 8, 2021

Performance of modify large files #515

Closed

geky mentioned this issue Mar 23, 2021

Performance issue when overwriting #528

Open

danielgjackson added a commit to cueband/InfiniTime that referenced this issue Sep 14, 2021

Realize LittleFS cannot be used for random writes within a file -- fl…

34c1511

…ushing appears to cost of an order of the remainder of the file -- see: littlefs-project/littlefs#27

geky mentioned this issue Feb 20, 2022

Custom per-file cache size #646

Open

geky mentioned this issue Mar 24, 2022

repeat lfs_file_seek and lfs_file_write may make lfs no space #656

Open

geky mentioned this issue Apr 6, 2022

Removing first block of file #611

Open

geky mentioned this issue Dec 2, 2022

lfs_file_seek hangs. #712

Closed

geky mentioned this issue Apr 30, 2023

lfs_file_seek() poor performance when seeking backward by one byte with no apparent reason #810

Closed

geky mentioned this issue Sep 20, 2024

Overwriting log file doesn't work #1015

Open

geky mentioned this issue Nov 1, 2024

How to get best performance from LittleFS #1033

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite of file header slows down as file size increases #27

Rewrite of file header slows down as file size increases #27

rocuh commented Feb 12, 2018

geky commented Feb 12, 2018 •

edited

Loading

rocuh commented Feb 12, 2018

rocuh commented Mar 3, 2018

geky commented Mar 12, 2018

sn00pster commented Apr 19, 2018

geky commented Apr 19, 2018

sn00pster commented Apr 19, 2018

geky commented Apr 19, 2018

Aircoookie commented Oct 10, 2020

Rewrite of file header slows down as file size increases #27

Rewrite of file header slows down as file size increases #27

Comments

rocuh commented Feb 12, 2018

geky commented Feb 12, 2018 • edited Loading

rocuh commented Feb 12, 2018

rocuh commented Mar 3, 2018

geky commented Mar 12, 2018

sn00pster commented Apr 19, 2018

geky commented Apr 19, 2018

sn00pster commented Apr 19, 2018

geky commented Apr 19, 2018

Aircoookie commented Oct 10, 2020

geky commented Feb 12, 2018 •

edited

Loading