Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open and Close File Performance #214

Open
joel-felcana opened this issue Jun 25, 2019 · 15 comments
Open

Open and Close File Performance #214

joel-felcana opened this issue Jun 25, 2019 · 15 comments

Comments

@joel-felcana
Copy link

joel-felcana commented Jun 25, 2019

Hi,

I just implemented a small PoC of littleFS working on our custom board. The board uses a Silicon Labs Blue Gecko (cortex M4 with 32KB of RAM) and a Winbond W25Q128JV. We have been using the flash directly successfully. My littleFS config struct is this one:

#define READ_WRITE_BUF_SIZE 256
#define LOOKAHEAD_BUF_SIZE 64
#define MAX_FILE_SIZE 64 * 1024
#define MAX_FILE_NAME 15

uint8_t read_buffer[READ_WRITE_BUF_SIZE];
uint8_t write_buffer[READ_WRITE_BUF_SIZE];
uint8_t file_buffer[READ_WRITE_BUF_SIZE];
// the lookahead vector has to be 64bit-aligned (8B) (see below)
uint8_t lookahead_buffer[LOOKAHEAD_BUF_SIZE] __attribute__((aligned(8)));

// configuration of the filesystem is provided by this struct
const struct lfs_config cfg = {
    // block device operations
    .read  = block_device_read,
    .prog  = block_device_prog,
    .erase = block_device_erase,
    .sync  = block_device_sync,

    // block device configuration

    // Minimum read and write
    .read_size = READ_WRITE_BUF_SIZE,    // good tradeoff memory / performance
    .prog_size = READ_WRITE_BUF_SIZE,
    // Size of an erasable block. The smaller, the more efficient the FS is
    // as each file takes at least one block, but can be made bigger if
    //needed
    .block_size = 4096,
    .block_count = 4096,
    // Size of block caches. The FS has a read, a write and an additional
    // cache per file. They must be a multiple of the read/write size, and
    // a factor of the block size.
    .cache_size = READ_WRITE_BUF_SIZE,
    // Number of erase cycles before moving to a new block
    // The W25Q128JV is rated min 100k erase cycles.
    // 75k cycles is writing ten times per day in every block, every day,for
    // the next 20 years.
    .block_cycles = 75000,  // 3/4 of the min guaranteed value
    .cache_size = READ_WRITE_BUF_SIZE,
    // Number of blocks to lookahead during block allocation.
    // Uses 1 bit per block and must be multiple of 4B
    .lookahead_size = LOOKAHEAD_BUF_SIZE,

    // Optional statically allocated buffers. Read and write must be
    // cache_size and lookahead must be lookahead_size and 64-bit aligned.
    // By default lfs_malloc is used to allocate these buffer.
    .read_buffer = read_buffer,
    .prog_buffer = write_buffer,
    .lookahead_buffer = lookahead_buffer,

    // Optional upper limit on length of file names in bytes
    .name_max = MAX_FILE_NAME,

    // Optional upper limit on files in bytes
    .file_max = MAX_FILE_SIZE,

    // Optional upper limit on custom attributes in bytes.
    // No benefit of using smaller than default
    // .attr_max =
};

I've done a number of tests: one of them erases the flash, creates a new FS and proceeds to create files of a constant size on the FS until we run out of space. The throughputs I'm getting are, for example:

  • For 1KB files, 80% of files are created in 200ms or less (which is pretty good!). The problem is that the other 20% of files have times ranging from 1 to 40s!
    image
  • For 64KB files, I get similar results, as in 80% files complete between 2.1 and 2.5s (understandable, as there is more deleting to do to allocate space for the files), but equally I get around a 20% of files that get very high creation times of up to 30s.
    image

Writing on my NOR flash is pretty fast and deleting is as expected more costly (45ms typ per 4KB sector, but 400ms MAX!). I also timed the flash operations (the complete process, enabling write enable, writing, checking that flash is not busy...), and found out that they behave as expected, with erase times of <40ms, and write times of ~6ms.

In my particular case, I don't need to write particularly fast (I will generate ~200B per second, max), but I need to reliably write every second, so it's not a matter of throughput, but of making sure that the next file open or close operation doesn't take 30s, because I can't store all that data in a buffer. How can I improve the reliability of the system (or make it more "deterministic")?

PS: I haven't added my flash functions but if you think they are relevant, please tell me and I will do. I can also dump the flash if needed.

@safirxps
Copy link

Did you try same test by reducing READ_WRITE_BUF_SIZE ?

@joel-felcana
Copy link
Author

Hi,

I'm running some experiments at the moment so all this is WIP.

I tried changing the buffer size from the original 256 to 16B, 32B and 64B. All the three improved the situation, but none of them solved it.

  • 64B read and write buffers: I moved from 80% of files with good values to 97%. 97% of files open in under 200ms, and 97% of files close in under 4ms (the files that had a higher open time are not the same that the ones that have a high close time, so that reduces to 95% the files that open and close in time. Write times are quite stable too, with 97% of files being written in less than 3s. I have an average throughput of ~27KB (compared to ~50KB when writing to flash directly). The worst write time is 7.1s, but I can deal with that.

  • 32B read and write buffers: slightly less write throughput caused by the increase in write ops, but not by that much (median write time of 2.6s, median write throughput of 24KB). Abnormal open and close dramatically decreased to 3 abnormal opens and an abnormal close, but they are on the 40s level, so quite a deal-breaker there!

  • 16B read and write buffers: similar to 32B, only two abnormal opens and three abnormal close. The three abnormal closes and one of the opens are on the 40s level. Throughput's been impacted more from the decrease, with a median write time of 3.4s and a median throughput of 18.7kB.

  • 16KB files vs 64KB files: it doesn't change results much. I've only tried with 16B buffers, but write output is similar, open times increase with the now much bigger number of files in disk (>800) and I still experience abnormal open and close times.

There's no pattern on the occurrence of the abnormal open, write or close times, they just seem to happen quite randomly. I can observe that the file creation time increases as the flash is more and more occupied (I guess is proportional to the number of files present in the FS). In this graph, both lines represent the same data, but at two different scales.

image

What surprises me is that an abnormal open or close can take up to 40s. I could erase the complete flash in that time!

I have some graphs and tables in this excel sheet that I attach. The table contains the time it takes to open, write and close an individual file. The test consists in creating files until we run out of space, so that's 240 files for 64KB files and 812 for 16KB files.
littleFS - File Creation Times.xlsx

@joel-felcana joel-felcana changed the title littleFS performance Open and Close File Performance Jun 26, 2019
@safirxps
Copy link

Could you try 1 Byte for read and write buffer please ?

@Johnxjj
Copy link

Johnxjj commented Jun 27, 2019

Hi @joel-felcana ,When I first started testing, I have also encountered this anomaly. When I close a small file smaller than 10K, it takes a few seconds to complete the operation. I am not paying attention to this issue now, but this still needs to be solved.

@FreddieChopin
Copy link
Contributor

I'm almost certain that what you see is exactly what I noticed some time ago
#75
(read more than the first post, the problem with allocator and free list is discussed further below)

The problem here is that without a significant redesign of littlefs, this problem is unsolvable and only gets worse if you have more storage and more data saved on it. For an SD card with a lot of capacity you can easily get to an unusable state - just have a few dozen megabytes of data saved and any attempt to write anything (which causes the rescan of lookahead buffer) can take minutes or even hours.

@joel-felcana
Copy link
Author

joel-felcana commented Jun 27, 2019

Thanks @safirxps for your support with this issue.

EDIT: I think there's been a flaw in my buffer size tests (on the ones I did today 1B buffer sizes). While I certainly changed the read and write buffers, I had the cache size set to 256B. If I change the read, write and cache sizes to one, I get a write time of 22s for 64KB, and if I set read and write to 1B and leave the cache size at 256, I get a write time of 2s for 64KB (potentially getting out of boundaries, I know!)... I will repeat my tests tomorrow, this one is a mess...

I tried with 1B buffers, and writing the full flash with 64KB, 16KB and 4KB files, and I see no difference: maybe there's a relatively smaller number of unusual opening times, and maybe more stable close times, but the results are more or less the same.
Here's a table showing the opening, writing and close times for the different tests:
image

image

image

And here is the Excel file with all the data.
littleFS - File Creation Times.xlsx

There's something odd, though. Yesterday, reducing the buffer size would have a detrimental effect on writing times (as expected, as more overhead is added to the writing operation), but today they flash has performed much better than yesterday, with negligible effect of the buffer size on the write throughput...

@joel-felcana
Copy link
Author

So, I've done a couple of tests more.

  1. Increasing the lookahead size to contain all the blocks in RAM and avoid hitting the "search for more blocks" penalty, as suggested by @FreddieChopin. My memory is a 16MB one, and the min erasable sector is 4KB, so that makes 4096 blocks -> 512B of lookahead.
  2. Using a buffer size of 1B, as suggested by @safirxps.

Sheets 2 and 3 of the excel datasheet that I attach include the results. Regarding storing the complete datablock info in ram, I see no difference whatsoever: with a lookahead of 512 and a buffer size of 64B, I still see six files that take between 10 and 35 seconds to open, and 5 files that take from 4 to 34s to close.

Combining the two tests, with a lookahead of 512 and a buffersize of 1B, my write throughput falls to a meagre 3KB/s, with a 64KB taking 22.5s to be written to flash. The impact of having files on flash is even bigger: for example, with a 64B buffer size, it takes around 1ms per file already present in the FS to create a new file, but with a 1B buffer size, it seems to be a little less than 6ms per file!
Finally, even though I see less cases of unusually high file open times, I see at least three cases where opening a file takes an absurd 150-200s to open!!!!

littleFS - File Creation Times.xlsx

@joel-felcana
Copy link
Author

Hi @joel-felcana ,When I first started testing, I have also encountered this anomaly. When I close a small file smaller than 10K, it takes a few seconds to complete the operation. I am not paying attention to this issue now, but this still needs to be solved.

I don't see a particular correlation between file size and close times, and you can see that I got a test of 4KB files where actually all the files closed in less than 2 ms (and it was almost 3k of them!).

@joel-felcana
Copy link
Author

One last test I did and I want to share before I leave the office today.

The flash I am using (see datasheet in my first post) has a minimum erasable sector of 4KB. I tried using the other two erase commands it accepts (erase a 32KB and a 64KB block). With 64KB sectors, not only the performance increases (I get around 32KB of throughput, a 5KB improvement vs erasing 4KB sectors), but I don't get any weird open and close times, I have to handle a much smaller lookahead buffer, and I can squeeze 253 files (which if I am not mistaken, is the complete flash minus 4 blocks used by little FS), compared to 240 files erasing 4KB sectors and the 171 files I can geet erasing 32KB sectors.

@joel-felcana
Copy link
Author

Hi,

I'm still facing some issues with the performance of littleFS. I would like to understand what's going on, so at least I can work around it. @geky, I don't want to be rude by tagging you this way but I would like to get your opinion on the issue, if it's not asking too much.

The config that I ended up with after a lot of testing is as follows. I chose reads of 64B because they are the minimum value that doesn't impact massively the write performance, and writes of 256B because the bigger the size of the reads, the better the open and close file performance is, and at 64B I was facing low opening times. I also store all the blocks in the lookahead buffer to avoid any seek penalties.

#define WRITE_BUF_SIZE 64
#define READ_BUF_SIZE 256
#define BLOCK_SIZE 64 * 1024
#define BLOCK_COUNT 16 * 1024 / 64
#define LOOKAHEAD_BUF_SIZE BLOCK_COUNT / 8
#define MAX_FILE_SIZE 64 * 1024
#define MAX_FILE_NAME 19

uint8_t read_buffer[READ_BUFF_SIZE];
uint8_t write_buffer[WRITE_BUF_SIZE];
uint8_t file_buffer[WRITE_BUF_SIZE];

const struct lfs_config cfg = {
    // block device operations
    .read  = block_device_read,
    .prog  = block_device_prog,
    .erase = block_device_erase,
    .sync  = block_device_sync,

    // block device configuration

    // Minimum read and write
    .read_size = READ_BUF_SIZE,
    .prog_size = WRITE_BUF_SIZE,
    // Size of an erasable block. The smaller, the more efficient the FS is
    // as each file takes at least one block, but can be made bigger if
    //needed
    .block_size = BLOCK_SIZE,
    .block_count = BLOCK_COUNT,
    // Number of erase cycles before moving to a new block
    // The W25Q128JV is rated min 100k erase cycles.
    // 75k cycles is writing ten times per day in every block, every day,for
    // the next 20 years.
    .block_cycles = 75000,  // 3/4 of the min guaranteed value

    .cache_size = READ_BUF_SIZE,

    .lookahead_size = LOOKAHEAD_BUF_SIZE,

    .read_buffer = read_buffer,
    .prog_buffer = write_buffer,
    .lookahead_buffer = lookahead_buffer,

    .name_max = MAX_FILE_NAME,
    .file_max = MAX_FILE_SIZE,
};

My test is as follows: I start with a clean flash. I create files of 64KB until the flash is full (I can squeeze 253 files). Then I delete them all with lfs_remove, and restart creating files.

I will paste a log here. The format is debug level (one letter), time of the event in ms (between parenthesis) and then the message. Some messages follow the pattern FC: t = 8 + 1535 + 5 = 1548. That means that creating the file took 1548ms, 8ms to open, 1535ms to write 64KB and 5ms to close it. There is also another message that might be of interest, and are those that start with EBX, that means that block X was erased (EraseBlockX).

My issue is that after writing some files in the FS, when the master block is full, my performance is completely destroyed.

  1. The test erases the flash chip and at t =27s I mount a new FS. I start creating files. My opening times increase as the number of files increases at around 1.5ms per file in the FS. Deleting the files doesn't reduce the opening times (As I guess their entries are still in the master block), and deletion times seem to be the same than opening times, and increase at the same pace.
  2. At t=463.889, flash is full, and I proceed to recursively delete all the files. It takes littleFS almost 40s to realise that the FS was full!
  3. After that, looping through the dir twice (first to know how many files are there, and then another one to delete them) takes around 220s.
  4. The process carries on at around t=802s.
  5. At t=1067.473s, I experience what I think is the deletion of a master block. Block 1 is erased (See message timestamped 1275921). Closing that particular file takes 400s!!
  6. The process continues. At t=1681, the flash is full again. It takes littleFS around 33s to find out that the flash is full, and then 240s to delete all the files.
  7. And finally, the process continues until the master block is completely filled again. This time, at t = 2421386, I try to create a file, block 0 is deleted 250s later, and after another 250s, the test fails. It complains about the name of the file being too long, which I find surprising given that the file is called "F00743", so I suspect that there has to be a memory leak somewhere, or an unterminated string. File 743 means that the flash is quite full, but not completely full, so I think there should be enough space to do a swap and keep the master boot safe.

I'm happy to share my code, but I think I'm abusing this enough with such a massive ticket, so I won't share unless requested.

Thanks!!!!

@geky
Copy link
Member

geky commented Jul 9, 2019

@geky, I don't want to be rude by tagging you this way but I would like to get your opinion on the issue, if it's not asking too much.

No worries, though tagging me doesn't actually change the notifications I see. I'm subscribed to all littlefs notifications, but have been swamped and working through a long TODO list. I'm at least making positive progress though 👍

I can give some impressions, but it would take quite a bit more time to fully understand exactly where the performance issues are coming from. There's a whole set of issues around scalability/performance that need attention (such as #75 and #27). I'm hoping to tackle them after building up an appropriate system of tests to actually verify performance, but I haven't been able to get on top of them yet.

Note also that NAND support (and other storage with >8KiB block sizes) is new and not mature. It's possible there are some surprising runtime issues there.

Oh, it may also be worth a test with v2.0.2 as there was a patch after that which may be causing some performance issues (#158).

My opening times increase as the number of files increases at around 1.5ms per file in the FS. Deleting the files doesn't reduce the opening times (As I guess their entries are still in the master block), and deletion times seem to be the same than opening times, and increase at the same pace.

The architecture of littlefs is that directories are small logs (well, technically linked-lists of logs), files are COW linked-lists, and all of them are a part of a global block allocator (more info in DESIGN.md).

The directory logs are appended to on every update with what is called a "commit", which includes file creation/write/delete. Reading a directory log requires what is internally called a "fetch", which must traverse the log and check for consistency issues. This is why the open time increases every create/delete.

Eventually the log will be full and undergo a garbage collection. This will be expensive and worth profiling.

To combat this cost, these logs must be kept very small. Because of limitations on how we move things around when erasing/garbage collecting, the smallest log we can use is 2-blocks. After a 2-block log is full, the directory is split into a linked-list of two 2-block logs.

At t=463.889, flash is full, and I proceed to recursively delete all the files. It takes littleFS almost 40s to realise that the FS was full!

This sounds like it might be related to #75, but it sounds like you reduced your block count, so it may just be another side-effect of the logs becoming large.

After that, looping through the dir twice (first to know how many files are there, and then another one to delete them) takes around 220s.

Not really sure, reading should be much much cheaper.

At t=1067.473s, I experience what I think is the deletion of a master block. Block 1 is erased (See message timestamped 1275921). Closing that particular file takes 400s!!

That would be garbage collection hitting. Your wording makes it sounds like a scientist recording the movements of some sort of colossus, which I think is quite fitting. Garbage collection is O(n^2) of the block size and these blocks are very large.

It complains about the name of the file being too long, which I find surprising given that the file is called "F00743", so I suspect that there has to be a memory leak somewhere, or an unterminated string. File 743 means that the flash is quite full, but not completely full, so I think there should be enough space to do a swap and keep the master boot safe.

I was worried this situation could happen. There's some internal logic that relies on ENOSPC to know whether or not a filename can fit in a directory. I suspect there is a bug where it's catching an honest ENOSPC due to all blocks in use. This will require more investigation.


This is kind of a catch 22 isn't it. The workaround for allocator scalability is large blocks, but larger blocks expose a scalability issue in the directory logs...

One option may be to increase the read_size/prog_size. This will reduce the number of commits in a directory log, decreasing the granularity of log operations. It may be worth a try.

Another option would be a modification of littlefs that limits the number of commits in a directory log to some performance threshold. This would waste the storage in those logs, but may make the runtime bearable.


Hopefully this helps a bit, I know performance is very inconsistent right now, sorry about that.

@joel-felcana
Copy link
Author

joel-felcana commented Jul 11, 2019

Ok, a bit more progress on my side.

First, I refactored (pretty badly) the erase block function and that caused the issue where the test stops and returns the name too long error. It was, indeed, a case of the FS asking to erase the index, thinking it was erased although no operation was executed in the flash, and then overwriting its contents, and then interpreting whatever came from that. So my bad, and sorry.

On the other hand, I think it's then a matter of choices. Running with 4KB sectors, I get a worse use of the flash, with fewer files created before I get the out of space error and I get hit by long operation times pretty often, but they are always under a minute. Otherwise, running with 64KB blocks, I don't get those performance spikes, but opening times can get unmanageable given the big size of the index. I could test more with buffer sizes and other config bits, but I'm afraid my time for running test is (no pun intended) running out and I have to carry on with my work. So it will be 4KB sectors.

I'll keep your comments in consideration and get back to this ticket eventually to update my progress. Thanks for your help!

@diederikvdv
Copy link

After running into watchdog timeouts I recognised some behaviour mentioned in this post regarding opening (and closing) of files. Namely at random moments the time needed to open/close files was significantly larger.

After detecting this I tried to reproduce it with an older version of LFS: the effect disappeared. The setup and detailed data can be found in 2020-11-18 LFS2 VS LFS1.xlsx.

The difference in behaviour of the LFSV2 and LFSV1 can be seen below for repeatedly opening, writing 4352 bytes and closing a new file:

LFSV2: 0x00020002
image
image
image

LFS1: 0x00010006
image

In an old setup I used the attributes and got interesting behaviour regarding the timing of the opening and closing times: either the opening took long or the closing took long. Sadly I don't have the used settings any more.
LFSV2: 0x00020002
image

For my purpose LFSV1 with the CRC written to the end of the file - instead as LFS2 attribute - appears currently to be sufficient. Therefore I haven't (yet) looked much into the cause and/or solution, but wanted to share the results.

@kyrreaa
Copy link

kyrreaa commented Nov 1, 2023

Just to note that this "issue" still persist in 2.8.1 tested as of today. It is absolutely caused by the garbage collection / compacting of the logs but I don't think that it is solvable as-is?
I have implemented the lfs_fs_gc() call upon mounting my filesystem and it does indeed remove the initial scanning of the logs to build a map. (I use a full block map, aka one bit per block on entire fs, 4096 blocks for me). Result is creation of first file does not have suprprising delays due to the initial search for free space.
It does however still experience compacting if triggered... If one creates and deletes files regularly this happens quite frequently and I wonder if this can be mitigated in some way like lfs_fs_gc()? If for example I could set littlefs to use multiple blocks for the initial directory I could postpone compacting during the creation-sequence where I log data. Then if there was a way to trigger this after offloading data and deleting a lot of files, the compacting could be avoided during logging next time as well.

@geky
Copy link
Member

geky commented Nov 14, 2023

Hi @kyrreaa, yes, this is still an issue. If it's any consolation, this has been a priority and I have at least have a theoretical solution to bring metadata compaction down from $O(n^2)$ to $O(n \log{n})$. But actually bringing this into littlefs is a different story and there's still more work to be done...

I think since this PR was created, the metadata_max (#502) configuration option was merged, which provides a temporary workaround by artificially limiting how large metadata logs can get. It may be worth considering, though it involve a significant storage tradeoff.


I use a full block map, aka one bit per block on entire fs, 4096 blocks for me

It's worth noting that even if the lookahead buffer is large enough to hold all blocks, littlefs still does a filesystem scan after one pass to figure out what blocks are free. littlefs just doesn't know when blocks become free without this scan.

In theory, in the current implementation, it might be possible to update free blocks and avoid scanning, but this would involve a bit of work to get implement and would be limited in usefulness while the block map/lookahead buffer is stuck in RAM.

But at least when you scan can be controlled by lfs_fs_gc now. Glad to hear that has helped a bit.

I wonder if this can be mitigated in some way like lfs_fs_gc()?

In theory, it can. At least a bit. Something I'm thinking of adding is a compaction threshold to lfs_fs_gc to tell it to compact metadata logs that exceed a given threshold. This would not be too hard to implement and allow moving compaction to lfs_fs_gc as long as the total amount of logging between calls to lfs_fs_gc never exceeds block_size - compaction_threshold.

Unfortunately this wouldn't let you delay compaction arbitrarily... Currently metadata logs are limited to a block, so if metadata exceeds a block compaction is forced to trigger. This is also where metadata splitting happens, which is how metadata can span multiple blocks.

I suppose you could set the block_size large enough to delay compaction until after you creation-sequence. But doing so may be cumbersome...

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

7 participants