-
Notifications
You must be signed in to change notification settings - Fork 816
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Open and Close File Performance #214
Comments
Did you try same test by reducing READ_WRITE_BUF_SIZE ? |
Hi, I'm running some experiments at the moment so all this is WIP. I tried changing the buffer size from the original 256 to 16B, 32B and 64B. All the three improved the situation, but none of them solved it.
There's no pattern on the occurrence of the abnormal open, write or close times, they just seem to happen quite randomly. I can observe that the file creation time increases as the flash is more and more occupied (I guess is proportional to the number of files present in the FS). In this graph, both lines represent the same data, but at two different scales. What surprises me is that an abnormal open or close can take up to 40s. I could erase the complete flash in that time! I have some graphs and tables in this excel sheet that I attach. The table contains the time it takes to open, write and close an individual file. The test consists in creating files until we run out of space, so that's 240 files for 64KB files and 812 for 16KB files. |
Could you try 1 Byte for read and write buffer please ? |
Hi @joel-felcana ,When I first started testing, I have also encountered this anomaly. When I close a small file smaller than 10K, it takes a few seconds to complete the operation. I am not paying attention to this issue now, but this still needs to be solved. |
I'm almost certain that what you see is exactly what I noticed some time ago The problem here is that without a significant redesign of littlefs, this problem is unsolvable and only gets worse if you have more storage and more data saved on it. For an SD card with a lot of capacity you can easily get to an unusable state - just have a few dozen megabytes of data saved and any attempt to write anything (which causes the rescan of lookahead buffer) can take minutes or even hours. |
Thanks @safirxps for your support with this issue. EDIT: I think there's been a flaw in my buffer size tests (on the ones I did today 1B buffer sizes). While I certainly changed the read and write buffers, I had the cache size set to 256B. If I change the read, write and cache sizes to one, I get a write time of 22s for 64KB, and if I set read and write to 1B and leave the cache size at 256, I get a write time of 2s for 64KB (potentially getting out of boundaries, I know!)... I will repeat my tests tomorrow, this one is a mess... I tried with 1B buffers, and writing the full flash with 64KB, 16KB and 4KB files, and I see no difference: maybe there's a relatively smaller number of unusual opening times, and maybe more stable close times, but the results are more or less the same. And here is the Excel file with all the data. There's something odd, though. Yesterday, reducing the buffer size would have a detrimental effect on writing times (as expected, as more overhead is added to the writing operation), but today they flash has performed much better than yesterday, with negligible effect of the buffer size on the write throughput... |
So, I've done a couple of tests more.
Sheets 2 and 3 of the excel datasheet that I attach include the results. Regarding storing the complete datablock info in ram, I see no difference whatsoever: with a lookahead of 512 and a buffer size of 64B, I still see six files that take between 10 and 35 seconds to open, and 5 files that take from 4 to 34s to close. Combining the two tests, with a lookahead of 512 and a buffersize of 1B, my write throughput falls to a meagre 3KB/s, with a 64KB taking 22.5s to be written to flash. The impact of having files on flash is even bigger: for example, with a 64B buffer size, it takes around 1ms per file already present in the FS to create a new file, but with a 1B buffer size, it seems to be a little less than 6ms per file! |
I don't see a particular correlation between file size and close times, and you can see that I got a test of 4KB files where actually all the files closed in less than 2 ms (and it was almost 3k of them!). |
One last test I did and I want to share before I leave the office today. The flash I am using (see datasheet in my first post) has a minimum erasable sector of 4KB. I tried using the other two erase commands it accepts (erase a 32KB and a 64KB block). With 64KB sectors, not only the performance increases (I get around 32KB of throughput, a 5KB improvement vs erasing 4KB sectors), but I don't get any weird open and close times, I have to handle a much smaller lookahead buffer, and I can squeeze 253 files (which if I am not mistaken, is the complete flash minus 4 blocks used by little FS), compared to 240 files erasing 4KB sectors and the 171 files I can geet erasing 32KB sectors. |
Hi, I'm still facing some issues with the performance of littleFS. I would like to understand what's going on, so at least I can work around it. @geky, I don't want to be rude by tagging you this way but I would like to get your opinion on the issue, if it's not asking too much. The config that I ended up with after a lot of testing is as follows. I chose reads of 64B because they are the minimum value that doesn't impact massively the write performance, and writes of 256B because the bigger the size of the reads, the better the open and close file performance is, and at 64B I was facing low opening times. I also store all the blocks in the lookahead buffer to avoid any seek penalties. #define WRITE_BUF_SIZE 64
#define READ_BUF_SIZE 256
#define BLOCK_SIZE 64 * 1024
#define BLOCK_COUNT 16 * 1024 / 64
#define LOOKAHEAD_BUF_SIZE BLOCK_COUNT / 8
#define MAX_FILE_SIZE 64 * 1024
#define MAX_FILE_NAME 19
uint8_t read_buffer[READ_BUFF_SIZE];
uint8_t write_buffer[WRITE_BUF_SIZE];
uint8_t file_buffer[WRITE_BUF_SIZE];
const struct lfs_config cfg = {
// block device operations
.read = block_device_read,
.prog = block_device_prog,
.erase = block_device_erase,
.sync = block_device_sync,
// block device configuration
// Minimum read and write
.read_size = READ_BUF_SIZE,
.prog_size = WRITE_BUF_SIZE,
// Size of an erasable block. The smaller, the more efficient the FS is
// as each file takes at least one block, but can be made bigger if
//needed
.block_size = BLOCK_SIZE,
.block_count = BLOCK_COUNT,
// Number of erase cycles before moving to a new block
// The W25Q128JV is rated min 100k erase cycles.
// 75k cycles is writing ten times per day in every block, every day,for
// the next 20 years.
.block_cycles = 75000, // 3/4 of the min guaranteed value
.cache_size = READ_BUF_SIZE,
.lookahead_size = LOOKAHEAD_BUF_SIZE,
.read_buffer = read_buffer,
.prog_buffer = write_buffer,
.lookahead_buffer = lookahead_buffer,
.name_max = MAX_FILE_NAME,
.file_max = MAX_FILE_SIZE,
}; My test is as follows: I start with a clean flash. I create files of 64KB until the flash is full (I can squeeze 253 files). Then I delete them all with lfs_remove, and restart creating files. I will paste a log here. The format is debug level (one letter), time of the event in ms (between parenthesis) and then the message. Some messages follow the pattern My issue is that after writing some files in the FS, when the master block is full, my performance is completely destroyed.
I'm happy to share my code, but I think I'm abusing this enough with such a massive ticket, so I won't share unless requested. Thanks!!!! |
No worries, though tagging me doesn't actually change the notifications I see. I'm subscribed to all littlefs notifications, but have been swamped and working through a long TODO list. I'm at least making positive progress though 👍 I can give some impressions, but it would take quite a bit more time to fully understand exactly where the performance issues are coming from. There's a whole set of issues around scalability/performance that need attention (such as #75 and #27). I'm hoping to tackle them after building up an appropriate system of tests to actually verify performance, but I haven't been able to get on top of them yet. Note also that NAND support (and other storage with >8KiB block sizes) is new and not mature. It's possible there are some surprising runtime issues there. Oh, it may also be worth a test with v2.0.2 as there was a patch after that which may be causing some performance issues (#158).
The architecture of littlefs is that directories are small logs (well, technically linked-lists of logs), files are COW linked-lists, and all of them are a part of a global block allocator (more info in DESIGN.md). The directory logs are appended to on every update with what is called a "commit", which includes file creation/write/delete. Reading a directory log requires what is internally called a "fetch", which must traverse the log and check for consistency issues. This is why the open time increases every create/delete. Eventually the log will be full and undergo a garbage collection. This will be expensive and worth profiling. To combat this cost, these logs must be kept very small. Because of limitations on how we move things around when erasing/garbage collecting, the smallest log we can use is 2-blocks. After a 2-block log is full, the directory is split into a linked-list of two 2-block logs.
This sounds like it might be related to #75, but it sounds like you reduced your block count, so it may just be another side-effect of the logs becoming large.
Not really sure, reading should be much much cheaper.
That would be garbage collection hitting. Your wording makes it sounds like a scientist recording the movements of some sort of colossus, which I think is quite fitting. Garbage collection is O(n^2) of the block size and these blocks are very large.
I was worried this situation could happen. There's some internal logic that relies on ENOSPC to know whether or not a filename can fit in a directory. I suspect there is a bug where it's catching an honest ENOSPC due to all blocks in use. This will require more investigation. This is kind of a catch 22 isn't it. The workaround for allocator scalability is large blocks, but larger blocks expose a scalability issue in the directory logs... One option may be to increase the read_size/prog_size. This will reduce the number of commits in a directory log, decreasing the granularity of log operations. It may be worth a try. Another option would be a modification of littlefs that limits the number of commits in a directory log to some performance threshold. This would waste the storage in those logs, but may make the runtime bearable. Hopefully this helps a bit, I know performance is very inconsistent right now, sorry about that. |
Ok, a bit more progress on my side. First, I refactored (pretty badly) the erase block function and that caused the issue where the test stops and returns the name too long error. It was, indeed, a case of the FS asking to erase the index, thinking it was erased although no operation was executed in the flash, and then overwriting its contents, and then interpreting whatever came from that. So my bad, and sorry. On the other hand, I think it's then a matter of choices. Running with 4KB sectors, I get a worse use of the flash, with fewer files created before I get the out of space error and I get hit by long operation times pretty often, but they are always under a minute. Otherwise, running with 64KB blocks, I don't get those performance spikes, but opening times can get unmanageable given the big size of the index. I could test more with buffer sizes and other config bits, but I'm afraid my time for running test is (no pun intended) running out and I have to carry on with my work. So it will be 4KB sectors. I'll keep your comments in consideration and get back to this ticket eventually to update my progress. Thanks for your help! |
After running into watchdog timeouts I recognised some behaviour mentioned in this post regarding opening (and closing) of files. Namely at random moments the time needed to open/close files was significantly larger. After detecting this I tried to reproduce it with an older version of LFS: the effect disappeared. The setup and detailed data can be found in 2020-11-18 LFS2 VS LFS1.xlsx. The difference in behaviour of the LFSV2 and LFSV1 can be seen below for repeatedly opening, writing 4352 bytes and closing a new file: In an old setup I used the attributes and got interesting behaviour regarding the timing of the opening and closing times: either the opening took long or the closing took long. Sadly I don't have the used settings any more. For my purpose LFSV1 with the CRC written to the end of the file - instead as LFS2 attribute - appears currently to be sufficient. Therefore I haven't (yet) looked much into the cause and/or solution, but wanted to share the results. |
Just to note that this "issue" still persist in 2.8.1 tested as of today. It is absolutely caused by the garbage collection / compacting of the logs but I don't think that it is solvable as-is? |
Hi @kyrreaa, yes, this is still an issue. If it's any consolation, this has been a priority and I have at least have a theoretical solution to bring metadata compaction down from I think since this PR was created, the
It's worth noting that even if the lookahead buffer is large enough to hold all blocks, littlefs still does a filesystem scan after one pass to figure out what blocks are free. littlefs just doesn't know when blocks become free without this scan. In theory, in the current implementation, it might be possible to update free blocks and avoid scanning, but this would involve a bit of work to get implement and would be limited in usefulness while the block map/lookahead buffer is stuck in RAM. But at least when you scan can be controlled by
In theory, it can. At least a bit. Something I'm thinking of adding is a compaction threshold to Unfortunately this wouldn't let you delay compaction arbitrarily... Currently metadata logs are limited to a block, so if metadata exceeds a block compaction is forced to trigger. This is also where metadata splitting happens, which is how metadata can span multiple blocks. I suppose you could set the |
Hi,
I just implemented a small PoC of littleFS working on our custom board. The board uses a Silicon Labs Blue Gecko (cortex M4 with 32KB of RAM) and a Winbond W25Q128JV. We have been using the flash directly successfully. My littleFS config struct is this one:
I've done a number of tests: one of them erases the flash, creates a new FS and proceeds to create files of a constant size on the FS until we run out of space. The throughputs I'm getting are, for example:
Writing on my NOR flash is pretty fast and deleting is as expected more costly (45ms typ per 4KB sector, but 400ms MAX!). I also timed the flash operations (the complete process, enabling write enable, writing, checking that flash is not busy...), and found out that they behave as expected, with erase times of <40ms, and write times of ~6ms.
In my particular case, I don't need to write particularly fast (I will generate ~200B per second, max), but I need to reliably write every second, so it's not a matter of throughput, but of making sure that the next file open or close operation doesn't take 30s, because I can't store all that data in a buffer. How can I improve the reliability of the system (or make it more "deterministic")?
PS: I haven't added my flash functions but if you think they are relevant, please tell me and I will do. I can also dump the flash if needed.
The text was updated successfully, but these errors were encountered: