-
Notifications
You must be signed in to change notification settings - Fork 819
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Is lfs_config.block_cycles the endurance of the device? #660
Comments
I'm just a user of LFS and my recollection is that LFS uses the read verification to start marking blocks as corrupted. It doesn't store this information in flash so it would need to relearn after each power cycle. To my knowledge, LFS doesn't use any static information for block wear it is all dynamic. |
Ah, that simulation is unfortunately quite out of date. It's still using v1, which relied on bad-block reporting, but didn't provide true dynamic wear-leveling. v2 now has dynamic wear-leveling (which introduced the The first thing to note is that can use "bad-block" information such as the NVM's operation time! (This is the first time I've heard of this failure condition, is this a flash device if you don't mind me asking?). If you return LFS_ERR_CORRUPT from the block device erase or prog functions, littlefs will assume the block is now bad and move any data to a new block. However as @thrasher8390 mentioned it doesn't store this information anywhere, so the allocator may attempt to use the block again later for new data. This may change with a rework to the allocator (#75), but it's not clear yet what that will look like. So, the reason this changed in v2 is because relying only on bad-block information turned out to not be a good general purpose strategy for deteriorating storage. One issue is not all storage devices can detect bad blocks, but even on those that can detect it with a heuristic for finding bad blocks, intentionally creating blocks with excessive wear can lead to unexpected behavior and data loss. One of the bigger problems is that blocks with excessive wear can retain data for a shorter period of time, physically leaking electrons because the insulation has worn down. The way dynamic wear-leveling works in littlefs is by keeping a rough idea of how many times each block as been written, and after a certain number of cycles relocate the data to a new block. Unfortunately the act of relocating the data is relatively expensive, so we don't want to relocate the data every single time. This relatively arbitrary number is what the Higher numbers means data is relocated less often, but wear is level evenly distributed. You could turn it off and only rely on bad-block information to move data, but this would risk the above problems. That's why the numbers of ~100-1000, if you raised it to 100K that would effectively be the same as disabling it completely.
Currently, no. One option is increasing the configured block size to 4KiB, it only needs to be a multiple of the physical erase size. But the block allocator isn't smart enough to find sequential blocks. Consider that such a file would be copy-on-writed and wear-leveled, so it's location would change whenever you write to it. Requirements for this can often be solved using a partition table such as MBR, with one partition for the location-explicit data, and one partition for littlefs containing other data in the system.
No, though I'll take these comments as feedback this is wanted. Thoughts on what type of forum would be best? |
Thanks for the input @geky.
It's serial NOR flash. For example https://www.infineon.com/dgdl/Infineon-AN202731_Understanding_Typical_and_Maximum_Program_Erase_Performance-ApplicationNotes-v03_00-EN.PDF?fileId=8ac78c8c7cdc391c017d0cf9df6c576b which says..
Also see figure 17 in "Introduction to flash memory - Proceedings of the IEEE.pdf" So, as I understand it, an area on a device taking longer than the specified maximum to erase or program is an indicator of it being "worn". |
That's quite interesting, thanks for sharing. It certainly makes sense that an exceptional program/erase time would indicate a block is unreliable. |
@geky. If the usage of a block has reached the count of block_cycles and the allocator then decides to move the data in that block to a different block to implement wear levelling, then can the original block be reused in the future? If we have block_count set to 500 then it's only partially lived its 100k P/E cycle lifetime. So, assuming a block can keep getting freed up and subsequently reused, how does the FS keep track of the total cumulative use of that or any block? As you mention, the retention time of a block decreases as the #P/E cycles increases, so a simple initial write, read-verify check is not sufficient to ensure the data is retained for the required design life time. So how would one ensure that use of the FS is such that a design's data retention times need is met? There is information in the Cypress/Infineon app note AN217979 (https://www.infineon.com/dgdl/Infineon-AN217979_Endurance_and_Data_Retention_Characterization_of_Infineon_Flash_Memory-ApplicationNotes-v03_00-EN.pdf?fileId=8ac78c8c7cdc391c017d0d30d6b064f5) that covers retention as a function of temperature and P/E cycles. |
Hi @AdvelC, sorry about the late response. Yes blocks can be reused. The block_cycles just determines how many erase operations we allow on the block per allocation, allowing a tradeoff of performance+fewer metadata updates for less evenly distributed wear. After this and other discussions I'll probably change the name to "aggressiveness" or something similar the next time we make breaking changes to the API. On top of this LittleFS provides a form of statistical wear leveling by allocating blocks in a uniform distribution. This is done by allocating blocks linearly with a random starting position chosen at boot. This won't perfectly level wear, but LittleFS is already only able to provide dynamic wear leveling, and the behavior of flash as it approaches end-of-life is already a probabilistic system. If you need a tighter level of wear-leveling, LittleFS may not fit your use case. You could put LittleFS on top of an FTL layer like Dhara FTL, but this could come with its own complexity. In theory LittleFS could also store the wear for each block and chose the optimal block, but this would have both code cost and runtime impacts without providing static wear-leveling. It could be interesting to explore though. An interesting sidenote, this is similar to how most log-based filesystems/FTLs work. By allocating/writing all blocks in a linear cycle, you know all blocks are within +-1 erase without needing to store any other metadata. It's also worth noting this scheme is more important for storage types such as SD/eMMC which doesn't support partial block writes. Without the ability to write multiple updates to metadata blocks, the performance degrades catastrophically and LittleFS behaves no differently than a traditional CoW filesystem where all wear ends up duplicated into the root block. |
Is block_cycles the same as the endurance of a NVM device? The online simulation seems to show an area being no longer used when block_cycles is reached. A serial NOR flash device may have an endurance of 100k program/erase cycles, so I wondered why littlefs comment suggests a range of 100 to 1000. If it is not the endurance of a block, how does littlefs know when a block on the device, or an area in a block is worn (there is no other related item in lfs_config)? The erase & program times of an area on a device tend to go up as an area becomes more worn. If we design the erase and write interface functions to return an error indicating that the operation failed due to the NVM manufacturer's maximum operation times being exceeded, can LFS use that to tag a block as being worn?
Is it possible to steer a logical file into a physical area of the device, such as allocating a 3KB file to reside in a single physical 4KB block on a device rather than it spanning more than 1 physical block or being fragmented across multiple blocks?
Is there an active user forum that discusses LFS other than the ticket system on github?
Thanks.
The text was updated successfully, but these errors were encountered: