ESP32 - Optimizing SD Card I/O Performance

date
Jun 16, 2024
type
Post
AI summary
slug
esp32-sd-card-test
status
Published
tags
ESP32
Hardware
summary
By organizing files into subdirectories, using binary formats, maintaining in-memory metadata, and optimizing file structures for both writing and reading, you can significantly improve the I/O performance of your ESP32 data logger.

Why SD Card I/O Slows Down with Many Files

When you accumulate a large number of files in a single folder on an SD card, the file system (usually FAT32 or exFAT for SD cards) has to search through the directory entries to find or open a file. This directory searching process can become slow as the number of files increases. Here's why:
  • Directory Entries: FAT32 and exFAT store directory entries in a linear list. As the number of files grows, the time to search through this list increases linearly, which means it can take significantly longer to locate and open a specific file.
  • Fragmentation: Over time, as files are created, deleted, and modified, fragmentation can occur. This means that parts of files may be spread across different physical locations on the SD card, increasing the time it takes to read the file.
  • Cluster Allocation: Each file and directory entry requires a cluster allocation, which is managed by the File Allocation Table (FAT). With many files, the process of updating and managing these allocations becomes more complex and time-consuming.

Testing I/O

Here's an outline for two test programs to compare the speed difference between logging all files in a single directory and using a subfolder strategy. These programs will log the time taken for each write operation and store this information in a log file. The ESP32 will initiate a file write operation every 500 ms.
Qiwei Mao - SD Card I/O Performance Comparison
Qiwei Mao - SD Card I/O Performance Comparison
The performance analysis of SD card I/O on the ESP32 reveals distinct trends between storing files in a single folder versus organizing them into subfolders by minute intervals.
Storing files sequentially in a single folder shows a noticeable linear increase in file I/O time as the number of files grows. This suggests that accessing and managing files within a large directory becomes progressively slower due to increased search and retrieval times. In contrast, the subfolder approach initially exhibits similar increases in I/O time, but these are intermittently punctuated by drops in access times as new files are directed into fresh subfolders.
This phenomenon indicates that organizing files into smaller, time-based clusters mitigates some of the performance degradation observed in single-folder storage. Such organizational strategies not only optimize access speeds but also potentially enhance overall system efficiency by distributing storage load across manageable subsets.
Program 1: Single Directory Logging
Program 2: Subfolder Strategy Logging

Strategies for Optimizing I/O Performance

File Organization

  • Subdirectories: Organize files into subdirectories to reduce the number of files in a single directory. This can significantly speed up directory searches.
    • For example, create a new folder for each day or hour.
  • File Rotation: Implement a file rotation scheme where older files are archived or deleted periodically.

Indexing

  • Metadata Index: Maintain an in-memory index of file names and their corresponding directory entries. This index can help quickly locate files without having to search through the directory structure.
  • Time-based Indexing: Create an index based on timestamps to quickly access files relevant to a specific time range.

File Structure Optimization

  • Single File for Time Series Data: Instead of creating a new file for each log entry, consider appending data to a single file or a small number of files. Use a binary format for efficient storage and retrieval.
  • Log Segmentation: Segment the log file into smaller chunks that represent a fixed time period (e.g., one hour or one day).

Metadata Management and Efficient Data Retrieval

In-Memory Metadata

  • Maintain metadata in memory for quick access. This metadata can include information about which files contain unsent data and the positions within those files.

Efficient File Reads

  • When retrieving data to send over LoRa, read large chunks of data at once and then process these chunks in memory to extract the needed rows. This minimizes the number of I/O operations.

File Segmentation for LoRa Transfer

  • Segment files in a way that aligns with the data transfer needs. For example, store data in chunks that correspond to the maximum LoRa packet size, making it easier to retrieve and send.

Rolling Logs

  • Use a rolling log mechanism where old data is periodically moved to archive storage, keeping the active log files smaller and more manageable.

Conclusion

By organizing files into subdirectories, using binary formats, maintaining in-memory metadata, and optimizing file structures for both writing and reading, you can significantly improve the I/O performance of your ESP32 data logger. These strategies will help manage large numbers of files and ensure efficient data retrieval and transfer over your LoRa network.

About Me

Hi, I'm Qiwei Mao, a geotechnical engineer with a passion for IoT systems. I'm exploring low-power microcontrollers and LoRa communication systems to enable both hobbyist remote monitoring solutions and industrial-grade monitoring or control systems.
 
Qiwei Mao
Qiwei Mao
 

© Qiwei Mao 2024