published the source code of the distributed file system 3fs (Fire-Flyer File System), developed for use in training infrastructures and performing large machine learning models. The FS is part of the AI platform fire-flyer and is used by the Chinese company Deepseek, developing language models covering more than 600 billion parameters. The purpose of the creation of 3FS is the provision of a joint storage to simplify the development of distributed applications. The work of the FS is optimized for use in RDNA networks and storage of information on SSD drives. Code 3FS is written in C ++ ( chunkengine on rust) and open under the license mit.
FS is designed to highly produce a large number of accidental data reading operations for which traditional caching techniques and prey reading are not effective. Such activity is characteristic of the training systems of AI models, which in package mode request small portions of unrelated and non-repeated data. To work bypassing the file cache and direct appeal to the carriers in 3FS, the Direct I/O mode and the Linux Io_uring and AIO kernel interfaces are used. The problem of aligning the size, indicators and displacements when directed to the drive is decided through the alignment accounting at the level of the file system itself.
The contents of the files are divided into blocks of the same size, which are distributed into several replication chains. The dimensions of the unit, replication chains and replication tables can be determined in reference to individual catalogs. Each block has a unique identifier and is preserved independently in different storage services, and the final file is formed by logical unification of the blocks. When creating a file, the blocks are selected with the calculation for achieving uniform load on storage units and SSD drives.
FS work provide cluster manager, metadata service, storage service and client that are connected through the RDMA network (Infiniband or ROce). The cluster manager is responsible for changing the composition of the cluster and the delivery of configuration to services and customers. At the same time, several cluster managers work, one of which has the status of the primary (in the case of a failure, the status of the primary is occupied by another copy).
Metadata services recreate the semantics of the file system and are responsible for providing metadata and lists of blocks associated with files and catalogs. For distributed storage of metadata and configuration of the cluster, Foundationdb is used. The client can connect to any metadata service that are equal and work without taking into account the state.