.. _rank2file_file: Rank2file map ------------- The rank2file map tracks which files were written by which ranks during a particular dataset. This map contains information for every rank and file. For large jobs, it may consist of more bytes than can be loaded into any single MPI process. This information is scattered among multiple files that are organized as a tree. These files are stored in the dataset directory on the parallel file system. Internally, the data of the rank2file map is organized as a hash. There is always a root file named ``rank2file.scr``. Here are the contents of an example root rank2file map. :: LEVEL 1 RANKS 4 RANK 0 OFFSET 0 FILE .scr/rank2file.0.0.scr Note that there is no ``VERSION`` field. The version is implied from the summary file for the dataset. The ``LEVEL`` field lists the level at which the current rank2file map is located in the tree. The leaves of the tree are at level 0. The ``RANKS`` field specifies the number of ranks the current file (and its associated subtree) contains information for. For levels that are above level 0, the ``RANK`` hash contains information about other rank2file map files to be read. Each entry in this hash is identified by a rank id, and then for each rank, a ``FILE`` and ``OFFSET`` are given. The rank id specifies which rank is responsible for reading content at the next level. The ``FILE`` field specifies the file name that is to be read, and the ``OFFSET`` field gives the starting byte offset within that file. A process reading a file at the current level scatters the hash info to the designated “reader” ranks, and those processes read data for the next level. In this way, the task of reading the rank2file map is distributed among multiple processes in the job. The SCR library ensures that the maximum amount of data any process reads in any step is limited (currently 1MB). File names at levels lower than the root have names of the form ``rank2file...scr``, where ``level`` is the level number within the tree and ``rank`` is the rank of the process that wrote the file. Finally, level 0 contains the data that maps a rank to a list of files names. Here are the contents of an example rank2file map file at level 0. :: RANK2FILE LEVEL 0 RANKS 4 RANK 0 FILE rank_0.ckpt SIZE 524294 CRC 0x6697d4ef 1 FILE rank_1.ckpt SIZE 524295 CRC 0x28eeb9e 2 FILE rank_2.ckpt SIZE 524296 CRC 0xb6a62246 3 FILE rank_3.ckpt SIZE 524297 CRC 0x213c897a Again, the number of ranks that this file contains information for is recorded under the ``RANKS`` field. There are entries for specific ranks under the ``RANK`` hash, which is indexed by rank id within ``scr_comm_world``. For a given rank, each file that rank wrote as part of the dataset is indexed by file name under the ``FILE`` hash. The file name specifies the relative path to the file starting from the dataset directory. For each file, SCR records the size of the file in bytes under ``SIZE``, and SCR may also record the CRC32 checksum value over the contents of the file under the ``CRC`` field. On restart, the reader rank that reads this hash scatters the information to the owner rank, so that by the end of processing the tree, all processes know which files to read.