Transfer file

When using the asynchronous flush, the library creates a dataset directory within the prefix directory, and then it relies on an external task to actually copy data from the cache to the parallel file system. The library communicates when and what files should be copied by updating the transfer file. A scr_transfer daemon process running in the background on each compute node periodically reads this file to check whether any files needs to be copied. If so, it copies data out in small bursts, sleeping a short time between bursts in order to throttle its CPU and bandwidth usage. The code for this daemon is in scr_transfer.c. Here is what the contents of a transfer file look like:

FILES
  /tmp/user1/scr.1001186/index.0/dataset.1/rank_0.ckpt
    DESTINATION
      /p/lscratchb/user1/simulation123/scr.dataset.1/rank_0.ckpt
    SIZE
      524294
    WRITTEN
      524294
  /tmp/user1/scr.1001186/index.0/dataset.1/rank_0.ckpt.scr
    DESTINATION
      /p/lscratchb/user1/simulation123/scr.dataset.1/rank_0.ckpt.scr
    SIZE
      124
    WRITTEN
      124
PERCENT
  0.000000
BW
  52428800.000000
COMMAND
  RUN
STATE
  STOPPED
FLAG
  DONE

The library specifies the list of files to be flushed by absolute file name under the FILES hash. For each file, the library specifies the size of the file (in bytes) under SIZE, and it specifies the absolute path where the file should be written to under DESTINATION.

The library also specifies limits for the scr_transfer process. The PERCENT field specifies the percentage of CPU time the scr_transfer process should spend running. The daemon monitors how long it runs for when issuing a write burst, and then it sleeps for an appropriate amount of time before executing the next write burst so that it stays below this threshold. The BW field specifies the amount of bandwidth the daemon may consume (in bytes/sec) while copying data. The daemon process monitors how much data it has written along with the time taken to write that data, and it adjusts its sleep periods between write bursts to keep below its bandwidth limit.

Once the library has specified the list of files to be transferred and set any limits for the scr_transfer process, it sets the COMMAND field to RUN. The scr_transfer process does not start to copy data until this RUN command is issued. The library may also specify the EXIT command, which causes the scr_transfer process to exit.

The scr_transfer process records its current state in the STATE field, which may be one of: STOPPED (waiting to do something) and RUNNING (actively flushing). As the scr_transfer process copies each file out, it records the number of bytes it has written (and fsync’d) under the WRITTEN field. When all files in the list have been copied, scr_transfer sets the DONE flag under the FLAG field. The library periodically looks for this flag, and once set, the library completes the flush by writing the summary file in the dataset directory and updating the index file in the prefix directory.