FFI Sync

In the Random Reproducibility blog entry, I mentioned that I was using the sync command to call the sync system call. This runs a new process each time the program syncs, which is very wasteful of resources. I went ahead and changed the code to use FFI instead.

Implementation

The implementation is trivial. To test it, I copied the code to a test program, which is available on GitHub. Though I do not have an environment to test the build on Windows, where sync does nothing, I organized the code so that it should work. The FFI is separated into a separate module that is only built when not on Windows, and the API uses CPP to use that module only in that case. The library is configured as follows:

library
  hs-source-dirs: src
  exposed-modules:
      FFISync
  if !os(windows)
    other-modules:
        FFISync.Unistd
  build-depends:
      base >=4.7 && <5
  default-language: Haskell2010
  ghc-options: -Wall

The FFISync.Unistd module exposes a call to the sync system call as a c_sync function. This is the most trivial usage of FFI, as the type is IO () and this call cannot even fail!

{-# LANGUAGE ForeignFunctionInterface #-}

module FFISync.Unistd (c_sync) where

foreign import ccall "unistd.h sync"
  c_sync :: IO ()

The FFISync module provides the public API: the sync function. This function calls c_sync when not on Windows and does nothing otherwise.

{-# LANGUAGE CPP #-}

module FFISync (sync) where

-- (ffi-sync)
#ifndef mingw32_HOST_OS
import FFISync.Unistd (c_sync)
#endif

sync :: IO ()
#ifdef mingw32_HOST_OS
sync = pure ()
#else
sync = c_sync
#endif

The ffi-sync executable simply calls sync.

module Main (main) where

import FFISync (sync)

main :: IO ()
main = sync

Test

To test this implementation, I used some USB memory that is particularly slow. I created two test files with 32 MB of random data each.

$ dd if=/dev/urandom of=test-random1 bs=1M count=32
$ dd if=/dev/urandom of=test-random2 bs=1M count=32

When simply copying a file to the USB memory, the command finishes very quickly, but the data is not yet written. The kernel manages synchronization of the buffers in the background.

$ time cp test-random1 /mnt/stickB

real    0m0.079s
user    0m0.001s
sys     0m0.055s

The following script copies the second test file to the USB memory and then runs the ffi-sync executable.

#!/bin/sh

cp test-random2 /mnt/stickB
stack exec ffi-sync

Running this script takes a lot more time, indicating that it is indeed synchronizing.

$ time ./cp-and-sync.sh

real    0m4.532s
user    0m0.274s
sys     0m0.078s

Author

Travis Cardwell

Published

December 7, 2021