Explanation: why wc on Haskell turned out to be β€œfaster” than the analogue on C


After recent articles ( No. 10xd34df00d, β„–2chapuza, β„–3picul) comparing the speed of implementations of the simplified wc utility , I had only one question - How did a simple implementation in Haskell turn out to be faster than a simple implementation in C ?!


2020-02-27 : The results and conclusions forghc 8.8.3and on the Shakespeare texts are confirmed(at the end under the spoiler).


Briefly recall TK


For a 1.8Gb text file (1_871_822_210 bytes), you need to count the number of lines, words and characters. With a number of simplifications: only 8-bit character encoding, end of lines only in Unix style, word boundaries are spaces and all characters are less Char(14), reading only from a file (mapping to memory is acceptable).


Sample Test Data

Region,Country,Item Type,Sales Channel,Order Priority,Order Date,Order ID,Ship Date,Units Sold,Unit Price,Unit Cost,Total Revenue,Total Cost,Total Profit
Sub-Saharan Africa,South Africa,Fruits,Offline,M,7/27/2012,443368995,7/28/2012,1593,9.33,6.92,14862.69,11023.56,3839.13
Middle East and North Africa,Morocco,Clothes,Online,M,9/14/2013,667593514,10/19/2013,4611,109.28,35.84,503890.08,165258.24,338631.84
Australia and Oceania,Papua New Guinea,Meat,Offline,M,5/15/2015,940995585,6/4/2015,360,421.89,364.69,151880.40,131288.40,20592.00
Sub-Saharan Africa,Djibouti,Clothes,Offline,H,5/17/2017,880811536,7/2/2017,562,109.28,35.84,61415.36,20142.08,41273.28
Europe,Slovakia,Beverages,Offline,L,10/26/2016,174590194,12/4/2016,3973,47.45,31.79,188518.85,126301.67,62217.18
Asia,Sri Lanka,Fruits,Online,L,11/7/2011,830192887,12/18/2011,1379,9.33,6.92,12866.07,9542.68,3323.39
Sub-Saharan Africa,Seychelles ,Beverages,Online,M,1/18/2013,425793445,2/16/2013,597,47.45,31.79,28327.65,18978.63,9349.02
Sub-Saharan Africa,Tanzania,Beverages,Online,L,11/30/2016,659878194,1/16/2017,1476,47.45,31.79,70036.20,46922.04,23114.16
Sub-Saharan Africa,Ghana,Office Supplies,Online,L,3/23/2017,601245963,4/15/2017,896,651.21,524.96,583484.16,470364.16,113120.00
Sub-Saharan Africa,Tanzania,Cosmetics,Offline,L,5/23/2016,739008080,5/24/2016,7768,437.20,263.33,3396169.60,2045547.44,1350622.16
Asia,Taiwan,Fruits,Offline,M,2/9/2014,732588374,2/23/2014,8034,9.33,6.92,74957.22,55595.28,19361.94
Middle East and North Africa,Algeria,Cosmetics,Online,M,2/18/2011,761723172,2/24/2011,9669,437.20,263.33,4227286.80,2546137.77,1681149.03
Asia,Singapore,Snacks,Online,C,1/28/2013,176461303,2/7/2013,7676,152.58,97.44,1171204.08,747949.44,423254.64
Australia and Oceania,Papua New Guinea,Clothes,Offline,L,6/20/2011,647164094,7/14/2011,9092,109.28,35.84,993573.76,325857.28,667716.48
Asia,Vietnam,Personal Care,Online,M,4/4/2010,314505374,5/6/2010,7984,81.73,56.67,652532.32,452453.28,200079.04
Sub-Saharan Africa,Uganda,Personal Care,Online,M,6/19/2014,539471471,7/21/2014,451,81.73,56.67,36860.23,25558.17,11302.06
Sub-Saharan Africa,Zimbabwe,Office Supplies,Offline,C,3/28/2011,953361213,4/8/2011,9623,651.21,524.96,6266593.83,5051690.08,1214903.75
Sub-Saharan Africa,Ethiopia,Cosmetics,Online,M,7/7/2011,807785928,7/25/2011,662,437.20,263.33,289426.40,174324.46,115101.94
Europe,France,Cosmetics,Online,M,12/7/2015,324669444,1/18/2016,5758,437.20,263.33,2517397.60,1516254.14,1001143.46


Previous Results


"" , . i7-4600U CPU @ 2.10GHz. β€” . , /dev/shm/ (tmpfs).


" ", Haskell- ( , CPU):


wc-19.416
Haskellghc 8.0.2, -O22.811
gcc 8, -Ofast3.067
gcc 8, -Ofast0.637

, () - Haskell. - "". ghc (2.811 < 3.067), Haskell 2:3 Haskell. .


, . , AVX2:


Haskellghc 8.0.2, -O2 -mavx22,789
gcc 8, -Ofast -march=haswell2.914
clang 8, -Ofast -march=haswell1.124
gcc 8, -Ofast -march=haswell0.634
AVX2gcc 8, -Ofast -march=haswell0.269

, AVX2 Haskell, clang ( CompilerExplorer), . , AVX2 .


- Haskell


. , .


, -ddump-asm. wc.hs 24 . - , grep cmp : cmpq $33,%rbx; cmpq $32,%rbx; cmpq $10,%rbx; cmpb $4,%bl. .


- . C:


static _Bool process_chunk(const unsigned char *text, const size_t bytes,
                           _Bool prev) {
  result.chars += bytes;
  for (size_t i = 0; i < bytes;) {
    if (text[i] > ' ') {
      // -  ""
      result.words += !prev;
      prev = 1;
      while (++i < bytes && text[i] > ' ')
        ;
    } else {
      // -  
      prev = 0;
      do {
        _Bool non_space = text[i] != ' ' && text[i] - 9 > 4;
        result.words += !prev && non_space;
        result.lines += text[i] == '\n';
        prev = non_space;
      }
      while (++i < bytes && text[i] <= ' ');
    }
  }
  return prev;
}

Haskell-, . , "" :


Haskellgcc 8, -Ofast1.331
Haskellghc 8.0.2, -O22.811
gcc 8, -Ofast3.067
gcc 8, -Ofast0.637

"" Haskell- , ! , , ghc .


Haskell C, ( ):


  • ghc decision tree , . -: " " " ".
  • - " " , -. , , "" 1.8 ;)
  • , . , "" - ( ). β€” .

, 1871822210 / 44774631 = 41.8 , 44774631 / 15000000 = 2.98 .
-. - , . ?


1.8 (3068561 * 610 = 1871822210):


$ dd if=/dev/urandom of=/dev/shm/rand.dat bs=3068561 count=610

, " " , "" . 1871822210 / 210195110 = 8.9, . , "" . , "" 41 . "" :


Haskellgcc 8, -Ofast1.331 => 3.532
Haskellghc 8.0.2, -O22.811 => 4.812
gcc 8, -Ofast3.062 => 3.071

- β€” . , C. .


Haskell "" , . , , C - ;)


ghc 8.8.3 (2020-02-27)

:


  1. - LLVM.
  2. gcc clang.
  3. , .
  4. , "" Haskell () .
  5. wc .

. i7-7820HQ CPU @ 2.90GHz 64- Fedora 31.


5_777_367 . " " 1.8:


$ wget http://www.gutenberg.org/files/100/100-0.txt
$ for i in `seq 1 333`; do cat 100-0.txt >> test.txt; done

1_923_863_211 . C :


wc-12.687
Haskellgcc 9.2.1, -Ofast4.947
Haskellclang 9.0.1, -Ofast4.689
Haskellghc 8.8.3, -O36.339 (!)
Haskellghc 8.8.3, -O3 -fllvm4.893 (!)
gcc 9.2.1, -Ofast2.445
clang 9.0.1, -Ofast2.347

/dev/urandom 1_871_822_210 :


wc-16.878
Haskellgcc 9.2.1, -Ofast3.322
Haskellclang 9.0.1, -Ofast3.344
Haskellghc 8.8.3, -O33.976
Haskellghc 8.8.3, -O3 -fllvm2.513
gcc 9.2.1, -Ofast2.376
clang 9.0.1, -Ofast2.244

ghc 8.8.3 8.0.2, LLVM ( 9.0.1) . , C - , . " " /dev/urandom, .


, 0xd34df00d, Haskell . , .


, , .


!

: "" "" C ( ), : "" . , zero cost abstraction , zero cost abstraction .


All Articles