|
Parallel Binary Sorter [RESULTS] |
In the below you can see for sorting 128 million keys using 2 processors took a Wall clock time of 15 min and 0.42 sec and it keeps reducing as we increase the processors. I also report the user time (time spend by processor scanning the keys and putting into buckets) , this is reported in Ticks (on SGI 1tick = 1ms). The communication cost in terms of ticks is reported in column COMM for example when N=2 the COMM = 5560 that means its 5.560 sec. ======================================= KEYS=134217728 (128 million), each 16-bytes --------------------------------------- N USER COMM USER.COMM WALL.CLOCK 1 2 770922 5560 776482 15:00.42 2 4 471196 4272 475468 8:29.62 3 8 314281 3456 317737 5:17.65 4 16 135562 2148 137710 2:21.83 5 32 82414 950 83364 1:28.87 6 64 75160 559 75719 1:21.54 --------------------------------------- ======================================== KEYS=67108864 (64 million),each 16-bytes ---------------------------------------- N U C C.U WALL 1 2 340653 2348 343001 6:18.77 2 4 240861 1931 242792 4:14.57 3 8 153782 1781 155563 2:39.18 4 16 64408 804 65212 1:10.91 5 32 46659 486 47145 0:53.32 6 64 61618 369 61987 1:08.19 ---------------------------------------- ==================================== KEYS=33554432 (32 million), each 16-bytes ------------------------------------ N U C C.U WALL 1 2 148070 1219 148070 2:42.66 2 4 99067 677 99744 1:48.60 3 8 47319 322 47641 0:55.41 4 16 17936 135 18071 0:25.64 5 32 9973 191 10164 0:17.55 6 64 10330 58 10388 0:28.04 ----------------------------------- |
|
|