Too many missed branches

user1 于 2023-02-20 10:36:30 发表,最后修改于 2023-02-20 10:38:36


#include <algorithm>

void sort(unsigned *a, int n) {
    for (int d=n; d=d*.6; ) {
        for (int i=d; i<n; ++i) {
            int j = i;
            do {
                if (a[j-d]<=a[j]) break;
                std::swap (a[j-d], a[j]);
                j -=d;
            } while (j>=d);
        }
    }
}

unsigned X[1<<27];
int main() {
    unsigned t=1;
    for (int i=0; i<1<<27; ++i) X[i] = t*=3;
    sort (X, 1<<27);
}

Performance counter stats for './a.out':

         22,789.61 msec task-clock                #    0.994 CPUs utilized          
             4,205      context-switches          #  184.514 /sec                   
             1,046      cpu-migrations            #   45.898 /sec                   
           131,119      page-faults               #    5.753 K/sec                  
    91,640,809,004      cycles                    #    4.021 GHz                      (83.35%)
     4,061,525,860      stalled-cycles-frontend   #    4.43% frontend cycles idle     (83.33%)
     3,348,830,565      stalled-cycles-backend    #    3.65% backend cycles idle      (83.35%)
    68,127,471,038      instructions              #    0.74  insn per cycle         
                                                  #    0.06  stalled cycles per insn  (83.31%)
    18,599,476,046      branches                  #  816.138 M/sec                    (83.32%)
     2,573,416,358      branch-misses             #   13.84% of all branches          (83.35%)

      22.918302903 seconds time elapsed

      22.636324000 seconds user
       0.151680000 seconds sys

Performance counter stats for './a.out' using std::sort:

         10,913.15 msec task-clock                #    0.988 CPUs utilized          
             2,676      context-switches          #  245.209 /sec                   
               512      cpu-migrations            #   46.916 /sec                   
           131,119      page-faults               #   12.015 K/sec                  
    43,757,262,426      cycles                    #    4.010 GHz                      (83.43%)
     2,633,857,164      stalled-cycles-frontend   #    6.02% frontend cycles idle     (83.33%)
       814,202,475      stalled-cycles-backend    #    1.86% backend cycles idle      (83.32%)
    31,297,521,560      instructions              #    0.72  insn per cycle         
                                                  #    0.08  stalled cycles per insn  (83.25%)
     7,801,813,632      branches                  #  714.900 M/sec                    (83.36%)
     1,447,240,904      branch-misses             #   18.55% of all branches          (83.31%)

      11.046072689 seconds time elapsed

      10.540893000 seconds user
       0.371749000 seconds sys

Judge Duck Online | 评测鸭在线
Server Time: 2025-01-18 15:57:10 | Loaded in 16 ms | Server Status
个人娱乐项目,仅供学习交流使用 | 捐赠