Skip to content
Snippets Groups Projects
  • Tamar Christina's avatar
    a9473f9c
    middle-end:For multiplication try swapping operands when matching complex multiply [PR116463] · a9473f9c
    Tamar Christina authored
    This commit fixes the failures of complex.exp=fast-math-complex-mls-*.c on the
    GCC 14 branch and some of the ones on the master.
    
    The current matching just looks for one order for multiplication and was relying
    on canonicalization to always give the right order because of the TWO_OPERANDS.
    
    However when it comes to the multiplication trying only one order is a bit
    fragile as they can be flipped.
    
    The failing tests on the branch are:
    
    void fms180snd(_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
                   _Complex TYPE c[restrict N]) {
      for (int i = 0; i < N; i++)
        c[i] -= a[i] * (b[i] * I * I);
    }
    
    void fms180fst(_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
                   _Complex TYPE c[restrict N]) {
      for (int i = 0; i < N; i++)
        c[i] -= (a[i] * I * I) * b[i];
    }
    
    The issue is just a small difference in commutative operations.
    we look for {R,R} * {R,I} but found {R,I} * {R,R}.
    
    Since the DF analysis is cached, we should be able to swap operands and retry
    for multiply cheaply.
    
    There is a constraint being checked by vect_validate_multiplication for the data
    flow of the operands feeding the multiplications.  So e.g.
    
    between the nodes:
    
    note:   node 0x4d1d210 (max_nunits=2, refcnt=3) vector(2) double
    note:   op template: _27 = _10 * _25;
    note:      stmt 0 _27 = _10 * _25;
    note:      stmt 1 _29 = _11 * _25;
    note:   node 0x4d1d060 (max_nunits=2, refcnt=2) vector(2) double
    note:   op template: _26 = _11 * _24;
    note:      stmt 0 _26 = _11 * _24;
    note:      stmt 1 _28 = _10 * _24;
    
    we require the lanes to come from the same source which
    vect_validate_multiplication checks.  As such it doesn't make sense to flip them
    individually because that would invalidate the earlier linear_loads_p checks
    which have validated that the arguments all come from the same datarefs.
    
    This patch thus flips the operands in unison to still maintain this invariant,
    but also honor the commutative nature of multiplication.
    
    gcc/ChangeLog:
    
    	PR tree-optimization/116463
    	* tree-vect-slp-patterns.cc (complex_mul_pattern::matches,
    	complex_fms_pattern::matches): Try swapping operands on multiply.
    a9473f9c
    History
    middle-end:For multiplication try swapping operands when matching complex multiply [PR116463]
    Tamar Christina authored
    This commit fixes the failures of complex.exp=fast-math-complex-mls-*.c on the
    GCC 14 branch and some of the ones on the master.
    
    The current matching just looks for one order for multiplication and was relying
    on canonicalization to always give the right order because of the TWO_OPERANDS.
    
    However when it comes to the multiplication trying only one order is a bit
    fragile as they can be flipped.
    
    The failing tests on the branch are:
    
    void fms180snd(_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
                   _Complex TYPE c[restrict N]) {
      for (int i = 0; i < N; i++)
        c[i] -= a[i] * (b[i] * I * I);
    }
    
    void fms180fst(_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
                   _Complex TYPE c[restrict N]) {
      for (int i = 0; i < N; i++)
        c[i] -= (a[i] * I * I) * b[i];
    }
    
    The issue is just a small difference in commutative operations.
    we look for {R,R} * {R,I} but found {R,I} * {R,R}.
    
    Since the DF analysis is cached, we should be able to swap operands and retry
    for multiply cheaply.
    
    There is a constraint being checked by vect_validate_multiplication for the data
    flow of the operands feeding the multiplications.  So e.g.
    
    between the nodes:
    
    note:   node 0x4d1d210 (max_nunits=2, refcnt=3) vector(2) double
    note:   op template: _27 = _10 * _25;
    note:      stmt 0 _27 = _10 * _25;
    note:      stmt 1 _29 = _11 * _25;
    note:   node 0x4d1d060 (max_nunits=2, refcnt=2) vector(2) double
    note:   op template: _26 = _11 * _24;
    note:      stmt 0 _26 = _11 * _24;
    note:      stmt 1 _28 = _10 * _24;
    
    we require the lanes to come from the same source which
    vect_validate_multiplication checks.  As such it doesn't make sense to flip them
    individually because that would invalidate the earlier linear_loads_p checks
    which have validated that the arguments all come from the same datarefs.
    
    This patch thus flips the operands in unison to still maintain this invariant,
    but also honor the commutative nature of multiplication.
    
    gcc/ChangeLog:
    
    	PR tree-optimization/116463
    	* tree-vect-slp-patterns.cc (complex_mul_pattern::matches,
    	complex_fms_pattern::matches): Try swapping operands on multiply.