Name

    AMD_shader_ballot

Name Strings

    GL_AMD_shader_ballot

Contact

    Qun Lin, AMD (quentin.lin 'at' amd.com)

Contributors

    Qun Lin, AMD
    Graham Sellers, AMD
    Daniel Rakos, AMD
    Rex Xu, AMD
    Dominik Witczak, AMD

Status

    Shipping

Version

    Last Modified Date:         10/19/2016
    Author Revision:            4

Number

    ???

Dependencies

    This extension is written against the OpenGL Shading Language
    Specification, Version 4.50.

    This extension requires ARB_shader_group_vote and ARB_shader_ballot.

    This extension interacts with ARB_gpu_shader_int64.

    This extension interacts with AMD_gpu_shader_half_float.

Overview

    The extensions ARB_shader_group_vote and ARB_shader_ballot introduced the
    concept of sub-groups and a set of operations that allow data exchange
    across shader invocations within a sub-group.

    This extension further extends the capabilities of these extensions with
    additional sub-group operations.

IP Status

    None.

New Procedures and Functions

    None.

New Tokens

    None.

Modifications to the OpenGL Shading Language Specification, Version 4.50

    Including the following line in a shader can be used to control the
    language features described in this extension:

      #extension GL_AMD_shader_ballot : <behavior>

    where <behavior> is as specified in section 3.3.

    New preprocessor #defines are added to the OpenGL Shading Language:

      #define GL_AMD_shader_ballot 1

Additions to Chapter 8 of the OpenGL Shading Language (GLSL) Specification,
version 4.30 (Built-in functions)

    Add Section 8.18, Shader Invocation Group Functions

    The <min>, <max>, <add> group invocation functions process values of the
    specified value <v> across all active shader invocations in the sub-group
    with three special group operatons according to the following table:

    Group Operation   Description
    ---------------   ---------------------------------------------------------
    Reduce            A reduction operation for values of the specified value
                      <v> in the sub-group

    InclusiveScan     A binary operation with an identity <I> and <n> (where
                      <n> is the size of the sub-group) elements { a[0], a[1],
                      .., a[n] } resulting in { a[0], (a[0] op a[1]), .., (a[0]
                      op a[1] op .. op a[n-1]) }. <op> could be any of <min>,
                      <max>, <add>.

    ExclusiveScan     A binary operation with an identity <I> and <n> (where
                      <n> is the size of the sub-group) elements { a[0], a[1],
                      .., a[n] } resulting in { I, a[0], (a[0] op a[1]), ..,
                      (a[0] op a[1] op .. op a[n-2]) }. <op> could be any of
                      <min>, <max>, <add>.

    The identity <I> in the group operations <InclusiveScan> and <ExclusiveScan>
    is decided according to the following table:

    Function   Data Type                             Identity
    --------   -----------------------------------   ----------
    Min        32-bit signed integer                 INT_MAX
               64-bit signed integer                 INT64_MAX
               32-bit unsigned integer               UINT_MAX
               64-bit unsigned integer               UINT64_MAX
               16-bit/32-bit/64-bit floating-point   +INF

    Max        32-bit signed integer                 INT_MIN
               64-bit signed integer                 INT64_MIN
               32-bit/64-bit unsigned integer        0
               floating-point                        -INF

    Add        32-bit/64-bit signed integer          0
               32-bit/64-bit unsigned integer        0
               16-bit/32-bit/64-bit floating-point   0

    +------------------------------------------------------+-----------------------------------------------------------+
    | Syntax                                               | Description                                               |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  minInvocationsAMD(genType  v)               | Returns the minimum value of <v> across all active shader |
    | genIType minInvocationsAMD(genIType v)               | invocations in the sub-group with <Reduce> group          |
    | genUType minInvocationsAMD(genUType v)               | operation. These functions must be used in uniform        |
    | genDType minInvocationsAMD(genDType v)               | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  minInvocationsNonUniformAMD(genType  v)     | Returns the minimum value of <v> across all active shader |
    | genIType minInvocationsNonUniformAMD(genIType v)     | invocations in the sub-group with <Reduce> group          |
    | genUType minInvocationsNonUniformAMD(genUType v)     | operation. These functions could be used in non-uniform   |
    | genDType minInvocationsNonUniformAMD(genDType v)     | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  minInvocationsInclusiveScanAMD(genType  v)  | Returns the minimum value of <v> across all active shader |
    | genIType minInvocationsInclusiveScanAMD(genIType v)  | invocations in the sub-group with <InclusiveScan> group   |
    | genUType minInvocationsInclusiveScanAMD(genUType v)  | operation. These functions must be used in uniform        |
    | genDType minInvocationsInclusiveScanAMD(genDType v)  | control flow. These functions operate component-wise.     |
    |                                                      |                                                           |
    |                                                      |                                                           |
    |                                                      |                                                           |
    |                                                      |                                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  minInvocationsInclusiveScanNonUniformAMD(   | Returns the minimum value of <v> across all active shader |
    |          genType  v)                                 | invocations in the sub-group with <InclusiveScan> group   |
    | genType  minInvocationsInclusiveScanNonUniformAMD(   | operation. These functions could be used in non-uniform   |
    |          genIType v)                                 | control flow. These functions operate component-wise.     |
    | genUType minInvocationsInclusiveScanNonUniformAMD(   |                                                           |
    |          genUType v)                                 |                                                           |
    | genDType minInvocationsInclusiveScanNonUniformAMD(   |                                                           |
    |          genDType v)                                 |                                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  minInvocationsExclusiveScanAMD(genType  v)  | Returns the minimum value of <v> across all active shader |
    | genIType minInvocationsExclusiveScanAMD(genIType v)  | invocations in the sub-group with <ExclusiveScan> group   |
    | genUType minInvocationsExclusiveScanAMD(genUType v)  | operation. These functions must be used in uniform        |
    | genDType minInvocationsExclusiveScanAMD(genDType v)  | control flow. These functions operate component-wise.     |
    |                                                      |                                                           |
    |                                                      |                                                           |
    |                                                      |                                                           |
    |                                                      |                                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  minInvocationsExclusiveScanNonUniformAMD(   | Returns the minimum value of <v> across all active shader |
    |          genType  v)                                 | invocations in the sub-group with <ExclusiveScan> group   |
    | genIType minInvocationsExclusiveScanNonUniformAMD(   | operation. These functions could be used in non-uniform   |
    |          genIType v)                                 | control flow. These functions operate component-wise.     |
    | genUType minInvocationsExclusiveScanNonUniformAMD(   |                                                           |
    |          genUType v)                                 |                                                           |
    | genDType minInvocationsExclusiveScanNonUniformAMD(   |                                                           |
    |          genDType v)                                 |                                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  maxInvocationsAMD(genType  v)               | Returns the maximum value of <v> across all active shader |
    | genIType maxInvocationsAMD(genIType v)               | invocations in the sub-group with <Reduce> group          |
    | genUType maxInvocationsAMD(genUType v)               | operation. These functions must be used in uniform        |
    | genDType maxInvocationsAMD(genDType v)               | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  maxInvocationsNonUniformAMD(genType  v)     | Returns the maximum value of <v> across all active shader |
    | genIType maxInvocationsNonUniformAMD(genIType v)     | invocations in the sub-group with <Reduce> group          |
    | genUType maxInvocationsNonUniformAMD(genUType v)     | operation. These functions could be used in non-uniform   |
    | genDType maxInvocationsNonUniformAMD(genDType v)     | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  maxInvocationsInclusiveScanAMD(genType  v)  | Returns the maximum value of <v> across all active shader |
    | genIType maxInvocationsInclusiveScanAMD(genIType v)  | invocations in the sub-group with <InclusiveScan> group   |
    | genUType maxInvocationsInclusiveScanAMD(genUType v)  | operation. These functions must be used in uniform        |
    | genDType maxInvocationsInclusiveScanAMD(genDType v)  | control flow. These functions operate component-wise.     |
    |                                                      |                                                           |
    |                                                      |                                                           |
    |                                                      |                                                           |
    |                                                      |                                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  maxInvocationsInclusiveScanNonUniformAMD(   | Returns the maximum value of <v> across all active shader |
    |          genType  v)                                 | invocations in the sub-group with <InclusiveScan> group   |
    | genType  maxInvocationsInclusiveScanNonUniformAMD(   | operation. These functions could be used in non-uniform   |
    |          genIType v)                                 | control flow. These functions operate component-wise.     |
    | genUType maxInvocationsInclusiveScanNonUniformAMD(   |                                                           |
    |          genUType v)                                 |                                                           |
    | genDType maxInvocationsInclusiveScanNonUniformAMD(   |                                                           |
    |          genDType v)                                 |                                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  maxInvocationsExclusiveScanAMD(genType  v)  | Returns the maximum value of <v> across all active shader |
    | genIType maxInvocationsExclusiveScanAMD(genIType v)  | invocations in the sub-group with <ExclusiveScan> group   |
    | genUType maxInvocationsExclusiveScanAMD(genUType v)  | operation. These functions must be used in uniform        |
    | genDType maxInvocationsExclusiveScanAMD(genDType v)  | control flow. These functions operate component-wise.     |
    |                                                      |                                                           |
    |                                                      |                                                           |
    |                                                      |                                                           |
    |                                                      |                                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  maxInvocationsExclusiveScanNonUniformAMD(   | Returns the maximum value of <v> across all active shader |
    |          genType  v)                                 | invocations in the sub-group with <ExclusiveScan> group   |
    | genIType maxInvocationsExclusiveScanNonUniformAMD(   | operation. These functions could be used in non-uniform   |
    |          genIType v)                                 | control flow. These functions operate component-wise.     |
    | genUType maxInvocationsExclusiveScanNonUniformAMD(   |                                                           |
    |          genUType v)                                 |                                                           |
    | genDType maxInvocationsExclusiveScanNonUniformAMD(   |                                                           |
    |          genDType v)                                 |                                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  addInvocationsAMD(genType  v)               | Returns the sum of the value of <v> across all active     |
    | genIType addInvocationsAMD(genIType v)               | shader invocations in the sub-group with <Reduce> group   |
    | genUType addInvocationsAMD(genUType v)               | operation. These functions must be used in uniform        |
    | genDType addInvocationsAMD(genDType v)               | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  addInvocationsNonUniformAMD(genType  v)     | Returns the sum of the value of <v> across all active     |
    | genIType addInvocationsNonUniformAMD(genIType v)     | shader invocations in the sub-group with <Reduce> group   |
    | genUType addInvocationsNonUniformAMD(genUType v)     | operation. These functions could be used in non-uniform   |
    | genDType addInvocationsNonUniformAMD(genDType v)     | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  addInvocationsInclusiveScanAMD(genType  v)  | Returns the sum of the value of <v> across all active     |
    | genIType addInvocationsInclusiveScanAMD(genIType v)  | shader invocations in the sub-group with <InclusiveScan>  |
    | genUType addInvocationsInclusiveScanAMD(genUType v)  | group operation. These functions must be used in uniform  |
    | genDType addInvocationsInclusiveScanAMD(genDType v)  | control flow. These functions operate component-wise.     |
    |                                                      |                                                           |
    |                                                      |                                                           |
    |                                                      |                                                           |
    |                                                      |                                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  addInvocationsInclusiveScanNonUniformAMD(   | Returns the sum of the value of <v> across all active     |
    |          genType  v)                                 | shader invocations in the sub-group with <InclusiveScan>  |
    | genIType addInvocationsInclusiveScanNonUniformAMD(   | group operation. These functions could be used in         |
    |          genIType v)                                 | non-uniform control flow. These functions operate         |
    | genUType addInvocationsInclusiveScanNonUniformAMD(   | component-wise.                                           |
    |          genUType v)                                 |                                                           |
    | genDType addInvocationsInclusiveScanNonUniformAMD(   |                                                           |
    |          genDType v)                                 |                                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  addInvocationsExclusiveScanAMD(genType  v)  | Returns the sum of the value of <v> across all active     |
    | genIType addInvocationsExclusiveScanAMD(genIType v)  | shader invocations in the sub-group with <ExclusiveScan>  |
    | genUType addInvocationsExclusiveScanAMD(genUType v)  | group operation. These functions must be used in uniform  |
    | genDType addInvocationsExclusiveScanAMD(genDType v)  | control flow. These functions operate component-wise.     |
    |                                                      |                                                           |
    |                                                      |                                                           |
    |                                                      |                                                           |
    |                                                      |                                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  addInvocationsExclusiveScanNonUniformAMD(   | Returns the sum of the value of <v> across all active     |
    |          genType  v)                                 | shader invocations in the sub-group with <ExclusiveScan>  |
    | genIType addInvocationsExclusiveScanNonUniformAMD(   | group operation. These functions could be used in         |
    |          genIType v)                                 | non-uniform control flow. These functions operate         |
    | genUType addInvocationsExclusiveScanNonUniformAMD(   | component-wise.                                           |
    |          genUType v)                                 |                                                           |
    | genDType addInvocationsExclusiveScanNonUniformAMD(   |                                                           |
    |          genDType v)                                 |                                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  swizzleInvocationsAMD(                      | Swizzles data within a group of 4 consecutive invocations |
    |          genType data, uvec4 offset)                 | of the sub-group based on <offset> as described below:    |
    | genIType swizzleInvocationsAMD(                      |                                                           |
    |          genIType data, uvec4 offset)                | for (i = 0; i < gl_SubGroupSizeARB; i+=4) {               |
    | genUType swizzleInvocationsAMD(                      |     dataOut[i+0] = isActive[i+offset.x] ?                 |
    |          genUType data, uvec4 offset)                |                    dataIn[i+offset.x] : 0;                |
    |                                                      |     dataOut[i+1] = isActive[i+offset.y] ?                 |
    |                                                      |                    dataIn[i+offset.y] : 0;                |
    |                                                      |     dataOut[i+2] = isActive[i+offset.z] ?                 |
    |                                                      |                    dataIn[i+offset.z] : 0;                |
    |                                                      |     dataOut[i+3] = isActive[i+offset.w] ?                 |
    |                                                      |                    dataIn[i+offset.w] : 0;                |
    |                                                      | }                                                         |
    |                                                      |                                                           |
    |                                                      | Where:                                                    |
    |                                                      | - isActive[i] tells whether the invocation with the index |
    |                                                      |   <i> is currently active in the sub-group.               |
    |                                                      | - dataIn[i] is the value of <data> for invocation index   |
    |                                                      |   <i>.                                                    |
    |                                                      | - dataOut[i] is the return value of the function for      |
    |                                                      |   invocation index <i>.                                   |
    |                                                      |                                                           |
    |                                                      | Components of <offset> must be constant integer           |
    |                                                      | expression with a value in the range [0, 3].              |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  swizzleInvocationsMaskedAMD(                | Swizzles data within a group of 32 consecutive            |
    |          genType data, uvec3 mask)                   | invocations with a limited mask as described below:       |
    | genIType swizzleInvocationsMaskedAMD(                |                                                           |
    |          genIType data, uvec3 mask)                  | for (i = 0; i < gl_SubGroupSizeARB; i++) {                |
    | genUType swizzleInvocationsMaskedAMD(                |     j = (((i & 0x1f) & mask.x) | mask.y) ^ mask.z;        |
    |          genIType data, uvec3 mask)                  |     j |= (i & 0x20); // which group of 32                 |
    |                                                      |     dataOut[i] = isActive[j] ? dataIn[j] : 0;             |
    |                                                      | }                                                         |
    |                                                      |                                                           |
    |                                                      | Where:                                                    |
    |                                                      | - isActive[i] tells whether the invocation with the index |
    |                                                      |   <i> is currently active in the sub-group.               |
    |                                                      | - dataIn[i] is the value of <data> for invocation index   |
    |                                                      |   <i>.                                                    |
    |                                                      | - dataOut[i] is the return value of the function for      |
    |                                                      |   invocation index <i>.                                   |
    |                                                      |                                                           |
    |                                                      | Components of <mask> must be constant integer expression  |
    |                                                      | with a value in the range [0, 31].                        |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genType  writeInvocationAMD(                         | Returns <inputValue> for all active invocations in the    |
    |          genType  inputValue,                        | sub-group except for the invocation whose invocation      |
    |          genType  writeValue,                        | index within the sub-group is <invocationIndex> for which |
    |          uint     invocationIndex)                   | <writeValue> is returned as described below:              |
    | genIType writeInvocationAMD(                         |                                                           |
    |          genIType inputValue,                        | for (i = 0; i < gl_SubGroupSizeARB; i++) {                |
    |          genIType writeValue,                        |     out[i] = (i == invocationIndex) ?                     |
    |          uint     invocationIndex)                   |              writeValue:inputValue;                       |
    | genUType writeInvocationAMD(                         | }                                                         |
    |          genUType inputValue,                        |                                                           |
    |          genUType writeValue,                        | Where out[i] is the return value of the function for      |
    |          uint     invocationIndex)                   | invocation index <i>.                                     |
    |                                                      |                                                           |
    |                                                      | <writeValue> and <invocationIndex> must be dynamically    |
    |                                                      | uniform within the sub-group, otherwise the return value  |
    |                                                      | of the function is undefined.                             |
    +------------------------------------------------------+-----------------------------------------------------------+

Dependencies on ARB_gpu_shader_int64

    If the shader enables ARB_gpu_shader_int64, this extension adds additional
    shader invocation group functions.

    Add Section 8.18, Shader Invocation Group Functions

    +------------------------------------------------------+-----------------------------------------------------------+
    | Syntax                                               | Description                                               |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type minInvocationsAMD(genI64Type v)           | Returns the minimum value of <v> across all active shader |
    | genU64Type minInvocationsAMD(genU64Type v)           | invocations in the sub-group with <Reduce> group          |
    |                                                      | operation. These functions must be used in uniform        |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type minInvocationsNonUniformAMD(genI64Type v) | Returns the minimum value of <v> across all active shader |
    | genU64Type minInvocationsNonUniformAMD(genU64Type v) | invocations in the sub-group with <Reduce> group          |
    |                                                      | operation. These functions could be used in non-uniform   |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type minInvocationsInclusiveScanAMD(           | Returns the minimum value of <v> across all active shader |
    |            genI64Type v)                             | invocations in the sub-group with <InclusiveScan> group   |
    | genU64Type minInvocationsInclusiveScanAMD(           | operation. These functions must be used in uniform        |
    |            genU64Type v)                             | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type minInvocationsInclusiveScanNonUniformAMD( | Returns the minimum value of <v> across all active shader |
    |            genI64Type v)                             | invocations in the sub-group with <InclusiveScan> group   |
    | genU64Type minInvocationsInclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform   |
    |            genU64Type v)                             | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type minInvocationsExclusiveScanAMD(           | Returns the minimum value of <v> across all active shader |
    |            genI64Type v)                             | invocations in the sub-group with <ExclusiveScan> group   |
    | genU64Type minInvocationsExclusiveScanAMD(           | operation. These functions must be used in uniform        |
    |            genU64Type v)                             | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type minInvocationsExclusiveScanNonUniformAMD( | Returns the minimum value of <v> across all active shader |
    |            genI64Type v)                             | invocations in the sub-group with <ExclusiveScan> group   |
    | genU64Type minInvocationsExclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform   |
    |            genU64Type v)                             | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type maxInvocationsAMD(genI64Type v)           | Returns the maximum value of <v> across all active shader |
    | genU64Type maxInvocationsAMD(genU64Type v)           | invocations in the sub-group with <Reduce> group          |
    |                                                      | operation. These functions must be used in uniform        |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type maxInvocationsNonUniformAMD(genI64Type v) | Returns the maximum value of <v> across all active shader |
    | genU64Type maxInvocationsNonUniformAMD(genU64Type v) | invocations in the sub-group with <Reduce> group          |
    |                                                      | operation. These functions could be used in non-uniform   |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type maxInvocationsInclusiveScanAMD(           | Returns the maximum value of <v> across all active shader |
    |            genI64Type v)                             | invocations in the sub-group with <InclusiveScan> group   |
    | genU64Type maxInvocationsInclusiveScanAMD(           | operation. These functions must be used in uniform        |
    |            genU64Type v)                             | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type maxInvocationsInclusiveScanNonUniformAMD( | Returns the maximum value of <v> across all active shader |
    |            genI64Type v)                             | invocations in the sub-group with <InclusiveScan> group   |
    | genU64Type maxInvocationsInclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform   |
    |            genU64Type v)                             | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type maxInvocationsExclusiveScanAMD(           | Returns the maximum value of <v> across all active shader |
    |            genI64Type v)                             | invocations in the sub-group with <ExclusiveScan> group   |
    | genU64Type maxInvocationsExclusiveScanAMD(           | operation. These functions must be used in uniform        |
    |            genU64Type v)                             | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type maxInvocationsExclusiveScanNonUniformAMD( | Returns the maximum value of <v> across all active shader |
    |            genI64Type v)                             | invocations in the sub-group with <ExclusiveScan> group   |
    | genU64Type maxInvocationsExclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform   |
    |            genU64Type v)                             | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type addInvocationsAMD(genI64Type v)           | Returns the sum of the value of <v> across all active     |
    | genU64Type addInvocationsAMD(genU64Type v)           | shader invocations in the sub-group with <Reduce> group   |
    |                                                      | operation. These functions must be used in uniform        |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type addInvocationsNonUniformAMD(genI64Type v) | Returns the sum of the value of <v> across all active     |
    | genU64Type addInvocationsNonUniformAMD(genU64Type v) | shader invocations in the sub-group with <Reduce> group   |
    |                                                      | operation. These functions could be used in non-uniform   |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type addInvocationsInclusiveScanAMD(           | Returns the sum of the value of <v> across all active     |
    |            genI64Type v)                             | shader invocations in the sub-group with <InclusiveScan>  |
    | genU64Type addInvocationsInclusiveScanAMD(           | group operation. These functions must be used in uniform  |
    |            genU64Type v)                             | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type addInvocationsInclusiveScanNonUniformAMD( | Returns the sum of the value of <v> across all active     |
    |            genI64Type v)                             | shader invocations in the sub-group with <InclusiveScan>  |
    | genU64Type addInvocationsInclusiveScanNonUniformAMD( | group operation. These functions could be used in         |
    |            genU64Type v)                             | non-uniform control flow. These functions operate         |
    |                                                      | component-wise.                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type addInvocationsExclusiveScanAMD(           | Returns the sum of the value of <v> across all active     |
    |            genI64Type v)                             | shader invocations in the sub-group with <ExclusiveScan>  |
    | genU64Type addInvocationsExclusiveScanAMD(           | group operation. These functions must be used in uniform  |
    |            genU64Type v)                             | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genI64Type addInvocationsExclusiveScanNonUniformAMD( | Returns the sum of the value of <v> across all active     |
    |            genI64Type  v)                            | shader invocations in the sub-group with <ExclusiveScan>  |
    | genU64Type addInvocationsExclusiveScanNonUniformAMD( | group operation. These functions could be used in         |
    |            genU64Type v)                             | non-uniform control flow. These functions operate         |
    |                                                      | component-wise.                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | uint mbcntAMD(uint64_t mask)                         | Returns the bit count of gl_SubGroupLtMaskARB with <mask> |
    |                                                      | as described below:                                       |
    |                                                      |                                                           |
    |                                                      |   bitCount(gl_SubGroupLtMaskARB & mask).                  |
    +------------------------------------------------------+-----------------------------------------------------------+

Dependencies on AMD_gpu_shader_half_float

    If the shader enables AMD_gpu_shader_half_float, this extension adds
    additional shader invocation group functions.

    Add Section 8.18, Shader Invocation Group Functions

    +------------------------------------------------------+-----------------------------------------------------------+
    | Syntax                                               | Description                                               |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type minInvocationsAMD(genF16Type v)           | Returns the minimum value of <v> across all active shader |
    |                                                      | invocations in the sub-group with <Reduce> group          |
    |                                                      | operation. These functions must be used in uniform        |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type minInvocationsNonUniformAMD(genF16Type v) | Returns the minimum value of <v> across all active shader |
    |                                                      | invocations in the sub-group with <Reduce> group          |
    |                                                      | operation. These functions could be used in non-uniform   |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type minInvocationsInclusiveScanAMD(           | Returns the minimum value of <v> across all active shader |
    |            genF16Type v)                             | invocations in the sub-group with <InclusiveScan> group   |
    |                                                      | operation. These functions must be used in uniform        |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type minInvocationsInclusiveScanNonUniformAMD( | Returns the minimum value of <v> across all active shader |
    |            genF16Type v)                             | invocations in the sub-group with <InclusiveScan> group   |
    |                                                      | operation. These functions could be used in non-uniform   |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type minInvocationsExclusiveScanAMD(           | Returns the minimum value of <v> across all active shader |
    |            genF16Type v)                             | invocations in the sub-group with <ExclusiveScan> group   |
    |                                                      | operation. These functions must be used in uniform        |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type minInvocationsExclusiveScanNonUniformAMD( | Returns the minimum value of <v> across all active shader |
    |            genF16Type v)                             | invocations in the sub-group with <ExclusiveScan> group   |
    |                                                      | operation. These functions could be used in non-uniform   |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type maxInvocationsAMD(genF16Type v)           | Returns the maximum value of <v> across all active shader |
    |                                                      | invocations in the sub-group with <Reduce> group          |
    |                                                      | operation. These functions must be used in uniform        |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type maxInvocationsNonUniformAMD(genF16Type v) | Returns the maximum value of <v> across all active shader |
    |                                                      | invocations in the sub-group with <Reduce> group          |
    |                                                      | operation. These functions could be used in non-uniform   |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type maxInvocationsInclusiveScanAMD(           | Returns the maximum value of <v> across all active shader |
    |            genF16Type v)                             | invocations in the sub-group with <InclusiveScan> group   |
    |                                                      | operation. These functions must be used in uniform        |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type maxInvocationsInclusiveScanNonUniformAMD( | Returns the maximum value of <v> across all active shader |
    |            genF16Type v)                             | invocations in the sub-group with <InclusiveScan> group   |
    |                                                      | operation. These functions could be used in non-uniform   |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type maxInvocationsExclusiveScanAMD(           | Returns the maximum value of <v> across all active shader |
    |            genF16Type v)                             | invocations in the sub-group with <ExclusiveScan> group   |
    |                                                      | operation. These functions must be used in uniform        |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type maxInvocationsExclusiveScanNonUniformAMD( | Returns the maximum value of <v> across all active shader |
    |            genF16Type v)                             | invocations in the sub-group with <ExclusiveScan> group   |
    |                                                      | operation. These functions could be used in non-uniform   |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type addInvocationsAMD(genF16Type v)           | Returns the sum of the value of <v> across all active     |
    |                                                      | shader invocations in the sub-group with <Reduce> group   |
    |                                                      | operation. These functions must be used in uniform        |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type addInvocationsNonUniformAMD(genF16Type v) | Returns the sum of the value of <v> across all active     |
    |                                                      | shader invocations in the sub-group with <Reduce> group   |
    |                                                      | operation. These functions could be used in non-uniform   |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type addInvocationsInclusiveScanAMD(           | Returns the sum of the value of <v> across all active     |
    |            genF16Type v)                             | shader invocations in the sub-group with <InclusiveScan>  |
    |                                                      | group operation. These functions must be used in uniform  |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type addInvocationsInclusiveScanNonUniformAMD( | Returns the sum of the value of <v> across all active     |
    |            genF16Type v)                             | shader invocations in the sub-group with <InclusiveScan>  |
    |                                                      | group operation. These functions could be used in         |
    |                                                      | non-uniform control flow. These functions operate         |
    |                                                      | component-wise.                                           |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type addInvocationsExclusiveScanAMD(           | Returns the sum of the value of <v> across all active     |
    |            genF16Type v)                             | shader invocations in the sub-group with <ExclusiveScan>  |
    |                                                      | group operation. These functions must be used in uniform  |
    |                                                      | control flow. These functions operate component-wise.     |
    +------------------------------------------------------+-----------------------------------------------------------+
    | genF16Type addInvocationsExclusiveScanNonUniformAMD( | Returns the sum of the value of <v> across all active     |
    |            genF16Type  v)                            | shader invocations in the sub-group with <ExclusiveScan>  |
    |                                                      | group operation. These functions could be used in         |
    |                                                      | non-uniform control flow. These functions operate         |
    |                                                      | component-wise.                                           |
    +------------------------------------------------------+-----------------------------------------------------------+

Additions to the AGL/GLX/WGL Specifications

    None.

GLX Protocol

    None.

Errors

    None.

Issues


Revision History

    Rev.    Date      Author    Changes
    ----  ----------  --------  --------------------------------------------------
    4     10/19/2016  rexu      Add interactions with ARB_gpu_shader_int64 and
                                AMD_gpu_shader_half_float. New group invocation
                                functions are added to support 64-bit integer
                                type, 16-bit/64-bit floating-point type, and
                                group operations. Clarify that <mask> in
                                swizzleInvocationsMaskedAMD() should be constant
                                integer expression with a value in the range
                                [0, 31].

    3     08/16/2016  rexu      Clarify that minInvocationsAMD, maxInvocationsAMD,
                                addInvocationsAMD, along with their non-uniform
                                versions, operate component-wise rather than on
                                vector.

    2     08/11/2016  rexu      Add non-uniform versions of minInvocationsAMD,
                                maxInvocationsAMD, and addInvocationsAMD.
                                Support those operations in non-uniform control
                                flow.

    1     04/21/2016  qlin      Internal revisions.
