Argument {COMM_GROUP_CONFIG}
ASTRA-sim 2.0 supports communicator groups.
You can pass a communicator group configuration file by specifying the file path using --comm-group-configuration.
If you do not pass a communicator group configuration file, by default, it will create a single group with all GPUs.
A valid communication group file is a JSON file with the following format.
{
  "<communicator_group_id>" : [gpu_ids]
}
For example, you can create two communicator groups with the following configuration file. The first communicator group, with ID 0, includes GPU IDs from 0 to 3. The second communicator group, with ID 1, includes GPU IDs from 4 to 7.
{
  "0": [0, 1, 2, 3],
  "1": [4, 5, 6, 7]
}
When simulating the workload, ASTRA-sim looks for the communication group id in each communication ET node (i.e. different operators of the same rank may have different communicator group). ASTRA-sim will look for the attribute pg_name in the communication ET node.
The following is part of a Chakra ET.
{
  "id": "4",
  "name": "in_emb_y@0_X1COMM",
  "type": "COMM_COLL_NODE",
  "attr": [
    {
      "name": "comm_size",
      "int64Val": "26843545600"
    },
    {
      "name": "comm_type",
      "int64Val": "2"
    },
    {
      "name": "pg_name",
      "stringVal": "17"
    },
    {
      "name": "is_cpu_op",
      "int32Val": 0
    }
  ]
}