ROCm/Tensile

TypeError: string indices must be integers in BenchmarkStructs.py:62

powderluv opened this issue · 6 comments

I run into a typeerror with Python 3.7/3.8/3.9. I am trying to build and run Tensile with:

CC=/home/foo/rocm/aomp/bin/clang CXX=/home/foo/rocm/aomp/bin/clang++ ../Tensile/bin/Tensile ../Tensile/Configs/rocblas_sgemm_asm_only.yaml ./

....
[100%] Built target tensile_client

################################################################################
# Converting Config to BenchmarkProcess Object
################################################################################

# Filling in Parameters With Defaults
# Convert Parameters to Steps
# Benchmark Common Parameters
# Fork Parameters
# Benchmark Fork Parameters
# Join Parameters
# Benchmark Join Parameters
# Benchmark Final
# NumBenchmarkSteps: 2

################################################################################
# Done Creating BenchmarkProcess Object
################################################################################
# Empty winners - use fast initialization of hardcodedParameters


################################################################################
# BenchmarkStep: Cijk_Ailk_Bljk_SB_00 - 00_BenchmarkFork 108.684s
# NumProblems: 1
# BenchmarkParameters:
#     BenchmarkFork = { 0 }
Traceback (most recent call last):
  File "../Tensile/bin/Tensile", line 36, in <module>
    Tensile.main()
  File "/home/foo/github/Tensile/Tensile/Tensile.py", line 285, in main
    Tensile(sys.argv[1:])
  File "/home/foo/github/Tensile/Tensile/Tensile.py", line 242, in Tensile
    executeStepsInConfig(config)
  File "/home/foo/github/Tensile/Tensile/Tensile.py", line 51, in executeStepsInConfig
    BenchmarkProblems.main( config["BenchmarkProblems"] )
  File "/home/foo/github/Tensile/Tensile/BenchmarkProblems.py", line 866, in main
    problemSizeGroupConfig, problemSizeGroupIdx)
  File "/home/foo/github/Tensile/Tensile/BenchmarkProblems.py", line 238, in benchmarkProblemType
    benchmarkPermutations = constructForkPermutations(benchmarkStep.benchmarkParameters)
  File "/home/foo/github/Tensile/Tensile/BenchmarkStructs.py", line 62, in constructForkPermutations
    values = param[name]
TypeError: string indices must be integers
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/home/foo/lokal/lib/python3.7/subprocess.py", line 800, in __init__
    restore_signals, start_new_session)
  File "/home/foo/lokal/lib/python3.7/subprocess.py", line 1465, in _execute_child
    executable = os.fsencode(executable)
  File "/home/foo/lokal/lib/python3.7/os.py", line 812, in fsencode
    filename = fspath(filename)  # Does type-checking of `filename`.
TypeError: expected str, bytes or os.PathLike object, not NoneType
(pyenv_3.7) 1 foo@5950x:~/github/Tensile/build$ CC=/home/foo/rocm/aomp/bin/clang CXX=/home/foo/rocm/aomp/bin/clang++ ../Tensile/bin/Tensile ../Tensile/Configs/rocblas_sgemm_asm_only.yaml ./

seems to be because of 5791b7f

Reverting that change gets things moving further.

Hi,

I think there is something out of date with rocblas_sgemm_asm_only.yaml. Try something like rocblas_sgemm_asm_single_kernel.yaml

Hi @sdquiring that doesn't help. It fails exactly the same way. Reverting 5791b7f makes it work for this kernel too.

After that I am still blocked on HIP on supported for gfx1030. What version of HIP do you use ? The open source version doesn't seem to detect gfx1030. See: ROCm/HIP#2238

I'm getting the same TypeError: string indices must be integers error from the same line (BenchmarkStructs.py:62) when using the same configuration file. I'm using a Vega 64 (gfx900) with rocm 4.1.2.
I've tried other configuration files (rocblas_sgemm_asm_single_kernel.yaml, rocblas_sgemm_hip_lite.yaml) and I'm seeing

Tensile::WARNING: ClientWriter Benchmark Process exited with code 2
Tensile::WARNING: BenchmarkProblems: Benchmark Process exited with code 2
# Get Results from CSV
Tensile::FATAL: Can't open "/home/alvin/Downloads/Tensile-rocm-4.1.0/build/1_BenchmarkProblems/Cijk_Ailk_Bljk_SB_00/Data/00_Final.csv" to get results

after kernels are generated and compiles, just before Writing Custom CMAKE. The .csv file does not exist.

tensile-output.txt

Any clue as to why the results are unable to be written?

Edit:
I just did some digging and it seems like run.sh for tensile_client is exiting immediately since rocm-smi is returning a non-zero code (2) and the script uses set -ex. I don't know why the script should not proceed if rocm-smi cannot successfully get the status of some of these features (mclk range, voltage range, voltage curve points, etc).

rocm-smi.out.txt
rocm-smi.err.txt

Edit 2:
Getting back to the original problem, I was still running into the issue when running with rocblas_sgemm_asm_only.yaml, it looks like BenchmarkProblems.py is treating benchmarkStep.benchmarkParameters like a dictionary but constructForkPermutations() exects a list of dictionaries, so the call on line BenchmarkProblems.py:235 is causing issues. I wrapped the dict in a list and got through the benchmarking phase.

The root cause of this issue is certain benchmarking steps that we have 'unofficially' dropped support for: InitialSolutionParameters, BenchmarkForkParameters, JoinParameters, and BenchmarkJoinParameters.

Removing the problem parameters from the configs and updating the code to properly communicate this dropped support are on the radar for tasks that need completion.

#1419 officially removes support for these no longer supported steps