aws-neuron/aws-neuron-sdk

Failing to load a traced model

brunonishimoto opened this issue · 3 comments

Env

AWS Instance: inf2.8xlarge

neuronx-cc --version
NeuronX Compiler version 2.13.72.0+78a426937
Python version 3.8.10
HWM version 2.13.72.0+78a426937
NumPy version 1.24.4
Running on AMI ami-01bd86df7ca5abf26
Running in region use1-az6

Venv

pip list
Package                       Version
----------------------------- -------------------
absl-py                       2.1.0
aws-neuronx-runtime-discovery 2.9
boto3                         1.34.110
botocore                      1.34.110
cachetools                    5.3.3
certifi                       2024.2.2
charset-normalizer            3.3.2
cloud-tpu-client              0.10
docutils                      0.20.1
ec2-metadata                  2.10.0
google-api-core               1.34.1
google-api-python-client      1.8.0
google-auth                   2.29.0
google-auth-httplib2          0.2.0
googleapis-common-protos      1.63.0
httplib2                      0.22.0
idna                          3.7
islpy                         2023.1
jmespath                      1.0.1
libneuronxla                  0.5.971
lockfile                      0.12.2
networkx                      2.6.3
neuronx-cc                    2.13.72.0+78a426937
numpy                         1.24.4
nvidia-cublas-cu11            11.10.3.66
nvidia-cuda-nvrtc-cu11        11.7.99
nvidia-cuda-runtime-cu11      11.7.99
nvidia-cudnn-cu11             8.5.0.96
oauth2client                  4.1.3
pgzip                         0.3.5
pip                           20.0.2
pkg-resources                 0.0.0
protobuf                      3.19.6
psutil                        5.9.8
pyasn1                        0.6.0
pyasn1-modules                0.4.0
pyparsing                     3.1.2
python-daemon                 3.0.1
python-dateutil               2.9.0.post0
PyYAML                        6.0.1
requests                      2.32.2
requests-unixsocket           0.3.0
rsa                           4.9
s3transfer                    0.10.1
scipy                         1.10.1
setuptools                    70.0.0
six                           1.16.0
torch                         1.13.1
torch-neuronx                 1.13.1.1.14.0
torch-xla                     1.13.1+torchneurone
typing-extensions             4.11.0
uritemplate                   3.0.1
urllib3                       2.2.1
wheel                         0.43.0

model.py

import torch.nn as nn

EMBEDDING_SIZE = 1536
NUM_CLASSES = 3264

class MLP(nn.Module):
    def __init__(self, h1, n_layers, dropout_prob, has_batch_norm):
        super(MLP, self).__init__()
        self.fc1 = nn.Linear(EMBEDDING_SIZE, h1)
        self.has_batch_norm = has_batch_norm

        if self.has_batch_norm:
            self.batchnorm1 = nn.BatchNorm1d(h1)

        self.layers = nn.ModuleList()
        self.batchnorms = nn.ModuleList()

        if n_layers > 1:
            for _ in range(n_layers - 1):
                self.layers.append(nn.Linear(h1, h1))
                if self.has_batch_norm:
                    self.batchnorms.append(nn.BatchNorm1d(h1))
        self.final_layer = nn.Linear(h1, NUM_CLASSES)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(p=dropout_prob)
        self.softmax = nn.Softmax()

    def forward(self, x):
        x = self.fc1(x)
        if self.has_batch_norm:
            x = self.batchnorm1(x)
        x = self.relu(x)
        x = self.dropout(x)

        for i, layer in enumerate(self.layers):
            x = layer(x)
            if self.has_batch_norm:
                x = self.batchnorms[i](x)
            x = self.relu(x)
            x = self.dropout(x)

        x = self.final_layer(x)

        return x

Convert file

import torch
import torch_neuronx
import json
import os
from model import MLP 

device = torch.device('cpu')

parameters = json.load(open('config.json', 'r'))

model = MLP(parameters["h1"], parameters["n_layers"], parameters["dropout_prob"], parameters["has_batch_norm"])

print(device)
model.load_state_dict(torch.load('eid_classifier.pt', map_location=device))
model.to(device)
model.eval()

# models_context_ids = [4429164, 4313200, 9834555, 5292199]
emb0 = [...]
emb1 = [...]
emb2 = [...]

ada_tensor = torch.tensor([emb0, emb1, emb2, emb3, dtype=torch.float32)

neuron_model = torch_neuronx.trace(model, ada_tensor, compiler_args=['--target=inf2'])

torch.jit.save(neuron_model, 'model_neuron.pt')

Output when running convert file

2024-05-22T17:01:35Z Running DoNothing
2024-05-22T17:01:35Z DoNothing finished after 0.000 seconds
2024-05-22T17:01:35Z Running AliasDependencyInduction
2024-05-22T17:01:35Z AliasDependencyInduction finished after 0.000 seconds
2024-05-22T17:01:35Z Running CanonicalizeIR
2024-05-22T17:01:35Z CanonicalizeIR finished after 0.000 seconds
2024-05-22T17:01:35Z Running LegalizeCCOpLayout
2024-05-22T17:01:35Z LegalizeCCOpLayout finished after 0.000 seconds
2024-05-22T17:01:35Z Running ResolveComplicatePredicates
2024-05-22T17:01:35Z ResolveComplicatePredicates finished after 0.001 seconds
2024-05-22T17:01:35Z Running AffinePredicateResolution
2024-05-22T17:01:35Z AffinePredicateResolution finished after 0.000 seconds
2024-05-22T17:01:35Z Running EliminateDivs
2024-05-22T17:01:35Z EliminateDivs finished after 0.001 seconds
2024-05-22T17:01:35Z Running PerfectLoopNest
2024-05-22T17:01:35Z PerfectLoopNest finished after 0.000 seconds
2024-05-22T17:01:35Z Running Simplifier
2024-05-22T17:01:35Z Simplifier finished after 0.001 seconds
2024-05-22T17:01:35Z Running GenericAccessSimplifier
2024-05-22T17:01:35Z GenericAccessSimplifier finished after 0.000 seconds
2024-05-22T17:01:35Z Running TCTransform
2024-05-22T17:01:35Z TCTransform finished after 0.001 seconds
2024-05-22T17:01:35Z Running CommuteConcat
2024-05-22T17:01:35Z CommuteConcat finished after 0.001 seconds
2024-05-22T17:01:35Z Running LowerTensorOp
2024-05-22T17:01:35Z LowerTensorOp finished after 0.005 seconds
2024-05-22T17:01:35Z Running ExpandBatchNorm
2024-05-22T17:01:35Z ExpandBatchNorm finished after 0.001 seconds
2024-05-22T17:01:35Z Running TCTransform
2024-05-22T17:01:35Z TCTransform finished after 0.002 seconds
2024-05-22T17:01:35Z Running EliminateDivs
2024-05-22T17:01:35Z EliminateDivs finished after 0.001 seconds
2024-05-22T17:01:35Z Running GenericAccessSimplifier
2024-05-22T17:01:35Z GenericAccessSimplifier finished after 0.001 seconds
2024-05-22T17:01:35Z Running CanonicalizeIR
2024-05-22T17:01:35Z CanonicalizeIR finished after 0.001 seconds
2024-05-22T17:01:35Z Running TensorOpFusion
2024-05-22T17:01:35Z TensorOpFusion finished after 0.001 seconds
2024-05-22T17:01:35Z Running TensorOpTransform
2024-05-22T17:01:35Z TensorOpTransform finished after 0.002 seconds
2024-05-22T17:01:35Z Running LateLowerTensorOp
2024-05-22T17:01:35Z LateLowerTensorOp finished after 0.001 seconds
2024-05-22T17:01:35Z Running MemcpyElimination
2024-05-22T17:01:35Z MemcpyElimination finished after 0.020 seconds
2024-05-22T17:01:35Z Running LoopFusion
2024-05-22T17:01:35Z LoopFusion finished after 0.009 seconds
2024-05-22T17:01:35Z Running Rematerialization
2024-05-22T17:01:35Z Rematerialization finished after 0.001 seconds
2024-05-22T17:01:35Z Running Simplifier
2024-05-22T17:01:35Z Simplifier finished after 0.005 seconds
2024-05-22T17:01:35Z Running Delinearization
2024-05-22T17:01:35Z Delinearization finished after 0.001 seconds
2024-05-22T17:01:35Z Running AliasDependencyElimination
2024-05-22T17:01:35Z AliasDependencyElimination finished after 0.001 seconds
2024-05-22T17:01:35Z Running DeadStoreElimination
2024-05-22T17:01:35Z DeadStoreElimination finished after 0.043 seconds
2024-05-22T17:01:35Z Running AliasDependencyInduction
2024-05-22T17:01:35Z AliasDependencyInduction finished after 0.000 seconds
2024-05-22T17:01:35Z Running Simplifier
2024-05-22T17:01:36Z Simplifier finished after 0.004 seconds
2024-05-22T17:01:36Z Running LICM
2024-05-22T17:01:36Z LICM finished after 0.002 seconds
2024-05-22T17:01:36Z Running Delinearization
2024-05-22T17:01:36Z Delinearization finished after 0.001 seconds
2024-05-22T17:01:36Z Running LoopFusion
2024-05-22T17:01:36Z LoopFusion finished after 0.000 seconds
2024-05-22T17:01:36Z Running SimplifySlice
2024-05-22T17:01:36Z SimplifySlice finished after 0.001 seconds
2024-05-22T17:01:36Z Running LICM
2024-05-22T17:01:36Z LICM finished after 0.001 seconds
2024-05-22T17:01:36Z Running Simplifier
2024-05-22T17:01:36Z Simplifier finished after 0.004 seconds
2024-05-22T17:01:36Z Running ValueNumbering
2024-05-22T17:01:36Z ValueNumbering finished after 0.001 seconds
2024-05-22T17:01:36Z Running LICM
2024-05-22T17:01:36Z LICM finished after 0.001 seconds
2024-05-22T17:01:36Z Running PadElimination
2024-05-22T17:01:36Z PadElimination finished after 0.000 seconds
2024-05-22T17:01:36Z Running Delinearization
2024-05-22T17:01:36Z Delinearization finished after 0.001 seconds
2024-05-22T17:01:36Z Running LoopFusion
2024-05-22T17:01:36Z LoopFusion finished after 0.001 seconds
2024-05-22T17:01:36Z Running GenericAccessSimplifier
2024-05-22T17:01:36Z GenericAccessSimplifier finished after 0.000 seconds
2024-05-22T17:01:36Z Running Simplifier
2024-05-22T17:01:36Z Simplifier finished after 0.004 seconds
2024-05-22T17:01:36Z Running LICM
2024-05-22T17:01:36Z LICM finished after 0.001 seconds
2024-05-22T17:01:36Z Running ValueNumbering
2024-05-22T17:01:36Z ValueNumbering finished after 0.001 seconds
2024-05-22T17:01:36Z Running TCTransform
2024-05-22T17:01:36Z TCTransform finished after 0.001 seconds
2024-05-22T17:01:36Z Running CommuteConcat
2024-05-22T17:01:36Z CommuteConcat finished after 0.001 seconds
2024-05-22T17:01:36Z Running RecognizeOpIdiom
2024-05-22T17:01:36Z RecognizeOpIdiom finished after 0.002 seconds
2024-05-22T17:01:36Z Running MaskPropagation
2024-05-22T17:01:36Z MaskPropagation finished after 0.001 seconds
2024-05-22T17:01:36Z Running Recompute
2024-05-22T17:01:36Z Recompute finished after 0.000 seconds
2024-05-22T17:01:36Z Running DeadCodeElimination
2024-05-22T17:01:36Z DeadCodeElimination finished after 0.001 seconds
2024-05-22T17:01:36Z Running DoNothing
2024-05-22T17:01:36Z DoNothing finished after 0.000 seconds
2024-05-22T17:01:36Z Running MutateDataType
2024-05-22T17:01:36Z MutateDataType finished after 0.001 seconds
2024-05-22T17:01:36Z Running AutoCastTCInputs
2024-05-22T17:01:36Z AutoCastTCInputs finished after 0.001 seconds
2024-05-22T17:01:36Z Running GenericAccessSimplifier
2024-05-22T17:01:36Z GenericAccessSimplifier finished after 0.001 seconds
2024-05-22T17:01:36Z Running Simplifier
2024-05-22T17:01:36Z Simplifier finished after 0.005 seconds
2024-05-22T17:01:36Z Running AliasDependencyElimination
2024-05-22T17:01:36Z AliasDependencyElimination finished after 0.001 seconds
2024-05-22T17:01:36Z Running DelinearIndices
2024-05-22T17:01:36Z DelinearIndices finished after 0.001 seconds
2024-05-22T17:01:36Z Running Delinearization
2024-05-22T17:01:36Z Delinearization finished after 0.001 seconds
2024-05-22T17:01:36Z Running DelinearIndices
2024-05-22T17:01:36Z DelinearIndices finished after 0.001 seconds
2024-05-22T17:01:36Z Running DeadCodeElimination
2024-05-22T17:01:36Z DeadCodeElimination finished after 0.001 seconds
2024-05-22T17:01:36Z Running InferIntrinsicOnCC
2024-05-22T17:01:36Z InferIntrinsicOnCC finished after 0.001 seconds
2024-05-22T17:01:36Z Running ResolveAccessConflict
2024-05-22T17:01:36Z ResolveAccessConflict finished after 0.002 seconds
2024-05-22T17:01:36Z Running LICM
2024-05-22T17:01:36Z LICM finished after 0.001 seconds
2024-05-22T17:01:36Z Running LocalLayoutOpt
2024-05-22T17:01:36Z LocalLayoutOpt finished after 0.004 seconds
2024-05-22T17:01:36Z Running DelinearIndices
2024-05-22T17:01:36Z DelinearIndices finished after 0.001 seconds
2024-05-22T17:01:36Z Running OrigLayoutTilingPipeline
2024-05-22T17:01:36Z Running GlobalLayoutOpt
2024-05-22T17:01:36Z GlobalLayoutOpt finished after 0.005 seconds
2024-05-22T17:01:36Z Running CanonicalizeDAG
2024-05-22T17:01:36Z CanonicalizeDAG finished after 0.001 seconds
2024-05-22T17:01:36Z Running FlattenAxesForTiling
2024-05-22T17:01:36Z FlattenAxesForTiling finished after 0.000 seconds
2024-05-22T17:01:36Z Running SundaSizeTiling
2024-05-22T17:01:36Z SundaSizeTiling finished after 0.020 seconds
2024-05-22T17:01:36Z OrigLayoutTilingPipeline finished after 0.044 seconds
2024-05-22T17:01:36Z Running TilingProfiler
2024-05-22T17:01:36Z TilingProfiler finished after 0.002 seconds
2024-05-22T17:01:36Z Running FlattenMacroLoop
2024-05-22T17:01:36Z FlattenMacroLoop finished after 0.001 seconds
2024-05-22T17:01:36Z Running InferNeuronTensor
2024-05-22T17:01:36Z InferNeuronTensor finished after 0.006 seconds
2024-05-22T17:01:36Z Running NeuronSimplifier
2024-05-22T17:01:36Z NeuronSimplifier finished after 0.005 seconds
2024-05-22T17:01:36Z Running LICM
2024-05-22T17:01:36Z LICM finished after 0.002 seconds
2024-05-22T17:01:36Z Running RewriteReplicationMatmul
2024-05-22T17:01:36Z RewriteReplicationMatmul finished after 0.001 seconds
2024-05-22T17:01:36Z Running FlattenMacroLoop
2024-05-22T17:01:36Z FlattenMacroLoop finished after 0.001 seconds
2024-05-22T17:01:36Z Running SimplifyMacroPredicates
2024-05-22T17:01:36Z SimplifyMacroPredicates finished after 0.055 seconds
2024-05-22T17:01:36Z Running DataLocalityOpt
2024-05-22T17:01:37Z DataLocalityOpt finished after 0.776 seconds
2024-05-22T17:01:37Z Running DMATilingProfiler
2024-05-22T17:01:37Z DMATilingProfiler finished after 0.002 seconds
2024-05-22T17:01:37Z Running NeuronSimplifier
2024-05-22T17:01:37Z NeuronSimplifier finished after 0.003 seconds
2024-05-22T17:01:37Z Running LegalizeSundaMacro
2024-05-22T17:01:37Z LegalizeSundaMacro finished after 0.002 seconds
2024-05-22T17:01:37Z Running NeuronSimplifier
2024-05-22T17:01:37Z NeuronSimplifier finished after 0.003 seconds
2024-05-22T17:01:37Z Running PerfectLoopNest
2024-05-22T17:01:37Z PerfectLoopNest finished after 0.002 seconds
2024-05-22T17:01:37Z Running FlattenMacroLoop
2024-05-22T17:01:37Z FlattenMacroLoop finished after 0.003 seconds
2024-05-22T17:01:37Z Running RewriteWeights
2024-05-22T17:01:38Z RewriteWeights finished after 0.985 seconds
2024-05-22T17:01:38Z Running ReshapeWeights
2024-05-22T17:01:38Z ReshapeWeights finished after 0.000 seconds
2024-05-22T17:01:38Z Running FlattenMacroLoop
2024-05-22T17:01:38Z FlattenMacroLoop finished after 0.002 seconds
2024-05-22T17:01:38Z Running SimplifyMacroPredicates
2024-05-22T17:01:38Z SimplifyMacroPredicates finished after 0.090 seconds
2024-05-22T17:01:38Z Running InferInitValue
2024-05-22T17:01:38Z InferInitValue finished after 0.038 seconds
2024-05-22T17:01:38Z Running NeuronSimplifier
2024-05-22T17:01:38Z NeuronSimplifier finished after 0.002 seconds
2024-05-22T17:01:38Z Running SimplifyTensor
2024-05-22T17:01:38Z SimplifyTensor finished after 0.003 seconds
2024-05-22T17:01:38Z Running LICM
2024-05-22T17:01:38Z LICM finished after 0.002 seconds
2024-05-22T17:01:38Z Running SundaISel
2024-05-22T17:01:38Z SundaISel finished after 0.015 seconds
2024-05-22T17:01:38Z Running PreprocessNkiKernels
2024-05-22T17:01:38Z PreprocessNkiKernels finished after 0.001 seconds
2024-05-22T17:01:38Z Running NeuronLoopInterchange
2024-05-22T17:01:38Z NeuronLoopInterchange finished after 0.001 seconds
2024-05-22T17:01:38Z Running NeuronSimplifyPredicates
2024-05-22T17:01:38Z NeuronSimplifyPredicates finished after 0.046 seconds
2024-05-22T17:01:38Z Running NeuronLoopFusion
2024-05-22T17:01:38Z NeuronLoopFusion finished after 0.004 seconds
2024-05-22T17:01:38Z Running NeuronLoopInterchange
2024-05-22T17:01:38Z NeuronLoopInterchange finished after 0.001 seconds
2024-05-22T17:01:38Z Running NeuronLICM
2024-05-22T17:01:38Z NeuronLICM finished after 0.001 seconds
2024-05-22T17:01:38Z Running FactorizeBlkDims
2024-05-22T17:01:38Z FactorizeBlkDims finished after 0.002 seconds
2024-05-22T17:01:38Z Running NeuronInstComb
2024-05-22T17:01:38Z NeuronInstComb finished after 0.004 seconds
2024-05-22T17:01:38Z Running NeuronValueNumbering
2024-05-22T17:01:38Z NeuronValueNumbering finished after 0.003 seconds
2024-05-22T17:01:38Z Running NeuronInstComb
2024-05-22T17:01:38Z NeuronInstComb finished after 0.002 seconds
2024-05-22T17:01:38Z Running VectorizeDMA
2024-05-22T17:01:38Z VectorizeDMA finished after 0.002 seconds
2024-05-22T17:01:38Z Running NeuronSimplifyPredicates
2024-05-22T17:01:38Z NeuronSimplifyPredicates finished after 0.035 seconds
2024-05-22T17:01:38Z Running LegalizePartitionReduce
2024-05-22T17:01:38Z LegalizePartitionReduce finished after 0.001 seconds
2024-05-22T17:01:38Z Running DeConcat
2024-05-22T17:01:38Z DeConcat finished after 0.001 seconds
2024-05-22T17:01:38Z Running PartialSimdFusion
2024-05-22T17:01:38Z PartialSimdFusion finished after 0.002 seconds
2024-05-22T17:01:38Z Running TritiumFusion
2024-05-22T17:01:38Z TritiumFusion finished after 0.047 seconds
2024-05-22T17:01:38Z Running CCOpFusion
2024-05-22T17:01:38Z CCOpFusion finished after 0.007 seconds
2024-05-22T17:01:38Z Running VectorizeMatMult
2024-05-22T17:01:38Z VectorizeMatMult finished after 0.001 seconds
2024-05-22T17:01:38Z Running PartialLoopFusion
2024-05-22T17:01:38Z PartialLoopFusion finished after 0.004 seconds
2024-05-22T17:01:38Z Running NeuronLICM
2024-05-22T17:01:38Z NeuronLICM finished after 0.002 seconds
2024-05-22T17:01:38Z Running LowerTranspose
2024-05-22T17:01:38Z LowerTranspose finished after 0.005 seconds
2024-05-22T17:01:38Z Running LateNeuronInstComb
2024-05-22T17:01:38Z LateNeuronInstComb finished after 0.002 seconds
2024-05-22T17:01:38Z Running SplitAccGrp
2024-05-22T17:01:38Z SplitAccGrp finished after 0.001 seconds
2024-05-22T17:01:38Z Running SpillPSum
2024-05-22T17:01:38Z SpillPSum finished after 0.005 seconds
2024-05-22T17:01:38Z Running LowerIntrinsics
2024-05-22T17:01:38Z LowerIntrinsics finished after 0.001 seconds
2024-05-22T17:01:38Z Running LegalizeType
2024-05-22T17:01:38Z LegalizeType finished after 0.214 seconds
2024-05-22T17:01:38Z Running NeuronLICM
2024-05-22T17:01:38Z NeuronLICM finished after 0.002 seconds
2024-05-22T17:01:38Z Running InferPSumTensor
2024-05-22T17:01:38Z InferPSumTensor finished after 0.004 seconds
2024-05-22T17:01:38Z Running WeightCoalescing
2024-05-22T17:01:38Z WeightCoalescing finished after 0.004 seconds
2024-05-22T17:01:38Z Running LegalizeSundaAccess
2024-05-22T17:01:38Z LegalizeSundaAccess finished after 0.033 seconds
2024-05-22T17:01:38Z Running RelaxPredicates
2024-05-22T17:01:38Z RelaxPredicates finished after 0.007 seconds
2024-05-22T17:01:38Z Running TensorInitialization
2024-05-22T17:01:38Z TensorInitialization finished after 0.002 seconds
2024-05-22T17:01:38Z Running NeuronSimplifyPredicates
2024-05-22T17:01:39Z NeuronSimplifyPredicates finished after 0.107 seconds
2024-05-22T17:01:39Z Running ExpandISAMacro
2024-05-22T17:01:39Z ExpandISAMacro finished after 0.003 seconds
2024-05-22T17:01:39Z Running SimplifyNeuronTensor
2024-05-22T17:01:39Z SimplifyNeuronTensor finished after 0.003 seconds
2024-05-22T17:01:39Z Running DMALocalityOpt
2024-05-22T17:01:39Z DMALocalityOpt finished after 0.001 seconds
2024-05-22T17:01:39Z Running DataStreaming
2024-05-22T17:01:39Z DataStreaming finished after 0.002 seconds
2024-05-22T17:01:39Z Running SFKVectorizer
2024-05-22T17:01:40Z SFKVectorizer finished after 0.912 seconds
2024-05-22T17:01:40Z Running LateLegalizeInst
2024-05-22T17:01:40Z LateLegalizeInst finished after 0.001 seconds
2024-05-22T17:01:40Z Running CoalesceCCOp
2024-05-22T17:01:40Z CoalesceCCOp finished after 0.001 seconds
2024-05-22T17:01:40Z Running SimpleAllReduceTiling
2024-05-22T17:01:40Z SimpleAllReduceTiling finished after 0.001 seconds
2024-05-22T17:01:40Z Running StaticProfiler
2024-05-22T17:01:40Z StaticProfiler finished after 0.003 seconds
2024-05-22T17:01:40Z Running SplitAPUnionSets
2024-05-22T17:01:40Z SplitAPUnionSets finished after 0.100 seconds
2024-05-22T17:01:40Z Running DumpGraphAndMetadata
2024-05-22T17:01:40Z DumpGraphAndMetadata finished after 0.004 seconds
2024-05-22T17:01:40Z Running BirCodeGenLoop
2024-05-22T17:01:40Z BirCodeGenLoop finished after 0.040 seconds
2024-05-22T17:01:40Z Running mod_parallel_pass
2024-05-22T17:01:40Z Running rewrite_matmult_sparse
2024-05-22T17:01:40Z rewrite_matmult_sparse finished after 0.000 seconds
2024-05-22T17:01:40Z Running birverifier
2024-05-22T17:01:40Z birverifier finished after 0.009 seconds
2024-05-22T17:01:40Z Running expand_replication
2024-05-22T17:01:40Z expand_replication finished after 0.000 seconds
2024-05-22T17:01:40Z Running unroll
2024-05-22T17:01:40Z unroll finished after 0.016 seconds
2024-05-22T17:01:40Z Running psum_legalization
2024-05-22T17:01:40Z psum_legalization finished after 0.000 seconds
2024-05-22T17:01:40Z Running error_injector
2024-05-22T17:01:40Z error_injector finished after 0.000 seconds
2024-05-22T17:01:40Z Running constant_propagate
2024-05-22T17:01:40Z constant_propagate finished after 0.001 seconds
2024-05-22T17:01:40Z Running vn_splitter
2024-05-22T17:01:40Z vn_splitter finished after 0.001 seconds
2024-05-22T17:01:40Z Running lower_ac
2024-05-22T17:01:40Z lower_ac finished after 0.000 seconds
2024-05-22T17:01:40Z Running input_dma_coalescing
2024-05-22T17:01:40Z input_dma_coalescing finished after 0.000 seconds
2024-05-22T17:01:40Z Running early_peephole_opts
2024-05-22T17:01:40Z early_peephole_opts finished after 0.000 seconds
2024-05-22T17:01:40Z Running pre_sched
2024-05-22T17:01:40Z pre_sched finished after 0.007 seconds
2024-05-22T17:01:40Z Running tensor_copy_elim
2024-05-22T17:01:40Z tensor_copy_elim finished after 0.001 seconds
2024-05-22T17:01:40Z Running mm_packing
2024-05-22T17:01:40Z mm_packing finished after 0.016 seconds
2024-05-22T17:01:40Z Running coloring_allocator_psum
2024-05-22T17:01:40Z coloring_allocator_psum finished after 0.008 seconds
2024-05-22T17:01:40Z Running dma_optimization_psum
2024-05-22T17:01:40Z dma_optimization_psum finished after 0.000 seconds
2024-05-22T17:01:40Z Running address_rotation_psum
2024-05-22T17:01:40Z address_rotation_psum finished after 0.002 seconds
2024-05-22T17:01:40Z Running coloring_allocator_sb
2024-05-22T17:01:40Z coloring_allocator_sb finished after 0.005 seconds
2024-05-22T17:01:40Z Running address_rotation_sb
2024-05-22T17:01:40Z address_rotation_sb finished after 0.001 seconds
2024-05-22T17:01:40Z Running dma_optimization_sb
2024-05-22T17:01:40Z dma_optimization_sb finished after 0.003 seconds
2024-05-22T17:01:40Z Running address_rotation_sb
2024-05-22T17:01:40Z address_rotation_sb finished after 0.003 seconds
2024-05-22T17:01:40Z Running coloring_allocator_dram
2024-05-22T17:01:40Z coloring_allocator_dram finished after 0.001 seconds
2024-05-22T17:01:40Z Running address_rotation_dram
2024-05-22T17:01:40Z address_rotation_dram finished after 0.000 seconds
2024-05-22T17:01:40Z Running tensorcopy_accel
2024-05-22T17:01:40Z tensorcopy_accel finished after 0.000 seconds
2024-05-22T17:01:40Z Running peephole_opts
2024-05-22T17:01:40Z peephole_opts finished after 0.000 seconds
2024-05-22T17:01:40Z Running lower_kernel
2024-05-22T17:01:40Z lower_kernel finished after 0.000 seconds
2024-05-22T17:01:40Z Running build_fdeps
2024-05-22T17:01:40Z build_fdeps finished after 0.002 seconds
2024-05-22T17:01:40Z Running remove_redundancies
2024-05-22T17:01:40Z remove_redundancies finished after 0.000 seconds
2024-05-22T17:01:40Z Running anti_dependency_analyzer
2024-05-22T17:01:40Z anti_dependency_analyzer finished after 0.008 seconds
2024-05-22T17:01:40Z Running tensor_copy_elim
2024-05-22T17:01:40Z tensor_copy_elim finished after 0.001 seconds
2024-05-22T17:01:40Z Running post_sched
2024-05-22T17:01:40Z post_sched finished after 0.047 seconds
2024-05-22T17:01:40Z Running address_rotation_sb
2024-05-22T17:01:40Z address_rotation_sb finished after 0.011 seconds
2024-05-22T17:01:40Z Running anti_dependency_analyzer
2024-05-22T17:01:40Z anti_dependency_analyzer finished after 0.008 seconds
2024-05-22T17:01:40Z Running dep_opt
2024-05-22T17:01:40Z dep_opt finished after 0.005 seconds
2024-05-22T17:01:40Z Running report_stats
2024-05-22T17:01:40Z report_stats finished after 0.000 seconds
2024-05-22T17:01:40Z Running assign_trigger_engine
2024-05-22T17:01:40Z assign_trigger_engine finished after 0.000 seconds
2024-05-22T17:01:40Z Running alloc_queues
2024-05-22T17:01:40Z alloc_queues finished after 0.000 seconds
2024-05-22T17:01:40Z mod_parallel_pass finished after 0.163 seconds
2024-05-22T17:01:40Z Running dep_reduction
2024-05-22T17:01:40Z dep_reduction finished after 0.011 seconds
2024-05-22T17:01:40Z Running bir_racecheck
2024-05-22T17:01:40Z bir_racecheck finished after 0.019 seconds
2024-05-22T17:01:40Z Running lower_dma
2024-05-22T17:01:40Z lower_dma finished after 0.002 seconds
2024-05-22T17:01:40Z Running coalesce_dma_blocks
2024-05-22T17:01:40Z coalesce_dma_blocks finished after 0.000 seconds
2024-05-22T17:01:40Z Running alloc_semaphores
2024-05-22T17:01:40Z alloc_semaphores finished after 0.001 seconds
2024-05-22T17:01:40Z Running expand_inst_late
2024-05-22T17:01:40Z expand_inst_late finished after 0.000 seconds
2024-05-22T17:01:40Z Running lower_sync
2024-05-22T17:01:40Z lower_sync finished after 0.000 seconds
2024-05-22T17:01:40Z Running lower_act
2024-05-22T17:01:40Z lower_act finished after 0.001 seconds
2024-05-22T17:01:40Z Running lower_dve
2024-05-22T17:01:40Z lower_dve finished after 0.004 seconds
2024-05-22T17:01:40Z Running lower_ap
2024-05-22T17:01:40Z lower_ap finished after 0.000 seconds
2024-05-22T17:01:40Z Running alloc_regs
2024-05-22T17:01:40Z alloc_regs finished after 0.000 seconds
2024-05-22T17:01:40Z Running birverifier
2024-05-22T17:01:40Z birverifier finished after 0.002 seconds
2024-05-22T17:01:40Z Running codegen
2024-05-22T17:01:40Z isa_gen finished after 0.012 seconds
2024-05-22T17:01:40Z dma_desc_gen finished after 0.003 seconds
2024-05-22T17:01:40Z debug_info_gen finished after 0.004 seconds
2024-05-22T17:01:40Z codegen finished after 0.020 seconds
2024-05-22T17:01:40Z Running neff_packager
2024-05-22T17:01:42Z neff_packager finished after 2.291 seconds
2024-05-22T17:01:43Z Compiler status PASS

When I run torch.jit.load('model_neuron.pt) to load the model I receive the following error:

Traceback (most recent call last):
  File "test.py", line 3, in <module>
    model_neuron = torch.jit.load('model_neuron.pt')
  File "/opt/env/neuronx/lib/python3.8/site-packages/torch/jit/_serialization.py", line 162, in load
    cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files)
RuntimeError:
Unknown type name '__torch__.torch.classes.neuron.Model':
  File "code/__torch__/torch_neuronx/xla_impl/trace.py", line 6
  training : bool
  _is_full_backward_hook : Optional[bool]
  model : __torch__.torch.classes.neuron.Model
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
  states : __torch__.torch.nn.modules.container.ParameterList
  weights : __torch__.torch.nn.modules.container.ParameterDict

Is there something that I am missing?

Thanks @brunonishimoto,

We're taking a look. We'll get back to you once we've been able to reproduce.

@aws-taylor thanks for replying.

I could solve it. I was missing the import torch_neuronx in the file that I call torch.jit.load('model_neuron.pt'). I thought that import torch was enough.

Oof, good catch. I'll work with our doc writers to get this added to our troubleshooting guide so that hopefully the next person doesn't stumble on it. Resolving for now, but don't hesitate to reach out if you have further issues.