nautobot/nautobot-app-ssot

ACI Sync Job fails with TypeError: 'str' object is not callable

kingfetty opened this issue · 24 comments

Environment

  • Python version: 3.10.12
  • Nautobot version: 2.2.7
  • nautobot-ssot version: 2.6.1

Expected Behavior

Expected to synchronize data from ACI fabric to NB instance

Observed Behavior

Running the job connects to the ACI APIC and starts to pull data, upon reaching the load_devices code errors with the below stack trace:

`Traceback (most recent call last):
File "/opt/nautobot/lib/python3.10/site-packages/diffsync/store/init.py", line 155, in get_or_instantiate
obj = self.get(model=model, identifier=ids)
File "/opt/nautobot/lib/python3.10/site-packages/diffsync/store/local.py", line 49, in get
raise ObjectNotFound(f"{modelname} {uid} not present in {str(self)}")
diffsync.exceptions.ObjectNotFound: device_type N9K-C93180YC-FX3__ not present in LocalStore

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/opt/nautobot/lib/python3.10/site-packages/celery/app/trace.py", line 477, in trace_task
R = retval = fun(*args, **kwargs)
File "/opt/nautobot/lib/python3.10/site-packages/celery/app/trace.py", line 760, in protected_call
return self.run(*args, **kwargs)
File "/opt/nautobot/lib/python3.10/site-packages/nautobot/extras/jobs.py", line 1136, in run_job
result = job(*args, **kwargs)
File "/opt/nautobot/lib/python3.10/site-packages/nautobot/extras/jobs.py", line 149, in call
return self.run(*args, **deserialized_kwargs)
File "/opt/nautobot/lib/python3.10/site-packages/nautobot_ssot/integrations/aci/jobs.py", line 90, in run
super().run(dryrun=self.dryrun, memory_profiling=self.memory_profiling, *args, **kwargs)
File "/opt/nautobot/lib/python3.10/site-packages/nautobot_ssot/jobs/base.py", line 317, in run
self.sync_data(memory_profiling)
File "/opt/nautobot/lib/python3.10/site-packages/nautobot_ssot/jobs/base.py", line 136, in sync_data
self.load_source_adapter()
File "/opt/nautobot/lib/python3.10/site-packages/nautobot_ssot/integrations/aci/jobs.py", line 75, in load_source_adapter
self.source_adapter.load()
File "/opt/nautobot/lib/python3.10/site-packages/nautobot_ssot/integrations/aci/diffsync/adapters/aci.py", line 438, in load
self.load_devices()
File "/opt/nautobot/lib/python3.10/site-packages/nautobot_ssot/integrations/aci/diffsync/adapters/aci.py", line 412, in load_devices
self.get_or_instantiate(
File "/opt/nautobot/lib/python3.10/site-packages/diffsync/init.py", line 813, in get_or_instantiate
return self.store.get_or_instantiate(model=model, ids=ids, attrs=attrs)
File "/opt/nautobot/lib/python3.10/site-packages/diffsync/store/init.py", line 159, in get_or_instantiate
obj = model(**ids, **attrs)
TypeError: 'str' object is not callable`

Steps to Reproduce

Create nautobot install
Ensure SSOT[ACI] is installed and properly loaded with correct environmental variables
Enable sync job
Execute sync job against controller.

Suspected Cause

Within the plugin code for the ACI adapter load_devices() is calling self.get_or_instantiate() passing a string in of "device_type" for the model argument. Further in the code, diffsync is attempting to execute whatever this argument is as a method. Given that string types are not callable it throws the error. I suspect that get_or_instantiate is being called with incorrect argument as the "model" argument.

After the correction of using a string to using self.device_type the code now fails at load_interfaces() because there is a block attempting to read the variable device_specs before it is assigned. This is due to the fact that the variable is created inside an if block that checks if a yaml file exists and the reference to the variable is outside this if block.

Also appears Sync Job is using default soft timeout limit of 5 minutes which is way too short for a sync job. This should be updated in the jobs.py meta to be at least several hours.

I had the same issue. If the issue is fixed, do we need to reinstall the ssot plugin ?

@jdrew82 I have reinstalled the ssot plugin & ran the aci sync job again. The job is still failing & a new error is coming now -
"exc_message": [
"ltd"]
where "ltd" is the identifier used in my env variable. If I replace ltd with abc or any other name, that name will be the error.
Below are the env vars used -
NAUTOBOT_APIC_BASE_URI_LTD
NAUTOBOT_APIC_USERNAME_LTD
NAUTOBOT_APIC_PASSWORD_LTD
NAUTOBOT_APIC_VERIFY_LTD
NAUTOBOT_APIC_SITE_LTD
NAUTOBOT_APIC_TENANT_PREFIX_LTD

@ShivaniMauryaIntel can you provide the full traceback? Without knowing where this is happening it's tough to track down the cause.

Also appears Sync Job is using default soft timeout limit of 5 minutes which is way too short for a sync job. This should be updated in the jobs.py meta to be at least several hours.

The global timeout is defined with the NAUTOBOT_CELERY_TASK_SOFT_TIME_LIMIT and NAUTOBOT_CELERY_TASK_TIME_LIMIT environment variables as hard limits. You can go into the Job settings and set a lower timeout if desired. Although we can set one in the Job itself it's difficult to know an appropriate timeout as it will vary quite a bit depending upon your deployment size and size of dataset being imported.

@jdrew82 Based on your suggestion on slack, I have defined my env vars in a .env file & called the file in nautobot service - https://josh-v.com/nautobot-environment-file/
Below are the env vars used -
NAUTOBOT_APIC_BASE_URI_LTD
NAUTOBOT_APIC_USERNAME_LTD
NAUTOBOT_APIC_PASSWORD_LTD
NAUTOBOT_APIC_VERIFY_LTD
NAUTOBOT_APIC_SITE_LTD
NAUTOBOT_APIC_TENANT_PREFIX_LTD

Now, I am running the ACI sync job from Nautobot UI & I am getting the error at "ltd" which is the identifier used in my env variable. If I replace ltd with abc or any other name, that name is showing in the error message -
Return Data -
{
"exc_message": [
"ltd"
],
"exc_module": "builtins",
"exc_type": "KeyError"
}

7
Traceback (most recent call last):
File "/opt/nautobot/lib64/python3.9/site-packages/celery/app/trace.py", line 477, in trace_task
R = retval = fun(*args, **kwargs)
File "/opt/nautobot/lib64/python3.9/site-packages/celery/app/trace.py", line 760, in protected_call
return self.run(*args, **kwargs)
File "/opt/nautobot/lib64/python3.9/site-packages/nautobot/extras/jobs.py", line 1136, in run_job
result = job(*args, **kwargs)
File "/opt/nautobot/lib64/python3.9/site-packages/nautobot/extras/jobs.py", line 149, in call
return self.run(*args, **deserialized_kwargs)
File "/opt/nautobot/lib64/python3.9/site-packages/nautobot_ssot/integrations/aci/jobs.py", line 90, in run
super().run(dryrun=self.dryrun, memory_profiling=self.memory_profiling, *args, **kwargs)
File "/opt/nautobot/lib64/python3.9/site-packages/nautobot_ssot/jobs/base.py", line 317, in run
self.sync_data(memory_profiling)
File "/opt/nautobot/lib64/python3.9/site-packages/nautobot_ssot/jobs/base.py", line 136, in sync_data
self.load_source_adapter()
File "/opt/nautobot/lib64/python3.9/site-packages/nautobot_ssot/integrations/aci/jobs.py", line 74, in load_source_adapter
self.source_adapter = AciAdapter(job=self, sync=self.sync, client=aci_creds[self.apic])
KeyError: 'ltd'

@jdrew82 Is this issue raised as another bug with ACI Sync Job? I am a bit concerned here as there was a bug fixed in the ACI Sync job 2 weeks ago & now there is another bug.

@ShivaniMauryaIntel @kingfetty Can you confirm if this has been resolved with the recent updates to the ACI integration?

@jdrew82 I will test this in couple of days & will let you know.

@jdrew82 There is another issue now. I have upgraded the ssot package to 3.1.0. While running the aci sync job, there is no "ACI APIC" coming up in drop down -
image
I have validated the APIC details are added as part of environment variable, & visible from nautobot shell -
image

This is the 2nd time when the issue is fixed by your team, however, in actual, the issue is not getting resolved in Production. Can you please perform end-to-end testing before releasing a new version? As a user, we are just reinstalling the package multiple times, performing testing, facing another error & raising a bug. This is going in a loop without any value. I hope you understand the pain points from user perspective.

@ShivaniMauryaIntel I'm sorry you're running into another bug but please don't insinuate that we don't perform testing of these integrations. These integrations have been created through engagements with our clients and if they give us approval we open-source them. There is no requirement from us to do so but we do it to give back to the community. With that in mind, we do actually fully test and validate functionality of the integration within a client or multiple clients environments before we open source it. However, even with doing that it is impossible to account for all use cases or possible situations. That is why we encourage the community to open Issues like this and notify of us of bugs or even submit PRs that resolve bugs.

Regarding the bug you're seeing, this isn't really a bug but a change in functionality. The ACI integration has been updated to utilize the Controller model. The code that would take the configuration from nautobot_config.py has been removed. Please review the documentation for specifics on what settings are required when creating the Controller object.

@jdrew82 I have made the required changes to nautobot_config.py file. However, I still don't see any ACI APIC getting loaded in the dropdown menu.

@jdrew82 I have made the required changes to nautobot_config.py file. However, I still don't see any ACI APIC getting loaded in the dropdown menu.

I think you misunderstood my last comment. The code that pulls the configuration from nautobot_config.py is no longer in place. You must create the ExternalIntegration and Controller objects manually inside the Nautobot UI. The installation instructions for ACI denote the required pieces for ACI in the ExternalIntegration.

Can you share a document to install ExternalIntegration object as I don't see it in the Nautobot UI? I am on Nautobot version 2.3.2

@ShivaniMauryaIntel You can find ExternalIntegration under the Extensibility menu. Unfortunately, we don't have screenshots showing the process with the ACI integration but we do have some images and documentation on the ExternalIntegration portion in the Device42 documentation. The only piece you'd need to do after that is to create a Controller object that references that ExternalIntegration. You can find those under the Devices menu.

Hi @jdrew82 I have added the ACI as part of ExternalIntegration & created a Controller object to reference it. However, while running the job, I am getting the error Undefined environment variable -
image

I have validated both username & password from NB shell & they are showing up -
image

Even after adding the username as secret from Nautobot UI, it is getting verified -
image

What can be the issue here?

@ShivaniMauryaIntel Can you confirm that the environment variable is also available to the worker process? If you're deploying with service files, ie systemd, then I suggest using an EnvironmentFile that's shared by both processes. If you're using containers just make sure that both nautobot and worker containers are getting the same env file.

Ok, I have now shared environmentfile with both process. Now, I am getting below error -
image

FYI - I am using a signed certificate by trusted authority for Nautobot server.
I have not modified the default location of the cert & key files. My cert file is under - /etc/pki/tls/certs/ & the key file is under - /etc/pki/tls/private. Also, the nautobot UI is working with this SSL certificate.

It looks like an SSL verification error for an internal site. You need to ensure that you have the CA cert and private key for that site installed and trusted. This is fairly standard when working with internal CA protected sites.

Another error with ACI job -
image

I have the Controller associated with Managed Device Group -
image

Can you provide the traceback for where that particular error came from? It should be under the Advanced tab on the Job.

Actually, I think I might have spotted an issue here.

@jdrew82 Did you push this code changes to the main branch? As I uninstalled & installed ssot[aci] plugin again, & everytime the older code is getting installed -
image

I can see that the main branch doesn't have this code change.