Getting {signal SIGSEGV: segmentation violation code=0x1 addr=0x40 pc=0x7fe2df901be7] when run from container image in Kube onPrem.
afernandezod opened this issue · 4 comments
Getting the above error when calling MQManager, err = ibmmq.Connx(qMgrName, cno) . It is a TLS connection and already verify all parameters for cno,csp,sco and cd are correct. Only happening when trying to run the service in a Docker container in kube. When run in Local Go or local from the container runs fine. Also firewall restrictions checked and looks fine.
Actual code:
// MQconnect - to connect to MQ
func MQconnect(conf config.ConfigInterface) (bool, ibmmq.MQQueueManager) {
var qMgrName string
resp := true
// Allocate the MQCNO structure needed for the CONNX call.
cno := ibmmq.NewMQCNO()
cd := ibmmq.NewMQCD()
csp := ibmmq.NewMQCSP()
sco := ibmmq.NewMQSCO()
qMgrName = conf.GetString("MQ.MngrName")
cd.ChannelName = conf.GetString("MQ.channel")
cd.ConnectionName = conf.GetString("MQ.host")
cd.SSLCipherSpec = conf.GetString("MQ.cipher")
cd.SSLClientAuth = ibmmq.MQSCA_OPTIONAL
sco.KeyRepository = conf.GetString("MQ.keyrepository")
sco.CertificateLabel = conf.GetString("MQ.certificateLabel")
cd.CertificateLabel = conf.GetString("MQ.certificateLabel")
csp.AuthenticationType = ibmmq.MQCSP_AUTH_USER_ID_AND_PWD
csp.UserId = conf.GetString("MQ.user")
csp.Password = conf.GetString("MQ.pass")
mqName := conf.GetString("MQ.name")
MQname = mqName
// Make the CNO refer to the CSP, CD and SCO structure so it gets used during the connection
cno.SecurityParms = csp
cno.ClientConn = cd
cno.SSLConfig = sco
// Indicate that we definitely want to use the client connection method.
cno.Options = ibmmq.MQCNO_CLIENT_BINDING
log.Info(cno)
log.Info(cd)
log.Info(csp)
log.Info(sco)
MQManager, err = ibmmq.Connx(qMgrName, cno)
if err == nil {
resp = true
} else {
resp = false
}
return resp, MQManager
}
Docker File entries (to install and use mq):
copy/unpack mq client
RUN mkdir -p /opt/mqm
COPY 9.1.0.7-IBM-MQC-Redist-LinuxX64.tar.gz /opt/mqm/
RUN cd /opt/mqm
&& tar -xvf ./.tar.gz
&& rm -f ./.tar.gz
&& chmod a+rx /opt/mqm
COPY --from=builder /opt/mqm /opt/mqm
RUN chmod -R a+rx /opt/mqm
&& ls -lta /opt/mqm/
- MQ Version ibm.com_IBM_MQ_Client-9.1.0.swidtag
- Go compiler 1.15.2 darwin/amd64
Any help will be greatly appreciated!!!!
Difficult to know for sure.
But one problem that has been seen with some container configurations, particularly running under restricted security configurations in openshift, is that the MQ client can't access the directory it needs to log errors. It's normally under $HOME but that might not be set or available for some containers. See this PR which made a change for those progams to explicitly create and set permissions on a directory. Or use the MQ_OVERRIDE_DATA_PATH env var as documented here.
I wouldn't expect that failure to access the log directories to cause a SEGV in current versions of MQ, as we fixed a problem in that area a few years ago. You should just get a more normal MQRC failure code. But if you're using a 9.1 fixpack level, then it's possible that change wasn't backported to the LTS version.
Thanks @ibmmqmet.
I think we passed the issue with segmentation violation by using a newer version of the tar.gz file, we were using 9.1.0.7 and now changed to 9.2.2.0.. In addition we added write permission to opt/mqm and IBM folders.
Now when run from KUBE-OnPrem we are getting "MQDISC: MQCC = MQCC_FAILED [2] MQRC = MQRC_CONNECTION_BROKEN [2009]" which I believe already existed before just was masked by the segmentation problem. Also and before the BROKEN connection issue we are getting error "AMQ6300E: Directory '//.mqm' could not be created: 'EACCES - Permission denied'." which I had the hope to be resolved by adding write capabilities.
At this point looks like more permission issues we can visualize within the docket container.
Here changes in the Docker file:
copy/unpack mq client
RUN mkdir -p /opt/mqm
&& mkdir -p /IBM/MQ/data/errors
"#" COPY 9.1.0.7-IBM-MQC-Redist-LinuxX64.tar.gz /opt/mqm/
COPY 9.2.2.0-IBM-MQC-Redist-LinuxX64.tar.gz /opt/mqm/
RUN cd /opt/mqm
&& tar -xvf ./.tar.gz
&& rm -f ./.tar.gz
&& bin/genmqpkg.sh -b /opt/mqm
&& chmod -R a+rwx /opt/mqm
&& chmod -R a+rwx /IBM
.
.
.
COPY --from=builder /opt/mqm /opt/mqm
COPY --from=builder /IBM /IBM
RUN chmod -R a+rwx /opt/mqm
&& ls -lta /opt/mqm/
&& chmod -R a+rwx /IBM
&& ls -lta /IBM/
I'm still investigating from sources like ibm-messaging/mq-container#310 but again, any help will be greatly appreciated.
Did you look at the PR I linked to before? It has this diff:
&& mkdir -p /IBM/MQ/data/errors \
&& mkdir -p /.mqm \
&& chmod -R 777 /IBM \
&& chmod -R 777 /.mqm
There should be no reason to change any permissions on /opt/mqm - that's for code, not data.
Once more, thanks @ibmmqmet.
After reviewing the code and discuss some security issues by changing the chmod from a+rwx to 777 for the /.mqm and it worked.
Really appreciate it.