bggo/Zmbkpose

Using issue34 on ZCS 8.0.3

Opened this issue · 25 comments

Hello,

I'm using the latest version of Zmbkpose (issue34) but i'm having a problem with it. No matter if I make full or incremental backup constantly the script is hanging with the following reason:

2013-05-16 10:32:22 DEBUG: xxx@domain.com - Last backup from 09/05/2013 10:26:50 ...
2013-05-16 10:32:22 DEBUG: xxx@domain.com - START INC backup ...
2013-05-16 10:32:22 DEBUG: xxx@domain.com - Doing ldap backup ...
2013-05-16 10:32:22 DEBUG: xxx@domain.com - ldap backup complete
2013-05-16 10:32:22 DEBUG: xxx@domain.com - Doing mailbox backup ...
curl: (35) SSL connect error

Anyone knows anything about this issue?

I think the problem is caused by ssl negotiation. Show me the output off :
curl -v -S -k https://IP_MAIL_HOST:7071/
Replacing IP_MAIL_HOST by its corresponding value.

Thanks for your reply

Here is the output:

[root@mx sync]# curl -v -S -k https://mx.domain.com:7071/

  • About to connect() to mx.domain.com port 7071 (#0)
  • Trying 111.111.111.111... connected
  • Connected to mx.domain.com (111.111.111.111) port 7071 (#0)
  • Initializing NSS with certpath: sql:/etc/pki/nssdb
  • warning: ignoring value of ssl.verifyhost
  • skipping SSL peer certificate verification
  • SSL connection using TLS_DHE_RSA_WITH_AES_256_CBC_SHA
  • Server certificate:
  •   subject: CN=*.domain.com,OU=Web,O=Domain Ltd.,L=Sofia,ST=Bulgaria,C=BG,serialNumber=Oc8qHAKrVVR7Y7N-EmHsR-ZEBYe-cxQp
    
  •   start date: Jun 13 03:33:11 2012 GMT
    
  •   expire date: Aug 15 00:41:34 2013 GMT
    
  •   common name: *.domain.com
    
  •   issuer: CN=GeoTrust SSL CA,O="GeoTrust, Inc.",C=US
    

    GET / HTTP/1.1
    User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.13.1.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2
    Host: mx.domain.com:7071
    Accept: /

    < HTTP/1.1 302 Found
    < Date: Fri, 17 May 2013 13:20:23 GMT
    < Expires: Tue, 24 Jan 2000 20:46:50 GMT
    < Cache-Control: no-store, no-cache, must-revalidate, max-age=0
    < Pragma: no-cache
    < X-Frame-Options: SAMEORIGIN
    < Content-Type: text/html;charset=UTF-8
    < Location: https://mx.domain.com:7071/zimbraAdmin
    < Content-Length: 0
    <
  • Connection #0 to host mx.domain.com left intact
  • Closing connection #0
    [root@mx sync]#

There is no error here.
The curl command error , always occurs, or is a sporadic error ?

This is happening after we've upgraded from 8.0.0 to 8.0.3, and the error appears on incremental backup, today i'll try to start full backup, and will report. The Zimbra installation is working on CentOS 6.3 x64

after 28 hours of full backup, it showed 2 times this error and needed to be restarted. does Zmbkpose needs a timeout between backing up 2 accounts in order to prevent more connections to the server?

There is a "-e" argument for wait between 2 accounts backups , but I think
this is not the problem. I am testing 8.0.3 version with zmbkpose.

2013/5/20 ymarinov notifications@github.com

after 28 hours of full backup, it showed 2 times this error and needed to
be restarted. does Zmbkpose needs a timeout between backing up 2 accounts
in order to prevent more connections to the server?


Reply to this email directly or view it on GitHubhttps://github.com//issues/35#issuecomment-18133676
.

please let me know how's your test going on.

Sorry, but I can't reproduce this error testing on client host with :
curl-7.15.5-9.el5
openssl-0.9.8e-22.el5_8.4
And server host with:
ZCS Open Source Edition 8.0.3

You say : "after 28 hours of full backup, it showed 2 times this error..." ok, so, this is not always happening.

Are you running zmbkpose in same zimbra server? or in a different host, with a firewall (with a NAT configured) in the middle?

In the first lines off zmbkpose, change the value of "CURL_args" variable adding "-v" argument for get verbose info of curl. should look like this:
CURL_args="-v -s -S -k"
Let's wait for an error occurs.

OK, i'll try with those options. The script is running on the same machine, and it's running of working time in order not to load the server

Same problem here. Any solution?

curl-7.19.7-36.el6_4.x86_64
openssl-1.0.0-27.el6_4.2.x86_64
ZCS Open Source Edition 8.0.4

Error:
2013-08-21 05:15:05 DEBUG: user@domain.com - Last backup from 20/08/2013 05:14:11 ...
2013-08-21 05:15:05 DEBUG: user@domain.com - START INC backup ...
2013-08-21 05:15:05 DEBUG: user@domain.com - Doing ldap backup ...
2013-08-21 05:15:05 DEBUG: user@domain.com - ldap backup complete
2013-08-21 05:15:05 DEBUG: user@domain.com - Doing mailbox backup ...

  • About to connect() to mailbox.domain.com port 7071 (#0)
  • Trying 192.168.1.10... connected
  • Connected to mailbox.domain.com (192.168.1.10) port 7071 (#0)
  • Initializing NSS with certpath: sql:/etc/pki/nssdb
  • warning: ignoring value of ssl.verifyhost
  • skipping SSL peer certificate verification
  • NSS error -5961
  • Closing connection #0
  • SSL connect error
    curl: (35) SSL connect error

The script is running on the same machine

Hi.
If you run repeatedly the test. Could you tell me if the 5961 nss error code repeats itself or if it changes?

Hi, thanks for the help.

It's always the same error

grep "NSS error" /opt/zimbra/log/backup.log

  • NSS error -5961
  • NSS error -5961
  • NSS error -5961
  • NSS error -5961
  • NSS error -5961
  • NSS error -5961
  • NSS error -5961

because I can not reproduce the error (I'm trying to start a scenario with the same versions of packages), and I not know the inside of ssl ; I'm blind in the matter
I would like to do the following test.

In the first lines off zmbkpose, change the value of "CURL_args" variable adding "-v --sslv3" arguments for get verbose info and use of ssl3. should look like this:
CURL_args="-v --sslv3 -s -S -k"
Let's see if we have news

same mistake, centos 6.4 zimbra FOSS 8.0.4_GA_5737

if I try to change the scrip adding to the cul command "--sslv3 -v":

  • About to connect () to mydomain port 7071 (# 0)
  • Trying 127.0.0.1 ... connected
  • Connected to mydomain (127.0.0.1) port 7071 (# 0)
  • Initializing NSS with CertPath: sql :/ etc / pki / nssdb
  • Warning: ignoring value of ssl.verifyhost
  • Skipping SSL peer certificate verification
  • NSS error -5961
  • Closing connection # 0
  • SSL connect error

curl: (35) SSL connect error

yeap! same error adding the option "--sslv3 -v"

  • About to connect() to mailbox.domain.com port 7071 (#0)
  • Trying 192.168.1.1... connected
  • Connected to mailbox.domain.com (192.168.1.1) port 7071 (#0)
  • Initializing NSS with certpath: sql:/etc/pki/nssdb
  • warning: ignoring value of ssl.verifyhost
  • skipping SSL peer certificate verification
  • NSS error -5961
  • Closing connection #0
  • SSL connect error
    curl: (35) SSL connect error

Confirmed sporadic error in SSL negotiation :
Versions in tests:
curl-7.19.7-35.el6.x86_64
openssl-1.0.0-27.el6.x86_64
ZCS Open Source Edition 8.0.4

A simple test look ok :
curl -k https://192.168.200.15:7071

But, with multiple test:

i=1; while true ;do
curl -k https://192.168.200.15:7071 >/tmp/out 2>&1 ; r=$?
if grep -q error /tmp/out;then cat /tmp/out ;
echo "Try:$i Return code:$r"
break
fi
let i++
done

We get :
curl: (35) SSL connect error
Try:212 Return:35

Repeating test:
Try:137 Return code:35
Try:213 Return code:35
Try:144 Return code:35
Try:168 Return code:35
Try:105 Return code:35
Try:32 Return code:35
Try:386 Return code:35
Try:16 Return code:35

Trying against ZCS Open Source Edition 5.0.16, I end 7000 tests with no errors.

Now the code tried several times on error 35
Fixed in branch "issue35".
Please download this branch.

I tried the new branch but does not seem to work.

running as user zimbra :

usr/local/bin/zmbkpose_issue35 - admuser "admin @ localdomain " - admpw " password_admin "-f -u -a user @ localdomain

I get this output:

14.10.2013 17:47:39 DEBUG : Testing ldapserver ldap :/ / 127.0.0.1:389 , using user uid = zimbra , cn = admins , cn = zimbra
14.10.2013 17:47:39 DEBUG : 1 Analizing accounts
14.10.2013 17:47:39 DEBUG : user @ localdomain - START FULL backup ...
14.10.2013 17:47:39 DEBUG : user @ localdomain - Doing ldap backup ...
14.10.2013 17:47:39 DEBUG : user @ localdomain - ldap full backup
14.10.2013 17:47:39 DEBUG : user @ localdomain - Doing mailbox backup ...
14.10.2013 17:47:39 DEBUG : user @ localdomain - Doing mailbox backup ...
14.10.2013 17:47:40 DEBUG : user @ localdomain - mailbox full backup
14.10.2013 17:47:40 DEBUG : user @ localdomain - generated tar file / mnt / backup / alessandro.poli @ retesi.it/20131014174739 : FULL.tar
14.10.2013 17:47:40 DEBUG : user @ localdomain - backup size is 0
14.10.2013 17:47:40 DEBUG : user @ localdomain - END in 1secs ,
14.10.2013 17:47:40 DEBUG : Backup FINISHED 1 backups performed

user @ localdomain is a box that takes up about 4GB .

In the backup folder I find these files / links:

$ ls -las
total 40
4 drwxr-x--- 2 zimbra zimbra 4096 Oct 14 17:47 .
4 drwxr-xr-x 3 zimbra root 4096 Oct 14 17:47 ..
32 -rw-r----- 1 zimbra zimbra 30720 Oct 14 17:47 20131014174739:FULL.tar
0 lrwxrwxrwx 1 zimbra zimbra 23 Oct 14 17:47 LAST -> 20131014174739:FULL.tar

if I try an extraction of tar:

$ Tar-xf 20131014174739 : FULL.tar
tar: Can not connect to 20131014174739: resolve failed

so I did:

$ Mv 20131014174739: FULL.tar USER_AT_LOCA_FULL.TAR

and open the file

$ tar -xf USER_AT_LOCA_FULL.TAR

another ls

$ ls -las
total 64
4 drwxr-x--- 2 zimbra zimbra 4096 Oct 14 17:59 .
4 drwxr-xr-x 3 zimbra root 4096 Oct 14 17:47 ..
0 lrwxrwxrwx 1 zimbra zimbra 23 Oct 14 17:47 LAST -> 20131014174739:FULL.tar
32 -rw-r----- 1 zimbra zimbra 30720 Oct 14 17:47 USER_AT_LOCA_FULL.TAR
20 -rw-r----- 1 zimbra zimbra 19648 Oct 14 17:47 acct.ldif
4 -rw-r----- 1 zimbra zimbra 1346 Oct 14 17:47 mailbox.tgz

$ du -sh *
0 LAST
32K USER_AT_LOCA_FULL.TAR
20K acct.ldif
4.0K mailbox.tgz

an extraction mailbox.tgz

$ tar -zxvf mailbox.tgz

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

Ouch.  test code, remnant. Sorry.
Fixed. Please update you code from branch "issue35".
commit SHA: a5a9ea4

Great, it works! Thank you!!

One other question, any execution errors are logged?

Forget it,

the error appears again, and if the previous version of the script went up, now comes out and is not processing the boxes that remain.

[root@mail /]# tail backup_FULL_20131018.log
2013-10-18 16:54:20 DEBUG: user1@domain - mailbox backup complete
2013-10-18 16:54:20 DEBUG: user1@domain - generated tar file /mnt/backup/user1@domain/20131018165420:FULL.tar
2013-10-18 16:54:20 DEBUG:user2@domain - backup size is 0
2013-10-18 16:54:20 DEBUG: user2@domain - END in 0secs,
2013-10-18 16:54:20 DEBUG:user2@domain - START FULL backup ...
2013-10-18 16:54:20 DEBUG: user2@domain - Doing ldap backup ...
2013-10-18 16:54:20 DEBUG: user2@domain - ldap backup complete
2013-10-18 16:54:20 DEBUG: user2@domain - Doing mailbox backup ...
2013-10-18 16:54:20 DEBUG: user2@domain - Doing mailbox backup ...
curl: (35) SSL connect error
[root@mail /]#

Sorry braulio88, I don't understand your comments. The errors appears again with the last version?

yes

same error