closeio/sync-engine

Newer versions of MySQL driver dropped support for gevent

Opened this issue · 6 comments

The mysqlclient driver that we're using has removed gevent support in version 1.4.

The last version of the driver that supports it (1.3.4) supports Python up to 3.9 so this is blocking us from supporting newer Python versions and missing out on any potential improvements.

@squeaky-pl could we just switch to using the pure-Python driver PyMySQL ? It's by the same maintainer as mysqlclient and he recommends using PyMySQL instead of mysqlclient if you use gevent.

This is one of possible solutions.

We would need to measure performance impact. sync-engine's sync pods have a pretty tight loop where it polls all the IMAP uids in the database for a given account against the remote IMAP uids. For large email inboxes it can be easily hundreds of thousands of uids fetched and decoded (this time in Python instead of C) in one SQL query.

We still have semi-broken data with strings that are not valid in Python 3 that were inserted into the database with Python 2 with surrogates, MySQL doesn't validate strings against invalid surrogate pairs and Python 3 does. we are currently correcting this on the fly.

See: https://github.com/closeio/sync-engine/blob/c064bf8f55bc190298b7f43504442c8f535089a4/inbox/sqlalchemy_ext/util.py#L281-L320 and

We would need to see if this approach can be replicated on the new driver, or resync all the data that was synced with Python 2 and reinsert it with surrogate pairs corrected (fun project apart).

There are also some fun and weirds hacks here that access internal mysql C driver modules. See

import _mysql_exceptions
import gevent
from gevent import socket
from redis import TimeoutError
from sqlalchemy.exc import StatementError
from inbox.error_handling import log_uncaught_errors
from inbox.logging import create_error_log_context, get_logger
from inbox.models import Account
from inbox.models.session import session_scope
log = get_logger()
BACKOFF_DELAY = 30 # seconds to wait before retrying after a failure
TRANSIENT_NETWORK_ERRS = (socket.timeout, TimeoutError, socket.error, ssl.SSLError)
TRANSIENT_MYSQL_MESSAGES = (
"try restarting transaction",
"Too many connections",
"Lost connection to MySQL server",
"MySQL server has gone away",
"Can't connect to MySQL server",
"Max connect timeout reached",
)
def retry(
func,
retry_classes=None,
fail_classes=None,
exc_callback=None,
backoff_delay=BACKOFF_DELAY,
):
"""
Executes the callable func, retrying on uncaught exceptions matching the
class filters.
Arguments
---------
func : function
exc_callback : function, optional
Function to execute if an exception is raised within func. The exception
is passed as the first argument. (e.g., log something)
retry_classes: list of Exception subclasses, optional
Configures what to retry on. If specified, func is retried only if one
of these exceptions is raised. Default is to retry on all exceptions.
fail_classes: list of Exception subclasses, optional
Configures what not to retry on. If specified, func is /not/ retried if
one of these exceptions is raised.
"""
if fail_classes and retry_classes and set(fail_classes).intersection(retry_classes):
raise ValueError("Can't include exception classes in both fail_on and retry_on")
def should_retry_on(exc):
if fail_classes and isinstance(exc, tuple(fail_classes)):
return False
if retry_classes and not isinstance(exc, tuple(retry_classes)):
return False
return True
@functools.wraps(func)
def wrapped(*args, **kwargs):
while True:
try:
return func(*args, **kwargs)
except gevent.GreenletExit:
# GreenletExit isn't actually a subclass of Exception.
# This is also considered to be a successful execution
# (somebody intentionally killed the greenlet).
raise
except Exception as e:
if not should_retry_on(e):
raise
if exc_callback is not None:
exc_callback(e)
# Sleep a bit so that we don't poll too quickly and re-encounter
# the error. Also add a random delay to prevent herding effects.
gevent.sleep(backoff_delay + int(random.uniform(1, 10)))
return wrapped
. Not sure what this is even for but maybe we could just remove it.

If we find ourselves under time pressure we could also move to a distro with Python 3.9. Sync-engine works on Python 3.9 unmodified and it's even in the CI now. That would buy as one more year if using non LTS distro is an option. Ubuntu 21.04 shipped with 3.9.

@tsx was also talking to me about getting rid of Gevent first, which might be a smaller project and would align with the direction we are taking with other projects.

Also putting this on @wojcikstefan radar.

tsx commented

We still have semi-broken data with strings that are not valid in Python 3 that were inserted into the database with Python 2 with surrogates, MySQL doesn't validate strings against invalid surrogate pairs and Python 3 does. we are currently correcting this on the fly.

Sounds like this can be fixed with a standalone migration? We could come up with a clever enough select ... where field like '%surrogate%' and date_created < $python3_migration and then fix and rewrite those rows directly.

resync all the data that was synced with Python 2 and reinsert it with surrogate pairs corrected

I know it'll be more complex than a couple queries, but could be way easier than full resyncs.