Clarification about import via /docker-entrypoint-initdb.d
Closed this issue · 6 comments
I maintain the Alpine Linux version of MariaDB https://github.com/wodby/mariadb and use the entrypoint script from this repository for init. Recently I've identified the issue when MariaDB for some reason does not rebuild indexes and acts differently when a dump imported via /docker-entrypoint-initdb.d
bind mount as opposed to mysql db < file.sql
import after the full init/start. Described in details at wodby/docker4drupal#341. I was also able to reproduce this issue with vanilla WordPress.
So I'm not sure whether the import after init and before start is a safe method of importing. But maybe you have some explanation for this behavior.
If you look at the entrypoint, what you describe is exactly how the import behavior works today:
-
we start
mysqld
(normally, with all arguments and flags the user passed and an explicit--socket
value and--skip-networking
somysqld
doesn't become available outside the container prematurely): https://github.com/docker-library/mariadb/blob/e62e2b5bd3b34da3596bc80b40513e9bded8ec5a/10.3/docker-entrypoint.sh#L97-L99 -
import
*.sql
files viamysql foo < bar.sql
: https://github.com/docker-library/mariadb/blob/e62e2b5bd3b34da3596bc80b40513e9bded8ec5a/10.3/docker-entrypoint.sh#L170-L179 -
shut down our temporary
mysqld
instance: https://github.com/docker-library/mariadb/blob/e62e2b5bd3b34da3596bc80b40513e9bded8ec5a/10.3/docker-entrypoint.sh#L181-L184 -
fire up the "real"
mysqld
instance (now with networking appropriately enabled): https://github.com/docker-library/mariadb/blob/e62e2b5bd3b34da3596bc80b40513e9bded8ec5a/10.3/docker-entrypoint.sh#L192
Thank you for the response.
Yes, I know how it works, let me show you how to reproduce the issue with the latest MariaDB 10.2. I'm attaching a vanilla Drupal 7 clean db dump after a standard installation. d7.sql.gz, nothing special in this dump, just a real world example, it may be any db dump that have enough records in a table and the structure to build indexes.
Import via init bind mount
First, import the dump via the init bind mount:
docker run --rm -d -e MYSQL_DATABASE=drupal -e MYSQL_ROOT_PASSWORD=password -v ./d7.sql:/docker-entrypoint-initdb.d/d7.sql mariadb:10.2
docker exec -ti [CID] sh
mysql -uroot -ppassword drupal
show index from system;
# Cardinality column is all 0
# Run select from the table for a few times to trigger index rebuild:
select * from system;
# Run a few times.
show index from system;
# Cardinality is still 0
As you can see indexes weren't built.
Manual post-start import
Mount the db dump as .txt
to skip init
docker run --rm -d -e MYSQL_DATABASE=drupal -e MYSQL_ROOT_PASSWORD=password -v ./d7.sql:/docker-entrypoint-initdb.d/d7.txt mariadb:10.2
docker exec -ti [CID] sh
# Manually import dump after MariaDB will initialize and start.
mysql -uroot -ppassword drupal < /docker-entrypoint-initdb.d/d7.txt
mysql -uroot -ppassword drupal
Run SQL queries from the previous example, you'll see that after a few selects (and a few seconds) MariaDB will build indexes and cardinality column won't be 0.
I believe it may have something to do with the full start of MariaDB.
My docker version
:
Client:
Version: 18.06.1-ce
API version: 1.38
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:21:31 2018
OS/Arch: darwin/amd64
Experimental: false
Server:
Engine:
Version: 18.06.1-ce
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:29:02 2018
OS/Arch: linux/amd64
Experimental: true
Kubernetes:
Version: v1.10.3
StackAPI: v1beta2
My docker info
:
Server Version: 18.06.1-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 4.9.93-linuxkit-aufs
Operating System: Docker for Mac
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 1.952GiB
Name: linuxkit-025000000001
ID: Q4AW:SBZG:NWA4:TIRY:MLDW:TGVL:MWMH:H26Y:PEET:W2XJ:6TXM:4YEP
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
File Descriptors: 172
Goroutines: 187
System Time: 2018-11-14T05:08:29.0688101Z
EventsListeners: 2
HTTP Proxy: gateway.docker.internal:3128
HTTPS Proxy: gateway.docker.internal:3129
Registry: https://index.docker.io/v1/
Labels:
Experimental: true
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Ok, figured out what's going on here.
The dump you've provided explicitly locks the table and disables indexes while dumping data (which greatly speeds up the data insertion process), after which it then re-enables indexes. Normally, mysqld
would stay running after doing so, which gives it time to slowly rebuild those indexes over time.
In our case, we almost immediately stop (gracefully) mysqld
to start it back up again, and it seems to forget that it was trying to rebuild indexes and doesn't get back around to it.
If I run ANALYZE TABLE system;
(which appears to be the official way to ask MySQL to rebuild indexes; https://stackoverflow.com/a/30051747/433558), the indexes have appropriate Cardinality
values again.
So all-in-all, I'm thinking there's really not much we can do here except recommend that you run ANALYZE TABLE
after your imports (or adjust them to not disable indexes) to force mysqld
to actually build those indexes properly.
For mysqldump
, that'd be:
-K, --disable-keys '/*!40000 ALTER TABLE tb_name DISABLE KEYS */; and
'/*!40000 ALTER TABLE tb_name ENABLE KEYS */; will be put
in the output.
(Defaults to on; use --skip-disable-keys to disable.)
(But I think adding ANALYZE TABLE
lines explicitly is the better solution -- I wish mysqldump
had a flag for that.)
Closing, since I don't think there's anything else we can reasonably do here automatically (as far as I know, mysqld
doesn't have an alternative "clean shutdown" method that also allows any in-progress reindexing to complete first -- we're already using the only "clean shutdown" method we're aware of).
Thank you for looking at this.
Regarding the --disable-keys
option:
'/*!40000 ALTER TABLE tb_name DISABLE KEYS /; and '/!40000 ALTER TABLE tb_name ENABLE KEYS */; will be put in the output. This makes loading the dump file faster because the indexes are created after all rows are inserted. This option is effective only for non-unique indexes of MyISAM tables. Disable with --skip-disable-keys.
So assumming most people use InnoDB, ANALYZE TABLE
is the only good option.