Ed's Ramblings about MySql

Tuesday, October 12, 2010

MERGE Table Hint

To quickly check if a MERGE table is functional (i.e., no subtables are missing and their schema's all match), run a "SELECT COUNT(*)" on the MERGE table. The SELECT COUNT(*) will have to open and examine each and every subtable, and it will run quickly because "SELECT COUNT(*)" on a MYISAM table is dirt cheap because they maintain a total row count.

Thursday, July 29, 2010

5.1.46 Innodb Plugin WIN

This fix has measurably helped our db performance:

http://bugs.mysql.com/bug.php?id=51325

Expunging the cache of dropped table pages used to be one of the largest causes of blocking in our db's. It used to be right up there with the blocking caused by ROLLBACKs (see my earlier post). We've now been running the Percona build of 5.1.47 for a while, and things are way better. In fact, purging the page hash (the data structure that lets the db look up where a given page resides in the cache) of deleted table page entries now causes more blocking than cleaning up the cache itself.

Innodb ROLLBACK Bottleneck

ROLLBACKs are the number one bottleneck in some of our databases. ROLLBACKs are bad because they grab the global semaphore on the data dictionary while each row is deleted. First of all, this means ROLLBACKs are not scalable since only one at a time can get any work done. Second, many different paths through the Innodb code need the global data dictionary semaphore. So a ROLLBACK not only blocks other ROLLBACKs, it blocks all sorts of unrelated activity in the db.

The code that causes the problem in the Percona build of 5.1.47 is in storage/innodb_plugin/row0undo.c. The comment before the lock is grabbed says "Prevent DROP TABLE etc. while we are rolling back this row". It is very unfortunate that this is implemented by locking down the ENTIRE data dictionary when only ONE table (usually) needs to be protected from schema changes.

Sunday, July 25, 2010

Replacing a Bad Drive in a md Software Mirror

I have a lot of Linux "md" RAID-1 arrays in my set-up. Every so often, one of the two drives in a RAID-1 starts behaving badly and I want to replace it. The "bad" drive hasn't stopped working outright, but it's causing I/O errors to be logged in /var/log/messages, or it fails a SMART self-test. My favorite method is to put the bad drive in "write-mostly" mode, add a third drive to the RAID-1, let it fully sync, and then remove the bad drive from the RAID-1.

Why put the drive in "write-mostly" mode? Because the one remaining "good" drive in the RAID-1 may not be so good. It might have unreadable sectors. You will not discover that until the md software tries to read the entire "good" drive in order to copy it to the new third drive. If this happens, unless one's luck is extremely bad, the md software will be able to read the unreadable sectors from the "bad" drive. So instead of taking the bad drive out of the mirror immediately, I tell md to use it as little as possible.

The details; say our RAID-1 is /dev/md9, the bad drive is /dev/sdX, and our new drive is /dev/sdN.

Step 1. echo writemostly > /sys/block/md9/md/dev-sdX/state

This tells the md software to avoid reading from the bad drive. It still gets written to as usual.

Step 2. mdadm --grow /dev/md9 -n 3

Grow the RAID-1 to be able to handle 3 drives.

Step 3. mdadm /dev/md9 -a /dev/sdN

Add the new drive to the array. Syncing starts at once.

Step 4. grep md9 -A 3 /proc/mdstat

Monitor the progress of the sync and when it's done, continue. If you are unlucky, this step will never finish. You will see I/O errors in /var/log/messages and the md software will restart the sync over from scratch. This will happen in an endless loop. This is where backups come into play as the safety net.

Step 5. mdadm /dev/md9 -f /dev/sdX

Manually "fail" the bad drive.

Step 6. mdadm /dev/md9 -r /dev/sdX

Remove the bad drive from the array.

Step 7. mdadm --zero-superblock /dev/sdX

Erase the "md" superblock that's written towards the end of /dev/sdX to avoid any confusion that /dev/sdX might be part of an md array down the road.

Step 8. mdadm --grow /dev/md9 -n 2

Back to normal.

Saturday, July 24, 2010

How I Upgrade from Mysql 5.0 to 5.1

I've used this procedure to upgrade several db's from 5.0 to the Percona build of 5.1. I like the Percona build because it includes a lot of performance fixes for innodb not found in stock 5.1. For us, the performance fixes have made a measurable difference.

MySql advises dumping the db from 5.0 and reloading the entire thing into a new 5.1 db. That's not feasible for us. We have terabytes of data. So I employ an "in-place" upgrade strategy.

STEP 0. Try out all the steps below on an old backup copy of the db to get an idea of what you are in for before attempting them on the production db.

STEP 1. SHOW CREATE TABLE for every FEDERATED table.

Are you using the new FEDERATED storage engine? None of my 5.0 FEDERATED tables would work after the in-place upgrade to 5.1. So save the output of SHOW CREATE TABLE to be able to drop and recreate them later.

STEP 2. SET GLOBAL INNODB_FAST_SHUTDOWN = 0;

The documentation for the innodb plugin says to flush all the undo and empty the insert buffer before upgrading innodb in-place. (The Percona build is based on the innodb plugin.) Turning off fast shutdown will accomplish exactly that when the db is shut down.

STEP 3. Shutdown the instance.

This step can take a lot of time. To get some idea of how long it will take, first do a SHOW INNODB STATUS. Look for the "TRANSACTIONS" section of the output; near the top you will see a "History list length" value. The larger this number is, the more undo there will be to clear. Look for the "INSERT BUFFER AND ADAPTIVE HASH INDEX" section and the "Ibuf: size" value, which is the size of the insert buffer content in units of 16K pages. The larger that is, the more deferred index updates the db will have to complete before it shuts down.

STEP 4. Replace 5.0 binaries with 5.1 binaries.

I like to build from source. So for me this involves running "make uninstall" in my 5.0 mysql source tree and then "make install" in my 5.1 source tree.

STEP 5. Replace the my.cnf file with one appropriate for 5.1 and restart instance.

Look over the 5.1 options and decide what's right for you. One variable we had to think about was innodb_autoinc_lock_mode. I perform the rest of the steps during a maintenance outage. Since in my environment all db connections are instigated remotely, I just add "skip-networking" to the my.cnf file before restarting.

STEP 5. mysql_upgrade

This script will probably try to run REPAIR TABLE on innodb tables, but it will complain every time it tries that since REPAIR TABLE is unimplemented for innodb tables. This is harmless.

STEP 6. Run CHECK TABLE FOO FOR UPGRADE on every innodb table. For every innodb table indicated as needing repairing, run ALTER TABLE FOO ENGINE = INNODB.

I wrote a python script to handle this for me. ALTER TABLE FOO ENGINE = INNODB creates a fresh copy of the table. For large tables, this can take a long time. Hopefully only a relatively small fraction of tables will need rebuilding as has been the case with my db's; otherwise you are in mysql purgatory.

STEP 7. Drop and recreate any FEDERATED tables.

Tuesday, May 19, 2009

Beware: myqldump and FEDERATED tables

mysqldump in 5.1 is broken in my opinion with respect to FEDERATED tables.

mysqldump insists on dumping all the rows of a FEDERATED table. That's bad when you go to restore from the dump. The normal case is that the remote table our FEDERATED table points to is fine and needs nothing done to it. The restore will try to re-INSERT all the rows into the remote table. But they are already there.

A solution to this problem is to isolate all FEDERATED tables within a single database, say "my_federated" for example. Don't put anything but FEDERATED tables in "my_federated".

Then dump the database as follows:

mysqldump --opt --no-data --result-file=federated.dmp --verbose --databases my_federated

mysqldump --opt --result-file=data.dmp --verbose --databases list of all databases except my_federated and information_schema

Restore the db by loading both dump files:

mysql

mysql> source federated.dmp

mysql> source data.dmp

Wednesday, May 13, 2009

Building MySql From Sources on Redhat

UPDATE 7/2010: Caveat emptor, this information is out of date. I've learned some new things since originally writing this post as well. I wrote this when I was first figuring out how to build from source.

Building MySql from sources has worked very well for me, at least once I got over a few humps. First, there are several non-default options to "configure" that you probably want to use. For example, by default, the "FEDERATED" storage engine isn't included. Second, there is additional software that's not installed by default with Redhat that's needed.

The guidance here works for MySql 5.1.33.

Building MySql:

1. Get the source tree from mysql.com

2. cd mysql-X.XX

3. ./configure options

4. make

5. make install

Step (5) installs the mysql binaries to /usr/local/bin.

My "configure" options:

./configure --with-charset=utf8 --with-collation=utf8_general_ci --with-plugins=innobase,partition,archive,federated,heap,myisam,myisammrg --enable-local-infile --without-debug --enable-thread-safe-client --with-client-ldflags=-all-static

I'll explain why I use these particular options later on in this post.

Missing software problems:

1. "configure" fails complaining about "no curses/termcap library found"

Cure: yum install ncurses-devel

2. "make" fails because of missing g++

Cure: yum install gcc-c++

3. "make" (of innodb backend) fails due to missing file zlib.h

Cure: yum install zlib-devel

Configure options:

1. --with-charset=utf8 --with-collation=utf8_general_ci

Why: we got tired of having to jump through hoops when our data happens to contain

non-latin1 characters. These options make all db's utf8 by default.

2. --with-plugins=innobase,partition,archive,federated,heap,myisam,myisammrg

Why: don't futz with plug-ins. Link things directly in. Should yield a bit of performance.

3. --enable-local-infile

Why: without this, LOAD DATA LOCAL INFILE is disabled.

4. --without-debug

Why: should gain some performance.

5. --enable-thread-safe-client

Why: our clients can have multiple threads.

6. --with-client-ldflags=-all-static

Why: should gain a bit of performance