MySQL Notes
Additional MySQL configuration notes
Author: Andrea Brancatelli
Editor: Aaron Stone
Created on: July 31, 2007
DBMail requires that most (if not all) tables be in InnoDB format. We rely on InnoDB's transactions, foreign key relations, and other strong ACID features to maintain the integrity of your mail store. InnoDB is a fairly complex table type, and requires a number of tuning parameters for best performance.
Attachment Size Tuning
The attachment size of an email is bounded by max_allowed_packet size. The example large innodb configuration bundled with MySQL (my-innodb-4g) specified a maximum_allowed_packet size of 16M. Therefore attachments >=16M will not store in the database but DBMail will consider the email successfully received. The attachment will usually end up being 0 bytes. Specify a maximum_allow_packet as an upper bound to the attachment size, such as 128MB, or 256MB.
InnoDB Tuning
ToDo.
InnoDB disk space optimization
Optimizing the disk space allocation with an InnoDB engine always proved to be a pretty hard job. InnoDB allows the DB Administrator to choose different space allocation strategies controlled by the “innodb_file_per_table” switch inside of MySQL's my.cnf.
WITHOUT innodb_file_per_table
- you have one shared tablespace for all the tables. This gives you the advantage of limiting the space your db will be using (unless you activate the autoextend) but give you the disadvantage of preallocating all the space you'll be using
- the problem everyone is facing is caused by the autoextend. Actually it's conceptually wrong to use the autoextend when you have a single tablespace, because whenever your DB will grow, the tablespace will grow, but it will never get smaller, even when you delete a table or when you run an OPTIMIZE TABLE
- the OPTIMIZE TABLE issue is pretty simple: when you run an OPTIMIZE TABLE, MySQL will simple re-create the whole table you're optimizing, rewriting all the datas sequentially. This will give you a certain degree of speed when you'll be accessing datas later, but may be also a problem because at a certain moment (“one second” before the optimize finish) you'll end up having two copies of the same table: the old one, and the new temporary one that will get renamed when the operation completes. If you are using the shared tablespace with autoextend activated this will probably mean that your shared tablespace will GROW because the amount of datas will double. When the OPTIMIZE TABLE will be completed the old table will be deleted and all the hypothetically-free space will just sit there waiting to be used. This is the key point. InnoDB doesn't free the disk space because it just wait for the space to be used again, because it's designed for an environment where you preallocate the space for the db. So since you preallocated it, why should you care about freeing it?
WITH innodb_file_per_table
- you have one (well, actually two) file per each InnoDB table. Each table/index file will stay in the database directory (which to me looks like another big advantage itself)
- whenever you add a table you get another .idb file. Whenever your table grow, the idb file grows, and your filesystem space decrease. Whenever you delete the table, the .ibd file gets deleted and your filesystem space augments.
- the OPTIMIZE TABLE process here gets interesting. When you run an OPTIMIZE TABLE the InnoDB engine will start to create a NEW .idb file with a temporary name, using only the space it actually needs to store the real datas. When the optimize table has ended, it will simply delete the old .idb file and rename the temporary one to the correct name. This mean that if your old table's .idb file had grown up to 3, 4, 5, 100 GB but you have only 100 MB of real datas in it, the new .idb file will be 100MB while the one that will be deleted was 3, 4, 5, 100GB.
Easiness vs. Speed
innodb_file_per_table will probably make your life much easier in terms of space handling and in general easiness of administration, but in terms of speed or whatever else I can't say if there's any advantage as I haven't done any testing myself. What I can assure you is that, having every .idb in the corresponding DB's dir you can mount directory from different HDs or RAIDs for each DBs thus having better racing conditions within the same MySQL server.
Probably you could achieve the same result having more than one shared tablespace, but frankly I have no experience with this.
Rule of thumb
- If you have a single, stand-alone, DB server, use a shared InnoDB tablespace preallocating the space and disabling the autoextend. Using the optimize table will give you a better optimization of the tables, and you'll have no problem with the disk space as it's already allocated up to a fixed size.
- If you have a machine with various tasks going on, like a mail server, web server, mailing lists server, ftp server and whatever else, use the innodb_file_per_table switch. Using the OPTIMIZE TABLE mechanism you'll be able to reclaim your disk space back whenever you delete anything (the whole .idb file gets deleted so you istantly get your space back) or whenever any table will significantly decrease in size (by running an OPTIMIZE TABLE against the table in question).
Collation sequence / database character set mismatch errors
In the event one cannot use the commandline utilities to, say, add user and the log shows messages such as:
collation mismatch, your MySQL configuration specifies a different charset than the data currently in your DBMail database.
even though mySQL utilities show that the database is in UTF-8 charset and the configuration says the database is in UTF-8, the problem can be solved by issuing the following mySQL command:
ALTER DATABASE dbmail DEFAULT CHARACTER SET utf8;
Further Information
Doubt? Question? Fear? Panic? Good! MySQL is like that :-)