The world's most popular open source database
Jay Pipes is the Community Relations Manager for North America at MySQL. He joined MySQL in January, 2006, and is co-author of Pro MySQL (Apress, 2005). Frequently speaking at conferences about MySQL, Jay lives in Columbus, Ohio, with his wife Julie, and his extremely needy two cats and two dogs.
So, someone had a great idea to go back through the Developer Zone, which has become a bit disorganized of late, and put together an article with links and summaries for all the Dev Zone articles that discussed MySQL 5.1 features. I have done so, and also linked in many community member's blog articles and tutorials to give everyone a good dose of the features that make MySQL 5.1 worth investigating.
The main features that MySQL 5.1 brings to the table are table logging, events, partitioning and a new form of replication called row-based replication. However, there are many smaller features such as XML functionality and essential stability and bug fixes that were mentioned by the community that are very interesting.
With MySQL 5.1, Mikael Ronström shifted focus from MySQL Cluster to the development of advanced partitioning features. Prior to partitioning debuting in MySQL 5.1, a couple of options were available to MySQL developers and DBAs, but these options had a number of disadvantages. If you used MyISAM Merge tables, you were forced to use homogeneous MyISAM tables, which lacked certain features, and if you went the custom partitioning route (which many users did), you took the partitioning functional set out of the database and into your application. While there are some advantages to this, the disadvantage is that MySQL isn't as effective in optimizing queries across the entire data set, and there was still the need to figure out how to query across the entire data set.
Starting off the coverage of partitioning, Peter Gulutzan wrote a lengthy article on the feature. Over the past year and a half, a number of people have written about partitioning, which seems to be the most popular new feature in MySQL 5.1. Robin Schumacher wrote a series of articles on partitioning, beginning with Improving Database Performance with Partitioning and then More On MySQL Partitioning. His final installment was about Partitioning With Dates in MySQL 5.1. Partha Dutta, over at RightMedia, wrote a couple blog posts that confirmed some fairly massive performance improvements with partitioning -- MySQL 5.1 Partitioning (Part 3), MySQL 5.1 Partitioning (Part 4), and Matt Casters (Pentaho) wrote an article on using Kettle and MySQL 5.1 Partitioning for large data sets: Handling 500M Rows
Rounding out the blogging on partitioning, Giuseppe Maxia blogged about using partitions in production today via replication; and improving archive performance with partitioning to achieve the dream DW solution.
Along with MySQL 5.1 also came the Event Scheduler feature. Events are kind of a cron or Scheduler system for MySQL. You can have the events subsystem execute procedures or other code at various intervals, making them ideal for monitoring applications and other use cases.
Here is a summary of articles that showed how to use Events:
MySQL 5.1 supports three different kinds of replication modes now: the standard statement-based replication format (used prior to 5.1), row-based replication, and a mixed mode. Row-based replication is designed to address certain problems that crop up in replicated environments that make use of non-deterministic functions such as RAND() or NOW() within data modification statements and environments that make heavy use of temporary tables in data modifications. Additionally, row-based replication solves problems inherent in using user-defined functions, a topic which Mats Kindahl blogged about in detail in his article Row based replication and user defined functions. Eric Bergen also talked a bit about his desire to use row-based replication after seeing Lars Thalmann's talk at last year's Users Conference in his article Row based replication and application developers. For a great article on the future of MySQL replication and some of the issues surrounding it, I would recommend reading Kristian Köentopp's blog post Replication - Now and Then. Finally, Mark Smith had another good writeup about row-based replication.
One particular MySQL 5.1 feature that will come in very handy is the ability of MySQL 5.1 to use standard tables instead of log files for a variety of internal logs, including the Slow Query Log and the General Log. Giuseppe Maxia had a nice article entitled Hacking MySQL table logs Outrageous tricks with table logs that is certainly worth reading for background on the feature. Additionally, on the MySQL Forge, First impression on table logs
Another feature that made it's way into the MySQL 5.1 code base was certain XML functions. The developer zone has a nice introduction to MySQL XML functions. This is also an area to keep an eye on, as work from Erik Wetterberg and Arnold Daniels on XML functionality has been patched into the 5.2 tree.
Roland Bouman, our resident INFORMATION_SCHEMA fan, has posted a very nice graphical overview of the MySQL 5.1 Information schema. With MySQL 5.1 and an updated plugin interface, more and more of the MySQL server is becoming pluggable, including many of the INFORMATION_SCHEMA tables. For instance, you can now see the processlist (normally done through SHOW FULL PROCESSLIST) directly through the INFORMATION_SCHEMA.PROCESSLIST table. Very nice.
As far as major bug fixes of note, definitely the ones that take the cake are:
This last "bug fix", which modifies the behaviour of auto-increment in InnoDB, really should not be called a bug fix, as it is a substantial change in the behaviour of InnoDB, and represents a significant undertaking from Innobase/Oracle in allowing InnoDB to scale better under certain workloads. Therefore, I'll talk here a little bit about these changes, which have been well-documented by Ken Jacobs in the MYSQL 5.1 manual.
Traditionally, that is to say, up until MySQL 5.1.22, InnoDB handled newly inserted records into an InnoDB table with an AUTO_INCREMENT column by using a global counter which held the last value for the auto-incrementing column. A lock would be placed on this counter for the duration of the SQL statement which did the inserting. After the statement completed, the lock would be released and other statements which inserted new rows into the table could acquire a lock on the counter.
This "traditional mode" worked well for many scenarios, particularly
situations involving short, single-row inserts, as the time the lock was
held for the statement was minimized. However, some performance issues
would come into play when statements such as
my_innodb_table SELECT * FROM some_other_table were executed,
as the lock on the auto-increment global counter would be held for an
extended period of time, causing other activity to wait. To be sure,
there were reasons for this strategy having to do with the way MySQL
replication works — InnoDB needed to ensure that the replaying of
statements on a slave server would be identical to the master server.
Heikki Tuuri and the Innobase/Oracle team spent considerable resources
to develop a solution to the above scalability problem. The new
solution involves a configurable setting which can be changed
without a server restart in MySQL 5.1.22. The new server
innodb_autoinc_lock_mode controls how InnoDB
treats statements which insert rows into an InnoDB table with an
AUTO_INCREMENT column. Depending on your environment —
specifically, whether you are using the binlog for replication or
recovery purposes and whether you are executing "batched insert"
statements— you can set this variable to 0, 1, or 2. 0
corresponds to the traditional mode, and is not recommended except for
very specific scenarios (see the doc link above). 1 represents
"consecutive mode" and is the default. In this mode, only statements
where InnoDB cannot determine the number of rows to be inserted
will use the global auto-increment lock. All other "simple insert"
statements (even those inserting multiple records in batch mode) will
use a faster, lighter locking mechanism, which results in significant
scalability increases. The final setting, 2,
represents an "interleaved" mode and has even greater scalability
improvements, but cannot be used in scenarios where the binary log is
being used for recovery or statement-based replication.
Again, this is a large and important feature chance in MySQL 5.1.22, and I encourage anyone using InnoDB tables in a production environment to read up on the new documentation.
In the coming months, MySQL 5.1 will reach Release Candidate stage. Be sure to get up to speed on all the features mentioned in this article and start testing your installations on the next generation of the MySQL Server!
Read and post comments on this article in the MySQL Forums. There are currently 3 comments.