MySQL Tips and Tricks

Since most of us are not DBAs, I figured we could share some tips and tricks relevant to the databases we are using.

1.Indexes
You need to index your columns which may appear inside a where clause. However, having indexes that are never used is also bad. An example of this is in a wide historical table, with update times greater then 1 second. Most of the time, an auto incrementing index is created. This index can become very large, degrading table performance. For historical tables, consider using the timestamp as your primary key

2.InnoDB Indexes
InnoDB uses clustered indexes. These indexes read from left to right. If you index on say (timestamp, pointID), then an where clause that will use this index properly would look like this

where timestamp > [i]sometime[/i]

or

where timestamp > [i]sometime[/i] and pointID = [i]somepointID[/i]

or

where pointID = [i]somepointID[/i] and timestamp > [i]sometime[/i]

These 3 where clauses will use the indexes to find relevant rows, and not perform a full table scan

However, if you do this

where pointID = [i]somepointID[/i]

it will perform a full table scan because the index timestamp does not appear in the where clause

3. Learn to use EXPLAIN
EXPLAIN is extremely useful when writing queries. It will show you how the query will be executed, what indexes it will use, if it will use a range, full table scan, or indexes to find the rows, etc. EXPLAIN SELECT is your friend

4.Partition your large tables
Stay Tuned

edit
Modified to respond to Colby. If a wide table has an update rate >= 1 second, then the primary key can be the timestamp. If you are using a tall table, then your primary key would need to be like (timestamp,pointID) to work.

Hi Kyle-

Thanks for starting this thread, it’s a great idea! I wanted to mention 2 things that came to mind about your first point:

  1. Just as a point of clarification for those who want to follow this advice: true “primary keys” in MySQL need to be unique. It is not guaranteed that the timestamp will be unique (and actually kind of likely that it won’t be, especially since mysql doesn’t include milliseconds). That’s ok, you can just make a regular index. It’s ok if the table doesn’t have an actual “primary key”. I really wanted to point this out, because if you DO make the timestamp the primary key, anytime you try to insert two rows with the same timestamp, one will error out.
  2. Sometimes the auto incrementing key can be helpful for historical tables. The timestamp is arbitrary- it can be set to anything. However, the auto incrementing key is not, and provides a view as to the order in which rows were inserted. It’s rare that this is actually important, but it’s just something to think about before you go and delete the column. One concrete example (really happened): machine was logging data (machine state, “on”) when the user noticed the clock was wrong. Set it back one hour. Machine logs a second set of data with similar timestamps (the same hour twice, now machine state “off”). Coincidentally, in the mean time, the machine dies. When the boss looks at the chart, it looks like the machine turned on and off repeatedly for an hour, and gets mad. HOWEVER, by using the incrementing index column, the user was able to show what actually happened, and that the machine was operating normally and failed for a different reason. In this case, having that column really saved someone some problems.

Ok, so maybe that kind of thing doesn’t happen often enough to warrant using the space, but it’s something to think about.

Keep the tips flowing!

Yeah I meant to note this was for wide tables with an update rate >= 1 second