The scope of MySQL optimization is very wide. One aspect of this complex field is the query performance and this article deals with it. MySQL provides powerful tools, that allow you to have a look inside the query parser & optimizer. To understand this article some MySQL server & SQL knowledge is required. Please be aware, that this article is just an overview about query performance. It can’t cover all aspects of it. This article is a part of the web-application performance series.

A simple calculation

On MySQL.com the company behind MySQL (MySQL AB) published a simple calculation to find out the estimated query performance:

log(row_count)/log(index_block_length/3*2/(index_length+data_pointer_length))+1
  • row_count: number of rows in your table
  • index_block_length: normally 1024 bytes
  • index_length: key value length (usually 3 bytes)
  • data_pointer_length: data pointer (usually 4 bytes)

The result of this formula represents the number of disk seeks for a special query. This formula is not valid for all queries.

Using EXPLAIN

EXPLAIN is a powerful tool to analyze the query performance. You can use EXPLAIN for every query.

EXPLAIN SELECT * FROM employee WHERE id = 18273;

After sending this query to the server, the result set contains information of the MySQL Optimizer. The optimizer analyzes the query and decides which execution plan would be the fastest and/or cost-effective. Check approximatly all queries to make sure, that you don’t overlook a unindexed or slow query.

The output of the optimizer contains the following columns:

  • table: involved tables (if you do a join, then this table contains more than one entry)
  • type: type of query (e.g. SYSTEM | CONST | EQ_REF| REF | RANGE | INDEX | ALL, ordered by the fastest in a descending order)
  • possible_keys: all possible keys for this query
  • key: finally used key
  • key_len: the length of the key (the length correlates with the performance)
  • ref: shows which column or constant would used to compare the value of key
  • rows: estimated number of rows to check
  • Extra: additional information about how MySQL executes the query

If you want to use an index that MySQL don’t use in the execution plan, then you can force MySQL to use this index with:

SELCT lname, fname, birthday FROM employee FORCE INDEX(lname)...

LIMIT

You should use LIMIT, if you have a lot of rows in your result set. It saves a lot of time, if you’re using LIMIT. So, you can build your query, after your pagination variable is evaluated.

SELECT lname,fname FROM employee LIMIT 0, 30

or with the application of a pagination:

SELECT lname, fname FROM employee LIMIT ($page * $rows_per_page), $rows_per_page

Caching queries effectively

MySQL implements a query cache. This cache saves the execution plan for your queries and the results. Until the table is not changed, the query cache holds the queries in the cache (depends on the possible memory).  The most common fault with the query cache is the innocence of how the query cache works. It works case-sensitive! This can be tricky if you use the same query more than one times. Then, you have to establish query coding-standards in your application, to keep the cache smart, efficently and economical.

SELECT * FROM employee WHERE id > 10 AND id < 10000

is not the same like:

select * from employee where id > 10 AND id < 10000

Split up complex queries

At a certain level, some queries can perform really slow. This will happen especially by joins, correlated queries, large tables and such complex constructions. Then you achieve a better performance with splitting this one query into more queries. Test this queries against the EXPLAIN command to check which query performs well and which not. But, be aware that before splitting a query up , the first option should be to check the indexes and the query cache.

SQL_CACHE vs SQL_NO_CACHE

If you are performing statistical queries to update user statistics and so one you should use the SQL_NO_CACHE directive. This directive is a part of the query cache. If you set the value of query_cache to  2, then you can use this directive. Use SQL_NO_CACHE for all queries, that you don’t need in your application. Because the space of your query cache is limited and shouln’t be wasted.

LIKE with %

‘%’ is known as the wildcard operator in MySQL. If you have an index on a column it can perform fast.

SELECT * FROM employee WHERE lname LIKE('New%');

This works fast, if you have an index on lname. The example below don’t works fast, because the index couldn’t be used:

SELECT * FROM employee WHERE lname LIKE('%man');

So, try to avoid the wildcard at the beginning of the like-clause. An considerable option can be to reverse the values of this column.

Other considerations

  • avoid calculated comparisons between values
  • don’t use DISTINCT and GROUP BY together
  • avoid correlated subqueries
  • use the FULLTEXT search and avoid LIKE

no comment

Comments:

XHTML: You can use the following tags: