Bluehost Web Hosting Help

Optimizing MySQL: Queries and Indexes Article 4 of 4

Optimizing MySQL: Queries and Indexes Article 4 of 4

Courtesy of:Ian Gilfillan

Most systems need to be highly optimized for selects - take a news site which performs millions of queries per day, but where the data arrives in large batches of text files. So for parts of the day, inserts need to be optimal, without noticeably affecting the millions trying to access the data...

Assuming a nicely formatted '|' delimited text file that we want to insert into the table above, take this piece of PHP code:

if (!($fp = fopen("datafile.txt","r"))) {               // open the file for reading
              print "nUnable to open datafile.txt for writing";    // display error
              exit();                                       // end the running of the program
            }
            
            
            while (!feof ($fp)) {                   // loop through the file line by line
              $sline = fgets($fp, 4096);            // put the data into the variable $sline
              $sline = chop($sline);                        // remove the newline
              list($eno,$fname,$sname,$telno,$salary) = split("|",$code);
                                    // split the line on "|", populating the ind. variables
              $db->query("insert into employee(employee_number,firstname,surname,
            tel_no, salary 
               values($eno,'$fname','$sname','$tel_no', $salary)");
            }                                               // end while loop
            

This would work, but would be very slow. The index buffer would be flushed after every insert. Until recently, MyISAM tables (The MySQL default) did not allow data to be inserted at the same time as being read. The new format does, but only if there are no deleted records present (highly unlikely in a heavily used system). So the entire table is locked for the duration of each insert. Fine on a low volume site, but when you're getting hundreds or thousands of queries per second, you'll soon notice the backlog!

There's a solution however - the best way to insert the data is to use MySQL's "LOAD DATA INFILE". This is much faster (20 times according to MySQL), and the only way some systems I've seen are still hanging in there!

The syntax is simple, and the code becomes a lot simpler too:

$db->query("LOAD DATA INFILE 'datafile.txt' INTO TABLE employee (employee_number,firstname,surname,tel_no,salary) FIELDS TERMINATED BY '|'");

LOAD DATA INFILE has defaults of:

FIELDS TERMINATED BY 't' ENCLOSED BY '' ESCAPED BY ''

if you don't specify any of these clauses. And, just as with an ordinary insert, you need to specify a field list if the order of the fields is different, or, as in the example above, you're not inserting data for every field. Always specifying a field list is good practice for all queries anyway - if someone adds a field to the table at a later stage, you don't want to go back and have to fix all your previous INSERT and SELECT * statements.

If you can't get this to work properly, have a look at the format of your text file - every problem I've seen with LOAD DATA has been because of a corrupted text file. Every field in every row must be delimited correctly!

You may not always be inserting from a text file - perhaps your application needs to do many unrelated inserts continually. There are ways to make sure the mass of users selecting are not badly affected... The first is to use INSERT LOW PRIORITY. This waits until there are no more reads waiting to happen, waiting for the gap, and not pushing in as it were. Of course, if your database is a rush hour special, there may never be a gap, and the client performing the INSERT LOW PRIORITY may start to grow cobwebs! An alternative here is INSERT DELAYED. The client is immediately freed, and the insert put into a queue (with all the other INSERT DELAYED's still waiting for the queue to end). This means that there can be no meaningful information passed back to the client, (such as the auto_increment value), as the INSERT has not been processed when the client is freed. Also, be aware that a catastrophe such as an unexpected power failure here will result in the queued INSERT's being lost. For neither of these methods do you have any idea when the data will be inserted, if at all, so I suggest you use with caution.

Conclusion

It's not only getting the data in that needs to be quick - sometimes you need to get it out quickly too. (Say you've accidentally loaded yesterday's classified ads, for example). Don't do a:

DELETE FROM classifieds;

Rather, use:

TRUNCATE TABLE classifieds;

The difference here is that DELETE drops records one by one, and that can be 1 million one by one's too slow! Note that this does not apply before version 4.0 of MySQL. At time of writing, most of you will still be using 3.x versions (if you do a DELETE FROM tablename on a non-empty table, and get 0 records back as a result, you're running an earlier version. To fix this problem, MySQL made DELETE remove records one by one so as to return the number of records deleted, but TRUNCATE still did the quick delete. Also, earlier versions than 3.23.33 used TRUNCATE tablename, not TRUNCATE TABLE tablename)

This has only been a brief introduction to optimizing queries and indexes. Once you've mastered these tips, you'll want to look at your more complex joins that just never seem to perform quickly. The MySQL site has lots of useful information, and there are a few good books out there as well. But don't forget EXPLAIN! Often the best way is to try and rewrite the query in as many different ways as possible, and see which one runs more efficiently. You can learn a lot by trying to figure out why one alternative ran faster than the other. Good luck!

resource list:
www.mysql.com - complete documentation for MySQL

Knowledgebase Article 218,134 views bookmark tags: cpu exceeded index indexes mysql optimize optimizing queries query


Was this resource helpful?

Did this resolve your issue?


Please add any other comments or suggestions about this content:





Recommended Help Content

Optimizing MySQL: Queries and Indexes Article 3 of 4

Optimizing MySQL: Queries and Indexes Article 3 of 4 Courtesy of: Ian Gilfillan Ordering by surname is a common requirement, so it would make sense to create an index on surname. But in this example o

Optimizing MySQL: Queries and Indexes Article 1 of 4

Optimizing MySQL: Queries and Indexes Article 1 of 4 Courtesy of: Ian Gilfillan Badly defined or non-existent indexes are one of the primary reasons for poor performance, understanding and then fixing

Optimizing MySQL: Queries and Indexes Article 2 of 4

Optimizing MySQL: Queries and Indexes Article 2 of 4 Courtesy of: Ian Gilfillan Some knowledge of how indexes work allows you to use them more efficiently. Firstly, note that when you update a table w

Related Help Content

CPU Performance Issues

What is CPU Protection

MySQL Repair and Optimize Tables in phpMyAdmin

How to repair and optimize your database using phpMyAdmin:

How To Run SQL Queries in PHPMyAdmin

A guide on performing a standard SQL query using the phpMyAdmin interface

Upgrade CPU and Memory on CloudSites

This article explains how to add CPU and memory upgrades to a CloudSites account.

Restore MySQL databases using the Site Backup & Restore tool

How to restore MySQL databases using the Site Backup & Restore tool

Backup all MySQL databases using the Site Backup & Restore tool

How to backup all of your MySQL databases using the Site Backup & Restore tool

Optimizing Your Images

How can I optimize my images?

Avoiding index html

How do I display my home page which is using a name other than index.html?