As we all are aware, SQL optimizations are required for speeding up applications. There situations in my career where we felt like MySQL’s MyISAM engine is not fast enough! There are multiple factors that may slow down the SQL. They could be categorized into three categories.
- The database Software/Engine
- The database design/architecture
- The SQL used
We will get to each of this one by one.
1. The database Software/Engine
For normal websites including e-commerce sites, the most executed SQL statement is SELECT. There may be updates when someone buys a product. Admin will be doing INSERT and DELETE too. Still the most used one is SELECT statements. So experts recommend DBMS which are fast in SELECT statements and MySQL with MyISAM engine is always recommended for faster SELECT statements. Problem with MyISAM is the lack of foreign key support. InnoDB engine supports foreign Keys, but is slower compared to MyISAM engine.
Again we can see that MySQL 5.0x.xx is slower than MySQL 5.1x.xx. You can read more about it here.
So we need to choose a database/engine that fits our requirement. If you intend to have only few hundred SELECT statements per minute; it won’t be a big concern.
2. The database design/architecture
The design has a huge impact on performance. For those who have a computer science degree are aware of database normalization. Unfortunately, I have never heard a non CMM level company’s Project Manager or team lead talk about normalization. On the other hand I heard people saying that ‘normalization will slow down the application’. In reality, its both true and false.
When a database is designed in 3rd normal form, the database is considered ‘normalized’ and you can go to BCNF or 4th normal form.
When we normalize a database, we have four goals: 1. Arranging data into logical groupings such that each group describes a small part of the whole. 2. Minimizing the amount of duplicate data stored in a database 3. Organizing the data such that, when you modify it, you make the change in only one place. 4. Building a database in which you can access and manipulate the data quickly and efficiently without compromising the integrity of the data in storage.
When it’s mostly SELECT only system like a blog where few people insert or update data and thousands view it, it’s Ok to have denormalized DB design. But for ecommerce systems or similar that have extensive use of inserts/updates, it’s always better to reduce no. of places to update for each transaction. Denormalization may make the system to use less number of SELECT queries and make complex SQLs more simple, but when it’s time to insert or update, the application needs to execute multiple INSERT or UPDATE statements. In effect, normalization may make the system to use complex SELECT statements with joins, but denormalization may make the application slower by needing us to execute more INSERT/UPDATE statements.
We need to analyse the system first to see how normalized our system needs to be. A standard piece of database design guidance is that the designer should first create a fully normalized design; then selective denormalization can be performed for performance reasons.
Another important fact is that, all fields that appear in a WHERE clause should be indexed for better performance of SELECT statements. on the other hand, over indexing will slow down the INSERT/UPDATE statements.
3. SQL Used
We can optimize SQLs used in the application. Simplest tip is: never use * in a select statement. Select the fields you need. Another is to specify the table names if we join tables. For example
Let A and B be two tables in a database; and we need only field1 and field2 from table A and field1 from table B, then instead of
SELECT * FROM A, B WHERE A.id = B.t1id
SELECT A.field1. A.field2, B.field1 FROM A, B WHERE A.id = B.t1id`
There are cases where multiple SELECT statements and the language you use to write the application is faster than complex joins in SQLs. Think wisely 🙂
In MySQL, they support specifying which index to use using index hint statements like FORCE INDEX, USE INDEX etc. You have to look into the features provided by the DBMS being used to find out the best method to optimize your SQL.