December 24, 2011

MySQL database optimization

1.One of the first places to look to improve performance is queries, particularly the ones that run often.
2.Big gains can be achieved by analyzing these queries and rewriting them efficiently. You can use MySQL’s SHOW QUERY LOG to get an idea of which queries need to be fine tuned.
3.If optimization of your queries s a first goal, the first thing to do is to try implementing indexes. If you have a column involved in searching ( where clause), grouping ( group by clause) or sorting(Order by clause) and columns used in JOIN operations, it is likely result in a performance gain if you create index on these columns.
4.Another factor in creating index is cardinality( i.e.number of unique values),the higher the cardinality, the greater the chance that MySQL uses the index in queries.
5.Index speeds up SELECT queries but slows DELETE,INSERT and UPDTE queries, so create indexes on those columns which are mostly used in SELECT queries.
6.Create scheduled job to REBUILD INDEXES.
7.If business operations doesn’t affect, schedule to RESTART server weekly once when no users are connected to system.
8.Run ANALYZE TABLE command frequently ( schedule it weekly)
9.Run OPTIMIZE TABLE command frequently ( schedule it weekly)
10.SCHEMA OPTIMIZATION: Optimize data types and use DeNormalization if required.
11.Design your tables to minimize their space on the disk. This can result in huge improvements by reducing the amount of data written to and read from disk. Smaller tables normally require less main memory while their contents are being actively processed during query execution. Any space reduction for table data also results in smaller indexes that can be processed faster.
12.If your application makes several database requests to perform related updates, combining the statements into a stored PROCEDURE can help performance. Similarly, if your application computes a single result based on several column values or large volumes of data, combining the computation into a UDF (user-defined function) can help performance.
13.Disk seeks are a huge performance bottleneck. This problem becomes more apparent when the amount of data starts to grow so large that effective caching becomes impossible. If you find this issue then Increase the number of available disk spindles (and thereby reduce the seek overhead) by either symlinking files to different disks or striping the disks.
14.Check if you need partitioning of tables either ROW BASED PARTITIONING or HORIZONTAL PARTITIONING.

For more information, pl use following links.
http://dev.mysql.com/doc/refman/5.5/en/optimization.html
http://docs.oracle.com/cd/E17952_01/refman-5.5-en/mysql-tips.html

No comments:

Creating DataFrames from CSV in Apache Spark

 from pyspark.sql import SparkSession spark = SparkSession.builder.appName("CSV Example").getOrCreate() sc = spark.sparkContext Sp...