April 14, 2009

SQL SERVER - Isolation levels

Whenever a statement is executed, or a data modification is made, it runs under the influence of an isolation level. The isolation level directly impacts the performance of SQL Server, along with the integrity of the data.

Essentially, an isolation level determines the degree to which a data modification is isolated from another data modification, or a database modification is isolated from a statement (such as a SELECT statement), or how a statement is isolated from another statement.

More specifically, isolation levels are used to control the following:
If any locks are taken when data is read, and what kind of locks are being requested.
How long read locks are kept.
And last, they determine if a read operation acting on a row that is experiencing a data modification, (1) blocks the data modification until the row is unlocked, (2) retrieves the committed version of the row before the start of the data modification, or (3) reads the uncommitted data modified by the data modification.
Traditionally, SQL Server has supported four isolation levels:
Read Uncommitted: This is the lowest isolation level. It only isolates transactions and activities to ensure that physically corrupt data is never read. It allows dirty reads, nonrepeatable reads, and phantom reads.
Read Committed: This isolation level does not permit dirty reads, but does allow nonrepeatable reads and phantom reads. This is the default isolation level for SQL Server, and is used for each connection to SQL Server unless one of the other isolation levels has manually been set for a connection.
Repeatable Read: This isolation level does not permit dirty reads or nonrepeatable reads, but does allow phantom reads.
Serializable Read: This is the highest isolation level and ensures that all transactions and statements are completely isolated from each other. It does not allow dirty reads, nonrepeatable reads, or phantom reads.
Read Uncommitted provides the best concurrency, but the least data integrity. On the other extreme, Serializable Read provides the worst concurrency, but the highest level of data integrity. As you know, the higher the concurrency, the better SQL Server performs, and the lower the concurrency, the worse SQL Server performs. As a DBA, it is your job to balance the needs between data integrity and concurrency by selecting an appropriate isolation level.

In SQL Server 2005, two new isolation levels are introduced, both of which use row versioning. They include:
READ_COMMITTED_SNAPSHOT (statement level)
ALLOW_SNAPSHOT_ISOLATION (transaction level)
The purpose of these new isolation levels is to give DBAs or developers more granularity in their choice of isolation levels, with the goal of improving read concurrency. When read concurrency is increased, there are fewer locks to be managed by SQL Server. This results in less locking resources required by SQL Server, along with less blocking. This in turn boosts SQL Server's performance.
When a row versioning-based isolation level (which includes the two new ones we are now discussing) is enabled at the database level, the database engine maintains versions of each row that is modified (for an entire database). Whenever a transaction modifies any row, an image of the row before the modification is copied into a page of what is called the version store. The version store is located in the tempdb database and is used for temporary storage of versioned rows for all of the databases on a single SQL Server instance.

No comments:

Creating DataFrames from CSV in Apache Spark

 from pyspark.sql import SparkSession spark = SparkSession.builder.appName("CSV Example").getOrCreate() sc = spark.sparkContext Sp...