AL updates and locking, Part I - Introduction to lock escalation

adrogin
Feb 19, 2023
9 min read

Updated: Feb 23, 2023

Recently Microsoft announced significant changes in the SQL Server locking architecture which can dramatically reduce the number of lock objects held by a transaction and therefore reduce the risk of escalation. Of course I was curious to try the new locking mechanism and test it from the Business Central and AL coding perspective - how this change can impact the way we design AL applications.

When I started testing and describing my experience, I thought that I need to run a few queries with a more traditional SQL locking system to illustrate the SQL Server locking behaviour and compare the results against the new experimental locking model. So, one thing led to another, the post grew big, and I decided to split it in two parts. In the first part, I am going to walk the reader through a few tests demonstrating the locking escalations in SQL Server, whilst the next post will focus on Business Central update statements on an on-prem database and Azure DB with optimized locking.

Locking escalation in SQL Server

First of all, let me give you a short introduction in SQL Server lock escalation model before talking about tests and queries.

This is explained in detail in the Transaction locking and row versioning guide, a comprehensive document, worth reading for anyone who wants to understand ins and outs of the locking.

I recommend reading this document, but at the moment, since I am talking about lock escalations, I am particularly interested in a specific section of it, Lock escalation thresholds, which sheds the light on the conditions behind the decision to escalate a lock.

Lock escalation is triggered when lock escalation is not disabled on the table by using the ALTER TABLE SET LOCK_ESCALATION option, and when either of the following conditions exists:

A single Transact-SQL statement acquires at least 5,000 locks on a single nonpartitioned table or index.

A single Transact-SQL statement acquires at least 5,000 locks on a single partition of a partitioned table and the ALTER TABLE SET LOCK_ESCALATION option is set to AUTO.

The number of locks in an instance of the Database Engine exceeds memory or configuration thresholds.

If locks cannot be escalated because of lock conflicts, the Database Engine periodically triggers lock escalation at every 1,250 new locks acquired.

So, based on the official Microsoft guide, the check for escalation condition is triggered on every 1250 locks acquired by the transaction, but the escalation happens if a single statement within the transaction holds at least 5000 locks. If the total number of locks exceeds 5000, but the locks are held by multiple statements, and no single statement holds 5000 locks, the database engine continues row-level locking.

One more thing I want to highlight here because of a popular misconception that SQL Server database escalates row locks to a page level. Another quote from the locking guide confirms that this statement is wrong.

Lock escalation cannot occur if a different SPID is currently holding an incompatible table lock. Lock escalation always escalates to a table lock, and never to page locks. Additionally, if a lock escalation attempt fails because another SPID holds an incompatible TAB lock, the query that attempted escalation does not block while waiting for a TAB lock. Instead, it continues to acquire locks at its original, more granular level (row, key, or page), periodically making additional escalation attempts.

And once again, the same document refines the definition of the escalation path.

The Database Engine does not escalate row or key-range locks to page locks, but escalates them directly to table locks. Similarly, page locks are always escalated to table locks. Locking of partitioned tables can escalate to the HoBT level for the associated partition instead of to the table lock.

For a partitioned table, the lock is escalated to a partition instead of locking the whole table.

In short, what is important for the following tests:

Lock is escalated once a single statement in a transaction acquires 5000 locks
Lock is escalated from the row level directly to the table level

Test preparation

The table which I used for the tests is the one from my previous post https://www.keytogoodcode.com/post/how-come-the-size-of-my-sales-receivable-setup-table-is-16kb

In AL source code this table has two fields, with the clustered primary key on the integer "No.":

fields
{
    field(1; "No."; Integer) { }
    field(2; Description; Text[50]) { }
}

keys
{
    key(PK; "No.")
    {
        Clustered = true;
    }
}

Or the same table from the SQL Server perspective (skipping constraints with default values on system fields).

CREATE TABLE [dbo].[MyTestTable](
	[timestamp] [timestamp] NOT NULL,
	[No_] [int] NOT NULL,
	[Description] [nvarchar](50) NOT NULL,
	[$systemId] [uniqueidentifier] NOT NULL,
	[$systemCreatedAt] [datetime] NOT NULL,
	[$systemCreatedBy] [uniqueidentifier] NOT NULL,
	[$systemModifiedAt] [datetime] NOT NULL,
	[$systemModifiedBy] [uniqueidentifier] NOT NULL,
 CONSTRAINT [MyTestTable$PK] PRIMARY KEY CLUSTERED 
(
	[No_] ASC
),
 
CONSTRAINT [MyTestTable$systemId] UNIQUE NONCLUSTERED 
(
	[$systemId] ASC
)

I removed the application GUID and the company name from the table name for readability, but whenever I refer to MyTestTable, I mean a BC table with all its complicated naming.

It happened that I created this table for another test and needed a large dataset, so 200 000 records were ready at hand. I did not recreate the set and just used available data (although had to drop it all and reinitialize again a few times as the testing went along). A much smaller set of 10 000 record would be sufficient for my purpose, but in the end, I thought it's good to show how an update of a relatively small fraction of a table can lead to an escalation that locks the whole table.

Test 1: Modify a range of values

Before starting any experiments with update statements generated by Business Central, I want to run a few updates directly from SQL Server Management Studio to establish a benchmark. The queries in the initial tests are important to demonstrate the escalation behaviour of SQL Server and compare the results with queries generated by AL statements.

In the first test, I run a simple UPDATE query modifying 3000 rows. The statement is executed inside a transaction for two reasons. First of all, because I don't really want these changes to persist, but rather interested in capturing the statistics during the statement execution. The other reason will be clear in the subsequent tests - this is needed to illustrate the importance of differentiation between locks held by a transaction and those acquired by a single statement within the transaction.

BEGIN TRANSACTION
	UPDATE [MyTestTable] SET [Description] = 'abcd' WHERE No_ <= 3000

	WAITFOR DELAY '00:00:30'
ROLLBACK

After updating the rows, the transaction waits for 30 seconds before the rollback to give enough time to observe the locking statistics. Once the transaction is completed, all its locks will be released, so want to have a bit of time to

query the stats. I use the sys.dm_tran_locks management view to collect the locks acquired by the transaction.

select * from sys.dm_tran_locks

After running the first update statement, the management view contains 3086 lock objects: 3000 exclusive keylocks, 82 intent exclusive (IX) page locks, 1 IX table lock, and 3 shared database locks.

Shared lock on the whole database is always acquired by every connection to prevent the database from being dropped while the connection is active. It does not affect locking on lower granularity levels, so we can ignore those.

3000 exclusive keylocks are quite obvious - these are the 3000 rows being updated. As for the remaining 83 intent exclusive locks, probably this bit needs an explanation and another reference to the SQL Server locking guide. Although, this part is not very clear in the guide, it gives an example of locking behaviour of an update statement.

The UPDATE statement acquires these locks:

Exclusive (X) locks on the updated data rows.

Intent exclusive (IX) locks on the clustered index pages containing those rows.

An IX lock on the clustered index and another on the table.

Besides holding an exclusive lock on the row being modified, UPDATE statement always acquires intent exclusive locks on objects containing this row: page and table (or partition for a partitioned table). These locks prevent other transactions from placing exclusive locks on the same objects. Note that two intent exclusive locks are compatible, so multiple transactions can place IX locks on one table, thus modifying different rows in the same table. But an IX lock held by another transaction will prevent escalation, because escalation converts an intent exclusive table lock to an exclusive one, which is incompatible with any other lock type.

Another useful source of locking statistics, which also collects data on the escalation events, is the sys.dm_db_index_operational_stats dynamic management view.

SELECT 
    index_id,
    row_lock_count, 
    page_lock_count, 
    index_lock_promotion_attempt_count, 
    index_lock_promotion_count
FROM sys.dm_db_index_operational_stats (
    DB_ID(),OBJECT_ID(N'dbo.MyTestTable', 'U'), NULL, NULL)

Here are the counters that we can see for the UPDATE statement above.

Numbers in this table seem to be contradicting the data received from the sys.dm_tran_locks view. Why is the row lock count twice the number of updated rows?

To interpret this number, we need to understand the actual locking sequence during the execution of an update statement. For example, looking into the sequence of locking events in the SQL Server profiler.

This is what happens when a single row is updated.

As you can see, lock is not a placed on a row in a single event. In a read-committed transaction, the UPDATE statement acquires an update (U) lock on rows and intent update (IU) lock on pages and the table while reading the data. These locks are converted to exclusive (X) for rows and intent exclusive (IX) for pages and the table when the statement updates the row data. This sequence can be illustrated on a query execution plan.

Conversion of the lock from U to X (and from IU to IX) triggers a separate lock event, therefore the dm_db_index_operational_stats doubles the total number of locks. But this event does not increase the number of locks held by the transaction, that's why we see only one (unsuccessful) lock promotion attempt in the statistics. This counter is incremented when the lock count reaches 2500.

The counter index_lock_promotion_attempt_count needs a side note as well. First, from the terminology perspective - terms "promotion" and "escalation" are both used in SQL Server documentation to mean the same thing. Second, this counter is a bit of a misnomer, because it does not actually mean that the database engine indeed attempts to escalate the lock whenever this counter ticks. In fact, it is incremented every time when the transaction lock counter is increased by 1250. Except precisely 1250. So the counter is incremented when the lock count is 2500, 3750, 5000, 6250, and so on. The first actual escalation attempt happens when the counter reaches 5000 - but only if these 5000 locks are held by one statement on the same table instance. Therefore the number 1 in the statistics means that the number of acquired locks exceeded 2500, and not a failed escalation attempt.

Test 2: Modifying a larger set

In the second test, I am running the same SQL statement, but increasing the number of records to 6100 (changed the WHERE condition in the update query to WHERE No_ <= 6100).

Why this number, 6100? It would be logical to run a test with 5000 rows to see how the lock is escalated when the threshold is reached. Well, I'm skipping 5000 because, despite the clear statement in the locking guide, the escalation does not happen at 5000 rows. If you want to understand the details, I recommend a great article The Lock Escalation Threshold on SQLPerformance.com. I will just skip the 5000 rows dataset and expand the number of affected rows to 6100.

Now, let's run the modified query and have a look at the locking statistics.

We see that now SQL Server released all row and page-level locks and holds one exclusive lock on the table instead.

sys.dm_db_index_operational_stats view shows 12167 rowlock events.

Considering that the numbers above are double of the actual lock count, we can assume that the actual number of row locks before the escalation was 6083 rowlocks + 165 page locks + 1 IX table lock = 6249. And the subsequent lock acquired on the row 6084 caused the database engine to escalate the lock to the table level. From this moment on, there are no more row or page locks, and the counter value does not change.

As I said before, index_lock_promotion_attempt_count does not reflect the actual escalation attempts, but rather marks every 1250 locks acquired by the transaction. So the value was increased when the number of locks reached 2500, 3720, 5000, and 6250 where the lock was escalated and the inded_lock_promotion_count was increased to indicate the successful escalation.

Test 3: Updating records in batches

Now I want to see what happens if I start updating records in the table, splitting the updates in smaller batches, but still keeping everything in one transaction. The following snippet updates all 200 000 rows in the table, one thousand at a time, keeping all updates within one transaction.

DECLARE @I int = 1, @Count int = (SELECT COUNT(*) FROM MyTestTable)
BEGIN TRANSACTION
	WHILE @I < = @Count
	BEGIN
		UPDATE MyTestTable SET [Description] = 'abcd'
		    WHERE No_ BETWEEN @I AND @I + 999
		SET @I = @I + 1000
	END
	WAITFOR DELAY '00:00:30'
ROLLBACK

Index operational stats now shows 400 000 in the rowlock count, quite in line with the locking guide and the previous tests.

row_lock_count reflects 200 000 rowlocks, each one triggering two lock events, along with 163 lock escalation attempts - and none successful. As it was said earlier, index_lock_promotion_attempt_count does not actually count escalation attempts, but rather keeps the number of times the total number of transaction locks could be divided by 1250. Regardless of the meaning of the promotion attempt counter, what we see here is that SQL Server continues holding all 200 000 granular rowlocks and 5599 page locks until the end of the transaction, never escalating them to the table level.

I emphasized a few times the importance of differentiation between the number of locks held by a transaction and locks acquired by a single SQL statement. In this test, we can clearly see the difference - although the transaction lock count is way beyond the escalation threshold, the lock is still not escalated and remains on the row level because each separate statement acquires less than 5000 locks.

This was the introduction into the SQL Server locking model. In the next post, I will map this experience on Business Central AL code and the optimized locking model in Azure SQL.

Subscribe and stay tuned!

AL updates and locking, Part I - Introduction to lock escalation

Locking escalation in SQL Server

Recent Posts

SIGN UP AND STAY UPDATED!