Error handling in T-SQL traditionally has been a sort of misterious voo-doo for most developers, with its mixture oferror severities, SET settings likeXACT_ABORT,ARITHABORTorARITHIGNOREand the options to handle the error on the server or at the client. For a long time now the best resource I know on this subject was, and perhaps still is, Erland Sommarskog set of articles at But the introduction of Service Broker activated procedures adds some new issues to consider when designing your application and this post is about what these issues are these and how to best cope with them.

Well first things first, lets get this out of our way: check@@erroror useBEGIN TRY BEGIN CATCH? In my opinion, the advantages offered by the TRY CATCH blocks are overwhelming and you should steer away from checking @@ERROR after each statement. Even if you may find some immediate scenario that would seem simpler using an explicit check for @@error, writing code using TRYCATCH blocks pays of on the long run. For me the most important differentiatin factor between the two is actually the human factor: code writen with @@error checks is difficult to read and impossible to maintain. Keeping the discipline in writing code to check for @@error after each statement is tedious and errorprone. @@error checks are not programming in the future tense as Scott Meyers would put it: future versions in the database engine could introduce behavior changes that raise new errors your application is not prepared to deal with. The TRYCATCH blocks solve so many of these issues that they are clearly the winner in my book.

Why is Service Broker error handling something special? Shouldnt the normal common wisdom of writing Transact-SQL apply to Service Broker as well? Well yes, but as it turns out there is a twist to the story. Service Broker is an integrated part of the SQL Server database engine and it follows all the normal Transact-SQL language parsing, compilation and execution rules like any other database engine componet, including everything related to errors and exception handling. But the typical Service Broker application is deployed in a quite different fashion than the non-broker applications: its an activated procedure that runs in the server background threads. The first major difference is that there is no client application to surface the error to, so any error that happens will have to be dealt with by the procedure. This means that the very purpose of error handling in a Service Broker application is different than in an ordinary procedure: Service Broker error handling has to captureanyerror and somehow react to this error at the level of Service Broker semantics, namely conversations. Most times this means that the conversation that received the message that triggered the error has to be ended with an error so that the partener service is notified about this problem and, sometimes, some compensation logic has to be run to undo the effects of this conversation had so far. The partener will receive the error message and in turn run its own compensation to complete (with error) the business transaction represented by the conversation..

Lets roll a quick example. Supose we have a trivial service whose role is to insert received message content into a target table (similar to what an audit service perhaps would do). Lets go ahead and build our DDL objects:

selecte.value(@id,int)as[id],

e.value(@key,nvarchar(256))as[key],

e.value(@value,nvarchar(256))as[value]

from@payload.nodes(//payload/entry)p(e);

select@error=ERROR_NUMBER(),@message=ERROR_MESSAGE();

endconversation@handlewitherror=@errordescription=@message;

attach the procedure to the target_queue as an activated procedure

As you can see, the error handling in the stored procedure is handled by a BEGIN TRY/BEGIN CATCH block, and the error handling consist of ending the conversation with an error, sending back the error number and message to the original sender. Of course, in a real application it would probably make sense to clearly define error numbers and messages reported by the target service, as is not really useful to send back to the sender some internal database engine codes and messages. So now lets go ahead and test our service:

toservicetarget_service,current database

select@payload=Npayloadentry id=1 key=A value=B//payload;

If we look into the target_table we find the entry we sent (1, A, B). The response that came back from the target service is sitting in the sender_queue and is a mundane EndDialog message, indicating the succesfull completion of the assignment.

Now if we simply run the very same SEND example, were gonna trigger an error, because the same values would violate the primary key constraint on the target_table. So go ahead and run this again:

toservicetarget_service,current database

select@payload=Npayloadentry id=1 key=A value=B//payload;

Now the sender_queue has received an error message:

DescriptionViolationofPRIMARYKEYconstraintPK__target_table__0AD2A005.Cannotinsertduplicatekeyinobjectdbo.target_table./Description

So our error handling works as we expected and the service is handling error conditions fine. So is this the end of the story?

As it turns out, weve only skimmed the surface and were about to open one nasty can of worms. To proove this lets produce the one most common mistake that happens during development: a malformed payload:

declare@handleuniqueidentifier,@payloadnvarchar(max);

toservicetarget_service,current database

select@payload=Npayloadentry id=1 key=A value=B/payload;

The payload in this case is incorrect, the entry tag is not closed. So what happens with our message? Nothing apparently, the message simply stays in the target_queue and refuses to be processed, as a simple check usingselect*from[target_queue]shows! The obvious culprit should be that message failed to be processed 5 times and the poison message detection has intervened and deactivated the target_queue so lets go ahead and check it:

select[name],is_receive_enabledfromsys.service_queues

Now I reckon that as I was writing this piece, I had a big surprise to see that the queue is not disabled! In fact I had to delete a paragraph I had already wrote,  and come back and revisit this behavior. So what happened? It turns out that the activation mechanism was actualy fooled by our erroneous message to put the queue in the NOTIFIED state, and leave it as such, as a check on therelevant DMVshows:select*fromsys.dm_broker_queue_monitors.The NOTIFIED state occurs when an activated procedure was launched but the RECEIVE verb has not been run on the activated queue. But our activated procedure was not running, as a quick check onselect*fromsys.dm_broker_activated_tasksshows. A second check on the ERRORLOG (or the system event viewer) revealed the problem:

2007-10-31 16:31:52.57 spid52s The activated proc [dbo].[usp_target] running on queue tempdb.dbo.target_queue output the following: The current transaction cannot be committed and cannot support operations that write to the log file. Roll back the transaction.

So our procedure wad run, had hit an exception, but our exception handler was not capable of dealing with the problem. The error in the ERRORLOG indicates that we are dealing with an uncommittable transaction issue. So apparently we need to augment our error handling CATCH block to deal with such problems, and theXACT_STATEdocumentation shows how this can be achieved. So were gonna modify our CATCH block to deal with doomed transactions and were also gonna modify our RECEIVE code slightly not to cast to XML during the RECEIVE statement, since that apparently is fooling the activation machinery which behaves as if the RECEIVE never occured. Here is our modified procedure:

selecte.value(@id,int)as[id],

e.value(@key,nvarchar(256))as[key],

e.value(@value,nvarchar(256))as[value]

from@payload.nodes(//payload/entry)p(e);

If -1, the transaction is uncommittable and should

Test whether the transaction is uncommittable.

Test wether the transaction is active and valid.

select@error=ERROR_NUMBER(),@message=ERROR_MESSAGE();

endconversation@handlewitherror=@errordescription=@message;

We alter the proceure, turn activation back on and yet again nothig, the message is still sitting in the target_queue. This time though the queue was correctly disabled, we check theselect[name],is_receive_enabledfromsys.service_queuesview and sure enough the queue is disabled. But wait a minute! Im sure TRY/CATCH has its place, but this is not quite what we wanted to achieve, a simple common XML formatting error is taking down our service! How come?

As it turns out, XML casting and validation errors are raised with severity 16, and this severity will always put the transaction into an uncommittable state. For Service Broker, this is quite bad, as parsing and shreding XML is a day to day operation for most service applications. Unfortunately, theres simply no workaround for this issue, which has the implication that one cannot use the SQL Server built-in XML methods tovalidatethe received XML programmatically, one has to ensure that the XML is valid upfront. For XML payloads fortunately there is a solution: declare the message validation on the message type and even enforce an XML schema. This will cause validation to occur prior to the message being enqueued and will ensure that the activate dprocedure has to deal only with valid input. In our case, this means to change the contract of the service to use an XML well formatted validation message:

createmessagetype[valid_payload] validation=well_formed_xml;

createcontract[validated_contract]([valid_payload] sentbyinitiator);

alterservice[target_service](addcontract[validated_contract]);

We also need to change our activated procedure to deal with the new message type name, the valid_payload type:

We can now reset the broker in tempdb (to get rif of the erroneous message that would prevent the target_queue from ever becomming enabled) and enable back the target_queue:

alterdatabasetempdbsetnew_brokerwithrollbackimmediate;

So now we can send again the erroneous message, this time using the validated contract and message type:

declare@handleuniqueidentifier,@payloadnvarchar(max);

toservicetarget_service,current database

select@payload=Npayloadentry id=1 key=A value=B/payload;

sendonconversation@handlemessagetype[valid_payload](@payload);

And, as expected, the Service Broker responds immedeately with an error on our conversation because the message did not pass the XML payload validation:

DescriptionA message of type valid_payload failed XML validation on the target service.XML parsing: line 1, character 51, end tag does not match start tag This occurred in the message with Conversation ID , Initiator: 1, and Message sequence number: 0./Description

So we were able to cope with the case of invalid XML, but how about other cases? What if the message is still valid XML, but has some invalid attributes? Ie. the payloadpayloadentry id=foobar key=A value=B//payloadwill pass the well_formed_xml test, but will still cause the service to disable itslef, because foobar cannot be casted to int. You could raise the bar even further by specifying an XML schema for validation, but not all of us are so confident in our XML schema knowledge to be sure that we covered everything. And besides, there are binary messages too, XML payload isnota requirement for Service Broker. The truth is that the severity of such trivial errors forcing the transaction to rollback is quite a burden on the application, since it means that pretty much any malformed message can disable a service. And there simply isnt any buletproof defense against it.

Many developers I talked with have been at this point already and realised that they simply cannot cover all the angles on this problem so they tried another approach: what if one rolls back, but then does some corrective action. Surely we could write some code in the catch block that remedies the problem, like dequeue the message and error the conversation immedeately, without trying to process its payload. Usually this looks something like the following, int he BEGIN CATCH block:

Test whether the transaction is uncommittable.

select@error=ERROR_NUMBER(),@message=ERROR_MESSAGE();

endconversationwitherror=@errordescription=@message;

But this is riddled with problems. The moment the transaction was rollbed back, the code has absolutely no guarantee over the state of this conversation and its messages. Another instance of the activated procedure might had already picked up the very same message and tried to process it. This problem is true for any processing of this kind, it is not specific to Service Broker, but the typical Service Broker environment, with its activated procedures running in parallel and trying to grab the next unlocked message it is very likely to happen in production. Whats worst is that is isunlikelyto happen in the test development environement, unless the test environment matches the concurrency of the production environment.

So it turns out error handling in Service Broker is a bit trickier than expected and presenting some challenges. Some recommendations stand valid:

Do use the BEGIN TRY/BEGIN CATCH blocks, they are simplifying the code tremendously

Use the Service Broker conversation semantics to represent business transactions, and use the END CONVERSATION WITH ERROR to signal errors to the other service at the end of a conversation when you catch

Do harden your service broker contract as much as possible using XML validation schemas to catch malformed messages as early as possible, before they enter your application queues

But other thant this, dealing with errors and the problems caused by uncommittable transactions and its effects on Service Broker queues is by no means trivial, and is a subject I actually still looking for an acceptable answer.

I cant test it right now, but can a doomed transaction become un-doomed by rolling back to a savepoint that was taken before the error that caused the transaction to become uncommittable? If so, the solution is obvious take a savepoint directly after reading a message from the queue, and rollback to that savepoint if an error turns out to be a poison message.

I have handled errors in a different way, similar to what Hugo suggested can you please let me know if this is a valid way of doing it.

Basically I begin a transaction before receiving any messages off the queue. When I get a message off the queue i also store the conversation_handle. I then do processing on the message in a try..catch block. In the catch block, i rollback the transaction (which puts the message back on the queue), begin a new transaction, and then get the message off the queue again using the same conversation_handle. I then log the error in a table and send an error message back to the initiating queue.

keep looping while we have a message to process

where conversation_handle = @conversationhandle), TIMEOUT 1000;

and send an error message back to the initiating queue

lets see if there are any other messages waiting to be processed and get that if there is

Unfortunately rollback to savepoints are not supported on doomed transactions. The funny story is that if you have *both* Rogers books about Service Broker (the beta and the 2005 versions) the beta one shows how to use rollback to savepoint, because in Beta 2 this was working! By RTM time though the functionality had changed.

As for rolling back an then receiving the same message it works only if theres one and only one queue reader. In a concurrent environment the rolled back message is up for grabs as soon as it is unlocked and a different instance of the procedure may grab it and start processing it anew, so the risk of deactivating the queue is still there.

This blog has saved me hours! Thanks muchly 🙂

[] were few and far between and depended upon user input, this was not trivial. You can see this article for a tutorial on []

I came across another site that talked about message poisoning. Was I misunderstanding them or should I end the conversation before/after the rollback in the code above, as well? Thanks.

@Chris: If you end *before*, it will be rolled back so is a no-op. If you end *after*, the conversation was already unlocked by the rollback and it can be already locked and changed by another thread. Getting poison message to be handled *automatically* is very tricky. My recommendation is to handle all cases possible in the commit path, and when faced with a true poison message that causes rollback, analyze it manually and fix the procedure to cause it to go on the commit path. Cycle, rinse repeat until all errors are properly handled.

Hi, Remus. I am having some issues with my deployment of service broker, and I am stumped. I am getting the error The conversation handle is not found. The service broker send/receive occurs on the same instance within the same db. I am using an activation procedure from the queue. I am using a trigger on a table to start the messaging. Below is my code.

SELECT @XML = (SELECT ID FROM inserted FOR XML AUTO);

Send the XML records to the Service Broker queue:

RECEIVE conversation_handle, message_body, message_type_name

If no messages exist, then break out of the loop:

INTO @ConversationID, @MessageBody, @MessageTypeName;

Lets only deal with messages of Message Type

IF @MessageTypeName = BrokerMessageType

SELECT @EmpID = tbl.rows.value(@ID, UNIQUEIDENTIFIER)

FROM @XML.nodes(/inserted) tbl(rows);

INTO @ConversationID, @MessageBody, @MessageTypeName;

select @error = ERROR_NUMBER(), @message = ERROR_MESSAGE();

end conversation @ConversationID with error = @error description = @message;

Can you see anything that would cause this error? Am I doing something wrong here? Any assistance/direction you can provide would be greatly appreciated. Thank you.

Interested in SQL Server monitoring and configuration management?

SQL Injection: casting can introduce additional single quotes

The cost of a transactions that has only applocks

Interested in SQL Server monitoring and configuration management?