Disclaimer: This post is not about detailed how-to and not about replication at Microsoft SQL Server at all, but is about my experience and faults.
Chapter 1. Never run alter column on replicated table (almost never).
Of course, there are many cases when you just do not have any other option. For example when you definitely know that changing from bit to smallint will take only a few minutes (it’s schema change only, just one table!). What is exactly happening in this case? You have OLTP server and store only monthly data there. Meanwhile, OLAP server stores all your data from the beginning. You thought about that a lot. Change bit to smallint. What can happen? I’m pretty sure that we’ll do it right before the end of the day and then go home. When I sat opposite my laptop in the middle of the night I had a lot of time to think about that twice.
Explanation: Yes, it took no more than 5 minutes on OLTP server, but when this transaction was sent to the OLAP, SQL Server started generating default values and updating transaction log. It took around a day to understand that something was going wrong and this just didn’t work.
Chapter 2 Choose correct type of the data
Yeah, I know that you do it all the time. I do it too. But I made a mistake. Everybody makes mistakes. In my particular example, I chose a bit type for a state property. I have records which I am checking by another source. If this another source has this record I raise a flag approved, otherwise set it to zero. Pretty simple, isn’t it? By default it is null. My assumption was that I have only two states of the flag: approved and not approved. And I missed null state. Then I added a default constraint with not approved value and I left only two states (one or zero). Then I was asked to add this third value. Hence I started this journey.
Advice: if you can use anything except null as value – use it.
Chapter 3 Re-initializing subscription is the worst option if you have OLTP-OLAP replication
The best solution is to wait until it is complete. You think about that first 24 hours. Then you start looking for other options. There are all options below which I found out:
- Wait until it is complete
- Re-initialize subscription
- Stop distribution and run the same query on the subscription database
- Rename column on the subscriber and create brand new with required settings
I put these options in an order which I thought how to choose them.
- I picked up first as the simplest one. I was waiting a lot of time when I started noticing that PID of the process was being changed each 5-6 hours. Finally, I found out what was going on. Your transaction can be interrupted and aborted and then replication tries to apply it again. And again and again. The reason can be any: timeout, server broke the session, the server was restarted (our sysadmin had a planned maintenance), anything.
- I walked around, investigate a bit and understood that if you are re-initializing replication whole data on the target machine will be replaced with data from the source. I suppose there are a couple of methods how to workaround this issue. However, I decided to go on.
- I ran the same query, and nothing changed. Still same enormous amount of time
- I was afraid that my replication will be broken
But when I tried all options I started researching.
Summary: try to find out as many applicable solutions as you can. Eventually, you will find out something.
Chapter 4 Rename column in the subscriber database can break your replication.
My investigation led me to only one promising idea. What will be if I rename a column to something like “myColumn_old” and create new but with old name? How does it work in general? I stuck to this idea.
When you are initializing, SQL Server creates for you following procedures for each table (an article) in the source:
When I found out that, I almost screamed. That is what I looked for! Want to go away from any delete on the OLAP? Here you are! Want to rename columns? It’s here too!
Replication with no-delete-on-subscriber policy was configured by my manager, and until now I haven’t known how it is configured! These procedures opened my eyes how replication does really work. It takes data from one server and runs procedures on the another. No black magic. No artificial intelligence. No undocumented APIs. Just simple as possible. And of course, after I renamed columns as I mentioned above and my transactions had flowed from publisher to subscriber immediately.