Translate

Monday, September 10, 2012

push and pull subscription

push and pull subscription

push subscription

With a push subscription, the Publisher propagates changes to a Subscriber without a request from the Subscriber. Changes can be pushed to Subscribers on demand, continuously, or on a scheduled basis. The Distribution Agent or Merge Agent runs at the Distributor.

pull subscription With a pull subscription, the Subscriber requests changes made at the Publisher. Pull subscriptions allow the user at the Subscriber to determine when the data changes are synchronized. The Distribution Agent or the Merge Agent runs at the Subsc7riber.

 ====================================================================================
 

Subscription can be configured in two ways:-

1.Push subscription

In push subscription the publisher has full rights when to update data to subscriber.
Subscriber completely plays a passive  role in this scenario. This model is needed when we
want full control of data in hand of publisher.

2.Pull subscription
In pull subscription the subscriber requests for new or changed data. Subscriber decides
when to update himself. This model is needed when we want the control to be on hands
of subscriber rather than publisher.

 

Replication is a very useful method of copying data from production systems to standby servers, reporting servers, and downstream data relay points. Unfortunately, it is an often misunderstood and/or overlooked technology, even among experienced DBAs. In my professional career, I have often seen replication misused in an attempt to achieve a goal for which it was not intended. When it is determined that replication is the correct solution (making that determination will be the subject of a future article), you must install and configure three basic components (these are extremely simplified definitions):

  1. Publisher: This is the source database or databases containing the information you want to replicate.
  2. Distributor: This is the database and set of jobs responsible for queuing replicated data from the publisher.
  3. Subscriber: This is the destination database or databases for data coming from the publisher.

Push and Pull, as named in the title of this article, are the two methods available for moving data from the Distributor to the Subscriber(s). Under the Push method, the Distributor is responsible for queuing data from the Publisher, then propagating it to the Subscriber(s). Under the Pull method, the Distributor is responsible for queuing data from the Publisher, and it is the job of each Subscriber to connect to the Distributor and grab all queued data ready for replication. Selecting the incorrect method can lead to serious performance issues, especially at peak times of database use.

In my years of experience troubleshooting replication problems in production database environments I have discovered a common denominator: No thought, consideration or understanding of Push and Pull was given when the replication topology was designed and implemented. This leads to a highly disorganized and chaotic environment, featuring a random blend of Push and Pull methods and much consternation. Interviewing the designers of such systems about why some subscriptions are Pushed, and others Pulled, has yielded responses such as: “I don’t know.” “What is Push/Pull?” and, “Does it really matter?” Choosing whether to use Push or Pull methods for your replication topology matters a great deal. There are two key factors to consider when making this determination:

  1. What kind of load is expected on the Publisher? For example, are the published databases serving as the foundation for a high-traffic website? My current client provides online web advertising engines which serve ads to web sites and capture anywhere from thousand to millions of mouse clicks and ad impressions per hour; all loaded into a set of very busy databases that are published for replication.
  2. On which server will the Distributor reside? Distribution can be configured to run on the Publisher, or it can be configured to run remotely on its own server. Determining the amount of load on your Publisher will often answer the question of where your Distributor should reside.

Consider the following example scenarios based upon the key factors I just mentioned; and these are by no means exhaustive:

1. Your Publisher receives millions of records per day in a high-transaction environment. Running Distribution remotely would be ideal, but budgetary constraints have forced you to run Distribution on the Publisher.

Solution: Strongly consider using the Pull method. The server running Publishing and Distribution will be under heavy load so place the burden of receiving replicated data on the Subscriber.

2. Your Publisher is moderately busy. Distribution is running remotely. Your Subscriber is a highly-used reporting system with heavy disk I/O and periods of high CPU usage.

Solution: Strongly consider using the Push method. A heavy reporting load on the Subscriber need not be further complicated by forcing the Subscriber to Pull replicated data from the Distributor. Let the Distributor handle that task.

Selecting the correct method (Push vs. Pull) to move data from the Distributor to the Subscriber(s) is a key factor in deploying a successful replication topology. It is worth spending some extra time in the planning stages to work out the method you will use. Your client, end users and DBAs will thank you in the end.

 

No comments:

Post a Comment