High Availability Strategies - InterSystems

3y ago

22 Views

2 Downloads

1.22 MB

19 Pages

Last View : 10d ago

Last Download : 3m ago

Upload by : Azalea Piercy

Report this link

Download PDF

Transcription

An InterSystems Technology GuideOne Memorial Drive, Cambridge, MA 02142, USATel: 1.617.621.0600 Fax: 1.617.494.1631http://www.intersystems.comHIGH AVAILABILITY STRATEGIESHA Strategies for InterSystems Caché, Ensemble, and HealthShare FoundationIntroduction . 1Operating System Failover Clustering . 2Virtualization-Based High Availability. 3Caché Database Mirroring . 4Mirroring Failover Strategies . 5Failover with Mirror Arbiter . 5Failover with ISCAgent Only . 6Failover with Custom Solution: Reliable Network Ping. 7Hybrid HA Strategy . 8General System Outages . 9Planned Outage Types . 10Unplanned Outage Types . 10Appendix A: Sample Reliable Network Configurations . 12Appendix B: Hybrid HA Solution . 14Apendix C: Sample ZMIRROR for Reliable Network Ping Failover . 17Appendix D: Manual Failover after Unplanned Outage of Primary . 18Ray Fucillo, Product Manager (ray.fucillo@intersystems.com)Mark Bolinsky, Technology Architect (mark.bolinsky@intersystems.com)April 6, 2015

General High Availability Strategies: InterSystems Caché, Ensemble, and HealthShare FoundationINTRODUCTIONThis document is intended to provide a survey of various High Availability (HA) strategies that can be used in conjunctionwith InterSystems Caché, Ensemble, and HealthShare Foundation. This document also provides an overview of thevarious types of system outages that can occur, as well as how each strategy would handle a given outage, with the goalof helping you choose the right strategy for your specific deployment.The strategies surveyed in this document are based on three different HA technologies: Operating System FailoverClusters, Virtualization-Based HA, and Caché Database Mirroring. Table 1 below highlights some key differencesbetween these technologies.Caché Database MirroringOperating System FailoverClusteringVirtualization High AvailabilityFailover after MachinePower Loss or CrashHandles machine failureseamlessly in version 2015.1or later. Prior versions didnot fail over automatically inthis scenario; alternativesrequired careful planning.Handles machine failureseamlesslyHandles physical and virtualmachine failures seamlesslyProtection from StorageFailure and CorruptionBuilt-in replication protectsagainst storage failure;logical replication avoidscarrying forward many typesof corruptionRelies on shared storagedevice, so failure isdisastrous; storage-levelredundancy optional, but cancarry forward some types ofcorruptionRelies on shared storagedevice, so failure isdisastrous; storage-levelredundancy optional, but cancarry forward some types ofcorruptionFailover after CachéShutdown, Hang, or CrashRapid detection and failoveris built inCan be configured to fail overafter Caché outageCan be configured to fail overafter Caché outageCaché UpgradesAllows for minimumdowntime Caché upgrades*Caché upgrades requiredowntimeCaché upgrades requiredowntimeApplication Mean Time toRecoveryFailover time is typicallysecondsFailover time can be minutesFailover time can be minutesExternal FileSynchronizationOnly databases arereplicated; external filesneed external solutionAll files are available to bothnodesAll files available after failoverTable 1: General Feature Comparison*Requires a configuration in which application code, routines, and classes are in databases separate from those that containapplication data1

General High Availability Strategies: InterSystems Caché, Ensemble, and HealthShare FoundationOPERATING SYSTEM FAILOVER CLUSTERINGA very common approach to achieving HA is to use failover solutions that are provided at the operating system level.Examples of such solutions exist on all platforms and include Microsoft Windows Clusters, HP Serviceguard, VeritasCluster Server, and IBM SystemMirror (PowerHA), as well as the respective clustering packages from Red Hat and SUSELinux. While the specifics of the configuration may differ slightly among the various platforms, the model is generally thesame: two identical servers with a shared storage device (often a SAN or iSCSI targets) and a shared IP address, oneactively serving production workload, and one standing by in case of failure. When an outage occurs on the active system,the failover technology transfers control of the shared disk and the shared IP address to the standby node, and then startsapplication services, including Caché.Caché is designed to integrate easily with these failover solutions. The production instance of Caché is installed on theshared storage device so that both members of the failover cluster recognize the instance, then added to the failovercluster configuration so that it will be started automatically as part of failover. When Caché starts on the newly active nodeduring failover, it automatically performs the normal startup recovery from WIJ and journal files (again, located on theshared storage device); data integrity is preserved just as though Caché had simply been restarted on the original failednode.Pros:Cons: Handles machine failure seamlesslyMost common HA choiceAvailable on all supported platforms through OS or 3rdparty vendorsAll files (database and external) available to both nodesStorage failure is disastrousUpgrades require downtimeApplication Mean Time to Recovery can be minutesThe appendixes in the Caché High Availability Guide contain detailed information on how to correctly configure Cachéwith some of the more popular OS failover clusters.2

General High Availability Strategies: InterSystems Caché, Ensemble, and HealthShare FoundationVIRTUALIZATION-BASED HIGH AVAILABILITYVirtualization technologies, such as VMware vSphere ESX/ESXi, provide High Availability capabilities, which typicallymonitor the overall health and viability of the physical hardware, as well as the guest operating systems running therein.On failure, the Virtualization HA software will automatically restart the failed virtual machine on an alternate survivinghardware. When Caché restarts, it automatically performs the normal startup recovery from WIJ and journal files; dataintegrity is preserved just as though Caché had simply been restarted on the original failed node.In addition, guest operating systems can be relocated to other servers within the virtual environment, allowing for a virtualmachine to be uplifted to alternate physical infrastructure, for maintenance purposes, without downtime. This feature isavailable as VMware vMotion, IBM Live Partition Mobility, HP Live VM Migration, and others.Pros:Cons: Handles machine failure seamlesslyMost common HA choice in virtual environmentsAll files are available after failoverPlanned physical hardware maintenance requires little orno application downtimeStorage failure is disastrousSoftware upgrades require downtimeApplication Mean Time to Recovery can be minutes forunplanned hardware failuresProper infrastructure is required to effectively support high availability in a virtual environment. This includes storage,networking, and processor capacity. Please refer to your virtualization supplier’s documentation for best practices.3

General High Availability Strategies: InterSystems Caché, Ensemble, and HealthShare FoundationCACHÉ DATABASE MIRRORINGA mirror consists of two physically independent Caché systems, called failover members. The mirror automatically assignsthe role of primary to one of the failover members, while the other member automatically becomes the backup system.Data is replicated from the primary to the backup failover member, thus providing built-in data redundancy. CachéDatabase Mirroring (Mirroring) is designed to provide an economical solution for rapid, reliable, robust, automatic failoverbetween two Caché systems for planned and unplanned outages.Mirroring additionally allows asynchronous replication to other members called async members. Async members can beused to meet a variety of demands including disaster recovery, reporting, data warehousing, and business intelligence.Async members are not available for automatic failover, but async members that are designated for disaster recovery canbe quickly promoted to take over as part of your disaster recovery procedures. For more information on the disasterrecovery features of mirroring (specific to versions 2013.1 and later), see the Caché documentation section on Promotinga DR Async Member to Failover Member and Mirror Outage Procedures. The remainder of the discussion of mirroring inthis document pertains to failover members and the high availability features of mirroring.Pros:Cons: Rapid, automatic and safe failover for almost any typeof hardware failure, operating system failure, or Cachéfailure.Allows for minimum-downtime Caché upgradesData replication protects against storage failure on theprimaryFailover time is typically seconds, providing fastapplication mean time to recoveryCan be less expensive than clustering solutionsLogical data replication can protect against physicalcorruption being carried forward to the other systemFailover members may be in separate data centers,possibly allowing for HA and DR goals to be met withonly two servers (allowable latency is dependent on theapplication)Async members for disaster recovery and reportingallow you to meet multiple needs with one technology. Only databases are automatically replicated; externalfiles needed by the application (i.e., file streams,images, etc.) need a third party replication solutionSecurity and configuration management is currentlydecentralized4

General High Availability Strategies: InterSystems Caché, Ensemble, and HealthShare FoundationMIRRORING FAILOVER STRATEGIESMirroring can be used to meet a variety of high availability needs. The strategy for meeting these demands willencompass the mirroring settings, hardware configuration, data center configuration, and sometimes manual procedures.In all cases, in order to take over as primary, it must be definitively determined, automatically through software or throughmanual intervention, that the primary failover member is down, and that the backup failover member has all of the journaldata that the primary failover member has durably committed. The mechanism for making that determination differs foreach of the mirroring failover strategies described. For more details, see the Caché documentation section on AutomaticFailover Mechanics.The remainder of this section describes the various mirroring failover strategies. The general mirroring pros and conslisted above apply to each of the failover strategies; specific pros and cons for each strategy are separately listed below.FAILOVER WITH MIRROR ARBITERStarting in version 2015.1, mirroring employs a separate system called the arbiter to provide safe, built-in, automaticfailover under scenarios in which communication between the failover members themselves is not possible: when theprimary’s host has either failed or become network-isolated. If the arbiter is not configured, the arbiter is down, or thebackup system was not up to date at the time of the failure, mirroring automatically falls back to the mode of operationdescribed in Failover with ISCAgent Only until the failover members are connected to the arbiter and caught up.Pros:Cons: Provides rapid failover in almost any failure scenario.Completely safe failover; no risk of split-brain (that is,two servers both acting as primary)No specialized hardware or software neededFailover members may be in separate data centers,possibly allowing for HA and DR goals to be met with onlytwo servers (allowable latency is dependent on theapplication)Mirror continues to operate normally if arbiter fails.(ISCAgent-based failover can still occur until the arbiterbecomes available again.)If failover members are in separate data centers, a thirdlocation should be used for the arbiter in order to allowautomatic failover after complete data center failure.To implement this strategy, identify and configure a host to act as arbiter as described in the Caché documentationsection on Locating the Arbiter to Optimize Mirror Availability.5

General High Availability Strategies: InterSystems Caché, Ensemble, and HealthShare FoundationFAILOVER WITH ISCAGENT ONLYWhen the backup mirror member detects a failure of the primary, it attempts to contact the ISCAgent on the primarymachine. If the backup successfully contacts the ISCAgent, it can then confirm that the primary is down or force it down ifit is unresponsive, download any journal information required for it to be fully caught up, and safely take over as primary.If the ISCAgent cannot be contacted (for example, if the primary server is down), failover does not occur. Of course, theadministrator can take manual steps to confirm that the primary is down and that the backup has the necessary journaldata, then initiate failover. See Appendix D for instructions on Manual Failover After Unplanned Outage of Primary.Pros:Cons: Completely safe failover; no risk of split-brain (that is,two servers both acting as primary)No specialized hardware or software neededAllows rapid failover after Caché shutdown, a hungCaché instance, and many hardware or software failuresthat prevent Caché from working, so long as theISCAgent remains reachable from the backup memberFailover members may be in separate data centers,possibly allowing for HA and DR goals to be met with onlytwo servers (allowable latency is dependent on theapplication) No automatic failover occurs after failure that rendersthe ISCAgent unreachable, such as host failureIf the primary host is unavailable, it can be difficult todetermine whether the backup has all the requiredjournal information in order to verify that it is safe toinitiate manual failoverIn 2015.1 and later this is the default strategy until you configure an arbiter. To implement this strategy in versions prior to2015.1, leave the Agent Contact Required for Takeover configuration setting at YES (the default).6

General High Availability Strategies: InterSystems Caché, Ensemble, and HealthShare FoundationFAILOVER WITH CUSTOM SOLUTION: RELIABLE NETWORK PINGImportant: Starting in version 2015.1, this strategy is no longer be available, and sites using this strategy will, uponupgrading, need to switch to use Failover with Mirror Arbiter, a simpler and safer way to achieve the same goals.When the backup mirror member detects a failure of the primary, it attempts to contact the ISCAgent on the primarymachine. If the backup successfully contacts the ISCAgent, it can then confirm that the primary is down or force it down ifit is unresponsive, download any journal information required for it to be fully caught up, and safely take over as primary.If the ISCAgent cannot be contacted (for example, if the primary server is down), network pings over the public andprivate network are utilized to determine the status of the primary server (this requires custom programming which isimplemented in IsOtherNodeDown ZMIRROR()). If the primary does not respond to the pings on either the public orthe private network, the backup assumes that the primary is down, and takes over as primary. Because the lack of pingresponse from the primary server does not strictly guarantee that the server is down, there is a risk of split-brain (twoservers simultaneously acting as primary) that cannot be completely eliminated with this strategy. Other mirroring failoverstrategies discussed in this document carry no risk of split-brain. To minimize the risk, the following is required: The networking between the failover members must be redundant, reliable, and highly available.The failover members should be hosted directly on physical machines, not on a virtualization platform; onvirtualized platforms, activity at the host/hypervisor level may cause a member to become temporarilyunresponsive to ping while it is still running. See the Hybrid HA Strategy for information on how to safely extendmirroring in its default configuration to provide higher availability in a virtualized environment.Pros:Cons: Allows rapid failover following server/host failure Requires implementation of a custom ZMIRROR routine(InterSystems can provide a sample).Requires specialized networking hardware configurationto provide very robust networking.Recommended that the failover machines are located inthe same data center to avoid network isolation.The risk of split-brain (two servers both acting asprimary) cannot be completely eliminated.To implement this strategy:1. Create a hardware configuration that provides an extremely reliable network between the primary and backup failovermembers. Please reference Appendix A for an example of a reliable network configuration between two failovermembers.2. Customize the sample implementation of IsOtherNodeDown ZMIRROR() from the routine provided in AppendixC. In any scenario under which the ping mechanism cannot adequately determine that the primary is down, yourimplementation must assume that it is up so that automatic failover will not occur. Of course, the administrator cantake manual steps to determine that the primary is down and that the backup has the necessary journal data, andthen initiate failover. See Appendix D for instructions on Manual Failover After Unplanned Outage of Primary3. Set Agent Contact Required for Takeover to NO.4. Adjust Trouble Timeout Limit to allow sufficient time for the IsOtherNodeDown ZMIRROR mechanism to operate.For example, while testing failover you may notice a message similar to the following in the cconsole.log file on thebackup failover member: Mirror recovery time of 7.101 seconds exceeded trouble timeout of 6seconds. Restarting. In this example, you might consider increasing the Trouble Timeout Limit to 8 seconds.7

General High Availability Strategies: InterSystems Caché, Ensemble, and HealthShare FoundationHYBRID HA STRATEGYDatabase Mirroring can be used in conjunction with Virtualization HA to provide extremely robust high availabilitystrategies for planned and unplanned outages.Database Mirroring provides the first line of defense with rapid automatic failover for planned and unplanned outages.Virtualization HA automatic restarts the virtual machine hosting a mirror member following unplanned machine or OSoutages, making the failed member available again to act as

Related Documents:

Machine Learning Made Easy: InterSystems IntegratedML - pdf

Machine Learning Made Easy: InterSystems IntegratedML Technology Brief. Win the Artificial Intelligence Talent War With an Easy-to-Develop, Easy-to-Deploy Machine Learning Solution Why Read this Technology Brief? Acco

30 Views

2y ago

Configuring High Availability for VMware vCenter in RMS ...

Configuring High Availability for VMware vCenter in RMS All-In-One Setup Testing Accidental Failure on a Host. High Availability for Cisco RAN Management Systems 6 Configuring High Availability for VMware vCenter in RMS All-In-One Setup Testing Accidental Failure on a Host. Title:

36 Views

3y ago

Cisco Prime Infrastructure 2.0 Administrator Guide

Contents vii Cisco Prime Infrastructure 2.0 Administrator Guide OL-28741-01 Configuring an SSO Server in the High-Availability Environment 8-11 Installing Software Updates in the High-Availability Environment 8-13 Software Update on High-Availability with Primary Alone 8-13 Software Update on High-Availability with Manual Failover Type 8-1

23 Views

2y ago

Meaningful Availability - USENIX

Google Philipp Hoffmann Google John Lunney Google Dan Ardelean Google Amer Diwan Google Abstract High availability is a critical requirement for cloud appli-cations: if a sytem does not have high availability, users can-not count on it for their critical work. Having a metric that meaningfully captures availability is useful for both users

26 Views

1y ago

8 Strategy Formulation and Implementation

Understand Grand Strategies for domestic and international operations Define corporate-level strategies and explain the portfolio approach. Describe business-level strategies, including Porter’s competitive forces and strategies and partnership strategies. Explain the major considerations in formulating functional strategies.

50 Views

3y ago

021220-Brochure-FUTURE AND OPTION TRADING STRATEGIES - Empirical Academy

- Direction neutral strategies and Spread strategies - Vertical and horizontal spread strategies - Volatility strategies & Advanced structures with Options 3. Different views and strategies for each view - 2.5 hrs - Delta, Gamma, Theta, Vega - concepts and use in Risk management - Gamma scalping. Exotic options overview. 4.

9 Views

1y ago

Beliefs and strategies in Filipino language learning and academic ...

Filipino language in terms of language aptitude, challenges and strategies in learning a language,communication strategies and purpose in leaning Filipino; 3. To determine the level of Filipino language learning strategies used by the respondents in terms of direct strategies and indirect strategies; 4.

10 Views

9m ago

Installing Vapor Recovery Units on Storage Tanks

tank; 2. Oil composition and API gravity; 3. Tank operating characteristics (e.g., sales flow rates, size of tank); and 4. Ambient temperatures. There are two approaches to estimating the quantity of vapor emissions from crude oil tanks. Both use the gas-oil ratio (GOR) at a given pressure and temperature and are expressed in standard cubic feet per barrel of oil (scf per bbl). This process is .

76 Views

3y ago

Recent Views

PHONE NO. CONTACT TOPIC/SUBTOPIC ORGANIZATION #A

651-757-2762 Deborah Klooz MPCA Paralegal: 651-757-2631 Jean Coleman MPCA Staff Attorney: 651-757-2791 Adonis Neblett MPCA Staff Attorney: 651-757-2017 Carmen Netten MPCA Staff Attorney: 651-757-2759 David Stellmach MPCA Staff Attorney: 651-757-2247 Joseph Dammel MPCA Staff Attorney: 651-757-2545 Michelle Janson MPCA Staff Attorney: #ATTORNEY .

2y ago

403 Views

Local Prosecutors and The Attorney General

Attorney General of Iowa Other Members iii Honorable Arthur K. Bolton Attorney General of Georgia Honorable Chauncey H. Browning, J 1'. Honorable John C. Danforth Attorney General of Missouri Honorable J olm P. Moore Attorney General of Colorado Attorney General of West Virginia Honorable Larry Derryberry Attorney General of Oklahoma

1y ago

178 Views

30th Annual Anti-Fraud Conference Tentative Schedule

Apr 30, 2019 · Jill Nerone, Supervising Deputy District Attorney, Alameda County District Attorney’s Office Laura Meyers, Assistant District Attorney, San Francisco County District Attorney’s, Office Nicole Pantaleo, Deputy District Attorney, Marin County District Attorney’s Office, Insurance F

2y ago

150 Views

Shannon McClellan Hon. Diane O. Leasure Ellery M. “Rick .

Attorney at Law Hon. Pamila J. Brown BOG Liaison District Court, Howard County Alan S. Carmel Attorney at Law Sarah Dawn Cline Attorney at Law Adam Sean Cohen Attorney at Law Delegate Kathleen M. Dumais District 15 Suzanne K. Farace Attorney at Law Barry L. Gogel Attorney at Law Michael I. Gordon

2y ago

142 Views

Powers of Attorney Act 2003 A Commentary - Law Society of New South Wales

POWERS OF ATTORNEY ACT 2003: A COMMENTARY 6 POWERS OF ATTORNEY ACT 2003: COMMENTARY The commentary is provided in black text. Reference to the "Act" is a reference to the Powers of Attorney Act 2003 as amended. Reference to the "Regulation" is a reference to the Powers of Attorney Regulation 2011, recently amended by the Powers of Attorney Amendment Act 2013 and the Powers of

7m ago

94 Views

California Safe Drinking Water and Toxic Enforcement Act .

District Attorney of Madera County 209 West Yosemite Avenue Madera, CA 93637 District Attorney of Marin County 3501 Civic Center Drive, Rm. 130 San Rafael, CA 94903 District Attorney of Mariposa County P.O. Box 730 Mariposa, CA 95338 District Attorney of Mendocino County P.O. Box 1000 Ukiah, CA 95482 District Attorney of Merced County

3y ago

163 Views

IN THE UNITED STATES COURT OF APPEALS FOR THE FIRST

Mar 06, 2020 · Attorney General of New Jersey Assistant Attorney General Counsel of Record Attorney for Amicus Curiae JOHN T. PASSANTE State of New Jersey Deputy Attorney General New Jersey Attorney General’s Office Richard J. Hughes Justice Complex 25 Market Street Trenton, NJ 086

2y ago

128 Views

ATTORNEY HANDBOOK - United States Courts

e. Each attorney's or pro se litigant's name must be typed and signed on the last page of the complaint, with: (1) his/her address (2) telephone number (3) if a Pennsylvania attorney, his/her Pennsylvania Attorney ID Number f. To file a complaint, the attorney must have an electronic signature on the complaint and must have an electronic

1y ago

124 Views

Power of Attorney - FedEx

Show the date the Power of Attorney is signed. Corporation Power of Attorney Partnership 1 10 9 8 7 6 5 4 3 2 12 11 1 10 9 8 7 6 5 4 3 2 12 11 1 10 9 8 7 6 5 4 3 2 12 11 Rev 6/13 The number preceding each instruction corresponds to the same number on the example of the power of attorney form. Customs Power of Attorney, Designation as Export .

1y ago

157 Views

Powers of Attorney - Ontario

attorney, a family member or friend may have to apply to be appointed as guardian. Powers of attorney that were properly made under previous laws of Ontario remain legally valid. The forms for a Continuing Power of Attorney for Property and a Power of Attorney for Personal Care contained in this booklet were revised on March 29, 1996 in accordance

1y ago

155 Views

STATUTORY POWER OF ATTORNEY - eForms

repudiated the power of attorney; and the power of attorney still is in full force and effect. 5. I/we make this affidavit for the purpose of inducing _ to accept delivery of the above described instrument, as executed by me/us in my/our capacity of attorney(s)-in-fact for the Principal. _, Attorney-in-fact

1y ago

118 Views

John J. Hoffman Acting Attorney General of New Jersey

JOHN J. HOFFMAN ACTING ATTORNEY GENERAL OF NEW JERSEY Division of Law 124 Halsey Street — 5th Floor P.O. Box 45029 Newark, New Jersey 07101 Attorney for Plaintiffs By: Jah-Juin Ho - #033032007 Deputy Attorney General 973-648-2500 JOHN J. HOFFMAN, Acting Attorney General of the State of New Jersey, and ERIC T.

1y ago

89 Views

Options in Oregon to Help Another Person Make Decisions

Power of Attorney A “Power of Attorney” is a legal document that allows a person to give another person (called an “agent”) the right to act on the person’s behalf. A “Power of Attorney” in Oregon can only be used for financial decisions. The way a “Power of Attorney” is written is important. The authority given to the agent can

3y ago

134 Views

- fcdfa

FRESNO COUNTY SUPERIOR COURT By DEPT.402 JAN SCULLY District Attorney, County of Sacramento RUTH YOUNG, State Bar No. 133606 Deputy District Attorney 906 G Street, Suite 700 Sacramento, CA 95814 Telephone: (916) 874-6174 JACKIE LACEY District Attorney, County of Los Angeles STUART C. LYTTON, State Bar No. 114241 Deputy District Attorney

3y ago

136 Views

Non-Attorney E-File Registration

your motion for e-filing access. Instructions to submit the Non-Attorney E-File Registration: 1. Register for a Non-Attorney Filer Account on the PACER website at www.pacer.uscourts.gov. If you already have a PACER Account, login to Manage My Account, select Non-Attorney E-File Re

2y ago

181 Views

High Availability Strategies - InterSystems

It looks like you're using an ad-blocker