Migrating workloads to Oracle Cloud Infrastructure (OCI) requires a thoughtful approach to ensure a smooth and successful transition. First, organizations should evaluate their current infrastructure and identify which workloads are suitable for migration. It is crucial to understand the dependencies and relationships between different applications and services. Additionally, it is essential to evaluate performance requirements and scalability needs to select the appropriate OCI services and configurations. Security is of utmost concern, and organizations must implement robust identity and access management, encryption, and network security measures to protect sensitive data. Cost considerations play a significant role in the decision-making process, so optimizing resource usage and selecting the right pricing model are vital for financial efficiency. Adequate planning for data migration and testing is necessary to minimize downtime and mitigate potential risks. Finally, having a well-defined rollback strategy in case of unforeseen issues is essential to ensure business continuity. Overall, a comprehensive approach that addresses technical, security, financial, and operational aspects is key to a successful migration to OCI.
General
• Customer technical expertise
• Timing and downtime expectations
• Business constraints
Setting expectations for a migration project is critical.
Depending on the characteristics of the source environment, the downtime can be considerable, so it’s important to make plans for the preparation, the actual migration, and the validation of the new environment before the cutover.
Environment Information
• Development
• Test
• Production
• Others like Training, etc.
Knowing the purpose of the environment can help in any required architecture redesign and can help determine downtime requirements.
Data Region Location
• Availability of Oracle Cloud Infrastructure data center
• Data Sovereignty and Regulatory compliance.
Services used
•IaaS Only
•IaaS and PaaS
• Lift and shift applications
•EBS, Hyperion, Primavera
•Fusion Middleware
Identifying the main applications and services running within an environment helps to determine the most appropriate migration strategy for each workload.
Network Considerations
General network requirements
• Shared network usage
• Number of IP networks
• Number of external IPs
• Bandwidth requirements
• Load Balancer information
• Outbound proxy for external Internet access
• Communication between data centers
• DNS usage
Network security
• Security policies that exist in this environment
• Security rules, security lists (Shared network)
• ACLs, Virtual NIC Sets, and IP Prefix Sets (IP Network)
• Additional security features needed beyond layer-3/4 filtering
• Additional filtering needed (for example, layer-7)
The configuration of security rules is especially important and can introduce a layer of complexity to a migration project. It’s important to understand that there is not necessarily a 1- to-1 mapping of these features from the Oracle Cloud Infrastructure Classic network to the VCN on Oracle Cloud Infrastructure.
On-premise to Oracle Cloud Infrastructure target environment connectivity
• FastConnect
•Ipsec VPN
Database Considerations
General
• Number of databases to migrate
• Purpose of each database
• Dependencies (what applications depend on each database)
• Average size of each database Oracle Databases
• Type of Database deployment: DBCS, ExaCS, Dedicated Infrastructure, ExaC&C,
Autonomous DB, MySQL or on-premises software installed on a VM?
• Version and Edition of each database
• License Type
• Migration Method: Zero Downtime Migration, Backup-Restore, DataPump, GoldenGate…
Third-party Databases (if any):
• Brand, version and edition of each third-party database Migration Method
• Are there any restrictions, if compatible, that would prevent the use of GoldenGate or DataGuard as the primary assisting tool for migrating the data?
• What is the backup method and schedule for each database?
Virtual Machine Considerations
Questions
• How is access to the instance secured? For example: SSH for Linux instances.
• Is there a bastion host? For accessing this instance, a best practice is to configure a bastion (jump) host.
• Is there any candidates to migrate to Autonomous Linux (which will sort out patching)
• How is the system patched? Are systems patched after initial provisioning?
• Is there a way to audit the fleet of VM’s for patches?
• How to tell which VMs need additional patches? Especially CVE patches.
• Is malware / anti-virus installed? Which anti-virus vendor?
• How are system level logs captured? Syslogs for Unix. Event logs for Windows.
• Ideally connect to a log analytics system (Splunk, ELK, Graylog, …)
• Is the image hardened? Review CIS (https://www.cisecurity.org/cisbenchmarks/)
• Benchmarks for hardening systems.
• What monitoring of the system is in place? At a minimum CPU / memory / disk should be monitored. A better solution would be to alert based on these metrics. The best solution would be to provide a mechanism for auto scaling.
• Is there a firewall running on this instance? Local firewall setting may affect remote access independent of any network security rule.
• Does the system sync time use NTP? Verify the NTP servers are accessible from Oracle Cloud Infrastructure or consider using the Oracle Cloud Infrastructure NTP service.
• How are the attached disks backed up? Verify there is a plan for backup / restore.
• Are fault domains being leveraged? Verify that fault domains are being considered as compute instances are provisioned.
Block Storage Considerations
Questions
• Verify performance (IOPS, latency, throughput) is reasonable for your workload.
• Can be used fio or Cloud Harmony benchmark to gather benchmark numbers.
• Verify block volume backup plan. Ideally this should be automated or use policy-based backups.
• When using iSCSI, enable CHAP authentication.
• For security, always enable CHAP authentication for iSCSI devices.
Custom Image Considerations
Questions
• Be aware of limitations (size, reserved IP addresses, Windows export…) custom images.
• Since images can be shared across regions, upload images only as needed for start-up time.
• Trade off management of images versus start-up time for a new instance.
Application-Level Disaster Recovery Considerations
Considerations for OCI migration.
• Is the application accessed via a DNS FQDN or by IP address directly?
• Will failover between prod and DR be accomplished by making DNS changes?
• Are there any other IP requirements between DR, prod and any other environments or are these largely undefined/non-existent (such as using the same IP addressing for both prod and DR, etc.)?
• Please provide a list of all applications that will be running in this Oracle Cloud Infrastructure environment.
• Specify where each application currently resides (on-prem, other cloud, etc.).
High Availability Considerations for Compute
To achieve high availability for compute instances in Oracle Cloud Infrastructure (OCI), you can follow these step-by-step recommendations:
1. Availability Domains (ADs):
• Recommendation: Deploy your compute instances across multiple availability domains (ADs) within a region. ADs provide fault isolation and ensure redundancy in case of failures.
2. Fault Domains:
• Recommendation: Within each AD, distribute your compute instances across multiple fault domains. Fault domains provide additional fault isolation by placing instances on different hardware within the AD.
3. Load Balancers:
• Recommendation: Utilize OCI Load Balancers to distribute incoming traffic across multiple compute instances, enhancing availability and scalability. Configure health checks and session persistence as needed.
4. Auto Scaling:
• Recommendation: Implement OCI Auto Scaling to adjust the number of compute instances automatically based on predefined policies and metrics. This ensures that your application can handle varying workloads and maintains performance during traffic spikes.
5. Backup and Recovery:
• Recommendation: Set up regular backups of your compute instances using OCI Block Volume or Object Storage. This allows you to recover in case of instance failures or data loss. Consider automated backup solutions to streamline the process.
6. Virtual Cloud Network (VCN) and Subnets:
• Recommendation: Design and configure your VCN and subnets to provide redundancy and fault tolerance. Utilize regional subnets to distribute instances across ADs and ensure connectivity during AD-level failures.
7. Instance Configuration:
• Recommendation: Utilize OCI instance shapes that provide built-in redundancy, such as instances with multiple physical NICs or instances with local NVMe SSD storage for improved performance and fault tolerance.
8. Monitoring and Notifications:
• Recommendation: Enable monitoring for your compute instances using OCI Monitoring service. Set up alerts and notifications for critical metrics, such as CPU utilization, network traffic, and instance health status. Configure email or notification channels to receive real-time alerts.
9. Regional Persistent Disks:
• Recommendation: Utilize OCI Regional Persistent Disks for storing critical data or application files. Regional persistent disks provide replication and redundancy across ADs within a region, ensuring data availability in case of a failure.
10. Multi-Region Disaster Recovery (DR):
• Recommendation: If high availability across regions is required, consider implementing a multi-region disaster recovery strategy. Replicate your compute instances and data to a secondary region using OCI’s data replication and synchronization tools, such as Oracle GoldenGate or OCI Data Guard.
It’s important to note that achieving high availability also involves considering application-level design and implementing proper fault tolerance mechanisms within your applications.
Business Continuity and DR Considerations using FSDR
Designing and implementing a full-stack disaster recovery (FSDR) solution in Oracle Cloud Infrastructure (OCI) involves several pre- requisites, planning, and execution steps. Here is a step-by-step guide for setting up a full-stack DR solution in OCI:
Pre-requisites:
1. Identify Recovery Point Objective (RPO) and Recovery Time Objective (RTO):
2. Select a Secondary Region:
o Choose a secondary OCI region that will serve as your DR site. Consider geographic separation and the availability of the required OCI services in the secondary region.
3. Connectivity and Networking:
• Establish a secure and robust network connection between your primary and secondary regions. Set up connections to ensure seamless communication between sites.
4. Identify Critical Applications and Services:
• Identify the critical applications, databases, and services that require DR protection. Determine the dependencies between them and prioritize their replication and recovery.
Plan Design:
1. Replication Strategy:
• Determine the appropriate replication strategy for your data. OCI provides options like Oracle Data Guard, Oracle GoldenGate, or object storage replication for different workloads. Choose the method that meets your RPO and RTO requirements.
2. Compute Instance Replication:
• Determine the replication approach for your compute instances. Options include OCI Block Volume Group Replication, automation tools, third-party replication tools, or container-based solutions like Docker for application-level replication.
3. Database Replication:
• Set up database replication using Oracle Data Guard or Oracle GoldenGate. Configure real-time data replication and failover capabilities to ensure data consistency and availability in the DR site.
4. Stand Alone Storage Replication:
• If your applications rely on shared file systems or block storage, consider implementing replication mechanisms like OCI Block Volume replication and File Storage Replication, ZFS Appliance or OCVS certified tools (like Veeam, CommVault,..) storage vendor-specific replication technologies for On-Premise to Cloud replication that can be working in Compatibility mode/certified by Oracle.
5. Load Balancer and Traffic Management:
• Configure OCI Load Balancers and Traffic Management policies to distribute traffic between the primary and secondary regions. Set up health checks and failover rules to ensure seamless failover in case of a disaster.
Execution:
1. Provision Resources in the Secondary Region:
Set up the necessary compute instances replication mechanisms, databases, storage volumes replication, and networking components in the secondary region. Ensure that configurations match those in the primary region.
2. Configure Replication:
Set up and configure the chosen replication technology (e.g., Oracle Data Guard, Oracle GoldenGate) to replicate data from the primary to the secondary region. Test replication and ensure synchronization.
3. Create the FSDR Plan
Generate the Scripts
4. Test and Validate:
Perform regular testing of your DR setup to validate its effectiveness. Conduct planned failover and failback exercises to ensure the integrity and reliability of your DR solution.
5. Monitoring and Maintenance:
Implement monitoring and alerting mechanisms to track the health and performance of your DR environment. Regularly review and update your DR plan based on changes in your infrastructure or applications.
6. Documentation and Communication:
Document the FSDR plan, including recovery procedures, contact information, and escalation paths. Communicate the DR plan to relevant stakeholders and ensure they are aware of their roles and responsibilities during a disaster event.
Regularly review and update your DR solution to account for changes in your infrastructure, applications, or business requirements. Conduct periodic audits and tests to validate the effectiveness of your DR strategy and ensure compliance with industry regulations.
Leave a Reply