Site Reliability Engineer (SQL Server DBA)
Location: Remote - WFH, Canada
Job Summary
We are seeking an experienced Site Reliability Engineer (SRE) to join our Data Center Engineering team at Level 3. This role requires a technically strong and operationally mature engineer who will help design, scale, and maintain the reliability of our physical and virtual data center infrastructure. As a Level 3 SRE, you will be a technical leader responsible for ensuring system uptime, optimizing capacity and performance, and contributing to long-term infrastructure resiliency.
Key Responsibilities
• Architect, deploy, and support SQL Server Always On Availability Groups across on-prem and Azure environments.
• Install, configure, and maintain SQL Server Failover Cluster Instances (FCI).
• Administer Windows Server Failover Clustering (WSFC), quorum, and node health.
• Plan and execute failover testing, DR drills, and recovery validation.
• Perform SQL Server installation, upgrades, patching, and migrations.
• Manage backup, restore, and disaster recovery strategies.
• Monitor and tune database performance (indexing, query optimization, waits).
• Troubleshoot replication, latency, blocking, and cluster-related issues.
• Configure and manage AG listeners, networking, DNS, and storage.
• Ensure database security, auditing, and compliance.
• Automate DBA tasks using T-SQL, PowerShell, and SQL Agent.
• Support application deployments and schema changes.
• Provide 24×7 production support for mission-critical systems.
• Maintain runbooks, standards, and operational documentation.
Education and Experience
• Bachelor's degree in Computer Engineering, Electrical Engineering, Information Technology, or a related technical field.
• 4-7 years of experience in database administration and operations.
• Experience participating in or leading incident response and postmortem analysis processes.
• Previous exposure to hybrid environments integrating on-premise data centers with public or private cloud platforms is desirable.
• 4+ years of experience with SQL Server Database.
• 4+ Years of experience with any other RDBMS Database knowledge (oracle, PostgreSQL).
Expertise
• Hands-on experience with backup, recovery, and HA solutions.
• Strong proficiency in Linux and Debian environments.
• Proficiency in scripting for database automation.
• Excellent analytical, problem-solving, and troubleshooting skills.
• Strong communication skills for cross-team collaboration.
• Understanding of Oracle and MySQL databases is a plus, but not mandatory
Apply tot his job
Apply To this Job
Job Summary
We are seeking an experienced Site Reliability Engineer (SRE) to join our Data Center Engineering team at Level 3. This role requires a technically strong and operationally mature engineer who will help design, scale, and maintain the reliability of our physical and virtual data center infrastructure. As a Level 3 SRE, you will be a technical leader responsible for ensuring system uptime, optimizing capacity and performance, and contributing to long-term infrastructure resiliency.
Key Responsibilities
• Architect, deploy, and support SQL Server Always On Availability Groups across on-prem and Azure environments.
• Install, configure, and maintain SQL Server Failover Cluster Instances (FCI).
• Administer Windows Server Failover Clustering (WSFC), quorum, and node health.
• Plan and execute failover testing, DR drills, and recovery validation.
• Perform SQL Server installation, upgrades, patching, and migrations.
• Manage backup, restore, and disaster recovery strategies.
• Monitor and tune database performance (indexing, query optimization, waits).
• Troubleshoot replication, latency, blocking, and cluster-related issues.
• Configure and manage AG listeners, networking, DNS, and storage.
• Ensure database security, auditing, and compliance.
• Automate DBA tasks using T-SQL, PowerShell, and SQL Agent.
• Support application deployments and schema changes.
• Provide 24×7 production support for mission-critical systems.
• Maintain runbooks, standards, and operational documentation.
Education and Experience
• Bachelor's degree in Computer Engineering, Electrical Engineering, Information Technology, or a related technical field.
• 4-7 years of experience in database administration and operations.
• Experience participating in or leading incident response and postmortem analysis processes.
• Previous exposure to hybrid environments integrating on-premise data centers with public or private cloud platforms is desirable.
• 4+ years of experience with SQL Server Database.
• 4+ Years of experience with any other RDBMS Database knowledge (oracle, PostgreSQL).
Expertise
• Hands-on experience with backup, recovery, and HA solutions.
• Strong proficiency in Linux and Debian environments.
• Proficiency in scripting for database automation.
• Excellent analytical, problem-solving, and troubleshooting skills.
• Strong communication skills for cross-team collaboration.
• Understanding of Oracle and MySQL databases is a plus, but not mandatory
Apply tot his job
Apply To this Job