[Remote] Network Engineer - Network Resiliency and High Availability

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. Dice is seeking a Senior Network Engineer specializing in Network Resiliency and High Availability to ensure their global network infrastructure remains fault-tolerant and capable of seamless disaster recovery. The role involves designing, validating, and optimizing redundant paths and high-availability clusters while ensuring zero packet loss for business-critical applications during unforeseen failures.ResponsibilitiesDesign, implement, and maintain high-availability network topologies using physical and logical redundancy patterns (e.g., Multi-Chassis EtherChannel/MCLAG, VPC, and VSS)Architect redundant Wide Area Network (WAN) transport paths utilizing dual-homed ISP connections, SD-WAN dynamic path selection, and automated failover technologiesConduct controlled Network Chaos Engineering exercises (e.g., simulating fiber cuts, device power failures, and split-brain scenarios) to validate failover timers and resilience assumptionsOptimize enterprise routing protocols (BGP, OSPF, EIGRP) for ultra-fast convergence, tuning features like Bidirectional Forwarding Detection (BFD), Fast Reroute (FRR), and Graceful RestartImplement First Hop Redundancy Protocols (HSRP, VRRP, GLBP) to guarantee default gateway redundancy for end-user and server segmentsManage complex traffic engineering strategies (e.g., BGP local preference, AS-path prepending) to ensure predictable asymmetric/symmetric routing during failure statesLead the network engineering track for Corporate Disaster Recovery planning, including active-active and active-passive data center strategiesDesign, configure, and maintain automated DNS-based failover (GSLB) and Anycast routing strategies to reroute user traffic away from degraded data centers or cloud regionsKeep comprehensive, up-to-date documentation on failover runbooks and infrastructure dependency mapsDeploy advanced monitoring tools to track metrics like Mean Time to Detect (MTTD) and Mean Time to Repair (MTTR)Set up telemetry-based alerting (SNMP, gRPC/Streaming Telemetry) to identify gray failures (e.g., high interface error rates causing intermittent drops) before they cause total outagesSkills5+ years in a dedicated network engineering or operations role, with a proven track record of designing 99.99% or 99.999% (Four-to-Five Nines) uptime environmentsBachelor's degree in Computer Science, Computer Engineering, or equivalent practical experienceDesign, implement, and maintain high-availability network topologies using physical and logical redundancy patterns (e.g., Multi-Chassis EtherChannel/MCLAG, VPC, and VSS)Architect redundant Wide Area Network (WAN) transport paths utilizing dual-homed ISP connections, SD-WAN dynamic path selection, and automated failover technologiesConduct controlled Network Chaos Engineering exercises (e.g., simulating fiber cuts, device power failures, and split-brain scenarios) to validate failover timers and resilience assumptionsOptimize enterprise routing protocols (BGP, OSPF, EIGRP) for ultra-fast convergence, tuning features like Bidirectional Forwarding Detection (BFD), Fast Reroute (FRR), and Graceful RestartImplement First Hop Redundancy Protocols (HSRP, VRRP, GLBP) to guarantee default gateway redundancy for end-user and server segmentsManage complex traffic engineering strategies (e.g., BGP local preference, AS-path prepending) to ensure predictable asymmetric/symmetric routing during failure statesLead the network engineering track for Corporate Disaster Recovery planning, including active-active and active-passive data center strategiesDesign, configure, and maintain automated DNS-based failover (GSLB) and Anycast routing strategies to reroute user traffic away from degraded data centers or cloud regionsKeep comprehensive, up-to-date documentation on failover runbooks and infrastructure dependency mapsDeploy advanced monitoring tools to track metrics like Mean Time to Detect (MTTD) and Mean Time to Repair (MTTR)Set up telemetry-based alerting (SNMP, gRPC/Streaming Telemetry) to identify gray failures (e.g., high interface error rates causing intermittent drops) before they cause total outagesCisco Certified Internetwork Expert (CCIE - Enterprise Infrastructure or Data Center) or strong CCNP with equivalent experienceJuniper Networks Certified Internetworking Specialist/Expert (JNCIS/JNCIE)Certified Business Continuity Professional (CBCP) or equivalent familiarity with DR frameworks is a plusCompany OverviewDice is the go-to career marketplace for tech professionals. It was founded in 2010, and is headquartered in Drachten, Friesland, NLD, with a workforce of 201-500 employees. Its website is https://www.or-quest.nl/.Company H1B SponsorshipDice has a track record of offering H1B sponsorships, with 2 in 2022, 4 in 2021, 5 in 2020. Please note that this does not guarantee sponsorship for this specific role.

Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

EHR Specialist

Remote

Senior Equity Analyst

Remote

Senior Data Analyst (Funding and Investment Products)

Remote

Fraud Strategy Analyst Intermediate

Remote

**Experienced Live Chat Specialist – Remote Customer Service Representative**

Remote

Remote Customer Support Representative - American Airlines

Remote

Experienced Quality Assurance Tester and Support Specialist - Remote Opportunity for a Detail-Oriented and Solutions-Driven Professional

Remote

Immediate Hiring: Occupancy Planner - Remote in EST/CST

Remote

PT Sales Associate Cashier – Amazon Store

Remote

[Remote/WFM] Virtual Customer Care At American express

Remote
← Back