Principal Site Reliability Engineer, Data Protection Products

Remote Full-time
ConnectWise is an industry and Global leading software company with over 3,000 colleagues in North America, EMEA and APAC. As a community-driven software company dedicated to the success of technology solution providers, our suite helps over 45,000 of our partners manage their businesses better, sell more efficiently, automate service delivery, and remotely control technology so they can consistently deliver amazing customer experiences.
Our company is powered by our connections, our colleagues, and our community. And, we accept all kinds.
Game-changers, innovators, culture-lovers—and humankind.
We invite discovery and debate. We recognize key moments as milestones.
We see you and value you for your unique contributions. Our inclusive, positive culture lays the foundation to ensure every colleague is valued for their perspectives and skills, giving you the choice of how YOU make a difference.
Curious? Read this opportunity to learn how YOU can make a difference at ConnectWise!

General Summary:
As a Site Reliability Engineer, you will work as an integral member of product teams, helping to build, deploy, and monitor cloud services reliably. You will contribute to complex software development projects to maintain essential, revenue-critical services. Additionally, you will actively develop code and build frameworks to monitor services deployed in production, driving reliability and performance across a large scale. You will be responsible for ensuring the reliability, availability, and performance of our Elasticsearch infrastructure. We're seeking a talented Site Reliability Engineer who can work with minimal supervision, define test procedures, and collaborate effectively with Developers, Designers, Customer Support, and Engineering Leadership.
Essential Duties and Responsibilities:

· Build systems and infrastructure to monitor complex, large-scale distributed systems.
· Identify stability/performance issues and collaborate with developers to triage critical issues in production systems.
· Represent the SRE organization in design reviews and operational readiness exercises for new and existing services.
· Devise ways to actively monitor system throughput, capacity, and reliability.
· Have the ability to debug complex systems and evolve a running environment without causing downtime.
· Engage in service capacity planning and demand forecasting, as well as software performance analysis and system tuning.
· Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization.
· Monitor and troubleshoot Elasticsearch performance issues and outages.

Who You Are
· Bachelor’s degree in Computer Science or equivalent work experience as a System Administrator with programming skills.
· Fundamental knowledge of technologies across a broad range of disciplines, including virtualization, storage, networking, server, and security.
· Understanding of systems and application design, including the operational trade-offs of various designs.
· Experience with monitoring and logging solutions such as Prometheus, Grafana, and ELK stack.
· Proficiency in scripting languages such as Python.
· Experience with infrastructure-as-code tools such as Terraform or CloudFormation.
· Strong understanding of Linux system administration and networking concepts.
· Excellent troubleshooting and problem-solving skills.
· Ability to work independently and collaboratively in a fast-paced environment.
· Strong communication and interpersonal skills.
· Demonstrable knowledge of Unix, TCP/IP, HTTP, web application security, and experience supporting multi-tier web application architectures.
· Experience in analyzing logs and troubleshooting large-scale distributed systems.
· Excellent organizational, time management, and communication skills.
Nice to Have
· Experience with instrumenting and monitoring production systems using tools such as ELK stack, Zabbix, Nagios, Statsd/Graphite, APM, etc.
· Experience with Amazon AWS Infrastructure (including EC2, S3, VPC, Security Groups, RDS) and related services is desirable.
· A working understanding of Docker, Vagrant, and configuration management tools like Ansible, Chef, or Puppet.
· Experience with one or more general-purpose programming/scripting languages, including but not limited to Python, Bash, Perl, or Go.
Benefits include:
· Medical Insurance
· Flexible PTO
· Flex Friday
· Hybrid Work Option Available
· Tuition Reimbursement
· And more!

ConnectWise is an Equal Opportunity Employer, dedicated to building a diverse and inclusive workforce and providing a workplace free from discrimination and harassment. ConnectWise provides equal employment opportunities to all employees and applicants without regard to race, ethnicity, color, religion, age, sex (including pregnancy), sexual orientation, gender, gender identity or expression, ancestry, national origin, citizenship status, physical or mental disability, genetic information, military/veteran status, marital status, familial or parental status, or any other characteristic or status protected by applicable federal, state and local laws.
The statements above are intended to describe the general nature and level of work being performed by individuals assigned to this job. Other duties may be assigned as needed. Reasonable accommodations may be made to enable qualified individuals with disabilities to perform the essential functions of the job and/or to receive other benefits and privileges of employment. If you need a reasonable accommodation for any part of the application and hiring process, please contact us at [email protected] or 1-800-671-6898.

Apply Now
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

GTM and Analytics Implementation Specialist

Remote

Product Manager (Remote Work Opportunity)

Remote

Medical Assistant/Relations Associate-Mount Sinai Doctors Medical Group - Full Time - Days

Remote

Pricing Actuary

Remote

Salesforce Revenue Management Developer

Remote

Administrative Support Specialist (Social Services-Child Welfare) – Union County – Monroe, NC

Remote

Remote Front End Engineer- $100k-$135k (React, TypeScript)

Remote

American Airlines Careers Remote (Associate Engineer) $20-25 An Hour

Remote

Remote Typing Job for Students: Flexible Hours, No Initial Investment, Attractive Salary Package

Remote

Senior Product Manager

Remote
← Back