IT Senior Cloud Operations Engineer

Date: Jun 3, 2025

Location: Erie, PA, US, 16530

Company: Erie Insurance

Division or Field Office:

Office of the CIO

Department of Position: Enterprise Tech Office Dept 

Work from:

Corporate Office in Erie, Pa
Salary Range:

$97,388.00 - $155,567.00 *

salary range is for this level and may vary based on actual level of role hired for

*This range represents a national range and the actual salary will depend on several factors including the scope and complexity of the role and the skills, education, training, credentials, location, and experience of an applicant, as well as level of role for which the successful candidate is hired. Position may be eligible for an annual bonus payment.

 

At Erie Insurance, you’re not just part of a Fortune 500 company; you’re also a valued member of a diverse and inclusive team that includes more than 6,000 employees and over 13,000 independent agencies.  Our Employees work in the Home Office complex located in Erie, PA, and in our Field Offices that span 12 states and the District of Columbia. 

Benefits That Go Beyond The Basics

We strive to be Above all in Service® to our customers—and to our employees. That’s why Erie Insurance offers you an exceptional benefits package, including:

  • Premier health, prescription, dental, and vision benefits for you and your dependents. Coverage begins your first day of work.
  • Low contributions to medical and prescription premiums. We currently pay up to 97% of employees’ monthly premium costs.
  • Pension. We are one of only 13 Fortune 500 companies to offer a traditional pension plan. Full-time employees are vested after five years of service.
  • 401(k) with up to 4% contribution match. The 401(k) is offered in addition to the pension.
  • Paid time off. Paid vacation, personal days, sick days, bereavement days and parental leave.
  • Career development. Including a tuition reimbursement program for higher education and industry designations.
     

Additional benefits that include company-paid basic life insurance; short-and long-term disability insurance; orthodontic coverage for children and adults; adoption assistance; fertility and infertility coverage; well-being programs; paid volunteer hours for service to your community; and dollar-for-dollar matching of your charitable gifts each year.

 

Position Summary

Responsible for leading cloud operations tasks, including incident response, automation, and operational improvements. 


This role involves overseeing complex issues, mentoring junior engineers, and driving operational efficiency. 


Collaborates with cross-functional teams and plays a key role in ensuring cloud infrastructure, environments, and workloads are reliable, secure, and optimized for performance.

 

What You'll Do:
We are seeking a proactive and solutions-driven IT Cloud Operations Engineer or IT Sr Cloud Operations Engineer to support our AI Center of Excellence (CoE) within the IT organization. In this role, you will be responsible for maintaining and optimizing the platforms that enable enterprise-scale AI initiatives. You'll play a key part in ensuring the reliability, availability, and performance of cloud-based AI/ML environments.

  • Cloud Environment Management: Operate, monitor, and maintain cloud-based infrastructure supporting AI workloads across AWS, Azure, or GCP.
  • System Reliability & Uptime: Ensure high availability of AI platforms and services, proactively responding to performance or reliability issues.
  • Incident Response & Troubleshooting: Lead incident investigations and root cause analyses for infrastructure and platform-level issues.
  • Automation & Optimization: Automate cloud operations tasks using scripts or tools (e.g., Python, Bash, Terraform) to streamline deployment, monitoring, and scaling.
  • Monitoring & Alerts: Implement and manage monitoring solutions (e.g., Prometheus, Grafana, CloudWatch, Datadog) to support proactive alerting and system health visibility.
  • Security & Compliance: Support cloud security configurations, manage access controls, and assist with compliance processes and documentation.
  • Collaboration: Work closely with cloud engineers, data scientists, and ML engineers to ensure smooth and efficient AI/ML workflows from dev to production

 

What Makes You Stand Out:

  • Experience with model performance monitoring
  • Knowledge of AI concepts and tools
  • Experience with ITIL/incident management frameworks.


 

Duties and Responsibilities

  • Lead incident response and provide on-call support for escalated cloud incidents and deployments, conducting root cause analysis, ensuring timely resolution, and implementing improvements to prevent future incidents.
  • Develop and maintain automated operational procedures, such as system maintenance, monitoring, compliance, patching, and recovery, to enhance cloud service reliability and reduce manual intervention.
  • Collaborate with cross-functional teams and leadership, providing operational insights to improve cloud infrastructure reliability, availability, and performance.
  • Ensure cloud environments adhere to operational standards, security, and compliance requirements, managing operational readiness, disaster recovery, and performance process and procedures.
  • Monitor and optimize cloud service performance, identifying issues proactively and leading continuous improvement efforts to enhance system reliability and operational efficiency.
  • Mentor and guide junior engineers, providing technical leadership and promoting best practices in cloud operations, automation, and incident response.
  • Lead operational improvement initiatives, including chaos engineering practices, operational drills, and testing activities to improve response, resiliency, and detection capabilities.


The first five duties listed are the functions identified as essential to the job. Essential functions are those job duties that must be performed in order for the job to be accomplished.


This position description in no way states or implies that these are the only duties to be performed by the incumbent. Employees are required to follow any other job-related instruction and to perform any other duties as requested by their supervisor, or as become clear.

 

Capabilities

  • Collaborates
  • Cultivates Innovation
  • Customer Focus
  • Decision Quality
  • Ensures Accountability
  • Instills Trust
  • Nimble Learning
  • Optimizes Work Processes (IC)
  • Self-Development
  • Values Diversity

Qualifications

Minimum Educational and Experience Requirements

  • Bachelor’s degree in computer science, engineering, or equivalent industry experience in a related technical field; and five years of professional experience in a related technical field; or
  • Associate’s degree in computer science, engineering, or equivalent industry experience in a related technical field; and seven years of professional experience in a related technical field; or
  • High School degree and nine years of professional experience in a related technical field, required. 

 

Additional Requirements

  • Cloud Platforms: Advanced knowledge of one or more cloud platforms (e.g., AWS, Azure, GCP), with deep expertise in managing cloud infrastructure.
  • Infrastructure as Code (IaC): Proficiency in Infrastructure as Code (IaC) tools (e.g., Terraform, CloudFormation, CDK, ARM, Pulumi), with a focus on the configuration and management of cloud resources.
  • Automation & Scripting: Expertise in CI/CD pipelines and automation tools (e.g., Jenkins, GitLab, Ansible), and scripting languages (e.g., Python, Bash, Powershell) to streamline operational tasks.
  • Monitoring & Incident Response: Proficiency with monitoring and observability tools (e.g., CloudWatch, Prometheus), with experience in cloud incident response and troubleshooting.
  • Operational Procedures: Ability to execute and improve operational procedures, including system maintenance, patching, recovery, and monitoring.
  • Compliance & Standards: Understanding of operational standards, controls, and compliance requirements, ensuring that cloud environments meet necessary regulations.
  • Operational Strategy: Ability to identify operational inefficiencies and propose strategic solutions to optimize system performance and availability.


Designations and/or Licenses

  • Associate-level cloud certification (such as AWS Certified Cloud Solutions Architect - Associate) preferred or willingness to obtain within 6 months of hire.

Physical Requirements

  • Ability to move over 50 lbs using lifting aide equipment; Rarely
  • Climbing/accessing heights; Rarely
  • Driving; Occasional (<20%)
  • Lifting/Moving 0-20 lbs; Occasional (<20%)
  • Lifting/Moving 20-50 lbs; Rarely
  • Manual Keying/Data Entry/inputting information/computer use; Frequent (50-80%)
  • Pushing/Pulling/moving objects, equipment with wheels; Rarely


Nearest Major Market: Erie