The Site Reliability Engineer III will ensure application reliability and stability, manage incidents, and develop automation while promoting team collaboration and operational excellence.
Job Description
As a Site Reliability Engineer III at JPMorgan Chase within the Chief Technology Office, you will collaborate with engineering, support, and operations teams to maintain and improve the reliability of mission-critical applications. You'll participate in incident management, troubleshooting, and continuous improvement, and help implement automation and monitoring solutions. On-call rotation is part of the role, requiring effective action during production incidents and a commitment to operational excellence. You'll share knowledge, follow best practices, and contribute to a culture of learning and innovation. We value team players who communicate clearly, solve problems proactively, and focus on customer needs.
Job responsibilities
Required qualifications, capabilities, and skills
Preferred qualifications, capabilities, and skills
As a Site Reliability Engineer III at JPMorgan Chase within the Chief Technology Office, you will collaborate with engineering, support, and operations teams to maintain and improve the reliability of mission-critical applications. You'll participate in incident management, troubleshooting, and continuous improvement, and help implement automation and monitoring solutions. On-call rotation is part of the role, requiring effective action during production incidents and a commitment to operational excellence. You'll share knowledge, follow best practices, and contribute to a culture of learning and innovation. We value team players who communicate clearly, solve problems proactively, and focus on customer needs.
Job responsibilities
- Design, develop, and operate solutions for application reliability, monitoring, and automation.
- Execute incident response, troubleshooting, and root cause analysis to resolve production issues and improve system stability.
- Build and maintain CI/CD pipelines using Jenkins (including global libraries), and implement infrastructure as code with Terraform.
- Develop and support containerized applications using Docker and Kubernetes, ensuring robust deployments and scalability.
- Implement and maintain observability solutions using tools such as Grafana, Prometheus, Splunk, and OpenTelemetry.
- Collaborate with engineering and support teams to drive continuous improvement and operational excellence.
- Participate in on-call rotation, responding to production incidents and ensuring timely resolution.
Required qualifications, capabilities, and skills
- Formal training or certification on Site Reliability Engineering concepts and 3+ years applied experience
- Experience in SRE, DevOps, or application support roles, with knowledge of SLIs/SLOs, incident response, and troubleshooting.
- Familiarity with monitoring and observability tools (e.g., Grafana, Prometheus, Splunk, OpenTelemetry).
- Hands-on experience with CI/CD pipelines (Jenkins, including global libraries), infrastructure as code (Terraform), version control (Git), containerization (Docker), and orchestration (Kubernetes).
- Exposure to cloud platforms (AWS, GCP, or Azure) and automating infrastructure and deployments.
- Willingness to participate in on-call rotation and respond to production incidents.
- Ability to break down issues, document solutions, and communicate effectively with team members and customers.
Preferred qualifications, capabilities, and skills
- Familiar in banking, fintech, or regulated environments.
- Participation in game days or chaos engineering.
- Interest in sharing knowledge and best practices with peers.
Top Skills
AWS
Azure
Docker
GCP
Git
Grafana
Jenkins
Kubernetes
Opentelemetry
Prometheus
Splunk
Terraform
Similar Jobs at JPMorganChase
Financial Services
As a Site Reliability Engineer III, you'll solve complex problems by optimizing applications and infrastructure, guiding design approaches, and collaborating with teams on deployments and best practices.
Top Skills:
AWSAzureDevOpsDockerGCPGitGrafanaJenkinsKubernetesOpentelemetryPrometheusSite Reliability EngineeringSlisSlosSplunkTerraform
Financial Services
As a Site Reliability Engineer III, you will maintain and optimize applications and infrastructure, guide peers, and enhance reliability and scalability through collaboration and automation tools.
Top Skills:
AWSAzureDockerGCPGitGrafanaJenkinsKubernetesOpentelemetryPrometheusSplunkTerraform
Financial Services
As a Site Reliability Engineer III, you'll solve complex business problems, optimize infrastructure, implement CI/CD pipelines, and promote site reliability practices, collaborating with teams.
Top Skills:
AWSDatadogDockerDynatraceEcsGitlabGoGrafanaJavaJenkinsKubernetesPrometheusPythonSplunkTerraform
What you need to know about the Kolkata Tech Scene
When considering the industries shaping India's tech scene, gaming might not immediately come to mind. However, in the last decade, increased internet usage and greater access to mobile devices have catapulted the industry to new heights, with Kolkata-based companies like Virtualinfocom, Red Apple Technologies and Digitoonz, at the forefront, driving the design and animation of new gaming titles for players.