Zafin Logo

Zafin

Cloud Site Reliability Engineer II

Posted 3 Days Ago
Be an Early Applicant
Trivandrum, Thiruvananthapuram, Kerala
Senior level
Trivandrum, Thiruvananthapuram, Kerala
Senior level
As a Cloud Site Reliability Engineer II, you will lead initiatives to enhance the reliability and performance of Zafin's cloud infrastructure. Responsibilities include resolving complex technical issues, implementing operational improvements, mentoring junior engineers, and overseeing project implementations, all while driving strategic cloud reliability efforts.
The summary above was generated by AI

Who we are

Founded in 2002, Zafin offers a SaaS product and pricing platform that simplifies core modernization for top banks worldwide. Our platform enables business users to work collaboratively to design and manage pricing, products, and packages, while technologists streamline core banking systems. 

With Zafin, banks accelerate time to market for new products and offers while lowering the cost of change and achieving tangible business and risk outcomes. The Zafin platform increases business agility while enabling personalized pricing and dynamic responses to evolving customer and market needs. 

Zafin is headquartered in Vancouver, Canada, with offices and customers around the globe including ING, CIBC, HSBC, Wells Fargo, PNC, and ANZ. Zafin is proud to be recognized as a top employer and certified Great Place to Work® in Canada, India and the UK.  


Job Summary

Zafin is seeking a Cloud Site Reliability Engineer II (CSRE II) to lead strategic initiatives in ensuring the reliability, scalability, and performance of our cloud infrastructure and applications. This advanced role requires mastery in cloud technologies, strategic planning, and incident management to drive innovative solutions and operational excellence.

As a CSRE II, you will influence the direction of cloud reliability strategies, mentor junior engineers, and lead significant projects that have a broad organizational impact. This position reports directly to the VP of Cloud Services and requires a proactive, collaborative mindset to achieve operational and strategic objectives.

Key Responsibilities

  • Lead and manage the resolution of complex technical issues involving Zafin’s products and Azure cloud environment.
  • Design and implement strategic operational enhancements to improve resiliency and system reliability.
  • Conduct in-depth Root Cause Analysis (RCA) for high-severity incidents and drive initiatives to reduce error recurrence.
  • Represent the organization in external client escalation calls, providing expert guidance and solutions.
  • Architect and optimize cloud infrastructure for high performance, scalability, and cost-effectiveness.
  • Provide thought leadership in managing and scaling container orchestration platforms such as AKS and OpenShift.
  • Oversee the implementation of advanced monitoring solutions and integrate predictive analytics for proactive issue resolution.
  • Develop and execute automation strategies to streamline operational workflows and incident responses.
  • Create and maintain comprehensive documentation of cloud architectures, processes, and incident management strategies.
  • Mentor and coach junior engineers, fostering a culture of continuous learning and innovation.
  • Drive strategic initiatives, collaborating with cross-functional teams to achieve organizational objectives.

 

Qualifications

  • Bachelor’s degree in Computer Science, Engineering, or a related field (Master’s degree preferred).
  • 12+ years of experience in cloud support, operations, or a related role.
  • Advanced expertise in Microsoft Azure (preferred) or equivalent cloud platforms.
  • Demonstrated experience in designing and scaling container orchestration systems like AKS or OpenShift.
  • Proven leadership in managing automated deployment pipelines, including Azure DevOps.
  • Mastery in enterprise monitoring platforms (e.g., Azure Insights, Grafana) and predictive analytics tools.
  • Advanced scripting skills with PowerShell, Python, or similar languages.
  • Extensive experience in incident management and defining SLAs for global production environments.
  • In-depth knowledge of database management, particularly Postgres.

 

Preferred Qualifications

  • Advanced certifications in cloud platforms (e.g., Azure Solutions Architect Expert).
  • Experience with ITSM tools and processes (e.g., ServiceNow).
  • Comprehensive understanding of security and compliance in cloud environments.

Soft Skills

  • Exceptional analytical and problem-solving abilities.
  • Strong leadership and mentoring skills.
  • Advanced communication and collaboration capabilities.
  • Visionary approach to operational innovation and strategic planning.

What’s in it for you

Joining our team means being part of a culture that values diversity, teamwork, and high-quality work. We offer competitive salaries, annual bonus potential, generous paid time off, paid volunteering days, wellness benefits, and robust opportunities for professional growth and career advancement. Want to learn more about what you can look forward to during your career with us? Visit our careers site and our openings: zafin.com/careers

Zafin welcomes and encourages applications from people with disabilities. Accommodations are available on request for candidates taking part in all aspects of the selection process. 

Zafin is committed to protecting the privacy and security of the personal information collected from all applicants throughout the recruitment process. The methods by which Zafin contains uses, stores, handles, retains, or discloses applicant information can be accessed by reviewing Zafin’s privacy policy at https://zafin.com/privacy-notice/. By submitting a job application, you confirm that you agree to the processing of your personal data by Zafin described in the candidate privacy notice.

Top Skills

Aks
Azure
Openshift
Postgres
Powershell
Python

Similar Jobs

Yesterday
Trivandrum, Thiruvananthapuram, Kerala, IND
Senior level
Senior level
Fintech • Consulting
The Site Reliability Engineer Lead will oversee a team to ensure service stability and scalability through automation, monitoring, and incident response. Responsibilities include managing cloud infrastructure, implementing CI/CD pipelines, and collaborating with development teams to optimize performance and reliability.
Top Skills: BashGoGroovyJavaPython
2 Days Ago
Trivandrum, Thiruvananthapuram, Kerala, IND
Senior level
Senior level
Fintech • Consulting
The Senior Site Reliability/DevOps Engineer at Equifax will design and architect scalable systems, develop automation frameworks, lead incident management efforts, mentor junior engineers, and collaborate with teams to ensure system reliability and performance.
Top Skills: BashGoPythonShell
5 Days Ago
Trivandrum, Thiruvananthapuram, Kerala, IND
Senior level
Senior level
Fintech • Consulting
The Senior Site Reliability/DevOps Engineer at Equifax will contribute to the architecture, automation, and management of scalable, secure production systems. Responsibilities include developing automation frameworks, incident management, system performance optimization, mentoring junior engineers, and collaborating with cross-functional teams.
Top Skills: BashGoPythonShell

What you need to know about the Kolkata Tech Scene

When considering the industries shaping India's tech scene, gaming might not immediately come to mind. However, in the last decade, increased internet usage and greater access to mobile devices have catapulted the industry to new heights, with Kolkata-based companies like Virtualinfocom, Red Apple Technologies and Digitoonz, at the forefront, driving the design and animation of new gaming titles for players.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account