Citi Logo

Citi

Senior PySpark Data Engineer

Posted Yesterday
Be an Early Applicant
In-Office
Pune, Mahārāshtra
Senior level
In-Office
Pune, Mahārāshtra
Senior level
The Senior PySpark Data Engineer will design and maintain data pipelines, optimize Spark jobs, mentor junior engineers, and ensure data integrity.
The summary above was generated by AI

Key Responsibilities About the Role

We are seeking a highly skilled and experienced Senior PySpark Data Engineer to join our dynamic data engineering team. The ideal candidate will have a strong background in building and managing large-scale data processing systems and a proven track record of working with cutting-edge Big Data technologies. You will be responsible for designing, developing, and maintaining our data pipelines, ensuring they are efficient, reliable, and scalable to meet our growing business needs.

Key Responsibilities

  • Design, develop, and maintain robust, scalable, and high-performance data pipelines using PySpark.
  • Develop, schedule, and monitor complex data workflows using orchestration tools like Apache Airflow.
  • Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver high-quality data solutions.
  • Optimize and tune Spark jobs for performance and efficiency.
  • Implement data quality checks and ensure data integrity across all data pipelines.
  • Design and implement data models for optimal storage and retrieval.
  • Mentor junior data engineers and promote best practices in data engineering.
  • Ensure compliance with data governance and security policies.
  • Troubleshoot and resolve data-related issues in a timely manner.

Required Qualifications

  • 6+ years of professional relevant experience in a data engineering role
  • Extensive hands-on experience with PySpark and advanced Python programming skills.
  • Proven experience with Big Data ecosystems, including Cloudera and/or DataBricks.
  • Hands-on experience with distributed query engines like Starburst (Trino/Presto).
  • Proficient in designing and managing complex workflows using scheduling tools, particularly Apache Airflow.
  • Strong expertise in SQL and experience with relational and non-relational databases.
  • Solid understanding of data warehousing concepts, ETL/ELT processes, and data modeling techniques.
  • Experience working in a Linux/Unix environment.
  • GIT HUB, CI/CD Pipeline

Education:

  • Bachelor’s degree/University degree or equivalent experience

This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.

------------------------------------------------------

Job Family Group: Technology

------------------------------------------------------

Job Family:Applications Development

------------------------------------------------------

Time Type:Full time

------------------------------------------------------

Most Relevant Skills Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

 

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.

Citi Kolkata, West Bengal, IND Office

41, Chowringhee Rd , Kanak Building, Kolkata, West Bengal, India, 700071

Similar Jobs

16 Days Ago
In-Office or Remote
IN
Senior level
Senior level
Insurance
Design and develop scalable data pipelines using PySpark and Databricks, focusing on data ingestion, transformation, validation, and performance optimization.
Top Skills: DatabricksPysparkPythonSparkSQL
Internship
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
This internship at Mondelēz International involves onboarding within the R&D department, aimed at early-career candidates eager to grow and learn in a dynamic environment.
11 Hours Ago
Hybrid
Senior level
Senior level
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
The Sr. Software Engineer will lead software delivery, enhance existing code, participate in Agile processes, and ensure quality in software solutions at Mastercard.
Top Skills: Apache FlinkApache KafkaAWSCheckmarxGitJavaNatsSonarSpring Boot

What you need to know about the Kolkata Tech Scene

When considering the industries shaping India's tech scene, gaming might not immediately come to mind. However, in the last decade, increased internet usage and greater access to mobile devices have catapulted the industry to new heights, with Kolkata-based companies like Virtualinfocom, Red Apple Technologies and Digitoonz, at the forefront, driving the design and animation of new gaming titles for players.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account