As a Data Engineer, you will work closely with our development and data engineering teams to design, develop, and maintain software applications and data pipelines. You will have the chance to enhance your skills in programming and data management while contributing to innovative projects that drive our business forward.
Experience Level: 5+ Years
Location: Indore/Raipur/Gurgaon/Bangalore
Employment Type: Full-time
Role Description:
● Design & Develop: Build and maintain scalable data platform frameworks leveraging Big Data
technologies (Spark, Hadoop, Kafka, Hive, etc.) and GCP services (BigQuery, Dataflow, Pub/Sub, etc.).
● Data Pipeline Development: Develop, optimize, and manage batch and real-time data pipelines to support business intelligence, analytics, and AI/ML workloads.
● Java Development: Utilize Java to build efficient, high-performance data processing applications and frameworks.
● Cloud Architecture: Design and implement cloud-native data solutions on GCP, ensuring reliability, security, and cost efficiency.
● ETL & Data Integration: Work with structured and unstructured data sources, integrating data from multiple systems into a unified platform.
● Performance Tuning: Optimize data processing performance by fine-tuning Spark jobs, SQL queries, and distributed computing environments.
● Collaboration: Work closely with data scientists, analysts, and software engineers to deliver high-quality data solutions.
● Automation & Monitoring: Implement CI/CD pipelines for data workflows and set up monitoring solutions to track system health and performance.
Required Skills & Qualifications:
● Strong proficiency in Java for data engineering and backend development (springboot, microservices).
● Hands-on experience with Big Data technologies (Hadoop, Spark, Kafka, Hive, HBase, etc.).
● Expertise in GCP services: Big Query, Dataflow, Pub/Sub, Cloud Storage, Composer (Airflow), Python Dataproc, etc.
● Experience in developing data platform frameworks to support scalable and reusable data solutions.
● SQL & NoSQL database experience (e.g., Big Query, PostgreSQL, Cassandra, MongoDB).
● Knowledge of ETL/ELT processes and data modeling concepts.
● Experience with CI/CD tools (Git, Jenkins, Terraform) and infrastructure as code (IaC).
● Understanding of distributed computing principles and high-performance data processing.
● Strong problem-solving skills and ability to work in a fast-paced, agile environment.