Cargill Logo

Cargill

Data Engineer

Posted 3 Days Ago
Be an Early Applicant
In-Office
Bengaluru, Bengaluru Urban, Karnataka
Mid level
In-Office
Bengaluru, Bengaluru Urban, Karnataka
Mid level
Design, build, and maintain scalable batch and streaming data pipelines and data systems. Optimize data infrastructure, enforce governance and security, implement automated deployments, and collaborate with analytics stakeholders to deliver robust data products for analysis and decision making.
The summary above was generated by AI
Job Purpose and Impact
  • The Data Engineering job designs, builds and maintains moderately complex data systems that enable data analysis and reporting. With limited supervision, this job collaborates to ensure that large sets of data are efficiently processed and made accessible for decision making.

Key Accountabilities
  • DATA & ANALYTICAL SOLUTIONS: Develops moderately complex data products and solutions using advanced data engineering and cloud based technologies, ensuring they are designed and built to be scalable, sustainable and robust.
  • DATA PIPELINES: Maintains and supports the development of streaming and batch data pipelines that facilitate the seamless ingestion of data from various data sources, transform the data into information and move to data stores like data lake, data warehouse and others.
  • DATA SYSTEMS: Reviews existing data systems and architectures to implement the identified areas for improvement and optimization.
  • DATA INFRASTRUCTURE: Helps prepare data infrastructure to support the efficient storage and retrieval of data.
  • DATA FORMATS: Implements appropriate data formats to improve data usability and accessibility across the organization.
  • STAKEHOLDER MANAGEMENT: Partners with multi-functional data and advanced analytic teams to collect requirements and ensure that data solutions meet the functional and non-functional needs of various partners.
  • DATA FRAMEWORKS: Builds moderately complex prototypes to test new concepts and implements data engineering frameworks and architectures to support the improvement of data processing capabilities and advanced analytics initiatives.
  • AUTOMATED DEPLOYMENT PIPELINES: Implements automated deployment pipelines to support improving efficiency of code deployments with fit for purpose governance.
  • DATA MODELING: Performs moderately complex data modeling aligned with the datastore technology to ensure sustainable performance and accessibility.

Qualifications
  • Minimum requirement of 4 years of relevant work experience with a Bachelor's degree.
  • Big Data Technologies: Hands-on experience with the Hadoop ecosystem (HDFS, Hive, MapReduce) and distributed processing frameworks like Apache Spark (including PySpark and Spark SQL) for large-scale batch and streaming workloads.
  • Programming Expertise: Strong proficiency in Python (data manipulation, orchestration, and automation), Scala(Spark-based development), and advanced SQL (window functions, CTEs, query optimization) for high‑volume analytical queries.
  • Data Pipeline Development: Proven ability to design, build, and optimize ETL/ELT pipelines for batch and real-time ingestion using tools/frameworks such as Spark Structured Streaming, Kafka Connect, Airflow/Azure Data Factory, or Glue, with robust error handling, observability, and SLAs.
  • Cloud & Data Warehousing: Hands-on with modern data warehouses like Snowflake & Lakehouse Architecture.
  • Transactional Data Systems: Experience with transaction management (isolation levels, locking, concurrency), backup/restore, replication (logical/physical), and high availability (Patroni, PgBouncer, read replicas).
  • Data Governance & Security: Understanding and implementation of data quality frameworks (DQ checks, Great Expectations/Deequ), metadata management (Glue/Azure Purview), role-based access control and row/column-level security, encryption, and compliance-aligned data handling (PII masking, auditability).

Preferred Skills
  • Experience with Apache Kafka or similar platforms for real-time data streaming.
  • Exposure to CI/CD pipelines, containerization (Docker), and orchestration tools (Kubernetes) for data workflows.
  • Understanding of supply chain analytics, commodity trading data flows, and risk management metrics (ideal for agri commodities industry).
  • Ability to collaborate with data scientists on predictive modeling and machine learning pipelines.

Similar Jobs at Cargill

Yesterday
In-Office
Senior level
Senior level
Food • Greentech • Logistics • Sharing Economy • Transportation • Agriculture • Industrial
Design, build, and maintain scalable batch and streaming data platforms and pipelines (Snowflake, Kafka/Pulsar). Develop production-grade Python ETL/ELT, data models, automated deployments, and ensure security, governance, performance, reliability, and GenAI enablement. Partner with analytics and business stakeholders to deliver robust data products.
Top Skills: .NetApache KafkaApache PulsarData LakeData WarehouseEltOpenaiOraclePostgresPower BIPythonSnowflakeSQLSQL ServerTableau
Yesterday
In-Office
Senior level
Senior level
Food • Greentech • Logistics • Sharing Economy • Transportation • Agriculture • Industrial
Designs, builds, and operates scalable AWS-based batch and streaming data pipelines and platforms (Lakehouse and warehouse). Owns architecture, performance, security, and cost optimization; implements transformations (SQL/dbt, PySpark), Kafka ingestion, Airflow orchestration, and CI/CD/Terraform. Partners with product and platform teams, mentors engineers, and drives data modeling, schema evolution, and operational excellence.
Top Skills: AirflowAws GlueAws IamAws LambdaCi/CdDbtGitHiveIcebergImpalaKafkaParquetPostgresPower BIPysparkPythonS3SagemakerSnowflakeSQLTerraform
14 Days Ago
In-Office
Senior level
Senior level
Food • Greentech • Logistics • Sharing Economy • Transportation • Agriculture • Industrial
The Principal Data Engineer will lead the design and development of data systems, pipelines, and frameworks while ensuring data availability for analysis. They will collaborate with stakeholders and manage data infrastructure efficiently.
Top Skills: AWSAws GlueFlinkIcebergKafkaParquetSnowflakeSpark

What you need to know about the Kolkata Tech Scene

When considering the industries shaping India's tech scene, gaming might not immediately come to mind. However, in the last decade, increased internet usage and greater access to mobile devices have catapulted the industry to new heights, with Kolkata-based companies like Virtualinfocom, Red Apple Technologies and Digitoonz, at the forefront, driving the design and animation of new gaming titles for players.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account