Join a global healthcare biopharma company in Hyderabad, India as a Specialist, GSF DnA Data Engineer. Drive innovation and execution excellence by designing, building, and operating production-grade data platforms and pipelines. Partner with analytics, data science, and business stakeholders to translate requirements into robust datasets. Deliver reliable, governed, secure, and analytics-ready data by implementing modern data warehousing and lakehouse patterns on AWS and Databricks.

Requirements

Design, build, and operate batch and streaming data pipelines to ingest data from multiple sources into an AWS data lake / lakehouse and data warehouse.
Develop and maintain ETL/ELT transformations using Python, PySpark, and SQL; optimize jobs for performance, cost, and reliability.
Partner with Data Analysts, Data Scientists, and business stakeholders to understand use cases and deliver curated, analytics-ready datasets and features.
Implement data quality controls (validation rules, reconciliation, anomaly checks), define SLAs/SLOs, and contribute to metadata, lineage, and data catalog practices.
Use orchestration and observability to run pipelines reliably (e.g., Databricks Workflows, AWS Step Functions, scheduling, logging, monitoring, alerting).
Apply engineering best practices: unit/integration testing, automated data tests, code reviews, and quality gates within CI/CD.
Model and publish data for BI/analytics using dimensional modeling (star/snowflake), facts & dimensions, and slowly changing dimensions (SCD).
Write and tune advanced SQL for profiling, transformations, and performance troubleshooting across large datasets.
Build on AWS using services such as S3, Glue, Lambda, Step Functions, EMR, and CloudWatch; follow security best practices (IAM, encryption, least privilege).
Provision and manage cloud resources using Infrastructure as Code (e.g., Terraform) across dev/test/prod environments.
Package and deploy workloads using Docker (and where applicable ECS/Fargate); manage dependencies and runtime configurations.
Use GitHub for version control (branching strategies, pull requests, code reviews) and set up CI/CD for automated build, test, and deployment.
Develop scalable processing on Databricks / Apache Spark using PySpark and lakehouse concepts (e.g., Delta Lake, ACID, schema evolution).
Use notebooks (e.g., Jupyter/Databricks) for exploration and PoCs, then productionize solutions with reusable modules, tests, and deployment pipelines.
Work in an Agile delivery model (planning, daily sync, reviews, retros), providing accurate estimates and proactively managing risks/dependencies.
Create and maintain technical documentation (data contracts, pipeline specs, runbooks) and support operational handoffs.

Benefits

Generous Paid Time Off
401k Matching
Retirement Plan
Visa Sponsorship

To apply for this job please visit msd.wd5.myworkdayjobs.com.

Specialist, GSF DnA Data Engineer

Requirements

Benefits

Tired of manual job applications?

Applying for this role?

Job Categories

APPLYING TO MULTIPLE ROLES?

Featured Jobs

THE SMARTEST AI RESUME BUILDER

Weekly Jobs Newsletter

RevOps Academy

Vote us on Product Hunt

Follow us on social media

RevOps Academy

Latest articles on the blog

RevOps Career Path in 2026: What 1,890 Real Job Postings Tell You About Breaking In

AIRops in 2026: The New RevOps Discipline, the Jobs It’s Creating, and What They Pay

The 2026 RevOps Career Guide: How to Break In, Move Up, and Get Paid

RECRUITERS!