We are looking for a Data Platform Reliability Engineer to ensure the reliability, scalability, and performance of our enterprise data platform. You will play a key role in building highly resilient, governable, observable, and cost-efficient data platforms, while enabling engineering teams to operate at scale with automation and best practices.
Requirements
- Automate infrastructure provisioning and operations using Infrastructure as Code (IaC)
- Implement and manage CI/CD pipelines for data and platform deployments
- Implement and manage Data Governance tools such as Collibra
- Improve system resilience through capacity planning, performance tuning, and fault tolerance design
- Optimize cloud usage and costs through FinOps best practices
- Collaborate with engineering and analytics teams to improve platform reliability and developer experience
- Drive security, compliance, and access control best practices
- Own platform reliability and availability for enterprise data systems (SLAs, SLOs, error budgets)
- Monitor and manage data cloud infrastructure, ingestion frameworks, and transformation workflows
- Lead incident management, root cause analysis (RCA), and postmortems
To apply for this job please visit hccz.fa.em3.oraclecloud.com.

Follow us on social media