Design, develop, and maintain end-to-end ETL/ELT pipelines using Python and PySpark. Build large-scale data processing frameworks to handle structured and unstructured data, ensuring high performance and reliability. Architect and manage data solutions within the GCP ecosystem, focusing on cost-efficiency and security.
Requirements
- Strong proficiency in Python, including experience with libraries like Pandas, NumPy, and logging frameworks.
- 3+ years of hands-on experience with Apache Spark (PySpark) for distributed data processing.
- Practical experience with Google Cloud services, specifically BigQuery, Cloud DataProc or Dataflow, Cloud Storage, Cloud Functions, and Cloud Composer.
- Solid understanding of relational databases and SQL (PostgreSQL, MySQL) as well as NoSQL environments.
- Experience with Git, Docker, and CI/CD pipelines. Familiarity with Terraform or other IaC tools is a significant plus.
To apply for this job please visit fa-etvl-saasfaprod1.fa.ocs.oraclecloud.com.

Follow us on social media