Machine Learning Data Engineer

Remote Full TimeUnited States (Remote)Kalibri Labs

At Kalibri, we’re looking for a Machine Learning Data Engineer to design, build, and maintain production pipelines that power Kalibri’s core algorithmic products. The role is ideal for someone mid-level who thrives turning complex models into reliable, scalable production systems.

Requirements

  • Design, build, and maintain production data pipelines using Python, Prefect, Airflow, Jenkins or any other orchestration framework multi-phase algorithmic workflows.
  • Build and optimize advanced SQL transformations in Snowflake, including window functions, CTEs, stored procedures, UDFs, and semi-structured data processing.
  • Build and maintain dbt models for data transformation, identity resolution, and slowly changing dimension (SCD Type 2) tracking across 80+ models and multiple pipeline stages.
  • Build and maintain feature engineering pipelines that feed ML models including CatBoost gradient boosting, Prophet time-series decomposition, LightGBM regression, and PuLP linear programming solvers.
  • Operationalize ML model outputs by integrating predicted ADRs, occupancy forecasts, and optimization results into downstream production tables and Parquet file outputs.
  • Integrate and reconcile data from multiple heterogeneous sources including hotel property management systems, rate shop providers, mapping APIs, and market forecast data.
  • Work with PySpark for large-scale daily distribution processing, managing partitioning strategies, memory tuning, and efficient Parquet I/O across millions of records.
  • Implement and monitor data quality frameworks such as DBT and Monte Carlo.
  • Manage CI/CD pipelines using Bitbucket Pipelines for automated testing, linting (SQLFluff), and deployment of dbt projects and Python applications.
  • Containerize pipeline components with Docker for consistent execution across development and production environments.
  • Implement robust retry logic, error handling, and fallback strategies across pipeline phases to ensure reliable daily and monthly production runs.

Benefits

  • Fully remote work
  • Robust medical, dental, and vision plans through Blue Cross Blue Shield
  • 401k plan with employer match
  • Flexible Paid Time Off

Before applying for this position you need to submit your online resume. Click the button below to continue.

Tired of manual job applications?

JobCopilot auto-applies to thousands of RevOps and GTM roles on your behalf — so you can focus on interviews, not applications.

Applying for this role?

Tailor your resume to this exact role — hiring managers notice the difference.

Latest articles on the blog

RECRUITERS!

Reduce the risk of your recruitment process (applicant quality, long and inefficient process) by selecting from a relevant pool of candidates.

POST A NEW JOB NOW!