Senior Data Engineer
The Senior Data Engineer will own the path from raw transactional and event data to trustworthy, well-modelled datasets powering BetMGM's analytics, ML, and operational systems. Builds on an AWS + Snowflake stack — Prefect on ECS Fargate for orchestration, dbt for transformation, Terraform for everything, CI/CD pipelines with quality gates that block bad code. Comfortable directing AI coding agents (Claude Code, Cursor, Copilot, dbt Copilot, Snowflake Cortex Code) as a force multiplier across the engineering workflow — PR review, model authoring, test generation, incident triage. Strong opinions about what belongs in the warehouse vs. the orchestrator vs. the platform, and the seniority to push back when a request shouldn't be built the way it was asked.
- Design, build, and operate batch, micro-batch, and streaming pipelines feeding Snowflake — Prefect-orchestrated flows on ECS Fargate, dbt for transformation, Snowpipe Streaming and Kafka for event ingestion.
- Own the full dbt lifecycle (sources → staging → intermediate → marts) with model contracts, freshness SLAs, automated tests, and version-controlled documentation.
- Stand up Snowflake objects (warehouses, RBAC, resource monitors, Dynamic Tables, Iceberg tables) through Terraform — no ClickOps in production.
- Build AWS-native infrastructure for data workloads — S3, ECS Fargate, Lambda, EMR Serverless, Glue Catalog, IAM, Secrets Manager, VPC endpoints — entirely in Terraform.
- Maintain CI/CD pipelines (GitLab CI or GitHub Actions) that gate every change with linting, dbt build, unit tests, contract checks, and AI-assisted code review.
- Tune warehouse sizing, clustering, and query patterns for cost and latency; instrument credit usage via ACCOUNT_USAGE; right-size before scaling up.
- Design RBAC, masking policies, and row-access policies that satisfy a regulated operator without becoming an access bottleneck.
- Bring newer Snowflake capabilities to bear — Dynamic Tables, Snowpipe Streaming, Iceberg, Cortex AISQL — when they are the right answer, not because they are new.
- Own freshness SLAs and data contracts for the gold layer; configure Monte Carlo coverage for volume, freshness, schema, and distribution; triage incidents end-to-end.
- Treat the warehouse as a product: every consumer-facing model has tests, documentation, an owner, and a defined SLO.
- Direct AI coding agents (Claude Code, Cursor, GitHub Copilot, dbt Copilot, Snowflake Cortex Code) as a force multiplier — writing specs, decomposing work, reviewing AI-generated PRs, and owning the architectural decisions agents cannot make.
- Help the team raise its ceiling on what is possible with AI in the loop, not just its baseline productivity.
- Partner with analytics engineers, data scientists, and ML platform engineers on shared standards (naming, testing, observability, lineage, cost attribution).
- Work alongside Entain India and contractor engineering partners; level them up on the standard playbook so the same code review, IaC, and CI/CD norms apply everywhere.
- Translate stakeholder requests into the right shape — push back when a request should not be built the way it was asked.
- BS or MS in Computer Science, Statistics, Math, or other STEM field — or equivalent practical experience. Practical experience wins ties. (required)
- 5+ years building production data pipelines on a modern stack (Python + SQL + dbt + cloud). (required)
- Deep Snowflake — beyond SQL into administration: warehouse sizing, RBAC, resource monitors, Streams/Tasks, Dynamic Tables, secure data sharing, cost tuning via ACCOUNT_USAGE. (required)
- Strong AWS — S3, ECS/Fargate, Lambda, IAM, Secrets Manager, VPC — plus production experience with at least one of EMR Serverless, Glue, or MWAA. (required)
- Terraform for both cloud and Snowflake — you have owned IaC, not just touched it. (required)
- Orchestration fluency — Prefect, Airflow, or Dagster — and an opinion about when each is the right tool. (required)
- CI/CD ownership — you have built quality gates that block bad code, not just YAML pipelines that pass. (required)
- Bias toward outcomes — you describe past work in terms of SLAs, incidents, and customers served, not tool checklists. (required)
- Snowflake-native ML (Snowpark, Cortex AISQL, Snowflake Notebooks) for in-warehouse scoring or unstructured workloads. (nice-to-have)
- Iceberg / open-table-format experience for cross-engine interoperability. (nice-to-have)
- Streaming experience — Kafka, Snowpipe Streaming, or Kinesis — with stated latency budgets. (nice-to-have)
- Reverse-ETL exposure (Hightouch, Census, or custom) into operational marketing or product systems. (nice-to-have)
- A demonstrable track record of shipping more with AI in the loop than without — not "I have used Cursor," but "this is how I design work for an agent to do." (nice-to-have)
- Regulated-industry experience (gaming, fintech, healthcare) — comfort with audit, lineage, and PII tiering. (nice-to-have)
- Medical, Dental, Vision, Life, and Disability Insurance
- 401(k) with company match
- Pre-tax spending accounts including health care FSA and commuter savings
- Flexible paid time off
- Professional development reimbursement and ongoing skills training opportunities
- Employee resource groups
- Swag, ticket giveaways, and more!
