Manager, ML Platform and Infrastructure
DarkVision Voir toutes les offres
- North Vancouver, BC
- 150.000-200.000 $ par an
- Permanent
- Temps-plein
- Team Leadership & Strategy: Manage, mentor, and grow a diverse team of engineers and scientists. You will set the technical roadmap for our ML infrastructure, balancing immediate production needs with long-term scalability goals.
- Platform Architecture: Architect the cloud-based (AWS) and on-premise systems required for massive-scale batch processing. You will make high-level decisions regarding compute orchestration (Kubernetes), storage, and cost optimization.
- Operational Excellence: Oversee the development of CI/CD pipelines and MLOps practices. You will ensure our training and inference workflows are reproducible, monitored, and secure.
- Data Integrity & Validation: Supervise the Data Science function to ensure the statistical validity of our models. You will champion data integrity audits and experimental design frameworks that prove the reliability of our technology to clients.
- Cross-Functional Collaboration: Serve as the bridge between Software Engineering, Applied ML, and Data Analysis teams, ensuring that infrastructure decisions support the company's product delivery goals.
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in software, infrastructure, or data engineering.
- 2+ years of experience leading or managing technical teams.
- Deep technical understanding of cloud infrastructure (AWS) and container orchestration (Kubernetes).
- Proficiency in Python and familiarity with the machine learning lifecycle.
- Experience managing multidisciplinary teams (combining DevOps/Infra with Data Science).
- Hands-on experience with workflow orchestration tools (e.g., Prefect, Dagster, Airflow).
- Experience processing petabyte-scale datasets.
- Familiarity with distributed computing frameworks (e.g., Ray, Dask).
- Experience dealing with data governance, security, and compliance in an industrial context.
- Pragmatic leadership style with the ability to discern when to build custom solutions versus utilizing managed services.
- Exceptional communication skills to articulate complex infrastructure constraints to non-technical stakeholders.
eQuest