Specialist – Tools & Infrastructure Reliability (March of Giants)
Ubisoft Voir toutes les offres
- Montréal, QC
- Permanent
- Temps-plein
- Advising development teams on technology and tooling choices to improve visibility, control, and robustness of internal and external services.
- Training, supporting, and guiding development teams in improving continuous integration and continuous deployment systems.
- Researching, integrating, and developing technologies that enhance reliability, performance, and productivity.
- Designing, operating, and owning build, configuration, versioning, and publishing pipelines (including packaging, signing, SBOM, artifacts).
- Implementing and supporting CI/CD tooling (automated tests, quality, security), IaC, and secure, reproducible, controlled deployments.
- Maintaining tooling products to deliver exemplary service quality to the project (internal SLOs).
- Implementing and maintaining game deployment guidelines and documenting infrastructure implementation and technical specifications for network and server systems.
- Collaborating with development teams to diagnose and resolve issues related to online services.
- Establishing and maintaining incident-management processes.
- Managing Cloud environments using appropriate tools.
- Developing tools and processes that allow developers to deploy services safely and efficiently.
- Defining and tracking SLA/SLO/SLI, deploying observability (logs, metrics, traces), managing capacity, and contributing to FinOps initiatives.
- University degree in Computer Science, Computer Engineering, or any relevant field.
- 5-8 years of experience in software development and system administration.
- Experience with infrastructure automation (Cloud).
- Experience managing high-throughput systems.
- Experience designing resilient, scalable, and redundant architectures.
- Experience in software development and optimization.
- Strong analytical and synthesis skills.
- Ability to solve complex problems.
- Ability to adapt quickly to change.
- Ability to work under pressure.
- Strong knowledge of distributed systems.
- Excellent knowledge of Linux and Windows system administration.
- Programming languages: Python, Go, C#, or C++.
- CI/CD (GitLab, GitHub, Azure DevOps), IaC (Terraform, CloudFormation), containers & orchestration (Docker, Kubernetes).
- Observability: Prometheus/Grafana, ELK/EFK, OpenTelemetry (or equivalent).
- Cloud: AWS, Azure, GCP; databases; networks (DNS, CDN, load balancing, TLS).
- Assets: Unreal Engine 5 (or similar engine), DevOps methodology, infrastructure automation experience.