Senior Software Engineer, Platform (SRE)

hireVouch

  • Kitchener, ON Waterloo, ON
  • Permanent
  • Temps-plein
  • Il y a 23 jours
As a Platform/Site Reliability Engineer (SRE), you will play a key role in establishing and enhancing our engineering platform. You will help ensure the reliability, scalability, and efficiency of our systems while developing tools that improve engineering productivity.You will play a key role in defining and shaping our platform strategy, setting best practices, and driving initiatives that enhance developer experience, system performance, and operational efficiency.What You Will Be Doing
  • DevOps & Infrastructure : Design, implement, and maintain scalable infrastructure to support engineering needs.
  • CI/CD Optimization : Improve our continuous integration and continuous deployment pipelines using AWS CDK , including requirements for a deployment tool and database migration tool to enable fast and reliable releases.
  • Release Tracking & Deployment : Establish visibility into release cycles, implement automation to streamline deployments, and ensure smooth rollouts.
  • Site Reliability & Observability : Implement monitoring, logging, and alerting systems to ensure high availability and performance of services.
  • Internal Tooling : Build and maintain tools that improve developer efficiency, automate repetitive tasks, and enhance productivity.
  • Security & Compliance : Ensure security best practices are followed in infrastructure, deployments, and internal systems, with a focus on SoC, ISO, and GDPR compliance.
What We’re Looking For
  • 7+ years of technical experience: 5+ years of experience as an SRE Engineer or similar. Prior startup experience is preferred but not required.
  • Deep expertise in AWS , including Fargate and Kubernetes for container orchestration.
  • Strong experience with CI/CD pipelines , specifically leveraging AWS CDK , including deployment and database migration tools.
  • Proficiency in observability tools (Datadog, Prometheus, Grafana) and performance monitoring.
  • Deep understanding of scaling strategies and highly available architectures.
  • Experience with scripting and automation using Python, Bash, or TypeScript.
  • Knowledge of security best practices , including compliance with SoC, ISO, and GDPR as a bonus.
  • Ability to collaborate cross-functionally with engineering teams to drive platform improvements.
Our Tech Stack
  • Infrastructure : AWS, Fargate, Redis, PostgreSQL, SQS, CDK, GitHub, Retool
  • Backend : Django REST framework, Celery
  • Frontend : Next.js, Tailwind css
  • LLM : OpenAI, Claude, AWS Bedrock

hireVouch