
Sr. Site Reliability Administrator
- Ottawa, ON
- Permanent
- Temps-plein
OpenText is a global leader in information management, where innovation, creativity, and collaboration are the key components of our corporate culture. As a member of our team, you will have the opportunity to partner with the most highly regarded companies in the world, tackle complex issues, and contribute to projects that shape the future of digital transformation.OpenText Discovery legal technology solution provides the world’s most advanced technology to help legal and compliance teams discover what matters across massive volumes of enterprise data. Using unstructured data analytics, machine learning, and interactive visualizations, our platforms provide fast access to key documents, contract terms, personnel with expertise, and critical early insights—for litigation, investigations, due diligence, compliance, and more.Your Impact:You will play a critical role in designing, implementing, and maintaining cloud infrastructure solutions for our organization while focusing on enhancing system reliability, scalability, and performance. Your expertise in cloud technologies will be instrumental in driving the success of our cloud initiatives and ensuring our systems operate flawlessly.As a Sr Site Reliability Administrator you will:
- Provide direct operations support for our cybersecurity products and technologies, including AWS infrastructure and product configuration/function responsibilities.
- Participate in strategic planning and execution for cloud and application security.
- Maintain uptime, security, and apply product and infrastructure patches across all cloud environments.
- Lead and mentor junior team members, fostering a culture of continuous learning, collaboration, and system reliability.
- Develop and maintain cloud architecture documentation, including diagrams, technical specifications, and best practices, to facilitate effective communication and knowledge sharing among teams.
- Configure and manage cloud environments using infrastructure-as-code (IaC) automation tools to provision and scale resources efficiently while maintaining system reliability.
- Implement and enforce robust cloud security measures, including access controls, data encryption, and vulnerability assessments, to safeguard sensitive information and ensure compliance with relevant regulations.
- Perform regular performance analysis, capacity planning, and proactive monitoring to identify optimization opportunities, mitigate risks, and address scalability challenges.
- Apply Site Reliability Engineering (SRE) principles and practices to drive reliability improvements, including incident response, post-incident analysis, and system resilience initiatives.
- Stay up to date with emerging cloud technologies, industry trends, and SRE methodologies, and provide recommendations for their adoption to enhance our cloud infrastructure and reliability practices.
- Bachelor’s degree in Computer Science, Information Technology, or cloud-focused certifications combined with four years of relevant industry experience.
- 1-5 years of experience in:
- Supporting applications deployed via Docker or Kubernetes-based containers.
- Working with Amazon Web Services (AWS) or similar cloud platforms.
- Linux or Windows system administration.
- Using DevOps tools and environments like Jenkins, Git, and Terraform.
- Managing Aurora/Postgres and other modern databases.
- Strong understanding of cybersecurity concepts/technologies.
- Proficient with scripting languages like Python, PowerShell, and shell scripting.
- Experience/knowledge in Terraform, GitLab, and Packer (big plus).
- Background in monitoring tools such as Zabbix, Grafana, CloudWatch, and others.
- Experience with compliance programs and frameworks such as ISO 27001, GDPR, FedRAMP/FISMA, SOC 2, or PCI DSS.