Senior Python Developer - Operate
Deloitte Voir toutes les offres
- Toronto, ON
- 72.000-138.000 $ par an
- Temporaire
- Temps-plein
Work Model: Hybrid
Reference code: 132996
Primary Location: Toronto, ON
All Available Locations: Toronto, ONOur PurposeAt Deloitte, our Purpose is to make an impact that matters. We exist to inspire and help our people, organizations, communities, and countries to thrive by building a better future. Our work underpins a prosperous society where people can find meaning and opportunity. It builds consumer and business confidence, empowers organizations to find imaginative ways of deploying capital, enables fair, trusted, and functioning social and economic institutions, and allows our friends, families, and communities to enjoy the quality of life that comes with a sustainable future. And as the largest 100% Canadian-owned and operated professional services firm in our country, we are proud to work alongside our clients to make a positive impact for all Canadians.By living our Purpose, we will make an impact that matters.
- Have many careers in one Firm.
- Enjoy flexible, proactive, and practical benefits that foster a culture of well-being and connectedness.
- Learn from deep subject matter experts through mentoring and on the job coaching
- Design, develop, and maintain Python services, libraries, and APIs for data processing and application backends.
- Build and integrate LLM-driven features (prompting, context construction, chaining/orchestration, evaluation).
- Implement retrieval-augmented generation (RAG) pipelines, including document ingestion, chunking, embeddings, and vector search.
- Develop robust unit/integration tests; contribute to CI/CD pipelines and observability (logging, metrics, tracing).
- Optimize performance and cost of LLM workloads (token usage, latency, caching, batching, model selection).
- Ensure data privacy and security best practices, including PII handling, guardrails, and governance.
- Collaborate cross-functionally with product managers, data engineers, designers, and security to ship production-quality features.
- Document designs, decisions, and operational runbooks; mentor peers and review code.
- Monitor and troubleshoot production issues; drive continuous improvement and post-incident learnings.
- Has 5+ years of professional experience building production-grade Python services and APIs
- Is fluent with Python fundamentals: asynchronous programming (asyncio), typing, packaging, dependency management, and performance profiling
- Builds reliable backends with FastAPI/Flask, REST/gRPC, and clean architecture patterns; comfortable with SQLAlchemy/ORMs and data modeling
- Writes high-quality, tested code: pytest, fixtures/mocking, linting/formatting, type checking, and code reviews; integrates with CI/CD
- Works confidently with Docker/containers, Git, and observability (structured logging, metrics, tracing) to support production operations
- Understands databases and messaging: Postgres/MySQL, Redis, and event/queue systems (e.g., Kafka/SQS)
- Applies security-first practices: secrets management, least-privilege access, input validation, secure coding, encryption at rest/in transit, and PII handling
- Communicates clearly, documents decisions, and collaborates effectively with product, security, risk, and compliance stakeholders
- Practical GenAI/LLM experience: integrating hosted LLM APIs (e.g., Azure OpenAI, OpenAI, Anthropic), prompt design, and token/cost management
- Retrieval-augmented generation (RAG): embeddings, chunking strategies, vector search (FAISS, pgvector, Pinecone), and orchestration frameworks (LangChain, LlamaIndex) or custom pipelines
- LLM application quality and safety: prompt/version management, evaluation frameworks (automatic and human-in-the-loop), guardrails, red-teaming, and safe output handling
- Optimization and reliability: latency and throughput tuning, caching/semantic caching, batching, function/tool calling, and fallback/model routing patterns
- Model customization: fine-tuning or parameter-efficient tuning (LoRA/QLoRA), adapter-based approaches, and model selection trade-offs (quality, latency, cost, compliance)
- Multimodal familiarity: using models that handle text with images or structured data where relevant to use cases
- Observability for GenAI: tracing, token/latency metrics, cost dashboards, and incident response for LLM-enabled services
- Cloud and platform experience (Azure/AWS/GCP), Kubernetes or serverless, infrastructure as code, and governance considerations for regulated environments (banking/financial services)