Arbetsbeskrivning
Location:
On-site, Stockholm
Experience Level:
Senior (8+ years)
About the Role
We're a small consultancy firm seeking a highly skilled Senior Site Reliability Engineer to expand our infrastructure capabilities and support our diverse client base.
You'll work directly with various customers across different industries, helping them design, implement, and optimize their cloud-native infrastructure while ensuring reliability, scalability, and performance of their mission-critical systems.
Key Responsibilities
Client Infrastructure & Platform Management
- Design, implement, and maintain highly available, scalable infrastructure on Google Cloud Platform (GCP) for various client environments
- Manage and optimize Kubernetes clusters across multiple client projects and environments
- Architect and maintain Apache Kafka streaming data pipelines and event-driven architectures tailored to client needs
- Implement infrastructure as code using Terraform, Ansible, or similar tools across diverse client requirements
Platform Engineering & Internal Developer Platforms (IDP)
- Design, build, and maintain internal developer platforms to enable self-service, on-demand infrastructure for application teams
- Implement control-plane provisioning using Crossplane, building reusable abstractions and composite resources
- Develop, package, and maintain Helm charts and Kubernetes operators to standardize deployments and reduce operational overhead
- Collaborate with developer teams to define platform APIs, templates, and CI/CD workflows
Client Reliability & Performance
- Develop and maintain SLIs, SLOs, and error budgets to ensure system reliability for client applications
- Design and implement comprehensive monitoring, alerting, and observability solutions customized for each client
- Conduct capacity planning and performance optimization across client services and workloads
- Lead incident response for client systems, conduct post-mortem analysis, and implement preventive measures
Client Security & Compliance
- Implement security best practices across client infrastructure and applications
- Ensure compliance with industry standards and regulatory requirements specific to each client's sector
- Manage secrets, certificates, and access controls across all client environments
Required Qualifications
Technical Skills
- 5+ years of experience in Site Reliability Engineering, DevOps, Platform Engineering, or similar roles
- Expert-level knowledge of Google Cloud Platform services (GKE, Cloud Run, BigQuery, Pub/Sub, etc.)
- Extensive experience with Kubernetes orchestration, including cluster management and troubleshooting
- Strong experience with Apache Kafka, including cluster management, topic design, and stream processing
- Proficient with infrastructure as code tools (Terraform, Pulumi, CloudFormation)
- Proven track record building and operating internal developer platforms (IDP), with hands-on experience using Crossplane
- Deep understanding of Helm chart development, templating best practices, and lifecycle management
Operational Excellence
- Deep understanding of monitoring and observability tools (Prometheus, Grafana, ELK stack, or similar)
- Experience with service mesh technologies (Istio, Linkerd) and API gateways
- Strong knowledge of networking, load balancing, and distributed systems concepts
- Experience with database technologies (PostgreSQL, Redis, MongoDB) and their operational aspects
Client-Facing & Consulting Skills
- Excellent client communication and presentation skills
- Ability to understand and translate diverse client requirements into technical solutions
- Experience working with multiple clients simultaneously and managing competing priorities
- Ability to provide strategic infrastructure guidance
What We Offer
- Competitive salary
- Flexible PTO and work-from-home options
- Professional development budget for conferences, training, and certifications
- State-of-the-art equipment and technology stipend
- Opportunity to work with diverse clients across various industries
- Professional development opportunities through varied client engagements
- Potential for travel to client sites (as needed)
About Our Consultancy
As a growing consultancy firm, we work with clients ranging from startups to enterprise organizations across various industries.
Our infrastructure expertise helps clients modernize their systems, improve reliability, and scale their operations.
You'll have the opportunity to work on diverse projects, from greenfield implementations to complex migrations, while building lasting relationships with our clients.
How to Apply
Please submit your resume highlighting your experience with our core technologies.