```html
Key Responsibilities
- Develop and maintain the VictoriaMetrics + Grafana + Alertmanager stack;
- Configure metrics, dashboards, and alerts for microservices and databases;
- Maintain centralized logging (Fluent Bit, Fluentd, Elasticsearch, Kibana);
- Define SLI/SLOs and participate in incident analysis (postmortems);
- Help operate and support EKS clusters (dev/prod);
- Work with autoscaling (Karpenter or Cluster Autoscaler);
- Configure storage (EBS/EFS), load balancers, and network policies;
- Troubleshoot pod, node, and networking issues;
- Maintain Argo CD and Helm charts for services;
- Ensure correct deployments and environment health;
- Automate routine operations via GitLab CI/CD;
- Monitor PostgreSQL (CloudNativePG) and Redis;
- Set up DB monitoring and alerts (replication, lag, failovers);
- Participate in testing backup and restore procedures;
- Work with Vault (secrets, tokens, access management);
- Configure cert-manager and automated certificate renewals;
- Help set up OAuth2 Proxy / Zitadel for services;
- Maintain Terraform/OpenTofu modules for infrastructure (EKS, S3, IAM);
- Work with multi-environment configurations and remote state;
- Write and review simple infrastructure changes via Git.
Requirements
- Hands-on experience with Kubernetes and understanding of core objects (Pods, Deployments, Services, Ingress);
- Ability to read/write Helm values and work with Argo CD;
- Understanding of monitoring/metrics concepts (Prometheus/VictoriaMetrics, Grafana, Alertmanager);
- Ability to work with logs and perform incident root cause analysis.
Desirable
- Experience with PostgreSQL and Redis (backups, replication, monitoring);
- Knowledge of Terraform or OpenTofu;
- Experience with Kafka, ClickHouse, or Vault;
- Understanding of SLI/SLOs and SRE practices.
What We Offer
- High salary (plus performance bonuses and salary revision regularly);
- Work schedule: Mon-Fri (9h with 1h lunch break), flexible start 8:00-10:00;
- 24 days holiday leave;
- Exciting work challenges that allow you to grow to your full potential;
- A strong team of like-minded professionals who will be by your side to accomplish ambitious projects, stimulate your professional development and bring experience.
Who You Are
We are looking for a candidate who is passionate about site reliability engineering, has a strong technical background, and is eager to tackle complex challenges in a collaborative environment.
Tech Stack
- Kubernetes
- VictoriaMetrics
- Grafana
- Alertmanager
- Fluent Bit
- Fluentd
- Elasticsearch
- Kibana
- GitLab CI/CD
- PostgreSQL
- Redis
- Vault
- Terraform/OpenTofu
Team Description
You will be part of a strong team of like-minded professionals who are dedicated to achieving ambitious projects and fostering professional development.
```Ready to apply for this role?
Apply Now →



