Senior Site Reliability Engineer

hace 3 días


Buenos Aires, Argentina Marvik A tiempo completo

**What’s the opportunity?**

We’re looking for a **Site Reliability Engineer (SRE)** to join our team
As an SRE, you're expected to ask key questions like:
What data do we need to understand how our systems are performing?
How do we collect that data?
What patterns are we looking for, and what do they mean?
Who needs to be alerted when something isn’t working?
Are there any systems where we need more or better data?An SRE designs systems and processes to answer these questions and automate support and response wherever possible.

**‍ Responsibilities**:
**Own OpenTelemetry Pipelines**: Design, implement, and maintain observability pipelines across logs, metrics, and traces, ensuring standardized, scalable, and efficient data ingestion. Optimize ingestion strategies for cost, performance, and usability.

**Empower Engineering Teams**: Build self-service automation and tooling that lets development teams implement observability without needing manual SRE support. Drive best practices and ensure teams take ownership of their telemetry.

**Support Incident Management**: Act as the engineering arm of the Incident Management Team—designing playbooks, processes, checklists, and automations to support teams during incidents.

**Collaborate Across Teams**: Work with teams across the business to understand their monitoring, alerting, and SLO/SLA needs. Design solutions that meet or exceed these requirements and influence architectural decisions from the start to ensure scalability and resilience.

**Automate Observability Infrastructure**: Use Infrastructure-as-Code (IaC) to manage monitoring tools, alert rules, and observability configurations across OTEL pipelines.

**Define Baseline Observability Standards**: Create base-level requirements to ensure all infrastructure and code is monitored consistently and accurately.

**Own Technical and Security Health**: Take full ownership of infrastructure reliability and ensure alignment with key availability and security KPIs.

**Optimize Alerting Systems**: Continuously fine-tune alerting to reduce noise, ensure alerts are actionable, and improve response efficiency.

**If you have**

4+ years of experience as an SRE or in a similar observability-focused role.

Strong Kubernetes expertise, including components, deployment practices, and monitoring.

Familiarity with OpenTelemetry—setting up collectors, instrumentation, and pipeline optimization.

Experience with tools like Grafana, Prometheus, Loki, New Relic, or Datadog.

Hands-on experience with Infrastructure-as-Code (Terraform) and GitOps CI/CD (e.g., ArgoCD, GitHub Actions).

Experience integrating incident platforms (PagerDuty, Jira) into alerting workflows.

Strong scripting skills (Python, Go, etc.) to automate observability tasks.

A problem-solving mindset and ability to collaborate across teams to improve reliability.

**It’s a plus**:
Cloud experience, especially with AWS and ECS workloads.

Experience managing observability pipelines at scale in high-throughput environments.

Familiarity with Configuration-as-Code tools (Ansible, Chef, or SaltStack).

Experience with database performance monitoring in large-scale distributed systems.


  • Senior DevOps

    hace 2 semanas


    Buenos Aires, Argentina Sofka A tiempo completo

    Senior DevOps & Site Reliability Engineer - Remote - Latam Sofka Senior Site Reliability Engineer (Sre) / DevOps Emi Labs At Emi Labs we are on a mission to increase Frontline Workers’ access to professional opportunities. This is a 2.7 billion population that accounts for 80% of the world’s workforce. They are digitally invisible, as there’s little to...


  • Buenos Aires, Argentina Launchpad Technologies A tiempo completo

    A technology company is seeking a Senior Site Reliability Engineer to ensure the reliability and performance of its infrastructure. This pivotal role involves both hands-on technical work and leadership in reliability initiatives. Candidates should have over 7 years of experience with cloud infrastructure, particularly Azure and AWS. This position offers a...


  • Capital Federal, Buenos Aires, Argentina Emprego AR A tiempo completo

    **Detalles de la oferta**: Overview: We are more than a software company. We want to be known as a company that does the right thing, no matter the challenge or controversy. We are committed to creating a culture that values every person and every experience. Individual life experiences shape the way we interact with the world, which is why we encourage...


  • Buenos Aires, Argentina GlobalLogic A tiempo completo

    Job: - IRC166637- Location: - Argentina - Buenos Aires- Designation: - Associate Specialist Engineer- Experience: - 3-5 years- Function: - Engineering- Skills: - Cloud Infrastructure, DevOps, Hadoop, MySQL, Spark**Description**: Our client delivers personalized experiences, cross-channel messaging and loyalty programs that add value to the customer...


  • Buenos Aires, Argentina Chainlink A tiempo completo

    Senior Site Reliability Engineer, Observability Join to apply for the Senior Site Reliability Engineer, Observability role at Chainlink Labs. Company Overview Chainlink is the industry‑standard oracle platform that brings the capital markets on‑chain and powers the majority of decentralized finance (DeFi). Our stack provides essential data,...

  • Senior DevOps

    hace 4 días


    Buenos Aires, Argentina Itps A tiempo completo

    Senior DevOps / Site Reliability Engineer (Azure) (Ref-Lch) We are looking for a highly skilled Senior DevOps / Site Reliability Engineer with deep experience in Azure cloud, CI/CD automation, and secure workload identity. This role is ideal for someone who masters modern DevOps practices, understands cloud architecture at scale, and can lead the design and...


  • Buenos Aires, Argentina Exxon Mobil A tiempo completo

    A global energy company is seeking a Site Reliability Engineer to manage and automate operations in Buenos Aires. Ideal candidates should have a Bachelor's degree and over 2 years of experience in site reliability or infrastructure engineering, specifically within a DevOps framework. The role involves developing infrastructure as code, performance...


  • Buenos Aires, Buenos Aires C.F., Argentina Blockscout Limited A tiempo completo

    Blockscout is a leading provider of indexing and UI services for EVM chains. Our team hosts explorers for many of the largest chains in the industry. Reliability is vital to our company's success. We are looking for a Site Reliability Engineer to strengthen our DevOps and Support teams.Key responsibilitiesMonitor systems: Proactively watch production systems...


  • Buenos Aires, Argentina VirginPulse A tiempo completo

    Overview: **Now is the time to join us!** We’re Personify Health. We’re the first and only personalized health platform company to bring health, wellbeing, and navigation solutions together. Helping businesses optimize investments in their members while empowering people to meaningfully engage with their health. At Personify Health, we believe in...


  • Buenos Aires, Buenos Aires C.F., Argentina Dev A tiempo completo

    Are you in Brazil or Argentina? Join us as we actively recruit in these locations, offering a comfortable remote environment. Submit your CV in English, and we'll get back to youWe invite a Senior Site Reliability Engineer to join our dynamic team. In this hands-on role, you'll focus on improving the stability, observability, and efficiency of our services....