Cloud Infrastructure Reliability Engineer

hace 1 mes


Córdoba, Córdoba, Argentina Avature A tiempo completo

About the Role:

Avature's Coverage team is focused on enhancing and sustaining the quality of our monitoring systems and methodologies, particularly during on-call duties and related incident detection efforts. The team's responsibilities encompass the management and ongoing enhancement of our server and service monitoring, as well as a comprehensive view of system reliability.

Your Responsibilities:

  • Gain a thorough understanding of Avature's infrastructure and operational processes.
  • Assist in establishing standards with a DevOps/SRE perspective and promote adherence to these standards.
  • Identify and rectify vulnerabilities within our infrastructure to guarantee service reliability.
  • Formulate strategies to avert and mitigate disruptions in essential services.
  • Occasionally engage in troubleshooting during active incidents.
  • Collaborate with development and engineering teams to integrate SRE methodologies from the initial phases of the software development lifecycle.

Your Daily Activities:

  • Contribute to the formulation and execution of SRE policies and practices.
  • Work alongside other infrastructure and development teams to continuously enhance their services' monitoring and observability.
  • Participate in incident management, conducting post-incident reviews and suggesting preventive actions to mitigate future issues.
  • Occasionally troubleshoot ongoing incidents.
  • Investigate methodologies and assess metrics to optimize how teams access information regarding their systems.

About You:

  • Proficient in observability tools: logs (ELK stack), metrics (e.g., Prometheus, Grafana), and tracing (e.g., Jaeger, OpenTelemetry).
  • Experience in designing and maintaining resilient and distributed systems.
  • Strong background in Linux system administration.
  • Excellent analytical and troubleshooting capabilities.
  • Infrastructure-as-code mindset.
  • Proficient in software development (Python, Golang) and configuration management (Puppet, Ansible).
  • Familiarity with incident management tools, such as Splunk On-Call.

About Avature:

Avature is a leading enterprise SaaS solution provider specializing in global talent acquisition and management. We are committed to high-quality engineering and exceptional customer service, recognized as innovators in the large enterprise market. Our clientele includes over 650 companies worldwide, including numerous Fortune 500 firms, major consulting agencies, and significant financial institutions.

What We Offer:

  • A dynamic, fast-paced, and engaging work environment.
  • Flexible working hours.
  • Options for remote work or office attendance.
  • Quarterly salary reviews.
  • Opportunity to receive part of your salary in US dollars.
  • Three weeks of vacation starting from the first year.
  • Four weeks of paternity leave.
  • Comprehensive health coverage (family plan).
  • Four days annually for professional development events.
  • A week off at the end of the year.
  • Reimbursement for internet service expenses.
  • Days off for birthdays.

At Avature, we celebrate diversity and inclusion, recognizing that each unique individual contributes to our team's success. We are dedicated to promoting equal opportunities and considering all qualified applicants fairly.



  • Córdoba, Córdoba, Argentina Avature A tiempo completo

    About the RoleAvature's Coverage team is dedicated to maintaining and improving the quality of our monitoring tools and practices as applied during on-call shifts or other related incident-spotting endeavors. As a Cloud Reliability Engineer, you'll strive to implement tools and processes that improve observability, monitoring, and incident management,...


  • Córdoba, Córdoba, Argentina Avature A tiempo completo

    About the RoleAvature's Coverage team is dedicated to maintaining and improving the quality of our monitoring tools and practices as applied during on-call shifts or other related incident-spotting endeavors. As a Cloud Reliability Engineer, you'll strive to implement tools and processes that improve observability, monitoring, and incident management,...


  • Córdoba, Córdoba, Argentina Avature A tiempo completo

    About the RoleAvature's Coverage team is dedicated to maintaining and improving the quality of our monitoring tools and practices as applied during on-call shifts or other related incident-spotting endeavors. As a Cloud Reliability Engineer, you'll strive to implement tools and processes that improve observability, monitoring, and incident management,...


  • Córdoba, Córdoba, Argentina Avature A tiempo completo

    Overview: At Avature, we are focused on enhancing the reliability and quality of our monitoring systems and practices during on-call duties and related incident management activities. Our Coverage team plays a crucial role in overseeing and continuously refining our server management, service monitoring, and alerting protocols to ensure a comprehensive view...


  • Córdoba, Córdoba, Argentina Techunting A tiempo completo

    Job Title: Principal Site Reliability EngineerWe are seeking a highly skilled Principal Site Reliability Engineer to lead our SRE team and optimize infrastructure for high-performance applications. The ideal candidate will have a strong background in software development, automation, and cloud management.Key Responsibilities:Lead SRE team to ensure high...


  • Córdoba, Córdoba, Argentina Intuition Machines, Inc. A tiempo completo

    Intuition Machines, Inc. leverages advanced AI/ML technologies to develop enterprise-grade security solutions. Our innovative research impacts systems that cater to hundreds of millions globally, supported by a diverse team operating from various locations. You may recognize our flagship product, the hCaptcha security suite. Our methodology is...


  • Córdoba, Córdoba, Argentina Intuition Machines, Inc. A tiempo completo

    Intuition Machines, Inc. leverages AI and machine learning to develop cutting-edge security solutions for enterprises. Our innovative research is applied to systems that cater to hundreds of millions of users globally, supported by a distributed team. One of our flagship products is the hCaptcha security suite, and our operational philosophy emphasizes...


  • Córdoba, Córdoba, Argentina Intuition Machines, Inc. A tiempo completo

    Intuition Machines, Inc. specializes in utilizing AI and machine learning to develop enterprise-level security solutions. Our innovations are applied to systems that cater to hundreds of millions of users globally, supported by a distributed team. You may recognize our flagship product, the hCaptcha security suite. Our methodology is straightforward: minimal...


  • Córdoba, Córdoba, Argentina Techunting A tiempo completo

    Principal Site Reliability EngineerWe are seeking a highly skilled Principal Site Reliability Engineer with extensive experience in leading SRE teams and optimizing infrastructure for high-performance applications. The ideal candidate will have a strong background in software development, automation, and cloud management.Key Responsibilities:Lead SRE teams...


  • Córdoba, Córdoba, Argentina Avature A tiempo completo

    Overview: Avature's Coverage team is focused on enhancing the quality of our monitoring tools and practices, particularly during on-call shifts and incident detection efforts. The team's responsibilities encompass the management and ongoing enhancement of our server and service monitoring, as well as a comprehensive view of system reliability. Your...

  • Site Reliability Engineer

    hace 2 semanas


    Córdoba, Córdoba, Argentina Internetwork Expert A tiempo completo

    {"h2": "About the Role", "p": "At Internetwork Expert, we're pushing the boundaries of AI/ML-powered security solutions. As a Site Reliability Engineer, you'll be at the forefront of building high-performance, secure, and cost-effective systems that serve millions of users worldwide. Our flat organization and customer-focused approach mean you'll work...

  • Site Reliability Engineer

    hace 4 semanas


    Córdoba, Córdoba, Argentina Internetwork Expert A tiempo completo

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Internetwork Expert. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability, performance, and security of our internet-scale systems.Key ResponsibilitiesDesign and implement scalable and highly available systems to handle...

  • Cloud Architect H/F

    hace 3 semanas


    Córdoba, Córdoba, Argentina Infotel A tiempo completo

    We are seeking a highly skilled Cloud Architect to support our IT transition to Azure. As a key member of our team, you will play a crucial role in designing and implementing a cloud adoption framework tailored to our company's needs.Key Responsibilities:Strategic Planning and Design: Assess our current cloud IT infrastructure, applications, and workflows to...

  • Site Reliability Engineer

    hace 3 semanas


    Córdoba, Córdoba, Argentina Intuition Machines, Inc. A tiempo completo

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Intuition Machines, Inc. As a Site Reliability Engineer, you will play a critical role in ensuring the performance, availability, and security of our internet-scale systems.Key ResponsibilitiesDesign and implement solutions to enhance system performance, availability,...

  • Cloud Architect

    hace 3 semanas


    Córdoba, Córdoba, Argentina Techunting A tiempo completo

    About TechuntingWe are a leading technology company seeking a highly skilled and experienced Senior Cloud Architect to drive the design and implementation of scalable, secure, and resilient cloud architectures.Key ResponsibilitiesCloud Architecture Design: Design and implement scalable, cloud-native SaaS applications and services that meet the needs of our...

  • DevOps Engineer

    hace 2 semanas


    Córdoba, Córdoba, Argentina Dlocal Corp A tiempo completo

    About dLocaldLocal is a global payments processor that enables top brands to collect payments in 40 emerging markets. Our mission is to simplify payment expansion and increase conversion rates for our merchants.Our TeamWe're a diverse and dynamic team of 900+ professionals from 25+ nationalities. We're passionate about building a global career that impacts...

  • DevOps Engineer

    hace 3 semanas


    Córdoba, Córdoba, Argentina Proofpoint A tiempo completo

    About the RoleWe are seeking a highly skilled DevOps Engineer to join our core detection team at Proofpoint. As a DevOps Engineer, you will play a critical role in designing, implementing, and supporting software systems that power our industry-leading products focused on email security.Our team is responsible for protecting the corporate email platforms of...

  • DevOps Engineer

    hace 3 semanas


    Córdoba, Córdoba, Argentina Dlocal Corp A tiempo completo

    About dLocal CorpdLocal Corp enables global companies to collect payments in 40 emerging markets, simplifying payment expansion and increasing conversion rates. As a payments processor and merchant of record, we facilitate our merchants' entry into the world's fastest-growing markets.Our dynamic culture offers flexibility, remote work options, travel...

  • Senior Software Engineer

    hace 3 semanas


    Córdoba, Córdoba, Argentina Proofpoint A tiempo completo

    About the RoleAt Proofpoint, we're committed to bringing passion and customer focus to the business. As a Senior Software Engineer on our TAP Dashboard team, you'll be building massive-scale systems used by Fortune 100 customers to protect their business-critical communications.Your Key ResponsibilitiesDesign and develop complex systems that interface with...

  • Senior Cloud Engineer

    hace 2 semanas


    Córdoba, Córdoba, Argentina Proofpoint A tiempo completo

    About the RoleWe are seeking a highly motivated Senior Software Engineer to join our Proofpoint team. As a key member of our product teams, you will be responsible for building and maintaining large-scale cloud-based services that form the backbone of our product suite.Key ResponsibilitiesDesign, implement, and operate large-scale cloud-based services that...