Senior Site Reliability Engineer
hace 22 horas
🟢 Are you in Brazil or Argentina? Join us as we actively recruit in these locations, offering a comfortable remote environment. Submit your CV in English, and we'll get back to you We invite a Senior Site Reliability Engineer to join our dynamic team. In this hands-on role, you’ll focus on improving the stability, observability, and efficiency of our services. You’ll lead initiatives to enhance monitoring, automation, and reliability practices while collaborating with engineering teams to ensure our systems run smoothly and remain resilient. 🟩 What's in it for you: - Join a top S&P 500 company shaping the future of global payments and financial technology - Lead initiatives to improve stability, observability, and efficiency of critical services - Collaborate with engineering teams to solve complex problems and drive operational excellence ✅ Is that you? - 5+ years in site reliability, observability, or platform engineering - Experience building SRE or observability practices from scratch - Hands-on OpenTelemetry experience (SDKs and Collector) - Strong experience with PromQL/SPL and at least one APM platform (Datadog, Splunk APM, Google Cloud APM) - Experience designing SLOs and alerting strategies (burn rate, multi-window) - Familiarity with MuleSoft or API gateway observability - Awareness of security best practices (PII redaction, access control) - Experience building automation scripts for CI/CD tasks - Experience with logging frameworks (Logback, Serilog) and structured JSON logging - Collaboration, communication, and independent problem-solving skills - Upper-Intermediate+ English level 🧩Key responsibilities and your contribution In this role, you’ll own and lead efforts to ensure the reliability, observability, and operational efficiency of our services. - Define and enforce logging, tracing, and metrics standards across services - Implement and maintain centralized telemetry pipelines and APM integrations - Build reusable instrumentation libraries for core languages (Java, .NET, Node.js, Python) - Establish dashboards and SLO/error budget alerts - Ensure log/trace correlation and schema consistency - Implement PII/secret redaction, retention, and cost optimization - Collaborate with development teams to onboard services and ensure observability readiness - Develop runbook templates, documentation, and training materials for engineering teams - Audit alerts, reduce noise, and maintain alert quality standards - Support incident response through tooling improvement and post-incident telemetry analysis 🎾 What's working at Dev.Pro like? Dev.Pro is a global company that's been building great software since 2011. Our team values fairness, high standards, openness, and inclusivity for everyone — no matter your background 🌐 We are 99.9% remote — you can work from anywhere in the world 🌴 Get 30 paid days off per year to use however you like — vacations, holidays, or personal time ✔️ 5 paid sick days, up to 60 days of medical leave, and up to 6 paid days off per year for major family events like weddings, funerals, or the birth of a child ⚡️ Partially covered health insurance after the probation, plus a wellness bonus for gym memberships, sports nutrition, and similar needs after 6 months 💵 We pay in U.S. dollars and cover all approved overtime 📓 Join English lessons and Dev.Pro University programs, and take part in fun online activities and team-building events Our next steps: ✅ Submit a CV in English — ✅ Intro call with a Recruiter — ✅ Internal interview — ✅ Client interview — ✅ Offer Interested? Find out more: 📋How we work 💻 LinkedIn Page 📈 Our website 💻IG Page
-
Lead Site Reliability Engineer
hace 2 semanas
Buenos Aires, Argentina Ecolab A tiempo completoJOB DESCRIPTION Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.The Infrastructure Engineering team is responsible for the design and engineering of solutions and technologies working with other engineering teams to support the...
-
Site Reliability Engineer Observability Lead
hace 3 semanas
Buenos Aires, Argentina Unilever A tiempo completoSite Reliability Engineer Observability Lead Responsibilities Create a robust observability framework, including an APM, alarming, dashboarding, event correlation, integrated to an existing observability platform. Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions. Troubleshoot priority incidents,...
-
Site Reliability Engineer
hace 1 semana
Capital Federal, Buenos Aires, Argentina Rp consultoria A tiempo completoNos encontramos en búsqueda de un/a **Ssr. Site Reliability Engineer **para incorporar a nuestro equipo en Buenos Aires, Argentina. ¿Qué buscamos en un **Ssr. Site Reliability Engineer**? Ser un colaborador activo de la automatización de tareas que necesiten intervención manual en el ciclo de desarrollo de software. Con muchas ganas de aprender,...
-
Senior Site Reliability Engineer
hace 5 días
Buenos Aires, Argentina Neara A tiempo completoNeara is a high-growth, venture-backed Series B, tech company headquartered in Sydney, Australia. We work with 75% of the utilities in Australia and New Zealand and are growing rapidly across the US and Europe. Our mission is to revolutionise the utilities industry by helping them future-proof their infrastructure and navigate the challenges of the clean...
-
Site Reliability Engineer
hace 1 semana
Buenos Aires, Argentina Launchpad Technologies A tiempo completoLaunchpad, a people-first technology company, is a leader in North America´s rapidly growing tech sector. Through two solutions, Launchpad supports its clients with digital transformation: - PaasportTM, our iPaaS solution, streamlines software integration and automates workflows. - Nearshore Staff Augmentation, our managed IT staffing service, connects top...
-
Senior Site Reliability Engineer, Observability
hace 2 días
Buenos Aires, Argentina Chainlink A tiempo completoSenior Site Reliability Engineer, Observability Join to apply for the Senior Site Reliability Engineer, Observability role at Chainlink Labs. Company Overview Chainlink is the industry‑standard oracle platform that brings the capital markets on‑chain and powers the majority of decentralized finance (DeFi). Our stack provides essential data,...
-
Senior Ai Site Reliability Engineer
hace 2 semanas
Buenos Aires, Argentina SQUIRE A tiempo completo**WHO WE ARE** SQUIRE is the leading business management system designed for the needs of barbers, shop owners, and their communities. We believe the pursuit of artistry and autonomy should not be restricted by the complexities of running a business. With SQUIRE, we provide custom-branded tools, resources, and guidance to help barbers of all stages and...
-
Site Reliability Engineer
hace 3 semanas
Buenos Aires, Argentina Exxon Mobil A tiempo completoA global energy company is seeking a Site Reliability Engineer to manage and automate operations in Buenos Aires. Ideal candidates should have a Bachelor's degree and over 2 years of experience in site reliability or infrastructure engineering, specifically within a DevOps framework. The role involves developing infrastructure as code, performance...
-
Senior Site Reliability Engineer
hace 3 días
Capital Federal, Buenos Aires, Argentina Business Commercial Management A tiempo completoBCM Uruguay is Hiring! Senior Site Reliability Engineer - Remote Remote - LATAM **English Level**: B2+ / C1 - Advanced Contractor - USD ⏱ Full-Time Para empresa multinacional de servicios en ingeniería digital, especialista en software de última generación y en desarrollo de productos digitales. Cuando una idea aparece, nacen la motivación y el deseo...
-
Site Reliability Engineer
hace 2 días
Buenos Aires, Argentina VS-Staffing A tiempo completoJob Description - Site Reliability Engineer - Remote Costa Rica **Title**: Site Reliability Engineer **Location**: Remote, LATAM **Job Overview**: **Key responsibilities include**: - Incident Management: Lead the response to security incidents through identification, containment, analysis, and mitigation strategies to minimize impact. - Procedure...