Senior Ai Site Reliability Engineer
hace 1 semana
**WHO WE ARE**
SQUIRE is the leading business management system designed for the needs of barbers, shop owners, and their communities. We believe the pursuit of artistry and autonomy should not be restricted by the complexities of running a business. With SQUIRE, we provide custom-branded tools, resources, and guidance to help barbers of all stages and experience levels attract and retain more customers, efficiently manage their shop operations, and increase their revenue.
Founded in 2015, SQUIRE is trusted by barbers in 4,000+ shops in more than a thousand cities around the globe. From streamlined booking and opening new shops to real-time earning dashboards and building lasting customer relationships, SQUIRE supports shop owners in seamlessly bridging the gap between their personal craft and business goals. SQUIRE enables barbers everywhere to unlock their full potential both as artists and as entrepreneurs.
**SUMMARY**
As a Senior AI Site Reliability Engineer, you will bring an AI-first mindset to solving classic reliability challenges. You’ll design, prototype, and deploy intelligent automation that improves observability, incident response, performance tuning, and operational efficiency across SQUIRE’s platform. This role is highly cross-functional, you’ll collaborate with engineering, infrastructure, and product teams to identify where AI can create leverage, then build and scale those solutions into production.
**REPORTS TO**:
- Senior Director, Platform Engineering
**JOB DUTIES AND RESPONSIBILITIES**:
- Develop and deploy AI/ML-driven solutions for monitoring, anomaly detection, and predictive alerting to improve system reliability and reduce MTTR.
- Use AI techniques to optimize capacity planning, autoscaling, and resource utilization across distributed systems.
- Automate repetitive operational tasks with intelligent agents and large-scale data analysis.
- Integrate LLMs and generative AI into incident response, post-mortem analysis, and business continuity
- Partner with platform and product engineering teams to embed AI-based observability into services from the ground up.
- Continuously evaluate new AI/ML methods and tools to expand SQUIRE’s AI-driven SRE capabilities.
- Drive a culture of experimentation: build prototypes, run pilots, measure results, and productionize what works.
- The duties and responsibilities outlined above are not a comprehensive list and additional tasks may be assigned from time to time based on business needs. _
**REQUIREMENTS AND QUALIFICATIONS**:
- 5+ years of experience in Site Reliability Engineering, DevOps, or related roles.
- Proven experience using AI/ML (supervised learning, anomaly detection, LLMs, etc.) to solve operational or reliability problems.
- Strong background in distributed systems, cloud infrastructure (AWS Preferred), and container orchestration (Docker, ECS, Elastic Beanstalk).
- Proficiency with observability stacks (Datadog, Sentry, Prometheus, etc.).
- Solid programming/scripting skills in Python, Go, or similar — with experience integrating ML/AI libraries and APIs.
- Hands-on with automation frameworks and infrastructure as code (Terraform, CloudFormation, etc.).
- Excellent analytical and problem-solving skills, with the ability to innovate in operational domains.
- Strong communication and collaboration skills across technical and non-technical stakeholders.
- English proficiency is a must. It's important you can communicate your ideas clearly as you will be interacting with English-speaking coworkers.
- Must be based in Buenos Aires.
- Availability to work on-site in our office in CABA two days a week (Tuesdays and Thursdays).
**NICE TO HAVE**:
- Familiarity with generative AI/LLM deployment (e.g., for operational assistants, automated runbooks).
- Experience with predictive scaling, proactive fault detection, or automated incident management systems.
- Contributions to AI-Ops / MLOps tooling or open source reliability projects.
**Interview Accommodations**
**Equal Employment Opportunity**
SQUIRE provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
This applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.
**Pay Transparency Nondiscrimination Provision**
SQUIRE will not discharge or in any other manner discriminate against employees or applicants because they have inquired about, discussed, or disclosed their own pay or the pay of another employee or applicant. However, employees who have access to the compensation information of other employees or applicants
-
Lead Site Reliability Engineer
hace 1 semana
Buenos Aires, Argentina Ecolab A tiempo completoJOB DESCRIPTION Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.The Infrastructure Engineering team is responsible for the design and engineering of solutions and technologies working with other engineering teams to support the...
-
Site Reliability Engineer Observability Lead
hace 2 semanas
Buenos Aires, Argentina Unilever A tiempo completoSite Reliability Engineer Observability Lead Responsibilities Create a robust observability framework, including an APM, alarming, dashboarding, event correlation, integrated to an existing observability platform. Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions. Troubleshoot priority incidents,...
-
Senior Site Reliability Engineer
hace 3 semanas
Buenos Aires, Argentina Dev.Pro A tiempo completoWe are a US-based outsource software development company that has been delivering exceptional software experience to our clients since 2011, helping technology companies to become industry leaders. Over the past few years, we’ve been hiring specialists all over the world while our main development centers were in Ukraine. Now, we keep expanding and start...
-
Site Reliability Engineer
hace 5 días
Capital Federal, Buenos Aires, Argentina Rp consultoria A tiempo completoNos encontramos en búsqueda de un/a **Ssr. Site Reliability Engineer **para incorporar a nuestro equipo en Buenos Aires, Argentina. ¿Qué buscamos en un **Ssr. Site Reliability Engineer**? Ser un colaborador activo de la automatización de tareas que necesiten intervención manual en el ciclo de desarrollo de software. Con muchas ganas de aprender,...
-
Senior Site Reliability Engineer
hace 2 días
Buenos Aires, Argentina Neara A tiempo completoNeara is a high-growth, venture-backed Series B, tech company headquartered in Sydney, Australia. We work with 75% of the utilities in Australia and New Zealand and are growing rapidly across the US and Europe. Our mission is to revolutionise the utilities industry by helping them future-proof their infrastructure and navigate the challenges of the clean...
-
Site Solutions Lead
hace 3 semanas
Buenos Aires, Argentina Workling A tiempo completoSite Solutions Analyst Job Summary Provide day‑to‑day leadership to a team of Site Solutions Analysts and Senior Analysts. Identify and communicate opportunities for process improvement to immediate supervisor. Oversee ticket resolution, ensuring accuracy and timeliness. Monitor team productivity and workload distribution. Train and mentor new hires as a...
-
Site Reliability Engineer
hace 4 días
Buenos Aires, Buenos Aires C.F., Argentina DevRev A tiempo completoDevRevAt DevRev, we're building the future of work with Computer – your AI teammate.Computer is not just another tool. It's built on the belief that the future of work should be about genuine human connection and collaboration – not piling on more apps.Computer is the best kind of teammate: it amplifies your strengths, takes repetition and frustration...
-
Site Reliability Engineer
hace 4 días
Buenos Aires, Buenos Aires C.F., Argentina DevRev A tiempo completoDevRevAt DevRev, we're building the future of work with Computer – your AI teammate.Computer is not just another tool. It's built on the belief that the future of work should be about genuine human connection and collaboration – not piling on more apps.Computer is the best kind of teammate: it amplifies your strengths, takes repetition and frustration...
-
Site Reliability Engineer
hace 5 días
Buenos Aires, Argentina Launchpad Technologies A tiempo completoLaunchpad, a people-first technology company, is a leader in North America´s rapidly growing tech sector. Through two solutions, Launchpad supports its clients with digital transformation: - PaasportTM, our iPaaS solution, streamlines software integration and automates workflows. - Nearshore Staff Augmentation, our managed IT staffing service, connects top...
-
Senior API Engineer, Search
hace 2 semanas
Buenos Aires, Argentina Ailet A tiempo completoLead Api Engineer (Search and Analytics) Blue Orange Digital Site Reliability Engineer Observability Lead Unilever Site Reliability Engineer Observability LeadResponsibilities- Create a robust observability framework, including an APM, alarming, dashboarding, event correlation, integrated to an existing observability platform.- Perform analytics on previous...