Senior Engineering Manager, Site Reliability
hace 7 días
Senior Engineering Manager, Site Reliability Join to apply for the Senior Engineering Manager, Site Reliability role at Next League. 3 days ago Be among the first 25 applicants. As the Senior Manager of Site Reliability Engineering, you will be responsible for ensuring the reliability, scalability, and efficiency for a wide range of client systems, including organizations like NASCAR, USOPC, TGL. This role is pivotal in leading strategic initiatives across the SRE domain to ensure optimal infrastructure and system performance, directly aligning with our clients’ business objectives. Your duties will encompass individual contributor responsibilities combined with management and leadership functions: high-level planning, governance, and continuous enhancement of reliability practices, as well as directing the SRE team to achieve and maintain superior service standards. Note: For this role we are only looking to hire in Canada and LATAM in the EST Time Zone. Leadership And Management Responsibilities Lead and mentor a team of 5 site reliability engineers as a "player-coach," actively collaborating with them to achieve reliable and scalable systems for our client partners. Guide, mentor, and foster the professional growth of a five-person SRE development team, establishing well-defined objectives and aligning career progression with overall organizational strategy. Champion innovation in automation, advocating for technologies that enhance system efficiencies and team productivity. Implement advanced monitoring to proactively forecast and mitigate system risks, ensuring business continuity. Align SRE goals with senior leadership's business objectives and client needs. Drive a culture of continuous improvement, incorporating cutting‑edge technologies and best practices into the SRE workflow. Oversee the development and implementation of training programs that elevate the technical acumen of the SRE team. Oversee and negotiate with technology vendors to procure tools necessary for advancing our SRE capabilities. Work with clients to define SLAs and procedures for escalation to 3rd party vendors. Individual Contributor Responsibilities Engage in SRE planning and execution, which includes participating in a rotational on‑call schedule for LiveOps support. Develop and execute a comprehensive site reliability strategy that supports the organization's overarching objectives. Partner with Solution Architecture to design, implement, and test production systems for high availability, scalability, and performance, ensuring business continuity during high‑visibility sports events. Evolve incident management to include risk assessment and develop organization‑wide, long‑term mitigation strategies. Direct and oversee root cause analyses (RCAs) for all major incidents, driving subsequent process improvements and follow‑up actions. Maintain service availability and performance, set and monitor SLAs, and reduce downtime and reliability risks. Drive adoption of best practices in CI/CD, cloud architecture, and system resilience. Hands‑on execution with expectation of being 70%+ billable on client work. Qualifications Minimum of 5 years of experience as a Site Reliability Engineer (SRE). At least 2 years of experience managing an SRE team. Proven success in Site Reliability Engineering (SRE), DevOps, or a related discipline, with deep expertise in large‑scale system architecture, including cloud services and enterprise deployments. Experience with AWS, Cloudwatch, DataDog is required. Proven experience in managing technology platforms, particularly during periods of high traffic. Proven experience in people management, including scheduling, on‑call rotations, and fostering team members' professional development through learning and training initiatives. Advanced hands‑on knowledge of automation scripting, infrastructure as code, and contemporary cloud orchestration tools. Demonstrated ability to contribute to strategic planning and initiatives in a technology‑focused environment. Exceptional problem‑solving, organizational, and leadership skills. Supervisory Responsibilities Direct leadership and development responsibilities for 5 SRE team members. Strategic oversight of the department's staff, including hiring, training, and performance evaluation. Location / Work Hours This role is 100% remote. Flexibility required to align with global team schedules, critical project timelines, and LiveOps availability. Travel Requirements Up to 5% travel may be required to foster team alignment, participate in key meetings, and support business needs. Work Environment The characteristics described here are representative of those an employee encounters while performing essential functions. Reasonable accommodations may be made to enable individuals with disabilities to perform essential functions. The noise level in the work environment is usually moderate. Next League is the leading digital growth consultant and technology solutions partner helping lead the sports industry to know what’s next. Founded by a team of technology veterans with decades of success in sports, Next League is redefining the digital agency and technology services model to unlock new business growth, digital innovation and technology solutions with a commitment to lasting social impact. The people‑first, culture‑driven approach puts a focus on building inclusive, curious and collaborative relationships that deliver next level digital experiences. Salary will be commensurate with a variety of factors, including qualifications, experience and geographic location. We strive to provide the best working environment for our team members by offering the following benefits: Retirement Plan Programs (with a company match) Unlimited Vacation & Sick Time Excellent Health Benefits Packages Flexible Working Opportunities (we are a 100% remote business) Diversity, Equity, and Inclusion are the core of our culture at Next League. Providing a safe and inclusive space for all team members to ensure their voice is heard is critical to our success. Next League provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. Pay Range $55,000 - $105,000 USD Seniority Level Mid-Senior level Employment Type Full‑time Job Function Engineering and Information Technology Industries Software Development #J-18808-Ljbffr
-
Site Reliability Engineering
hace 4 semanas
, Chubut, Argentina Capgemini Engineering A tiempo completoSite Reliability Engineering (SRE)-Observability ARS - Site Reliability Engineer (SRE) – Observability Capgemini Engineering is the world leader in engineering services. We bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash potential. Key Responsibilities Implement telemetry (logs,...
-
Remote Senior SRE Manager: Reliability
hace 5 días
, , Argentina Next League A tiempo completoA digital growth consultant is seeking a Senior Engineering Manager for Site Reliability to lead a team of engineers and ensure the reliability and efficiency of client systems. The position involves responsibilities that include both leadership duties and hands-on technical tasks, focusing on maintaining high service standards for major clients. Candidates...
-
Site Reliability Engineer: Scale, Reliability
hace 4 semanas
, , Argentina Capchase A tiempo completoJoin a forward-thinking company as a Site Reliability Engineer, where you'll play a crucial role in building scalable, high-performing systems. This position offers the opportunity to shape the future of reliability engineering while ensuring the availability, latency, and performance of our systems. You'll collaborate with a diverse team to define the...
-
Site Reliability Engineer
hace 3 semanas
, , Argentina Capchase A tiempo completoJoin to apply for the Site Reliability Engineer role at Capchase . Capchase provides flexible payment solutions to B2B software, cloud, and AI companies. Our core product, Capchase Pay , offers a buy-now-pay-later payment option for B2B SaaS, hardware, and cloud purchases, helping companies sell more and collect cash faster. Founded in 2020 and headquartered...
-
Senior Site Reliability Engineer
hace 2 semanas
, , Argentina Mas Global Consulting Llc A tiempo completoHi, this is Monica Hernandez, Founder and CEO of MAS Global. I started MAS with the idea that we could be more than a business, that’s why we like to say that MAS is More . I was born and raised in Medellin, Colombia and thanks to a scholarship I became a Software Engineer and built a career in the US, where I now live. Starting MAS was my way to give back...
-
Site Reliability Engineer
hace 1 semana
, , Argentina Epsilon Solutions Ltd. SA de CV. A tiempo completoDirect message the job poster from Epsilon Solutions Ltd. SA de CV. Technical Recruiter at Epsilon Solutions Ltd. Job profile: Sr. Site Reliability Engineer Location: Argentina (REMOTE) Job Type: Full Time Contract Overview We are looking for a highly skilled SRE with an engineering background to support our top clients with strong technical exposure and...
-
Senior SRE: Cloud Reliability
hace 2 semanas
, , Argentina Mas Global Consulting Llc A tiempo completoA leading technology firm in Argentina is seeking a Senior Site Reliability Engineer to ensure system reliability, scalability, and security. The ideal candidate will have over 5 years of experience in Site Reliability Engineering or DevOps, strong skills in AWS, Docker, Kubernetes, and automation. This role involves driving automation initiatives,...
-
Senior Devops
hace 4 semanas
, Chubut, Argentina Ingenierojob A tiempo completoSenior DevOps / Site Reliability Engineer (Azure) (Ref-Lch) ARS 1.200.000 - 1.500.000 Senior DevOps / Site Reliability Engineer (Azure) (Ref-Lch) We are looking for a highly skilled Senior DevOps / Site Reliability Engineer with deep experience in Azure cloud, CI/CD automation, and secure workload identity. This role is ideal for someone who masters modern...
-
Senior Site Reliability Engineer
hace 1 día
, , Argentina Laravel A tiempo completoAt Laravel, we don’t just build tools; we build the foundation that empowers millions of developers to ship their dreams. We are looking for a Senior Site Reliability Engineer to help us scale that mission by ensuring our global infrastructure remains as elegant and reliable as the code we write. If you are energized by the challenge of managing...
-
Site Reliability Engineering
hace 4 días
Capital Federal, Buenos Aires, , Argentina Modo A tiempo completoSomos MODO, la fintech de los bancos argentinos que está revolucionando la manera de pagar y ahorrar con promociones en la Argentina. Estamos en el centro del ecosistema de pagos, desarrollando experiencias de pago novedosas en QR, NFC y online con todos los medios de pago, y creando el mejor lugar para hacer y disfrutar promociones. Además, creamos el...