Lead AI Inference Engineer

hace 5 días


Buenos Aires, Argentina CFO TECH LATAM A tiempo completo

Lead AI Inference Engineer Tether Operations Limited Join Tether and Shape the Future of Digital Finance. At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from exchanges and wallets to payment processors and ATMs—to seamlessly integrate digital assets across blockchains. This role focuses on edge AI, porting and enhancing inference engines to run efficiently on edge devices. Overview Lead role responsible for delivering cohesive, production-ready local AI systems by addressing both low-level edge challenges and product-facing needs of app developers. Responsibilities Work on deploying machine learning models to edge devices using frameworks: llama.cpp, ggml, onnx Collaborate closely with researchers to assist in coding, training and transitioning models from research to production environments Integrate AI features into existing products, enriching them with the latest advancements in machine learning Manage a cross-functional team (pod) of middleware (JS), foundation (C++), QA and documentation engineers to produce high-quality deliverables Regularly assess our market position relative to similar products or platforms Leverage the expertise of technical architects to ensure robust architectural choices and code quality Ensure stable releases by following precise internal release processes Excellent programming skills in C++ Strong experience with Llama.cpp and ggml inference engines for deployment to specific GPU architectures Good understanding of deep learning concepts and model architectures Experience with transformers and LLMs Demonstrated ability to rapidly assimilate new technologies and techniques Has experience managing a small, specialized, cross-functional team (pod) of 3-5 people Has a genuine passion for building good products that improve people’s lives A degree in Computer Science, AI, Machine Learning, or a related field, complemented by a solid track record in AI R&D You have experience with AWS, containerization platforms, orchestration, and automated testing suites (Maestro, Appium) You understand the difficulties, nuances and importance of p2p technology You have worked with MLC, TVM or similar frameworks You have experience with Vulkan, CUDA You have productionized models Important information for candidates Recruitment scams have become increasingly common. To protect yourself, please keep the following in mind when applying for roles: Apply only through our official channels. We do not use third-party platforms or agencies for recruitment unless clearly stated. All open roles are listed on our official careers page. Verify the recruiter’s identity. All our recruiters have verified LinkedIn profiles. If you’re unsure, you can confirm their identity by checking their profile or contacting us through our website. Be cautious of unusual communication methods. We do not conduct interviews over WhatsApp, Telegram, or SMS. All communication is done through official company emails and platforms. Double-check email addresses. All communication from us will come from emails ending with @tether.to or @tether.io We will never request payment or financial details. If someone asks for personal financial information or payment at any point during the hiring process, it is a scam. Please report it immediately. When in doubt, feel free to reach out through our official website. We look forward to hearing from you. #J-18808-Ljbffr



  • Buenos Aires, Argentina CFO TECH LATAM A tiempo completo

    A leading tech company is seeking a Lead AI Inference Engineer to manage the deployment of machine learning models to edge devices. You will integrate advanced AI features into products while leading a cross-functional team. Ideal candidates hold a degree in a related field and have extensive experience with C++ and AI technologies. This position offers...


  • Buenos Aires, Argentina Pavago A tiempo completo

    A leading digital finance company based in Buenos Aires is seeking a Lead AI Inference Engineer to drive the delivery of cohesive, production-ready local AI systems. The ideal candidate will have excellent programming skills in C++, strong experience with AI model deployment, and the ability to manage a small, specialized team. Passion for building products...


  • Buenos Aires, Argentina Pavago A tiempo completo

    Join Tether and Shape the Future of Digital Finance At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting‑edge solutions empower businesses—from exchanges and wallets to payment processors and ATMs—to seamlessly integrate reserve‑backed tokens across blockchains, enabling instant, secure, and...


  • Buenos Aires, Argentina TD SYNNEX Argentina S.R.L. A tiempo completo

    A leading fintech company is looking for a Senior AI Inference Engineer specializing in llama.cpp to work 100% remote. The engineer will optimize the C++ layer for local AI inference engines on edge devices. Responsibilities include deploying models, collaborating with researchers, and integrating the latest AI features into products. Candidates should have...


  • Buenos Aires, Argentina We Clone You A tiempo completo

    A tech-focused operations firm is seeking a Senior AI Inference Engineer to enhance and port AI inference engines like Llama.cpp for edge devices. In this role, you will work close to the metal, ensuring models load quickly and run efficiently. Responsibilities include deploying models, collaborating with researchers, and integrating AI features into...


  • Buenos Aires, Argentina Tether Operations Limited A tiempo completo

    Join Tether and Shape the Future of Digital Finance At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from exchanges and wallets to payment processors and ATMs—to seamlessly integrate reserve-backed tokens across blockchains. By harnessing the power of blockchain...


  • Buenos Aires, Argentina We Clone You A tiempo completo

    Senior AI Inference Engineer (llama.cpp specialist) 100% Remote Tether Operations Limited About the Job You'll work on the C++ layer that powers local AI, porting and enhancing inference engines like llama.cpp, ONNX and similar, to run efficiently on edge devices. Your focus is on the runtime: making models load faster, run leaner, and perform well across...


  • Buenos Aires, Argentina Tether Operations Limited A tiempo completo

    **_Join Tether and Shape the Future of Digital Finance_** At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from exchanges and wallets to payment processors and ATMs—to seamlessly integrate reserve-backed tokens across blockchains. By harnessing the power of...


  • Buenos Aires, Argentina TD SYNNEX Argentina S.R.L. A tiempo completo

    Senior AI Inference Engineer (llama.cpp specialist) 100% Remote Tether Operations Limited Join Tether and Shape the Future of Digital Finance At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from exchanges and wallets to payment processors and ATMs—to seamlessly...

  • Ai Engineer

    hace 19 horas


    Buenos Aires, Argentina Elevation AI, Inc. A tiempo completo

    Elevation AI is seeking a hands-on, inventive AI Engineer with deep expertise in agent development to design, build, and scale intelligent, autonomous systems. You’ll be at the forefront of shaping how generative AI is applied in production—building agents that combine reasoning, planning, tool integration, and multi-step orchestration to transform...