AI Agent Evaluation Analyst
hace 3 semanas
About Mindrift At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What We Do The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe. Who We’re Looking For We’re looking for curious and intellectually proactive contributors, the kind of person who double‑checks assumptions and plays devil’s advocate. Are you comfortable with ambiguity and complexity? Does an async, remote, flexible opportunity sound exciting? Would you like to learn how modern AI systems are tested and evaluated? This Opportunity A flexible, project‑based opportunity well‑suited for: Analysts, researchers, or consultants with strong critical thinking skills Students (senior undergrads / grad students) looking for an intellectually interesting gig People open to a part‑time and non‑permanent opportunity About the Project We’re on the hunt for QAs for autonomous AI agents for a new project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you’ll have to balance quality assurance, research, and logical problem‑solving. This project opportunity is ideal for people who enjoy looking at systems holistically and thinking through scenarios, implications, and edge cases. What You’ll Be Doing Reviewing evaluation tasks and scenarios for logic, completeness, and realism Identifying inconsistencies, missing assumptions, or unclear decision points Helping define clear expected behaviors (gold standards) for AI agents Annotating cause‑effect relationships, reasoning paths, and plausible alternatives Thinking through complex systems and policies as a human would to ensure agents are tested properly Working closely with QA, writers, or developers to suggest refinements or edge‑case coverage Requirements Excellent analytical thinking: Can reason about complex systems, scenarios, and logical implications Strong attention to detail: Can spot contradictions, ambiguities, and vague requirements Familiarity with structured data formats: Can read, not necessarily write, JSON/YAML Ability to assess scenarios holistically: What’s missing, what’s unrealistic, what might break? Good communication and clear writing (in English) to document your findings Preferred Experience Experience with policy evaluation, logic puzzles, case studies, or structured scenario design Background in consulting, academia, olympiads (e.g. logic/math/informatics), or research Exposure to LLMs, prompt engineering, or AI‑generated content Familiarity with QA or test‑case thinking (edge cases, failure modes, "what could go wrong") Some understanding of how scoring or evaluation works in agent testing (precision, coverage, etc.) Benefits Get paid for your expertise, with rates that can go up to $17/hour depending on your skills, experience, and project needs Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments Participate in an advanced AI project and gain valuable experience to enhance your portfolio Influence how future AI models understand and communicate in your field of expertise We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI. #J-18808-Ljbffr
-
Freelance Agent Evaluation Analyst
hace 5 días
Municipio de Rincón de los Sauces, Argentina Mindrift A tiempo completo1 day ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation meets...
-
Remote AI Agent Evaluation Analyst
hace 3 semanas
Municipio de Rincón de los Sauces, Argentina Mindrift A tiempo completoA technology firm specializing in AI is seeking Quality Analysts for a flexible, project-based opportunity. Ideal candidates are analytical thinkers who can review evaluation tasks for logic and detail. Responsibilities include identifying inconsistencies and helping define expected behaviors for AI agents. This part-time role offers up to $17/hour, suitable...
-
AI Engineer
hace 3 semanas
Municipio de Rincón de los Sauces, Argentina GM2 A tiempo completoAI Software Engineer (Generative AI) – Remote Integrate Generative AI models, including LLMs, with external APIs, tools, and databases using secure and efficient orchestration patterns. Design, develop, and deploy AI workflows and Agentic AI solutions, enabling seamless orchestration of intelligent agents to plan and perform tasks while leveraging...
-
AI Agent Engineer
hace 3 semanas
Municipio de Rincón de los Sauces, Argentina Rocket Lab | The App Growth Hub A tiempo completoAcerca de Rocket Lab Rocket Lab es un App Growth Hub que impulsa el crecimiento sostenible de aplicaciones móviles mediante estrategias basadas en datos, creatividad y tecnología. Ayudamos a las marcas más innovadoras del mundo a escalar sus apps a través de la adquisición, engagement y retención de usuarios. Ser parte de Rocket Lab significa unirte a...
-
Freelance Economic Analyst
hace 2 semanas
Municipio de Rincón de los Sauces, Argentina Mindrift A tiempo completo2 days ago Be among the first 25 applicants Opportunity Details At Mindrift, innovation meets opportunity. We believe in using the power of collective intelligence to ethically shape the future of AI. What We Do The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our...
-
Product Owner – Brain AI Builder
hace 1 semana
Municipio de Rincón de los Sauces, Argentina Jelou AI A tiempo completoOverview El rol de Product Owner se enfoca 100% en el AI Builder, liderando la ejecución y priorización del roadmap, y participando activamente en cómo evoluciona el producto. Este no es un rol de coordinación pasiva: aquí se espera iniciativa, prototipado y toma de decisiones. Responsibilities Liderar la ejecución y priorización del roadmap del AI...
-
Senior AI Agent Engineer — Build Multi-Agent AI Systems
hace 3 semanas
Municipio de Rincón de los Sauces, Argentina Rocket Lab | The App Growth Hub A tiempo completoUna empresa innovadora en crecimiento de aplicaciones busca un Senior AI Agent Engineer con más de 5 años de experiencia en desarrollo de software. El candidato diseñará y desarrollará sistemas robustos de inteligencia artificial, comenzando por chatbots y avanzando hacia arquitecturas multiagente. Se valora la experiencia en TypeScript, herramientas...
-
Full-Stack Software Engineer
hace 3 semanas
Municipio de Rincón de los Sauces, Argentina Workana A tiempo completoFull-Stack Software Engineer - Testing & Evaluation Remote | Part-Time Workana Premium is partnering with AfterQuery, a YC‑backed AI research lab. This is a part‑time, task‑based opportunity where you’ll contribute to the training of cutting‑edge AI models by designing and documenting real‑world engineering tasks. About AfterQuery: AfterQuery is...
-
Generative AI Engineer — Remote, Multi-Agent Orchestration
hace 3 semanas
Municipio de Rincón de los Sauces, Argentina GM2 A tiempo completoA technology solutions provider is seeking an AI Software Engineer specializing in Generative AI for a remote position. The role involves integrating Generative AI models with various APIs and tools, designing AI workflows, and optimizing multi-agent systems. Candidates should have a strong background in software engineering, especially in API integration...
-
Marketing Analyst
hace 2 semanas
Municipio de Rincón de los Sauces, Argentina Darwin AI A tiempo completoOverview Darwin A.I. is building the future of artificial intelligence, helping businesses make smarter, faster, and more impactful decisions. We're looking for a Marketing Analyst to join our team and turn data into insights that drive growth. In this role, you'll analyze marketing performance, uncover trends, and provide recommendations that shape our...