Senior Data Pipeline Engineer
hace 4 días
Apiphani is a technology‑enabled managed services company dedicated to redefining what it means to support mission‑critical enterprise workloads. We're a small but rapidly growing company, which means there's lots of room for growth and learning opportunities abound Apiphani is dedicated to creating a diverse and inclusive work environment for all as a fundamental component of our business. Diversity and inclusion are the bedrock of creativity and innovation. Without diversity of experience and thought, we would fail to progress as a company and as a team. Apiphani strives to foster an environment of belonging, where every employee feels respected, valued, and empowered. We embrace the unique experiences, perspectives, and cultural background, which only you can bring to the table. Job Description An experienced data pipeline engineer who uses modern data engineering practices to transform raw data into reliable, consumable data products on AWS and other cloud platforms. The role is responsible for designing, developing, testing, and deploying scalable data pipelines, data warehouses, data lakes, and data products that support business and analytics needs. As a senior member of the analytics team, you will own critical production data pipelines and shape the evolution of our customer‑facing data products and metrics. You will work closely with data analysts, data scientists, and other stakeholders to ensure data quality, reliability, and availability across batch and streaming workloads. Typical activities include developing and configuring jobs for data ingestion, transformation, enrichment, efficient database and table design, and exposing curated data to downstream consumers. The role focuses on efficiency and resilience by aligning data platforms and pipelines with business goals and cloud architecture best practices. You will also influence data and platform roadmaps by providing technical leadership, setting best practices, and mentoring other engineers. Job Duties Design, develop, and maintain scalable batch and streaming data pipelines using Apache Spark and cloud‑native services (e.g., AWS Glue, EMR, Kinesis, and Lambda). Utilize and optimize Apache Spark (RDDs, DataFrames, Spark SQL) for distributed processing of large datasets, including both batch and near real‑time use cases. Implement robust ETL/ELT processes to ingest and transform data from databases, APIs, files, and event streams into curated datasets stored in S3 data lakes, data warehouses (such as Amazon Redshift), and data marts. Implement data quality checks, validation rules, and governance controls (including schema enforcement, profiling, and reconciliation) to ensure accuracy, completeness, and consistency. Develop and maintain logical and physical data models, schemas, and metadata in catalogs to support analytics, BI, and ML consumption. Create and manage data warehouses, data lakes, and data marts on AWS and other cloud platforms (such as Azure or GCP) following modern architectural patterns. Collaborate with data analysts, data scientists, and business stakeholders to understand data requirements and translate them into scalable pipeline and modeling solutions. Collaborate with DevOps, platform, security, and compliance teams to ensure secure, reliable cloud implementations and adherence to organizational standards. Develop cloud and data architecture documentation, including diagrams, guidelines, and best practices, to enable knowledge sharing and reuse. Troubleshoot and resolve data pipeline and job issues across development and production environments, ensuring minimal downtime and preserving data integrity. Continuously optimize data pipelines for performance, cost, reliability, and data quality using best practices in distributed data engineering and cloud resource tuning. Build algorithms and prototypes that combine and reconcile raw information from multiple sources, including resolving data conflicts and inconsistencies. Provide technical leadership for the analytics data stack, including reviewing designs, establishing standards for observability and reliability, and guiding junior engineers in delivering high‑quality solutions. Define and manage data and cloud infrastructure using infrastructure‑as‑code tools such as Terraform (and/or AWS CDK/CloudFormation) to ensure consistent, repeatable environments across development, test, and production. Participate actively in agile ceremonies (backlog refinement, sprint planning, daily stand‑ups, reviews), including estimating and updating user stories, tracking progress, and collaborating closely with data product and analytics stakeholders. Required Skills Bachelor's degree in Computer Science, Engineering, Mathematics, or related field, or equivalent work experience. 6+ years of experience in data engineering or closely related roles, working with large, complex datasets. Demonstrated experience owning production‑grade data pipelines end‑to‑end, from design and implementation through monitoring, incident response, and continuous improvement. Extensive hands‑on experience with Apache Spark for large‑scale data processing, including RDDs, DataFrames, and Spark SQL. Familiarity with big data ecosystem components such as HDFS, Hive, and HBase, and their cloud‑native equivalents on AWS and other clouds. Experience with SQL and NoSQL databases such as MySQL, PostgreSQL, DynamoDB, or similar technologies. Strong proficiency in SQL and at least one programming language such as Python (preferred) for data processing, automation, and orchestration glue code. Experience with data pipeline orchestration and scheduling tools such as AWS Step Functions, Amazon Managed Workflows for Apache Airflow (MWAA), or Apache Airflow. Experience with cloud‑based data platforms and services, ideally AWS (S3, Glue, EMR, Redshift, Kinesis, Lambda), with exposure to Azure or GCP as a plus. Experience designing and implementing data warehouses and data lakes, including partitioning, file formats, and performance optimization. Experience with data quality, automated data testing, and data governance methodologies and tools; familiarity with lineage, cataloging, and access controls. Strong analytical and problem‑solving skills, high attention to detail, and clear written and verbal communication. Ability to work independently and collaboratively in a fast‑paced, agile, and cross‑functional environment. Experience working with a modern data catalog such as Alation, Collibra, or similar tools is a plus. Ability to prepare and curate data for prescriptive and predictive modeling (e.g., features for ML models) is a plus. Hands‑on experience with infrastructure as code, preferably Terraform (and/or AWS CDK/CloudFormation), to provision and manage data and cloud resources. Practical experience working in an agile delivery model, including breaking down work into user stories, sizing and updating them during the sprint, and delivering incrementally. Base Salary $45,000–$90,000 USD Company Benefits Medical/dental/vision – 100% paid for employees, 50% paid for dependents Life and disability – 100% paid for employees 401(k) – 3% contribution, no employee contribution required Education and tuition reimbursement – up to $50 000 annually Employee Stock Options Plan Accident, critical illness, hospital indemnity benefits offered through our providers Employee Assistance Program Legal assistance Paid Time Off – up to 6 weeks per year Sick Leave – up to 2 weeks per year Parental Leave – up to 12 weeks Seniority Level Mid‑Senior level Employment Type Full‑time Job Function Engineering and Information Technology Industries IT Services and IT Consulting #J-18808-Ljbffr
-
Senior Data Engineer: Scale Data Pipelines
hace 3 semanas
, , Argentina Sezzle A tiempo completoA leading fintech company in Argentina is searching for a Senior Data Engineer to design and optimize large-scale data pipelines. With a compensation range of $5000 to $9000 per month, the ideal candidate will have 9+ years of experience in data engineering, expertise with AWS Redshift and ETL frameworks, and strong hands-on skills in SQL and programming....
-
Senior Data DevOps Engineer
hace 4 semanas
, , Argentina EPAM Systems A tiempo completoA global technology consulting firm is seeking a Senior Data DevOps Engineer in Argentina. You will optimize data workflows and deployment processes, design data pipelines, and work with data scientists. The ideal candidate has at least 3 years of experience in Data DevOps, excellent communication skills, and advanced English proficiency. Join us for...
-
Senior Data Engineer – AI-Powered Pipelines
hace 6 días
, , Argentina Calyptus A tiempo completoA technology service provider is seeking a Mid-Senior level Data Engineer based in Argentina. The ideal candidate will have over 3 years of experience in data engineering, a solid foundation in Python and SQL, and proficiency with cloud platforms. This full-time role involves designing and building data pipelines, developing data connectors, and implementing...
-
Remote Senior Data Engineer: ETL
hace 7 días
, , Argentina Lumenalta A tiempo completoA digital solutions company is seeking a Senior-Level Data Engineer to design and maintain ETL pipelines. The role requires 7+ years of experience and proficiency in Python or Java, along with strong SQL skills. This is a fully remote position for candidates in Latin America, with flexible working hours aligned with U.S. time zones. Applicants should be...
-
Senior Software Engineer
hace 4 semanas
, , Argentina Amperity A tiempo completoA leading AI-first company is seeking a Senior Software Development Engineer to join their remote team in Argentina. This role involves developing advanced data pipelines and integrating AI capabilities to help industry leaders unlock the full potential of their customer data. Ideal candidates will have 8+ years of software development experience and a...
-
, , Argentina Applaudo A tiempo completoA leading IT firm in Argentina seeks a Senior Data Engineer to build reliable and scalable data platforms for analytics and AI products. The ideal candidate has over 5 years of experience with data pipelines, strong SQL skills, and proficiency in Python and cloud platforms. You will collaborate with diverse teams, focusing on data quality and best practices...
-
Senior Data Engineer
hace 7 días
, , Argentina SECRETO A tiempo completoSenior Data Engineer – FinTech Platform (Remote) Remote | USD | Contractor | English fluent C1 or more | Only profiles from LATAM Contractor role for a client in New York, US. Mandatory Skills 8+ years as a Data Engineer with strong architectural thinking. 5+ years hands‑on BigQuery in production (hard requirement). Expert with GCP data stack: BigQuery,...
-
Senior Data Engineer
hace 2 semanas
, , Argentina Applaudo A tiempo completoSenior Data Engineer You are a Senior Data Engineer who enjoys building reliable, scalable data platforms and enabling data-driven and AI-powered products. You feel comfortable owning data pipelines end to end, working closely with analytics and AI/Data Science teams, and translating complex data problems into robust technical solutions. You value clean...
-
Senior Data Engineer: Scalable Pipelines
hace 2 semanas
, , Argentina Baufest A tiempo completoUna empresa de tecnología en Argentina busca un/a Sr Data Engineer para diseñar e implementar soluciones de datos. Las responsabilidades incluyen el desarrollo de pipelines y la colaboración con equipos para garantizar la calidad de los datos. Se buscan candidatos con al menos 5 años de experiencia en soluciones end-to-end y habilidades en SQL y...
-
Remote ETL Engineer — Scalable Data Pipelines
hace 7 días
, , Argentina JBK Argentina A tiempo completoA tech solutions company based in Argentina is seeking a Mid-Senior level Data Engineer specializing in ETL. The role involves designing and developing robust data pipelines and requires a minimum of 3 years of experience, strong T-SQL and Python skills, and intermediate English proficiency. The company offers a competitive salary in USD, remote work...