Mlops Engineer
hace 3 meses
Description
**Requirements**:
Design and implement infrastructure for deploying and managing ML models, mainly focused on AWS services. This involves choosing orchestration tools for automating the ML workflow.
Containerize models to ensure consistency and portable deployment across environments.
Setup monitoring and tracking systems to track the health of ML models in production.
Automate the process of deploying ML models from dev to prod.
Models’ version control.
Datasets version control.
Collaborate with data scientists, AI engineers and data engineers to understand the models and their requirements.
Document the ML workflows, including deployment procedures, monitoring practices, and retraining strategies.
Implement security measures to protect sensitive information used in ML models and during deployment.
Ensure data privacy regulations are adhered to throughout the ML lifecycle.
Develop monitoring dashboards to visualize model performance and identify potential issues proactively.
**Requirements**:
Bachelor or Masters degree in Computer Science.
Experience of at least 5 years in DevOps and MLOps.
Strong understanding of machine learning concepts, algorithms, and techniques.
Proficiency in machine learning libraries/frameworks such as TensorFlow, PyTorch, or scikit-learn.
Experience in model development, training, evaluation, and optimization.
Ability to translate machine learning models into production-ready code.
Deep knowledge of AWS services relevant to machine learning, such as Amazon SageMaker, AWS Lambda, AWS Glue, AWS Step Functions, AWS Batch, and Amazon EMR.
Familiarity with AWS storage and database services such as Amazon S3, Amazon RDS.
Expertise in containerization technologies such as Docker and container orchestration with Kubernetes.
Proficiency in managing infrastructure as code using tools like AWS CloudFormation or Terraform.
Experience in continuous integration and continuous deployment (CI/CD) pipelines for machine learning models.
Ability to monitor and troubleshoot production machine learning systems, ensuring high availability, scalability, and performance.
Understanding of DevOps principles and practices, including automation, version control, and collaboration.
Excellent communication, collaboration, and problem-solving skills.
AWS certifications relevant to machine learning and operations, such as AWS Certified Machine Learning - Specialty, AWS Certified DevOps Engineer Professional, or AWS Certified Solutions Architect - Professional, would be highly beneficial
5 years