Software Engineer Reinforcement Learning

Location

Zürich, Zürich, Switzerland

Salary

50000 - 90000 a year (Swiss Francs)

Description

Employment Type: 6 Month Contract

We are looking for a Software Engineer with a focus on data preparation and AI model training. You will work on assembling, annotating, and cleaning training data, while contributing to reward modeling and supervised fine-tuning tasks.


You might thrive in this role if you:

  • Have a deep understanding of machine learning and machine learning applications.
  • Working knowledge and experience tuning large language models (multimodal) and building evaluations.
  • Be willing to dive into large codebases to debug.
  • Someone who thrives in a dynamic and technically complex environment.
  • Track record of delivering outside-the-box novel solutions to solve real-world constraints.

 

Responsibilities

  • Data Assembly & Annotation: Gather and annotate training data for AI models, ensuring it meets the quality requirements for reward modeling and supervised fine-tuning.
  • Data Cleaning & Processing: Conduct data cleaning and preprocessing to ensure models receive high-quality input.
  • Model Training: Participate in the training and fine-tuning of models, ensuring that they meet performance and accuracy standards.
  • Collaboration: Work with AI engineers, data scientists, and other team members to ensure efficient workflows and data handling.
  • Continuous Improvement: Support iterative improvements to models based on performance monitoring and feedback.

 

Requirements

  • Experience: At least 3 years of experience working in a software engineering role focused on AI/ML tasks.
  • Data Expertise: Hands-on experience assembling, annotating, and cleaning training data for machine learning models.
  • Technical Skills: Proficiency in Python and experience with AI frameworks like TensorFlow or PyTorch.
  • Model Training: Familiarity with model training, reward modeling, and supervised fine-tuning techniques.
  • Attention to Detail: Strong focus on data quality and attention to detail when handling large datasets.

 

Bonus Points

  • Experience working with reward modeling for AI systems.
  • Familiarity with data labeling tools and techniques for supervised fine-tuning.
  • Knowledge of cloud platforms for AI/ML workloads.


Please mention the word **WARM** and tag RNDQuMjA1LjUuNTM= when applying to show you read the job post completely (#RNDQuMjA1LjUuNTM=). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.

Job type:

Remote job

Tags

  • software
  • python
  • training
  • support
  • cloud
  • assembly
  • engineer
  • engineering
Sent 25 days ago
Back to index