Talent.com
No longer accepting applications
Fine Tuning / Post Training Data Scientist - RL (GRPO, PPO, RLHF)

Fine Tuning / Post Training Data Scientist - RL (GRPO, PPO, RLHF)

BinanceWorkFromHome, Canterbury, New Zealand
3 days ago
Job description

Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance offerings range from trading and finance to education, research, payments, institutional services, Web3 features, and more. We leverage the power of digital assets and blockchain to build an inclusive financial ecosystem to advance the freedom of money and improve financial access for people around the world.

About the Role

You will develop and optimize Reinforcement Learning (RL) models for enterprise-scale applications such as customer service, token reporting, compliance, and Web3 domain reasoning.

You will explore and evaluate advanced Algorithms including PPO, GRPO, DPO, RLHF, RLAIF, and Agentic RL to enhance the capabilities of LLMs, VLMs, and Agentic AI at Binance. The role requires a strong theoretical foundation in RL—covering policy optimization, reward modeling, and planning—paired with the Engineering skills to build scalable production systems.

You will take full ownership from research through deployment, driving experimentation with systematic evaluation and benchmarking. Collaboration across research, infrastructure, and application teams will be key to delivering impactful AI solutions.

Responsibilities

  • Research and develop state-of-the-art RL algorithms, focusing on large model optimization and alignment techniques.
  • Design and implement RL training pipelines, including environment simulation, data generation, and reward function design.
  • Apply Reinforcement Learning methods to enhance LLM / VLM / Agentic AI capabilities in reasoning, planning, and autonomous decision‑making.
  • Collaborate with Engineers and researchers to integrate RL solutions into enterprise AI platforms.
  • Monitor model performance in production and continuously improve through iterative training and fine‑tuning.

Requirements

  • Master’s Degree in Computer Science, Applied Mathematics, Machine Learning, or related fields.
  • 5+ years of hands‑on experience in RL and (either 1 : LLM / VLM / Agentic AI) optimization.
  • Strong coding skills in Python, with experience in ML frameworks and RL libraries.
  • Experience with large-scale distributed training and optimization.
  • Self‑driven, ownership mindset, and strong problem‑solving skills. Excellent communication skills for cross‑functional collaboration.
  • Why Binance

  • Shape the future with the world’s leading blockchain ecosystem
  • Collaborate with world-class talent in a user‑centric global organization with a flat structure
  • Tackle unique, fast‑paced projects with autonomy in an innovative environment
  • Thrive in a results‑driven workplace with opportunities for career growth and continuous learning
  • Competitive salary and company benefits
  • Work‑from‑home arrangement (the arrangement may vary depending on the work nature of the business team)
  • Binance is committed to being an equal opportunity employer. We believe that having a diverse workforce is fundamental to our success.

    By submitting a job application, you confirm that you have read and agree to our Candidate Privacy Notice.

    #J-18808-Ljbffr

    Create a job alert for this search

    Data Scientist • WorkFromHome, Canterbury, New Zealand

    Related jobs
    • Promoted
    Senior Data Scientist

    Senior Data Scientist

    SmarterDxWorkFromHome, Canterbury, New Zealand
    As a Senior Data Scientist at SmarterDx, you will play a pivotal role in training cutting-edge machine learning models and ensuring their strategic integration into our Clinical AI Platform.Your wo...Show moreLast updated: 30+ days ago
    • Promoted
    Net / SQL Developer

    Net / SQL Developer

    RICEFW Technologies,IncWorkFromHome, Canterbury, New Zealand
    Reputed US IT Staffing Company based in USA,.Duration : 24 months + possible extensions.Under the direction of client managers, the major responsibilities include performing web interface design, pr...Show moreLast updated: 29 days ago
    • Promoted
    Assistant De Direction (H / F) - Le Teil - Cdi

    Assistant De Direction (H / F) - Le Teil - Cdi

    Lafarge FranceCanterbury, New Zealand
    Join to apply for the Assistant de Direction (H / F) - Le Teil - CDI role at Lafarge France.Localisation : Le Teil, FR, • • • •. Assistance dans la gestion et le suivi de la correspondance de la Directio...Show moreLast updated: 30+ days ago
    • Promoted
    Recommendation System, Data Scientist / Machine Learning Engineer

    Recommendation System, Data Scientist / Machine Learning Engineer

    BinanceWorkFromHome, Canterbury, New Zealand
    Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countrie...Show moreLast updated: 3 days ago
    • Promoted
    Senior AI Engineer

    Senior AI Engineer

    DatacomWorkFromHome, Canterbury, New Zealand
    Datacom works with organisations and communities across Australia and New Zealand to make a difference in people’s lives and help organisations use the power of tech to innovate and grow.As an AI E...Show moreLast updated: 30+ days ago
    • Promoted
    Data Scientist / Machine Learning Engineer (Market Growth Lifecycle)

    Data Scientist / Machine Learning Engineer (Market Growth Lifecycle)

    BinanceWorkFromHome, Canterbury, New Zealand
    Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countrie...Show moreLast updated: 1 day ago
    • Promoted
    AI Data Specialist - New Zealand

    AI Data Specialist - New Zealand

    RWS TrainAIWorkFromHome, Canterbury, New Zealand
    We are hiring globally! We are looking for.Part-time - 10+ hours per week.Flexible - work whenever you want.Until the end of December 2025 (an extension is possible). The role involves performing di...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Software Engineer (AI Focus)

    Senior Software Engineer (AI Focus)

    Education PerfectWorkFromHome, Canterbury, New Zealand
    Education Perfect is an EdTech platform designed to empower educators and amplify their impact in the classroom.We aim to enable teachers to personalise learning at scale with a range of powerful l...Show moreLast updated: 3 days ago
    • Promoted
    SAP ABAP Developer

    SAP ABAP Developer

    TalentWorkFromHome, Canterbury, New Zealand
    Our client is seeking an SAP ABAP Developer for a 3-6 month remote contract across New Zealand.Proven expertise in developing applications using ABAP RESTful Application Programming Model within S4...Show moreLast updated: 30+ days ago
    • Promoted
    Solution Architect

    Solution Architect

    FederatoWorkFromHome, Canterbury, New Zealand
    Federato is on a mission to defend the right to efficient, equitable insurance for all.We enable insurers to provide affordable coverage to people and organizations facing today’s challenges, inclu...Show moreLast updated: 30+ days ago
    • Promoted
    Staff Data Scientist

    Staff Data Scientist

    SmarterDxWorkFromHome, Canterbury, New Zealand
    As a Staff Data Scientist at SmarterDx, you will play a pivotal role in training cutting-edge machine learning models and ensuring their strategic integration into our Clinical AI Platform.Your wor...Show moreLast updated: 30+ days ago
    • Promoted
    Staff Software Engineer, Data Platform

    Staff Software Engineer, Data Platform

    SmarterDxWorkFromHome, Canterbury, New Zealand
    We are looking for a data and backend-oriented Staff Software Engineer to help us advance our clinical AI by designing and building core systems that handle, process, and analyze clinical data at s...Show moreLast updated: 30+ days ago
    • Promoted
    Engineering Senior Machine Learning Engineer New Zealand (Remote) FullTime

    Engineering Senior Machine Learning Engineer New Zealand (Remote) FullTime

    Leonardo Interactive PtyWorkFromHome, Canterbury, New Zealand
    Ai is building one of the world’s highest-throughput Generative AI platforms, enabling millions of users, from beginners to professionals, to create high-quality images and videos with ease.Now par...Show moreLast updated: 30+ days ago
    • Promoted
    System Software Engineer - Golang compiler, tooling, and ecosystem

    System Software Engineer - Golang compiler, tooling, and ecosystem

    CanonicalWorkFromHome, Canterbury, New Zealand
    System Software Engineer - Golang compiler, tooling, and ecosystem.System Software Engineer - Golang compiler, tooling, and ecosystem. Canonical is a leading provider of open source software and ope...Show moreLast updated: 30+ days ago
    • Promoted
    Data & Analytics Principal Consultant - Microsoft Fabric

    Data & Analytics Principal Consultant - Microsoft Fabric

    DatacomWorkFromHome, Canterbury, New Zealand
    Principal Consultant Data & Analytics - Microsoft Fabric.Auckland Preferred, Wellington Ok, or else anywhere in the North Island if willing and able to travel. NZ Citizens and Residents Only.Datacom...Show moreLast updated: 12 days ago
    • Promoted
    Secondary Mathematics And Statistics Teacher

    Secondary Mathematics And Statistics Teacher

    Rangiora New Life SchoolCanterbury, New Zealand
    Secondary mathematics and statistics teacher.Secondary (Years 7–15) / wharekura, Certificated teacher.Full-time, permanent - Start date beginning term 1, • • • •. We require a compassionate, creative, ...Show moreLast updated: 21 days ago
    • Promoted
    Embedded Linux Senior Software Engineer - Optimisation

    Embedded Linux Senior Software Engineer - Optimisation

    CanonicalWorkFromHome, Canterbury, New Zealand
    Embedded Linux Senior Software Engineer - Optimisation.Embedded Linux Senior Software Engineer - Optimisation.Embedded Linux Senior Software Engineer - Optimisation. Be among the first 25 applicants...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Software Engineer, Data Platform

    Senior Software Engineer, Data Platform

    SmarterDxWorkFromHome, Canterbury, New Zealand
    We are looking for a data and backend-oriented Senior Software Engineer to help us advance our clinical AI by designing and building core systems that handle, process, and analyze clinical data at ...Show moreLast updated: 30+ days ago