• Statistical understanding of the data, analyzing data for deeper insights
  • ML model evaluation and production-level experimentation pipelines

Core Responsibilities

• Collaborate directly with technical leads to design evaluation experiments for model
   performance assessment
• Set up controlled testing environments for LLM-as-judge scenarios
• Analyze experimental outputs and translate findings into clear, actionable insights
• Develop feature engineering approaches to understand dataset characteristics and quality


Required Technical Skills

• Python, statistical analysis, hypothesis testing
• Advanced data manipulation, feature selection, dimensionality reduction
• A/B testing, cross-validation, statistical significance testing
• Understanding of language model evaluation challenges and LLM-asjudge methodologies

Preferred
• Experience with PyTorch/TensorFlow for model analysis
• SQL for data extraction and analysis
• Visualization tools (matplotlib, seaborn, plotly) for results presentation
• Background in NLP evaluation or model interpretability

 

Interested parties please send your full resume with your current and expected salary to shirley.cho@manpowergrc.hk





Type: Contract

Category: I.T & T - Engineering

Reference ID: 508-24112025-SC

Date Posted: 24/11/2025

Search Jobs by Categories (43) Search Jobs by Locations (32) All Job Types (4)
Powered by SnapHop