Data Scientist Job at Mercor, Remote

R1I0b2RodE1mMXdiTGlJTTIyRlhZMmpzRnc9PQ==
  • Mercor
  • Remote

Job Description

Job Description: AI Task Evaluation & Statistical Analysis Specialist

Role Overview

We're seeking a data-driven analyst to conduct comprehensive failure analysis on AI agent performance across finance-sector tasks. You'll identify patterns, root causes, and systemic issues in our evaluation framework by analyzing task performance across multiple dimensions (task types, file types, criteria, etc.).

Key Responsibilities

  • Statistical Failure Analysis : Identify patterns in AI agent failures across task components (prompts, rubrics, templates, file types, tags)

  • Root Cause Analysis : Determine whether failures stem from task design, rubric clarity, file complexity, or agent limitations

  • Dimension Analysis : Analyze performance variations across finance sub-domains, file types, and task categories

  • Reporting & Visualization : Create dashboards and reports highlighting failure clusters, edge cases, and improvement opportunities

  • Quality Framework : Recommend improvements to task design, rubric structure, and evaluation criteria based on statistical findings

  • Stakeholder Communication : Present insights to data labeling experts and technical teams

Required Qualifications

  • Statistical Expertise : Strong foundation in statistical analysis, hypothesis testing, and pattern recognition

  • Programming : Proficiency in Python (pandas, scipy, matplotlib/seaborn) or R for data analysis

  • Data Analysis : Experience with exploratory data analysis and creating actionable insights from complex datasets

  • AI/ML Familiarity : Understanding of LLM evaluation methods and quality metrics

  • Tools : Comfortable working with Excel, data visualization tools (Tableau/Looker), and SQL

Preferred Qualifications

  • Experience with AI/ML model evaluation or quality assurance

  • Background in finance or willingness to learn finance domain concepts

  • Experience with multi-dimensional failure analysis

  • Familiarity with benchmark datasets and evaluation frameworks

  • 2-4 years of relevant experience

Job Tags

Remote job, Full time,

Similar Jobs

Lugg

FT Customer Support Representative - Work From Home Job at Lugg

 .../ Fully Remote] - Anywhere in U.S. / Up to $52K per year / Competitive benefits - As a Customer Support Rep you'll: Provide email, chat and SMS support to Customers, Luggers & Retailers; Achieve world-class customer satisfaction ratings; Perform real-time analysis and... 

Six Flags Over Texas Careers

Security Attendant Job at Six Flags Over Texas Careers

Overview: Security Attendant will be scheduled up to 29 hours depending on operational needs. Pay of $13/hr. Responsibilities: Description This person is responsible for ensuring the safety and security of all Six Flags Over Texas guests and Team members through...

Conexus Food Solutions

Class C Driver Job at Conexus Food Solutions

 ...Responsibilities Safely operate a company-provided Class C vehicle to deliver food and supplies to local restaurants. Load and unload delivery items (both at warehouse and client site),...  .... Qualifications Valid Class C driver's license with a clean driving record.... 

Intermountain Health

MRI Technologist Graveyard $7500 Bonus Job at Intermountain Health

 ...MRI Technologist Graveyard $7500 Bonus at Intermountain Health summary: The MRI Technologist at Intermountain Health provides patient-centered imaging services using MRI technology, ensuring high-quality scans and patient safety. This full-time night-shift role requires... 

Mindlance

Registered Pharmacist Job at Mindlance

 ...Description: An outpatient specialty pharmacy is seeking a licensed Pharmacist to provide temporary coverage. This role supports patients...  ...biologics, oral oncolytics, and injectables , across oncology, dermatology, rheumatology, and neurology. Key Responsibilities...