Job Description

Data Engineer – New Grad – Scale AI

Scale AI is the data platform powering the AI revolution — providing high-quality AI training data annotation, evaluation, and RLHF (Reinforcement Learning from Human Feedback) for the world's leading AI labs and enterprises. Scale's customers include OpenAI, Anthropic, Meta AI, the US Department of Defense, and 500+ enterprises training AI models across autonomous vehicles, natural language processing, computer vision, and robotics. Scale processes billions of data annotations annually, making it the most critical data infrastructure company for the AI era. With $1.5B raised at a $14B+ valuation, Scale AI is at the center of the most transformative technology moment in decades. We are hiring New Grad Data Engineers to build the data pipelines powering Scale's AI training data platform.

Responsibilities

Build Scale's AI training data ingestion pipelines — processing raw customer datasets (images, text, video, LiDAR point clouds) into Scale's annotation task management platform
Develop Scale's quality assurance data pipelines — implementing statistical sampling, annotator performance scoring, and consensus-based gold label generation for AI training datasets
Implement Scale's RLHF data pipeline — processing human preference feedback from Scale's evaluator network for reinforcement learning from human feedback model training
Build Scale's model evaluation data infrastructure — generating benchmark datasets, adversarial test cases, and red-teaming prompts for LLM safety and capability evaluation
Develop Scale's enterprise customer data onboarding pipelines — securely ingesting, deduplicating, and anonymizing sensitive customer datasets for proprietary AI model training
Build internal analytics datasets tracking annotator quality, task throughput, and data pipeline SLAs across Scale's global annotation operations

Requirements

Bachelor's degree in Computer Science, Data Engineering, or Machine Learning
Strong Python and SQL skills for data pipeline development
Understanding of ML data concepts: training data, annotations, data quality, and model evaluation
Familiarity with cloud data platforms (AWS, GCP) and distributed data processing (Spark, Dask)
Passion for AI development and understanding of how data quality impacts model performance

Benefits

Highly competitive salary with Scale AI pre-IPO equity at $14B+ valuation
Work at the epicenter of AI development — powering models used by 500M+ people
Medical, dental, and vision benefits
401(k) with Scale AI matching
San Francisco headquarters with hybrid flexibility and AI-native engineering culture

Job Details

Salary	$36 – $55 / month
Job Type	Full-time
Work Mode	Hybrid
Location	San Francisco, CA
Apply Before	Jul 19, 2026

Important: We never charge any fee at any stage of the hiring process. If anyone asks for money, report it to [email protected].

Apply on Company Website

Similar Jobs

No similar jobs found.