Skip to content

Data Scientist & Analytics Engineer, Factory Readiness

OpenAI LogoOpenAI
View Organization

Salary

$405K + Offers Equity

Location

San Francisco, CA

We're seeking an Data Scientist & Analytics Engineer to help us succeed in building high-quality products.

This role focuses on modeling, diagnostics, and insight generation during prototyping and validation builds—when failure modes are uncertain, system variation is high, and confidence must be earned.

You'll develop analytics and ML systems that highlight inefficiencies, explain yield, assist with failure analysis, and uncover any other blind spots. Your work will support critical decisions by enabling high-trust visibility into performance, yield, reliability, and early SPC signals. You'll work side-by-side with hardware, test, reliability, and manufacturing teams to debug the factory as a system—before scale amplifies its faults.

In this role, you will:

  • Build pipelines and models that track yield, Gage R&R, test escapes, and other anomalies across prototype and validation hardware builds
  • Develop tools for interactive root cause analysis, station diagnostics, and statistical correlation across attributes
  • Enable early SPC workflows and process capability analysis (e.g. Cp/Cpk) for new lines, stations, and SKUs
  • Automate commonality, slot, and shift-level comparisons to isolate sources of instability or inefficiency
  • Integrate with test and factory software to ensure that data collected is analysis-ready, interpretable, and traceable
  • Contribute to dashboards and visuals that accelerate business insights, enable R&D decisions, and perfect design iterations

You might thrive in this role if you:

  • Experience in ML, analytics, or data science, ideally in a hardware or manufacturing setting
  • Understand factory data deeply: yield curves, SPC, GR&R, measurement variation, slot/fixture/station noise
  • Are fluent in Python and SQL, with experience building pipelines (ELT, ETL) and diagnostic workflows in production data systems
  • Know how to extract insight from messy time-series and component-level data across test and process domains
  • Are comfortable presenting failure mode hypotheses to engineers and go/no-go summaries to executives.