Data Engineer – New Grad – X (Twitter)
X (formerly Twitter) is the global public conversation platform where 250+ million daily active users post, engage with, and search 500+ million tweets per day on news, sports, politics, entertainment, and real-time events worldwide. X's data platform processes one of the world's largest real-time social media data streams — requiring cutting-edge data engineering at a scale few organizations ever encounter. X's engineering team built Apache Kafka, Apache Parquet, and other foundational open-source data technologies used industry-wide. We are hiring New Grad Data Engineers in San Francisco to build the real-time data infrastructure powering one of the world's most influential information platforms.
Responsibilities
- Build and maintain real-time data pipelines processing 500+ million tweets and engagement events daily using Apache Kafka and Apache Flink
- Develop X's data warehouse ETL/ELT pipelines transforming raw tweet events into analytics datasets for ads measurement, user growth, and content quality teams
- Implement X's data lake architecture on Google Cloud Storage — organizing Parquet and ORC format datasets for efficient analytical querying using Presto and Spark
- Build X's ads data measurement pipeline — tracking ad impressions, clicks, video views, and conversion events for X's advertiser reporting platform
- Develop feature engineering pipelines for X's ML models powering tweet recommendations, trending topics, and content safety classifiers
- Monitor and optimize data pipeline SLAs ensuring on-time data delivery for X's business-critical analytics and advertiser reporting products
Requirements
- Bachelor's degree in Computer Science, Software Engineering, or Data Engineering
- Strong Python, Scala, or Java skills for Spark and Flink pipeline development
- Understanding of distributed data processing, stream processing, and data warehouse design
- Familiarity with Apache Kafka, Spark, Flink, or Hadoop ecosystem tools
- SQL proficiency and experience with large-scale analytical datasets
Benefits
- Competitive salary with X equity and performance bonus
- Work on one of the world's highest-scale real-time data platforms
- Medical, dental, and vision benefits
- 401(k) with X matching
- San Francisco headquarters with on-site amenities and world-class data engineering culture