Facebook Data Center Engineering Specialist - Production Hardware Engineer in Forest City, North Carolina


Facebook's mission is to give people the power to build community and bring the world closer together. Through our family of apps and services, we're building a different kind of company that connects billions of people around the world, gives them ways to share what matters most to them, and helps bring people closer together. Whether we're creating new products or helping a small business expand its reach, people at Facebook are builders at heart. Our global teams are constantly iterating, solving problems, and working together to empower people around the world to build community and connect in meaningful ways. Together, we can help people build stronger communities — we're just getting started.


Facebook is seeking a forward thinking, experienced IT professional with product management experience and technical skills in Server Hardware, Linux, and/or Networking, ideally in a data center environment. Our data centers, and the tens of thousands of servers installed in them, are the foundation upon which our rapidly scaling infrastructure efficiently operates and on which our innovative services are delivered. Facebook is at the leading edge of the global data center industry, both in design and operations. Our Production Hardware Engineers are responsible for driving health of the server fleet, from production verification test through end-of-life, by identifying systemic hardware, firmware, and tooling issues; engaging in hands-on problem solving; and partnering effectively with engineering and tooling teams to improve performance of the fleet. Candidates should have deep knowledge and experience in at least one of the following core areas: Hardware, Tools, Automation, and System Administration. Ability to work in a large-scale, distributed environment is key for this individual. The successful candidate should be process oriented with a hands-on, entrepreneurial work ethic, and enjoy working in a fast-paced environment where adaptability and flexibility are key to success. This position is full-time, based in Forest City, NC.

Required Skills:

  1. Analyze health of the global fleet of servers to proactively identify systemic issues, and take appropriate action to quickly mitigate impact

  2. Work with hardware engineering and release-to-production teams to take new server/storage products into mass production in the data centers

  3. Identify risks early in new hardware programs, and contribute the test methods needed to characterize the product and assure end-customer needs are met

  4. Execute root cause analysis of failures, individually and in partnership with operations and engineering teams, and deliver the right corrective and preventive actions

  5. Issue timely alerts and fixes to operations teams, and assure a robust feedback pipeline to engineering teams

  6. Assist hardware engineers by running experiments, collecting data, and providing feedback on failure symptoms for production servers

  7. Provide cross-functional communication with other technical operations groups to help resolve incidents

  8. Understand, troubleshoot, and fix broken servers and/or Linux related issues

  9. Provide serviceability feedback on production hardware

  10. Serve as the local point of contact and subject matter expert for data center operations teams for production hardware issues

  11. Ability to travel up to 30% required

Minimum Qualifications:

  1. BS or BA in technical field or commensurate experience

  2. 6+ years of experience with Linux and IT hardware systems in an internet operations environment

  3. Experience triaging and debugging hardware

  4. Experience working with Linux or Unix Operating systems

  5. Experience training and mentoring technicians and/or engineers

  6. Experience in data center hardware deployments and building scaling infrastructure

  7. Technical drafting experience with experience creating documentation for users of all levels

Preferred Qualifications:

  1. Bash, PHP, Python, or Perl scripting experience

  2. Experience in data center system and process automation

Industry: Internet

Equal Opportunity: Facebook is proud to be an Equal Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. If you need assistance or an accommodation due to a disability, you may contact us at accommodations-ext@fb.com or you may call us at +1 650-308-7837.