Sr. Database Reliability Engineer at Slack
San Francisco, CA, US / New York City, NY, US

About the Team

Slack's Database Reliability Engineering team builds and operates the platform services that store data at Slack. We write software to manage thousands of stateful hosts, providing many petabytes of online database capacity. We are also in the midst of transitioning Slack’s core MySQL infrastructure to use Vitess’ flexible sharding and management capabilities.  Review our recent presentation slides: Migrating to Vitess at (Slack) Scale.

Slack has a positive, diverse, and supportive culture—we look for people who are curious, inventive, and work to be a little better every single day. In our work together we aim to be smart, humble, hardworking and, above all, collaborative. If this sounds like a good fit for you, why not say hello?

About the Role

What you will be doing

  • Leading larger projects, from start to finish, where scope is mostly understood
  • Designing and developing new highly-available infrastructure to meet the needs of our growing and evolving product
  • Writing software to make the database infrastructure self-managing and self-service
  • Advising feature teams on how we can support the database needs of new features under development
  • Writing code to capture data about service performance, and create tools and dashboards to provide insight into that data
  • Participating in the Database Reliability Engineering on-call rotation, triaging and addressing production issues as they arise
  • Contributing to internal tools that help us improve our operations processes, manage our infrastructure, and scale our systems

What you should have

  • You have curiosity about how things work
  • You've been developing and operating high-traffic Internet applications and can point to things you’ve worked on
  • You've deployed server software on Linux, and then operated it at scale. You’ve debugged its problems, and analyzed and optimized its performance
  • You are a strong communicator. Explaining complex technical concepts to designers, support, and other engineers is no problem for you
  • You enjoy helping onboard new team members, mentoring, and teaching others

Qualifications:

Minimum:

  • Professional experience operating at least one distributed data storage system, at scale and in a team environment. Some examples include: a relational database like MySQL, a search engine like Solr, or a streaming message bus like Kafka
  • Bachelor's degree in Computer Science, Engineering or related field, or equivalent training, fellowship, or work experience

Preferred:

  • Solid competency in software engineering, using functional or imperative programming languages -- e.g. PHP, Python, Ruby, Go, C, or Java (used without frameworks)
  • Experience using distributed storage systems scaled out across hundreds or thousands of servers

Bonus Points:

  • Experience expressing complex questions in SQL, especially MySQL
  • Experience using deployment automation/configuration management (Chef a plus)
  • Experience with virtualized environments, especially Amazon Web Services
  • Experience in a startup environment