Site Reliability Engineer

Company: Together AI
Location: San Francisco
Posted on: April 2, 2026

Job Description:

About the Role As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a software engineer that applies sound engineering principles, operational discipline, and mature automation to our operating environments and codebase. You specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with varied interests in algorithms and distributed systems. Responsibilities Participate in on-call rotation (Pagerduty) to respond to production incidents Build and run our infrastructure with Ansible, Terraform, and Kubernetes to enable scaling to a massive number of concurrent users Build monitoring systems to ensure the highest quality service for our customers Design and implement operational processes (such as deployments and upgrades) Debug production issues across all services and levels of the stack Identify improvements for the product architecture from the reliability, performance and availability perspectives Plan the growth of Together AI’s infrastructure Requirements 5 years of professional SRE or related experience Bachelor's degree in Computer Science or a related field or equivalent work experience Knowledge of Ansible (roles, playbooks), Terraform, and Kubernetes Proficiency in programming/scripting languages Direct experience in monitoring and observability practices Knowledge of cloud services Ability to thrive in a collaborative environment involving different stakeholders and subject matter experts About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $150,000 - $200,000 equity benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy

Keywords: Together AI, Pleasanton , Site Reliability Engineer, IT / Software / Systems , San Francisco, California

Didn't find what you're looking for? Search again!

Let San Francisco recruiters find you. Post your resume for free!

Get San Francisco IT / Software / Systems jobs via email.

View more Pleasanton IT / Software / Systems jobs

Other IT / Software / Systems Jobs

Product Engineer
Description: About David AI David AI is the first audio data research company. We bring an R amp D approach to data developing datasets with the same rigor AI labs bring to models. Our mission is to bring AI into (more...)
Company: David AI
Location: San Francisco
Posted on: 04/3/2026

Senior Machine Learning Researcher, Large Behavior Models & Diffusion Policy
Description: At Toyota Research Institute TRI , we re on a mission to improve the quality of human life. We re developing new tools and capabilities to amplify the human experience. To lead this transformative (more...)
Company: Toyota Research Institute
Location: Los Altos
Posted on: 04/3/2026

Implementation Manager - Strategic Accounts
Description: About Abridge Abridge was founded in 2018 with the mission of powering deeper understanding in healthcare. Our AI-powered platform was purpose-built for medical conversations, improving clinical documentation (more...)
Company: Abridge
Location: San Francisco
Posted on: 04/3/2026

Salary in Pleasanton, California Area | More details for Pleasanton, California Jobs |Salary

Audio AI Research Engineer
Description: About David AI David AI is the first audio data research company. We bring an R amp D approach to data developing datasets with the same rigor AI labs bring to models. Our mission is to bring AI into (more...)
Company: David AI
Location: San Francisco
Posted on: 04/3/2026

Data Analyst
Description: Cardless is looking for a Data Analyst / Senior Data Analyst to join our team and help shape the insights, processes, and decisions behind our co-branded credit card platform. In this role, you ll work (more...)
Company: Cardless
Location: San Francisco
Posted on: 04/3/2026

Wireless Platform Engineer
Description: At Meter, we believe enterprise connectivity should feel as effortless as consumer Wi-Fi. Wireless is the last mile of the network, and it defines the customer s experience how quickly they get (more...)
Company: Meter
Location: San Francisco
Posted on: 04/3/2026

Senior Software Engineer, Criminal Screenings
Description: About Checkr Checkr is building the data platform to power safe and fair decisions. Established in 2014, Checkr s innovative technology and robust data platform help customers assess risk and ensure (more...)
Company: Checkr
Location: San Francisco
Posted on: 04/3/2026

Senior Startup Investor Manager, VC, AWS, Startup Investor Management
Description: Would you like to shape the future of cloud computing by identifying and working with the most promising early-stage startups and venture capital firms Do you have the technical depth, business acumen, (more...)
Company: Amazon
Location: San Francisco
Posted on: 04/3/2026

Senior Search Engineer
Description: you.com is an AI-powered search and productivity platform designed to empower users with personalized, efficient, and trustworthy search experiences. As a cutting-edge technology company, we combine advanced (more...)
Company: You.com
Location: San Francisco
Posted on: 04/3/2026

Forward Deployed Engineer II, GenAI, Google Cloud
Description: info outline X Applicants in San Francisco: Qualified applications with arrest or conviction records will be considered for employment in accordance with the San Francisco Fair Chance Ordinance for Employers (more...)
Company: Google
Location: San Francisco
Posted on: 04/3/2026

Loading more jobs...

Site Reliability Engineer

Didn't find what you're looking for? Search again!

Other IT / Software / Systems Jobs

Log In or Create An Account