Inference Software Engineer
Company: Etched
Location: San Jose
Posted on: April 2, 2026
|
|
|
Job Description:
About Etched Etched is building the world’s first AI inference
system purpose-built for transformers - delivering over 10x higher
performance and dramatically lower cost and latency than a B200.
With Etched ASICs, you can build products that would be impossible
with GPUs, like real-time video generation models and extremely
deep & parallel chain-of-thought reasoning agents. Backed by
hundreds of millions from top-tier investors and staffed by leading
engineers, Etched is redefining the infrastructure layer for the
fastest growing industry in history. Key responsibilities Support
porting state-of-the-art models to our architecture. Help build
programming abstractions and testing capabilities to rapidly
iterate on model porting. Build, enhance, and scale Sohu’s runtime,
including multi-node inference, intra-node execution, state
management, and robust error handling. Optimize routing and
communication layers using Sohu’s collectives. Utilize performance
profiling and debugging tools to identify bottlenecks and
correctness issues. You may be a good fit if you have Proficiency
in C++ or Rust. Understanding of performance-sensitive or complex
distributed software systems like Linux internals, accelerator
architectures (e.g. GPUs, TPUs), Compilers, or high-speed
interconnects (e.g. NVLink, InfiniBand). Familiarity with PyTorch
or JAX. Ported applications to non-standard accelerator hardware or
hardware platforms. Strong candidates may also have experience with
(Nice-to-have qualifications) Developed low-latency,
high-performance applications using both kernel-level and
user-space networking stacks. Deep understanding of distributed
systems concepts, algorithms, and challenges, including consensus
protocols, consistency models, and communication patterns. Solid
grasp of Transformer architectures, particularly Mixture-of-Experts
(MoE). Built applications with extensive SIMD (Single Instruction,
Multiple Data) optimizations for performance-critical paths.
Benefits Medical, dental, and vision packages with generous premium
coverage $500 per month credit for waiving medical benefits Housing
subsidy of $2k per month for those living within walking distance
of the office Relocation support for those moving to San Jose
(Santana Row) Various wellness benefits covering fitness, mental
health, and more Daily lunch dinner in our office How we’re
different Etched believes in the Bitter Lesson . We think most of
the progress in the AI field has come from using more FLOPs to
train and run models, and the best way to get more FLOPs is to
build model-specific hardware. Larger and larger training runs
encourage companies to consolidate around fewer model
architectures, which creates a market for single-model ASICs. We
are a fully in-person team in San Jose (Santana Row), and greatly
value engineering skills. We do not have boundaries between
engineering and research, and we expect all of our technical staff
to contribute to both as needed.
Keywords: Etched, Pleasanton , Inference Software Engineer, Engineering , San Jose, California