Our client, a leading high-frequency crypto trading firm, are seeking a Site Reliability Engineer (SRE) to design and build production configuration and deployment tools for their high-frequency trading (HFT) platform. This role is critical in ensuring the stability, scalability, and automation of our infrastructure. The ideal candidate will have extensive experience creating complex, production-focused tools, with an emphasis on reliability and performance.

Key Responsibilities

Develop and maintain scalable production tools to automate deployment, monitoring, and infrastructure management.
Improve system reliability, performance, and efficiency through automation and tooling.
Work closely with trading and development teams to ensure seamless operation of our live trading systems.
Manage configuration and deployment processes across AWS-based infrastructure.
Implement observability tools to enhance system monitoring and debugging capabilities.
Ensure fault tolerance, redundancy, and high availability for critical trading systems.
Support and enhance infrastructure for both C++ and Rust-based trading systems, ensuring seamless integration.

Required Qualifications

Strong programming skills in Python, with the ability to read and understand C/C++ code.
Deep understanding of Linux systems
Experience managing deployments and configuration management in AWS and/or on-premise clusters.
Proficiency in monitoring, logging, and alerting solutions to maintain high system uptime.
Strong background in networking fundamentals, including TCP/IP and system performance tuning.
Experience with scripting languages (e.g., Python, Bash) for automation

Preferred Skills

Familiarity with IaC tools such as Terraform or Ansible for infrastructure automation.
Experience in low-latency or high-performance environments is a plus but not required.
Strong problem-solving skills and the ability to work in a highly collaborative team.

Location

In-office only – offices available in New York City, London, and Singapore.

Location	New York
Discipline:	Financial Technology
Job type:	Permanent
Contact name:	Lewis Piper
Contact email:	lewis.piper@venturesearch.com
Job ref:	3127
Published:	about 1 month ago

Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

Key Responsibilities

Required Qualifications

Preferred Skills

Location

Latest jobs

Get new jobs for this search by email