Site Reliability Engineer (SRE)

Location New York
Discipline: Financial Technology
Job type: Permanent
Contact name: Lewis Piper

Contact email: lewis.piper@venturesearch.com
Job ref: 3127
Published: about 1 month ago

Our client, a leading high-frequency crypto trading firm, are seeking a Site Reliability Engineer (SRE) to design and build production configuration and deployment tools for their high-frequency trading (HFT) platform. This role is critical in ensuring the stability, scalability, and automation of our infrastructure. The ideal candidate will have extensive experience creating complex, production-focused tools, with an emphasis on reliability and performance.

Key Responsibilities

  • Develop and maintain scalable production tools to automate deployment, monitoring, and infrastructure management.
  • Improve system reliability, performance, and efficiency through automation and tooling.
  • Work closely with trading and development teams to ensure seamless operation of our live trading systems.
  • Manage configuration and deployment processes across AWS-based infrastructure.
  • Implement observability tools to enhance system monitoring and debugging capabilities.
  • Ensure fault tolerance, redundancy, and high availability for critical trading systems.
  • Support and enhance infrastructure for both C++ and Rust-based trading systems, ensuring seamless integration.

Required Qualifications

  • Strong programming skills in Python, with the ability to read and understand C/C++ code.
  • Deep understanding of Linux systems
  • Experience managing deployments and configuration management in AWS and/or on-premise clusters.
  • Proficiency in monitoring, logging, and alerting solutions to maintain high system uptime.
  • Strong background in networking fundamentals, including TCP/IP and system performance tuning.
  • Experience with scripting languages (e.g., Python, Bash) for automation

Preferred Skills

  • Familiarity with IaC tools such as Terraform or Ansible for infrastructure automation.
  • Experience in low-latency or high-performance environments is a plus but not required.
  • Strong problem-solving skills and the ability to work in a highly collaborative team.

Location

  • In-office only – offices available in New York City, London, and Singapore.