Infrastructure Engineer – London
smartTrade Technologies is a software publisher specializing in the trading and finance sector. Its clients primarily include investment banks, stock exchanges, brokers, and pension funds. smartTrade enables real-time computerized management of financial flows among these different stakeholders.
Joining smartTrade means becoming a part of an innovative and international company with offices in Aix-en-Provence, London, Geneva, New York, Toronto, and Tokyo.
Skill development and career progression are top priorities at smartTrade, offering employees numerous opportunities for learning, advancement, and mobility. Sports and their values of teamwork, performance, and dynamism are integral to the company's culture.
Additionally, smartTrade is highly committed to continuously supporting various charitable and environmental initiatives.
We are seeking a hands-on Linux Systems & Datacenter Administrator to join our Europe Platforms Operations team.
Location: Slough area
Role Overview
You'll be the on-the-ground owner for our Slough (Equinix) Platforms environment and a key contributor to our global private cloud.
The role blends Linux systems administration (Ubuntu), containerized compute (LXD/LXC, some Docker), networking, and datacenter operations.
You will partner with engineering, network, and security teams to ensure reliability, performance, and change control in a 24x7, market-facing environment.
This is a production-oriented role: you'll prepare, review and execute changes, troubleshoot live issues, execute maintenance windows, and continuously improve our platform through automation and rigorous documentation.
Our Environment
· Servers: Dell, HPE, Supermicro.
· Storage: LVM, software and hardware RAID (mdadm, MegaRAID, LSI, ...).
· Containers: LXD/LXC (primary), some Docker.
· Networking (day-2 ops): VLANs, LACP, ACLS, routing basics; vendors include Dell, Supermicro, Arista, Juniper, VYOS.
· Applications & Data: MySQL, Elasticsearch, Kafka, Java, Apache HTTPD, ... Automation & laC: Git/GitLab, Ansible, Netbox, Chef, Terraform; scripting with Bash/Python.
· Monitoring/Observability: Centreon, Dynatrace.
What You'll Do
· Operate and improve Linux fleets (Ubuntu) in production.
· Manage HPC baremetal and LXD/LXC container platforms
· Provide level-3 incident response for infrastructure issues (systems, containers, network paths, storage), restoring service within SLAs and driving post-mortems.
· Own Platforms datacenter operations in Slough: rack/stack, cabling, optics, power planning, servers installation, console/OOB, manage inventory in Netbox, RMA logistics, and vendor coordination (Equinix Smart Hands, carriers, OEMs).
· Perform day-2 network operations on switches and firewalls (ACLS, VLANs, LAGS, routing basics), and collaborate closely with network engineering for changes
· Automate with Ansible Chef for configuration management and Terraform for laC on AWS where applicable. Build reliable tooling for repeatable ops (config generation, pre-change checks, deployments, and validation).
· Contribute to change management (runbooks, maintenance windows, rollback plans) and keep documentation current (network diagrams, inventories, SOPs).
· Participate in a Follow-the-Sun operations model, coordinating with your EMEA/APAC peers.
What You'll Bring
Must-have:
o 2-3+ years operating Linux (Ubuntu, CentOS, RedHat) in production environments.
o This position requires occasional on-call availability outside of standard business hours to respond to urgent or critical operational issues. Flexibility to be contacted outside regular working hours is required.
o Previous datacenter work exposure: rack/stack, structured cabling (fiber/copper), PDUs, console/OOB, vendor/Smart Hands coordination, and accurate inventory. If no prior experience, willingness to learn and work in such environments.
o Containers: exposure to LXC or Docker in a production environment and their inner workings.
o Server hardware & storage: LVM, software RAID, MegaRAID tooling, firmware/BIOS/BMC (iDRAC/ILO/IPMI), and hands-on diagnostics and replacements.
o Networking fundamentals for day-to-day ops: VLANs, LACP, trunking, ACLs, static routes, BGP, DNS/DHCP, link/MTU issues; ability to execute well-scoped changes on Dell/Arista/Juniper/VYOS under peer review.
o Automation & SCM: Bash/Python, Git/GitLab; experience with Chef or Ansible or Puppet in production.
o Clear runbook-style writing, disciplined change control, and calm, structured troubleshooting under time pressure.
Nice to have:
o Familiarity with Equinix processes (cross-connects, tickets, remote hands) and carrier coordination.
o Ops exposure to Netbox, MySQL, Elasticsearch, Kafka, Java services, Apache; ability to collaborate with app teams on infra-adjacent issues.
o Experience with Centreon and Dynatrace (or equivalent monitoring/observability stacks).
o Config management/laC depth (Ansible, Puppet, Terraform modules, Secret management), and CI pipelines in GitLab.
o Deeper networking (EVPN/VXLAN, BGP, multicast) and/or traffic engineering.
Work Hours & Travel
Standard business hours aligned to Central European Time with flexibility for maintenance windows.
Rotational Weekend work (Friday/Saturday/Sunday) for planned changes and datacenter work; comp day granted during the week.
Travel: Once (up to twice occasionally) a week in Slough Equinix datacenters, once a month in London city center, and exceptional travels outside UK
- Department
- Platforms - Operations
- Locations
- London
- Remote status
- Hybrid