IT Operations Engineer
oneZero Financial Systems
Come join oneZero Financial Systems! An exciting, fast-growing company with Headquarters in Somerville MA, oneZero empowers banks, brokerages and hedge funds with cutting edge trade routing and execution technology. Our platform, deployed with 200+ entities globally, features a low-latency trading environment, integrations to the world’s leading execution venues, and reliable IT infrastructure and technical support—all designed to be customized and scaled to serve any business model and any size of market participant. We take pride in our great work atmosphere and highly motivated team of engineers. We are currently looking for a motivated and talented Associate Systems Administrator to join our Johannesburg office.
oneZero is proud to have been named one of Business Intelligence Group's Best Places to Work for four consecutive years:
https://www.onezero.com/awards/onezero-earns-recognition-as-a-2025-best-place-to-work/
The Boston Globe names oneZero a Top Place to Work in 2022, 2023, and 2024: https://www.onezero.com/homepage/the-boston-globe-names-onezero-a-top-place-to-work-for-third-year-in-a-row/
oneZero earns 2024 Great Place To Work Australia Certification
https://www.onezero.com/awards/onezero-2024-great-place-to-work-australia-certification/
Please see oneZero featured in e-Forex Magazine to learn more about the company and our dynamic team (https://goo.gl/vbXw8i)
Job Description:
The IT Operations Engineer is responsible for delivering reliable infrastructure and exceptional user
support through 24x7 monitoring, proactive management, and white-glove service. This role is critical
to maintaining our mission-critical financial services environment, where uptime is paramount.
The engineer manages the complete lifecycle of both end-user systems and production infrastructure
—from initial setup and onboarding through daily operations, maintenance, and eventual offboarding.
This position demands strong technical expertise, independent decision-making capabilities, and the
ability to exercise sound judgment when responding to critical incidents in a fast-paced, high availability environment where every minute of downtime has significant business impact.
Responsibilities:
Production Infrastructure Monitoring and Incident Response
- Monitor critical production systems 24x7 to ensure optimal performance and availability
- Exercise independent judgment in assessing incident severity and determining appropriate
response strategies - Respond to infrastructure alerts and incidents with urgency and precision, making real-time
decisions on escalation paths and resolution approaches - Perform root cause analysis and implement corrective actions to prevent recurring issues
- Evaluate system behavior patterns and make independent determinations on necessary
interventions - Coordinate with development teams during critical incidents and outages, serving as the
technical authority for infrastructure decisions - Document incident response procedures and maintain incident management records
- Participate in an on-call rotation to provide round-the-clock infrastructure support
Infrastructure Maintenance and Security Patching
- Design and execute planned maintenance windows for servers, network equipment, and applications
- Evaluate security patch criticality and make independent decisions on deployment timing and prioritization
- Apply security patches and updates to maintain system currency and compliance
- Perform routine system health checks and preventive maintenance tasks
- Manage backup systems and validate backup integrity regularly
- Coordinate with the security team to implement security controls and remediation efforts
- Maintain accurate configuration management and documentation
- Assess infrastructure requirements and recommend improvements to system architecture
Service Request Fulfillment
- Process and fulfill IT service requests through the ITSM platform with attention to SLA compliance
- Coordinate software installations, license management, and application access requests
- Support the complete lifecycle of hardware and software assets including procurement, deployment, configuration, and decommissioning
- Manage vendor relationships for equipment repairs and service delivery
- Evaluate and determine appropriate solutions for complex user requests requiring technical analysis
- Create and maintain knowledge base articles and user documentation
- Provide training and guidance to end users on IT tools and best practices
- Design and implement onboarding and offboarding processes for both end-user systems and production infrastructure
Requirements
Core Requirements
- 3+ years of experience in IT operations or a similar technical support role
- Demonstrated ability to exercise independent judgment in high-pressure situations and make critical decisions affecting system availability
- Experience with Windows and Linux server environments, virtualization technologies
- Strong understanding of network protocols, TCP/IP, DNS, DHCP, and VPN technologies
- Hands-on experience with monitoring tools (e.g., Nagios, Datadog, PRTG, or similar)
- Proficiency with ITSM platforms (ServiceNow, Jira Service Management, or similar)
- Experience with Active Directory, Office 365, and enterprise security tools
- Knowledge of backup and disaster recovery procedures
- Proven analytical and problem-solving abilities with capacity to assess complex technical issues and determine optimal solutions
- Excellent communication skills and customer service orientation
- Ability to work flexible hours, including on-call rotation
Preferred Qualifications
- Experience in financial services or other high-availability environments
- Knowledge of cloud platforms (AWS, Azure) and hybrid infrastructure
- CompTIA A+, Network+, Security+ or equivalent certifications
- Experience with automation and scripting (PowerShell, Python, Bash)
- Familiarity with the ITIL framework and best practices
- Understanding of foreign exchange markets and trading platform requirements