RHINO HEALTH
Software Engineering
Tel Aviv-Yafo, Israel
Rhino Federated Computing solves one of the biggest challenges in AI: seamlessly connecting siloed data through federated computing. The Rhino Federated Computing Platform (Rhino FCP) serves as the ‘data collaboration tech stack’, extending from providing computing resources to data preparation & discoverability, to model development & monitoring — all in a secure, privacy-preserving environment.
To do this, Rhino FCP offers flexible architecture (multi-cloud and on-prem hardware), end-to-end data management workflows (multimodal data, schema definition, harmonization, and visualization), privacy enhancing technologies (e.g., differential privacy), and allows secure deployment of custom code and third-party applications via persistent data pipelines.
Rhino is trusted by more than 60 leading organizations worldwide — including 14 of Newsweek’s “Best Smart Hospitals” and top 20 global biopharma companies — and is leveraging this foundation for financial services, ecommerce, and beyond.
The company is headquartered in Boston, with its main R&D center in Tel Aviv.
The Production Engineer will play a key role in ensuring the reliability, performance, and operational excellence of Rhino’s Federated Computing Platform (Rhino FCP). This distributed infrastructure supports cutting-edge AI/ML research and development across highly regulated industries, including healthcare, finance, and life sciences, by enabling secure, privacy-preserving data collaboration worldwide.
You will be responsible for maintaining and improving production environments deployed both in cloud environments and behind customer firewalls. You will work closely with Platform, Backend, and Product teams to ensure systems remain stable, observable, and highly available.
This role focuses heavily on operational ownership, production monitoring, troubleshooting complex environments, incident response, and improving deployment reliability and operational tooling. It is ideal for someone who enjoys solving production challenges, improving system reliability, and building operational excellence in fast-moving environments.
Production Operations & Reliability:
Maintain and support production environments across customer deployments and centralized cloud services, ensuring high availability and operational stability.
Monitoring and Observability:
Develop, improve, and maintain monitoring, alerting, and logging systems to proactively identify issues and improve visibility across distributed systems.
Incident Response and Troubleshooting:
Investigate, troubleshoot, and resolve complex infrastructure and application issues across cloud and on-premises environments, participating in incident management and root cause analysis.
Deployment Management:
Manage and support production deployments, upgrades, and maintenance activities across geographically distributed customer environments.
Operational Excellence:
Identify operational bottlenecks and continuously improve reliability, scalability, automation, and support processes.
Collaboration Across Teams:
Work closely with Backend, DevOps, and Product Engineering teams to support new features, improve operational readiness, and ensure smooth production adoption.
Automation and Tooling:
Contribute to internal tooling and automation efforts that reduce manual operational work and improve deployment and support efficiency.
Candidates should have 3–5 years of professional experience with a mix of the experiences described below:
The role is open to candidates who are based in Israel (hybrid work environment).