DevOps / Systems Resilience Engineer
Standard Chartered Bank
Hong Kong, Hong Kong, Asia

About Standard Chartered

We are a leading international bank focused on helping people and companies prosper across Asia, Africa and the Middle East.

To us, good performance is about much more than turning a profit. It's about showing how you embody our valued behaviours - do the right thing, better together and never settle - as well as our brand promise, Here for good.

We're committed to promoting equality in the workplace and creating an inclusive and flexible culture - one where everyone can realise their full potential and make a positive contribution to our organisation.

This in turn helps us to provide better support to our broad client base. The Role Responsibilities


  • Team member to enhance application service and infrastructure resilience through self-healing and automated failovers - target a 99.
  • 99% up-time to customers.

  • Assist in the running of planned random disruption of production infrastructure to ensure accountability for building resilient, always-on systems.
  • Build resilience into the application so underlying system failures are handled gracefully and do not impact end users.
  • Influence design / development teams to always be thinking of the rainy-day scenarios.


    Optimize monitoring to reduce false positive alerts Creatively deepen monitoring capabilities leveraging the 3 tenets of observability logs, metrics and traces Ensure all critical user service journeys are traceable end to end Ensure Production Solutions are fit for purpose.

    Where gaps are identified put a plan in place to uplift the toolset


    Availability / Reliability / Performance

    Design, Code, implement break fixes to improve service availability based on outcomes of thematic reviews Participate in post mortem reviews helping to ensure each exercise is a blameless adjust opportunity Monitor SLIs / SLOs in partnership with Product Teams to achieve the optimal development velocity

    Capacity Planning

    Enhance application and infrastructure scalability via iterative capacity management with the goal of reducing the effort required for capacity reviews through deep monitoring and auto-scale properties.

    Continuously monitor capacity for any discrepancies or spikes


    Identify opportunities to eliminate all manual and repeatable activities (toil) via tooling and automation Reduce the number of repeat incidents by permanently fixing the underlying root cause of issues

    Our Ideal Candidate

  • Post graduation degree with knowledge in Information technology.
  • Solid IT experience. Banking domain is desirable.
  • Agile Trained
  • Excellent oral and written communication skills, ability to interact with business representatives.
  • Good with stakeholder communication and able to liaise with Sr.Mgmt.
  • 报告这项工作

    Thank you for reporting this job!

    Your feedback will help us improve the quality of our services.

    通過點擊“持續”,我允許neuvoo同意處理我的數據並向我發送電子郵件提醒,詳見neuvoo的 隱私政策 。我可以隨時撤回我的同意或退訂。