About Standard Chartered
We are a leading international bank focused on helping people and companies prosper across Asia, Africa and the Middle East.
To us, good performance is about much more than turning a profit. It's about showing how you embody our valued behaviours - do the right thing, better together and never settle - as well as our brand promise, Here for good.
We're committed to promoting equality in the workplace and creating an inclusive and flexible culture - one where everyone can realise their full potential and make a positive contribution to our organisation.
This in turn helps us to provide better support to our broad client base. The Role Responsibilities
SRE Delivery team is delivering resilience solutions for Hong Kong. SRE Engineer is responsible for building resilience solutions strengthening the reliability, observability, operability and scalability across different business flows and applications of the bank.
The System Reliability Engineer will ensure the overall system reliability, uptime, health, and performance of the bank's service.
The candidate will work closely with various stakeholders to understand the architecture and design of different applications in the bank in order to help quickly resolve service impacting issues, detect and self-heal problems before they become service impacting, and provide valuable information and data back to the application developers in order to improve the long-term reliability of the platform.
The candidate would have a strong knowledge of SRE and Agile delivery practices, as well as a disciplined approach to planning, executing and reporting.
The role will require ongoing engagement with Application Development / Support, Product, and Operations teams to manage SRE efforts and create clients first mindset and effectively implement client journeys-based Service Level Deliveries with the support of SRE Delivery Head.
Effectively manage multiple stakeholder demands and expectations while maintaining quality and delivery
Progressively adopt proactive SRE strategies like Chaos Engineering, Game Days and Synthetic Monitoring
Partner with application developers and architects to ensure our services are built for scale and performance
Develop the monitoring solutions on top of existing observability platforms
Maintain open communication with Engineering and Product teams around system performance and reliability
Write, review, and execute test plans / strategies for validating product / system performance, scalability, and reliability
Drive product reliability improvements through monitoring, alerting, and application of software development best practices
Identify creative ways to break the products, uncover and report defects, as well as validate systems / solutions are operating as intended
Engage in the refinement of the development, build and deployment processes on top of our main infrastructure
Work with the engineering teams to architect and build our platform services to simplify real-time troubleshooting and operational response to incidents and outages
Be the expert on how to best use Cloud technologies to build our next-generation platform
Bridge the divide between our core application engineers and our main infrastructure teams
Provide capacity management expertise to ensure our deployments are managed for robustness and cost
Bring best practices and own environment management, ensuring all dev / test / prod environments are reproducible with high availability
Serve as a quality and reliability ambassador as part of an Agile software development team
Maintain and communicate testing timelines, schedules and status reports
Awareness and understanding of the Group’s business strategy and model appropriate to the role.
Awareness and understanding of the wider business, economic and market environment in which the Group operates.
Responsible for the system architecture, development, build and deployment processes of the resilience solutions to be delivered.
People and Talent
Lead through example and build the appropriate culture and values. Set appropriate tone and expectations from their team and work in collaboration with risk and control partners.
The ability to interpret the Group’s financial information, identify key issues based on this information and put in place appropriate controls and measures.
Awareness and understanding of the regulatory framework, in which the Group operates, and the regulatory requirements and expectations relevant to the role.
Regulatory & Business Conduct
Display exemplary conduct and live by the Group’s Values and Code of Conduct.
Take personal responsibility for embedding the highest standards of ethics, including regulatory and business conduct, across Standard Chartered Bank.
This includes understanding and ensuring compliance with, in letter and spirit, all applicable laws, regulations, guidelines and the Group Code of Conduct.
Lead the HK SRE engineering to achieve the outcomes set out in the Bank’s Conduct Principles : Fair Outcomes for Clients;
Effective Financial Markets; Financial Crime Compliance; The Right Environment.
Effectively and collaboratively identify, escalate, mitigate and resolve risk, conduct and compliance matters.
Our Ideal Candidate
Bachelor’s or Master's degree in computer science or equivalent practical experience.
Advanced Knowledge of application, data, and infrastructure architecture disciplines
Experience with Agile / Scrum delivery methodology and related tools
Advanced knowledge of object-oriented programming languages and concepts (Python, Java, Golang, etc..)
Experience with microservices, API-first, event-driven, agent-based architecture and design
Knowledge in DevOps CI / CD, containerization (Docker / Kubernetes), orchestration (Ansible / Salt)
Knowledge of different aspects of service design : including messaging protocols and behaviour, caching strategies and software design practices
Knowledge of infrastructure (networking, hypervisors, storage, security) - experience working with a private cloud is a plus
Experience with test automation with common test frameworks; and performance / load testing techniques at scale
Experience with metrics collection, time series queries, middleware such as Telegraf, and backends such as OpenTSDB or Prometheus
Experience with data visualization tools such as Kibana and Grafana