
Senior Manager / Director of AIOps and IT Operations (Perm)
- Hong Kong
- Permanent
- Full-time
- Strategic Leadership:
- Define and execute the strategy for AIOps implementation within the infrastructure domain, ensuring alignment with organizational goals.
- Collaborate with cross-functional teams (engineering, IT operations, and data science) to identify areas for automation, optimization, and enhanced system management.
- AIOps Development and Execution:
- Lead the design and development of AI-driven solutions, integrating advanced analytics, machine learning (ML), and automation to streamline IT operations such as monitoring, incident management, and capacity planning.
- Oversee the implementation of AIOps tools and frameworks to manage complex IT systems, ensuring minimal downtime and proactive problem resolution.
- Ensure integration of AIOps solutions with existing IT Service Management (ITSM) tools and monitoring platforms.
- Analytics and Optimization:
- Implement predictive and prescriptive analytics to forecast system performance, automate root-cause analyses, and optimize resource utilization.
- Use AI/ML models to improve system uptime, resilience, and scalability while reducing manual workload and operational overheads.
- Architecture and Ecosystem Development:
- Develop a scalable AIOps architecture that integrates with current and future IT systems.
- Create frameworks for continuous monitoring, performance tuning, and fault tolerance across multi-cloud environments, on-premises infrastructure, and hybrid models.
- Team Leadership and Mentorship:
- Build and lead a high-performing team of data scientists, AI engineers, and infrastructure specialists by fostering a culture of innovation and collaboration.
- Provide mentorship and training on AIOps solutions and best practices across teams.
- Stakeholder Collaboration:
- Partner with business and technical stakeholders to understand operational challenges and translate them into data-driven solutions.
- Communicate and advocate for the value of AIOps development to C-level executives and other key decision-makers.
- Compliance and Security:
- Ensure AIOps frameworks align with organizational cybersecurity and data privacy policies.
- Stay updated on AI ethics, compliance, and regulatory trends globally to maintain the responsible use of AI tools.
- Education:
- Bachelor's degree in Computer Science, Information Technology, Data Science, or a related field is required. Master's degree or MBA is preferred.
- Experience:
- 10+ years of experience in IT operations, infrastructure management, or AI/ML development.
- Proven track record of implementing AIOps solutions, from concept to production, at a senior technology leadership level.
- Experience working with cloud platforms (AWS, Azure, or GCP) and hybrid IT environments is a strong advantage.
- Familiarity with AIOps tools and platforms such as Dynatrace, DataDog, Splunk, Moogsoft, BigPanda, or similar.
- Technical Skills:
- Strong expertise in machine learning algorithms, natural language processing (NLP), predictive analytics, and automation frameworks.
- Proficiency in programming languages such as Python, R, or Java, with knowledge of AI/ML frameworks like TensorFlow, PyTorch, or Scikit-learn.
- Understanding of IT Operations Management (ITOM), ITSM, DevOps practices, and observability tools.
- Soft Skills:
- Strong leadership, communication, and project management skills.
- Exceptional problem-solving skills with the ability to anticipate operational challenges and proactively create solutions.
- Certifications (Desirable):
- ITIL速 certifications, Cloud certifications (AWS Solutions Architect, Microsoft Azure Administrator, etc.), or AI/ML certifications.
- A visionary thinker with a passion for operational excellence, skilled at managing complex IT systems at scale.
- Global mindset with experience working with teams in diverse geographies.
- Comfort with ambiguity and a proven ability to deliver measurable results in a fast-changing technology landscape.