IT Infrastructure Architect (Enterprise Cloud & HPC)

Premier Aegis Recruitment

  • Kowloon, Hong Kong
  • Permanent
  • Full-time
  • 3 days ago
Our client is one of the leading players in the regional telecom market. In order to cope with the high demand of company expansion in Hong Kong, our client are looking for high caliber candidate to join the teamResponsibilities:Oversee the full lifecycle of architecture design and delivery for large-scale, complex IT infrastructure solutions tailored to enterprise clients—covering compute, storage, networking, security, virtualization, and the integration of hybrid/multi-cloud environments.Design and implement high-performance GPU/HPC platforms (including NVIDIA and Huawei systems) and high-speed fabrics (such as IB and RoCE), while ensuring seamless integration with enterprise-grade platforms (VMware, Windows Server, Linux) and modern orchestration tools like Kubernetes.Enhance reliability, performance, security, and cost efficiency across data centers and cloud ecosystems by developing reference architectures, establishing technical standards, implementing automation, and upholding operational excellence practices.Architecture: Define compute (NVIDIA/Huawei GPUs, Kubernetes), storage, network, security (zero trust), cloud architecture; deliver HLD/LLD, NFRs.HPC/GPU: Optimize NVIDIA/Huawei clusters (1k-GPU, IB/RoCE), tune CUDA, automate via Python/Ansible.Delivery/Ops: End-to-end solution delivery; build observability, manage incidents, handle migration.Security/Cost: Ensure compliance (ISO 27001, GDPR), embed zero trust; implement FinOps.Stakeholders: Lead reviews, present to tech/non-tech teams.Requirements:Minimum 5–8 years in infrastructure architecture/engineering; at least 3 years leading data center or HPC construction/operations.Strong foundation in:Virtualisation and OS: VMware vSphere/NSX/Aria, Windows Server, Linux.Compute and GPU: x86 platforms, NVIDIA HGX/H200 and Huawei Atlas hardware architecture.Storage: SAN/NAS/Object, backup/DR, snapshot/replication, parallel FS (Lustre/BeeGFS).Networking: Data centre spine–leaf, IB/RoCE, QoS/ECN/PFC, SDN, load balancers, DNS/DHCP/IPAM.Security: AD/Azure AD, bastion/PAM, EDR, SIEM/SOAR, vulnerability management, Zero Trust, segmentation.Cloud and Orchestration: Kubernetes, Helm, Operators, CSI/CNI, hybrid/multi-cloud networking, Terraform/Ansible, Python.Scheduling and Platforms: Kubernetes/Volcano or Slurm for AI/HPC workloads; NVIDIA cluster management platforms.Prior leadership in large/mega-scale intelligent computing centres, data centres, or computing power scheduling platforms.End-to-end delivery experience with enterprise clients, including multi-vendor coordination and complex program governance.Interested parties, please send a full resume with expected salary to [email redacted, apply via Company website]All applications and data collected will be treated in strict confidence and used exclusively for recruitment purposes onlyAll applications applied through our system will be delivered directly to the advertiser and privacy of personal data of the applicant will be ensured with security.

CTgoodjobs

Similar Jobs

  • Solutions Architect (Enterprise), Enterprise

    Amazon

    • Causeway Bay, Hong Kong
    Description AWS Global Sales drives adoption of the AWS cloud worldwide, enabling customers of all sizes to innovate and expand in the cloud. Our team empowers every customer to …
    • 29 days ago
  • Cloud and Data Architect

    China Mobile Hong Kong Company Limited

    • Kwai Hing, Hong Kong
    CHINA MOBILE HONG KONG COMPANY LIMITED ("CMHK") is the wholly-owned subsidiary of China Mobile Limited (HKEx: 941) , which ranks 55th on the Fortune Global 500. Our company striv…
    • 11 days ago