Production Support/Management
IO TECH SOLUTIONS LIMITED
- Hong Kong Island, Hong Kong
- Permanent
- Full-time
- Diagnose and resolve production issues quickly, minimizing downtime and impact on end-users
- Provide on-call support for production incidents and manage issue escalation as necessary
- Collaborate with development teams to investigate root causes of production issues and propose solutions
- Perform system health checks and regular system maintenance tasks to ensure optimal performance
- Implement monitoring tools and alerting systems to proactively identify potential issues before they impact users
- Deploy bug fixes, patches, and system upgrades in production environments
- Document issues, resolution steps, and operational procedures for knowledge sharing
- Assist in post-incident reviews and implement improvements based on lessons learned
- Help implement change management processes to ensure smooth and controlled deployments
- Ensure adherence to SLAs (Service Level Agreements) for incident resolution and response time
- Bachelors degree in Computer Science, Information Technology, Engineering, or a related field
- 2+ years of experience in production support or operations management in a tech environment
- Familiarity with Linux/Unix or Windows server administration
- Strong experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Nagios, New Relic)
- Ability to work with log aggregation and analysis tools (e.g., ELK Stack, Splunk)
- Proficiency in troubleshooting application, infrastructure, and network issues
- Experience with databases (e.g., MySQL, PostgreSQL, MongoDB)
- Knowledge of incident management tools (e.g., JIRA, ServiceNow)
- Strong understanding of cloud platforms (e.g., AWS, Azure, GCP) and cloud infrastructure
- Familiarity with CI/CD pipelines and deployment automation tools
- Experience in automation and scripting (e.g., Bash, Python, Shell scripting)
- Familiarity with containerization technologies like Docker and orchestration tools like Kubernetes
- Experience in load balancing, scaling, and disaster recovery practices
- Knowledge of ITIL or other IT operations frameworks
- Experience in release management and deployment strategies