Moniepoint Incorporated is a global business payments and banking platform and recently became QED Investors’ first investment in Africa. We are the partner of choice for over 600,000 businesses of all sizes, powering the dreams of SMBs and providing them with equal access to the tools they need to grow and scale.
We are recruiting to fill the position below:
Job Title: Senior Cloud Engineer
Location: Nigeria (Remote)
Position Overview
- We are seeking an experienced Cloud Engineer to design, implement, and manage our multi-cloud infrastructure
- The ideal candidate will have deep expertise in cloud platforms, container orchestration, infrastructure automation, CI/CD pipelines, and observability solutions, ensuring scalable, reliable, and cost-effective cloud operations across multiple cloud providers.
Principal Duties and Responsibilities
Cloud Infrastructure Management:
- Manage cloud resources including compute instances and networking components
- Design, deploy, and manage multi-cloud infrastructure across Google Cloud Platform (GCP), Amazon Web Services (AWS), Azure, and Oracle Cloud Infrastructure (OCI)
- Maintain comprehensive documentation of cloud architectures, configurations, and runbooks.
- Optimize cloud resource utilization and implement auto-scaling policies
- Architect and implement highly available, fault-tolerant, and scalable cloud solutions
- Migrate on-premises applications and services to cloud environments with minimal disruption
- Design and implement disaster recovery and business continuity plans for cloud workloads
Kubernetes & Container Orchestration:
- Implement multi-cluster and multi-region Kubernetes deployments for high availability.
- Configure and manage Kubernetes resource objects
- Implement and maintain container orchestration strategies for microservices architectures
- Troubleshoot complex container and orchestration issues in production environments
- Manage Kubernetes cluster upgrades, scaling, and performance optimization
- Design, deploy, and manage production-grade Kubernetes clusters across multiple cloud providers
Service Mesh & Advanced Networking:
- Configure Istio traffic management, including virtual services, destination rules, and gateways
- Deploy and manage Istio observability components (telemetry, distributed tracing, service graphs)
- Monitor and optimize service mesh performance and resource utilization
- Configure Istio ingress and egress gateways for external traffic management
- Implement multi-cluster service mesh architectures across different cloud providers
- Design, deploy, and manage Istio service mesh for microservices communication and observability
- Implement advanced traffic routing (canary deployments, A/B testing, traffic splitting) using Istio
- Implement circuit breaking, retries, timeouts, and fault injection for resilience testing
Reverse Proxy & Load Balancing:
- Optimize HAProxy and Nginx performance for high-throughput environments
- Design and implement Nginx as reverse proxy for web applications and API gateways
- Implement HAProxy ACLs, backend routing, health checks, and session persistence
- Deploy, configure, and manage HAProxy for high-performance load balancing and reverse proxy
- Manage Nginx Plus features for advanced traffic management and monitoring
- Configure Nginx for rate limiting and request filtering
- Implement Nginx load balancing algorithms and upstream health monitoring
Infrastructure as Code & Configuration Management:
- Implement policy-as-code using tools like OPA (Open Policy Agent) or Sentinel.
- Integrate Terraform and Ansible workflows for end-to-end infrastructure automation
- Develop Ansible playbooks and roles for automated server provisioning and configuration
- Manage infrastructure drift detection and remediation
- Develop and maintain infrastructure as code using Terraform
- Create reusable, modular Terraform configurations for various cloud resources and Implement Terraform state management and remote backends
- Create and maintain infrastructure documentation and architecture diagrams
- Implement infrastructure version control, code review processes, and GitOps practices
- Design and implement configuration management solutions using Ansible
CI/CD Pipeline Management:
- Design, implement, and maintain continuous integration pipelines using Jenkins and Harness
- Implement continuous deployment workflows using ArgoCD for Kubernetes-based applications
- Implement Harness deployment pipelines for cloud-native applications
- Integrate security scanning (SAST, DAST, container scanning) into CI/CD pipelines
- Configure Jenkins jobs, pipelines, and shared libraries for automated build, configure build agents, runners, and execution environments
- Implement progressive delivery strategies (blue-green deployments, canary releases) using ArgoCD.
- Manage ArgoCD application definitions, sync policies and multi-cluster deployments
- Design and implement GitOps workflows with ArgoCD for declarative application delivery
- Optimize build times and pipeline efficiency
- Integrate CI/CD pipelines with version control systems (Git, GitHub, GitLab)
Message Streaming & Event-Driven Architecture:
- Deploy and manage Apache Kafka clusters for real-time data streaming and event-driven architectures
- Monitor Kafka cluster health, performance metrics, and consumer lag
- Configure Kafka topics, partitions, replication factors, and retention policies
- Troubleshoot Kafka producer and consumer issues.
- Optimize Kafka performance for high-throughput and low-latency use cases
- Implement Kafka Connect for data integration with various sources and sinks
Database & Proxy Management:
- Integrate ProxySQL with database clusters and replication topologies
- Deploy, configure, and manage ProxySQL for MySQL load balancing and high availability
- Implement database failover and disaster recovery using ProxySQL
- Implement query routing, caching, and connection pooling strategies using ProxySQL
- Implement database access security and audit logging through ProxySQL.
- Optimize database performance through ProxySQL query analysis and optimization
- Monitor ProxySQL metrics and troubleshoot connection and performance issues
Cloud Networking:
- Implement network monitoring and traffic analysis
- Troubleshoot complex networking issues across multi-cloud environments
- Implement VPN connections, Direct Connect/Interconnect, and hybrid cloud networking solutions
- Configure and manage cloud load balancers (Application Load Balancers, Network Load Balancers, Cloud Load Balancing)
- Implement network security controls, including security groups, network ACLs, and firewall rules
- Design and implement private connectivity between cloud providers.
- Design and implement cloud networking architectures, including VPCs, subnets, and network segmentation
Secrets Management & Security:
- Implement Vault agent and sidecar injectors for Kubernetes workloads
- Manage Vault encryption as a service for application-level encryption
- Configure and manage HashiCorp Vault for centralized secrets management across multi-cloud environments
- Migrate secrets from legacy systems to Vault.
- Implement dynamic database credentials and secret rotation strategies
- Configure Vault secret engines (KV, database, PKI, AWS, GCP, Azure dynamic secrets)
- Manage Vault high availability clusters and disaster recovery procedures
Qualifications, Competency & Skills Required
Education & Experience:
- Bachelor's Degree or Diploma in Computer Science, Information Technology, Engineering, or related fields
- Minimum of 5 years of proven experience in cloud engineering, DevOps, or platform engineering roles
- Hands-on experience managing production workloads across multiple cloud platforms
- Relevant cloud and technology certifications are highly desirable .
Technical Skills
Cloud Platforms (Required):
- Amazon Web Services (AWS): Proficiency in EC2, EKS, S3, RDS, VPC, Lambda, ECS, CloudFormation, IAM
- Multi-cloud architecture design and implementation experience
- Microsoft Azure: Experience with Virtual Machines, AKS, Blob Storage, Azure SQL, Virtual Networks, Azure Functions, ARM templates
- Cloud migration strategies and execution (lift-and-shift, re-platforming, re-architecting).
- Oracle Cloud Infrastructure (OCI): Familiarity with Compute, OKE, Object Storage, networking, and OCI-specific services
- Google Cloud Platform (GCP): Deep expertise in Compute Engine, GKE, Cloud Storage, Cloud SQL, VPC, Cloud Functions, Cloud Run, IAM
Container & Orchestration (Required):
- Experience with Helm charts for application packaging and deployment
- Hands-on experience with managed Kubernetes services (GKE, EKS, AKS)
- Proficiency in Docker containerization, image optimization, and registry management
- Knowledge of container runtime environments (containerd, CRI-O).
- Expert-level Kubernetes knowledge, including cluster architecture, networking, storage, and security
Service Mesh & Microservices (Required):
- Istio security features (mTLS, authorization policies, peer authentication, request authentication)
- Istio traffic management (virtual services, destination rules, gateways, service entries)
- Understanding of sidecar proxy patterns and Envoy proxy
- Multi-cluster and multi-mesh deployments
- Experience with other service mesh solutions (Linkerd, Consul Connect) is a plus.
- Istio observability and telemetry configuration
- Istio: Deep expertise in Istio architecture, deployment, and operations
- Service mesh troubleshooting and performance optimization
Reverse Proxy & Load Balancing (Required):
- HAProxy Advanced configuration and management for load balancing and high availability
- Integration with Kubernetes ingress controllers (Nginx Ingress, Istio Ingress).
- Nginx rate limiting, and performance tuning
- Experience with Nginx modules and custom configurations
- Nginx Expert-level configuration as reverse proxy and API gateway
- High availability configurations using keepalived, VRRP, or similar
- Nginx load balancing algorithms and upstream configurations
Infrastructure as Code (Required):
- Experience with version control systems (Git) and GitOps workflows
- Ansible playbook development, roles, and inventory management
- Advanced Terraform skills for multi-cloud infrastructure provisioning
- Proficiency in Ansible for configuration management and automation
- Infrastructure testing frameworks (Terratest, Kitchen-Terraform)
- Terraform module development, state management, and workspace strategies
CI/CD Tools (Required):
- ArgoCD: GitOps workflows, application synchronization, multi-cluster management
- Integration of CI/CD tools with Kubernetes and cloud platforms
- Jenkins: Pipeline development (declarative and scripted), shared libraries, plugin management
- Harness: Deployment pipeline configuration, workflow creation, approval gates
- Artifact repository management (Nexus, Artifactory, cloud-native registries).
- Automated testing and deployment strategies
Messaging & Streaming (Required):
- Understanding of event-driven architectures and patterns.
- Kafka topic design, partitioning strategies, and performance tuning
- Kafka Connect experience
- Experience with Kafka management tools (Kafka Manager, Cruise Control)
- Apache Kafka architecture, cluster management, and operations
Database & Proxy Technologies (Required):
- Understanding of database replication and clustering.
- MySQL database administration basics
- ProxySQL configuration, management, and optimization
Observability & Monitoring (Required):
- Log aggregation and analysis.
- Prometheus metrics collection, PromQL, and alerting rules
- Grafana dashboard design and visualization techniques
Networking (Required):
- Software-defined networking (SDN) concepts.
- Deep understanding of TCP/IP, DNS, HTTP/HTTPS, and network protocols
- Network security and firewall configuration
- Load balancing strategies and implementations
- Service discovery and DNS-based routing
- Cloud networking concepts (VPC, subnets, routing tables, NAT, VPN)
Scripting & Programming:
- Go or python programming basics for tooling development
- YAML and JSON for configuration management
- Proficient in scripting languages: Python, Bash
- Understanding of software development best practices.
Secrets Management (Required):
- HashiCorp Vault: Advanced knowledge of Vault architecture, deployment, and operations
- Vault secret engines (KV v1/v2, database, transit, cloud dynamic..
- Vault authentication methods and integration with cloud providers and Kubernetes