Your grant just got funded and you need compute yesterday. We set up and manage AWS, GCP, and on-premise HPC infrastructure so your team can focus on research, not sysadmin tickets.
Cloud Infrastructure (AWS/GCP)
- AWS and GCP architecture, deployment, and cost optimization
- Container orchestration (Docker, Kubernetes)
- AWS Organizations and multi-account setup
- Cross-account role access and billing administration
- Data security policy development and implementation
On-Prem HPC Infrastructure
We design, build, and administer dedicated research clusters. Our founder built gCluster at UC Irvine: 10 compute nodes (480 cores), 180TB networked storage over InfiniBand, SLURM scheduling across CPU, RAM, and GPU resources.
- HPC cluster design, deployment, and administration
- SLURM job scheduler configuration and tuning
- InfiniBand and high-speed networking
- NFS/parallel filesystem storage at scale (100TB+)
- Identity and access management (FreeIPA, Kerberos, ACLs)
- Container runtimes for HPC (Singularity/Apptainer)
- GPU cluster management and multi-node training
- VPN and secure remote access configuration
- Server administration and monitoring
Deliverables
- Architecture diagram and deployment documentation
- Configured infrastructure with IAM, networking, and storage policies
- Monitoring and alerting setup (CloudWatch, Prometheus, or Grafana)
- Cost analysis and optimization recommendations
- Runbook for day-to-day operations and common maintenance tasks
- For HPC: SLURM job script templates, user onboarding guide, and storage management procedures
Need cloud or HPC infrastructure set up right? Let’s talk.