How does Cortex relate to DeepLoom?

Cortex is the model serving infrastructure that will power DeepLoom in production. By building and proving Cortex as open-source software, Aulendur Labs validates its ability to deploy AI systems at scale with the monitoring, APIs, and authentication that enterprise and defense environments require.

Open Source

Cortex: Making Enterprise-Grade Model Serving Accessible

January 15, 2025By Jorden Gershenson, CTO

Today, Aulendur Labs is proud to release Cortex, an open-source platform that makes enterprise-grade model serving with vLLM dramatically easier. Licensed under Apache-2.0, Cortex represents our commitment to giving back to the AI community while making production ML infrastructure accessible to everyone.

Why Is vLLM Powerful But Complex to Deploy?

vLLM is an exceptional model serving engine, but deploying it in enterprise environments requires authentication, API management, GPU monitoring, and production configurations that take weeks to build from scratch. Cortex by Aulendur Labs wraps vLLM with all of this infrastructure out of the box.

vLLM is an exceptional model serving engine—fast, efficient, and highly optimized for production workloads. However, deploying vLLM in an enterprise context requires significant additional infrastructure:

Authentication and API key management
OpenAI-compatible API endpoints for easy integration
Administrative interfaces for model and user management
GPU monitoring and resource tracking
Production-ready deployment configurations
Proper CORS handling for web applications

Building all of this from scratch takes weeks or months. We built Cortex so you don't have to.

What Is Cortex by Aulendur Labs?

Cortex is an open-source, enterprise-grade model serving platform built by Aulendur Labs that wraps vLLM with OpenAI-compatible APIs, role-based authentication, GPU monitoring, and a modern admin UI. Cortex is the model serving infrastructure that will power DeepLoom at scale.

Cortex is a production-ready platform that wraps vLLM with everything you need for enterprise deployment. It provides:

Core Features

OpenAI-Compatible API: Drop-in replacement for OpenAI's API, making integration seamless
Admin UI: Modern web interface for managing models, users, and API keys
Authentication & Authorization: Role-based access control with API key management
Multi-Model Support: Serve multiple models simultaneously with automatic routing
GPU Monitoring: Real-time visibility into GPU utilization, memory, and temperature
System Metrics: Host CPU, memory, disk, and network monitoring via Prometheus

Developer Experience

We've obsessed over making deployment as simple as possible. The entire stack can be launched with a single command:

make quick-start

# That's it! Cortex will:
# - Detect your host IP automatically
# - Configure CORS properly
# - Start all services
# - Display your access URLs

# Example output:
# ✓ Cortex is ready!
# Login at: http://192.168.1.181:3001/login (admin/admin)

How Is Cortex Architected?

Cortex by Aulendur Labs uses a modern, containerized architecture: a Python/FastAPI gateway handles authentication and routing, a Next.js admin UI manages users and models, PostgreSQL stores metadata, and vLLM containers serve models — all monitored via Prometheus, node-exporter, and dcgm-exporter.

Cortex is built on a modern, containerized architecture:

Gateway (Python/FastAPI): Authentication, API routing, and business logic
Admin UI (Next.js/React): User-friendly interface for administration
PostgreSQL: Metadata storage for users, keys, and configurations
vLLM Containers: Dynamic model serving engines
Monitoring Stack: Prometheus, node-exporter, dcgm-exporter for observability

What Are Cortex's Real-World Use Cases?

Cortex by Aulendur Labs serves three primary audiences: research labs that need easy access to language models without infrastructure expertise, enterprise AI teams deploying on-premises with full data control, and development teams needing local environments that mirror production.

1. Research Labs

Provide researchers with easy access to powerful language models without requiring deep infrastructure knowledge. The OpenAI-compatible API means existing code just works.

2. Enterprise AI Teams

Deploy models on-premises with full control over data and costs. Built-in monitoring ensures you can track resource utilization and optimize GPU allocation.

3. Development Teams

Local development environment that mirrors production. Test your AI applications against the same API interface you'll use in production.

How Does Cortex Balance Smart Defaults with Full Control?

Cortex by Aulendur Labs automatically detects host IP, available NVIDIA GPUs, and operating system to apply optimal settings — but everything can be customized through environment variables and Docker Compose profiles, giving teams both instant productivity and full configurability.

One of our design principles is "smart defaults, full control." Cortex automatically detects:

Your host machine's IP address for proper network configuration
Available NVIDIA GPUs and enables appropriate monitoring
Operating system (Linux/Windows) and applies optimal settings

But everything can be customized through environment variables and Docker Compose profiles.

Is Cortex Production-Ready?

Yes — Cortex by Aulendur Labs ships production-ready from day one with built-in health checks, database backup commands, comprehensive logging, proper authentication with CORS and network isolation, and horizontal scalability for adding model containers as demand grows.

Cortex includes features critical for production deployments:

Health Checks: Built-in endpoints for monitoring service health
Database Backups: Simple commands for backup and restore
Logging: Comprehensive logging for debugging and audit trails
Security: Proper authentication, CORS, and network isolation
Scalability: Add more model containers as demand grows

Why Is Cortex Open Source?

Aulendur Labs believes in contributing back to the communities that enable its work. Cortex is released under the Apache-2.0 license through AulendurForge, Aulendur Labs' open-source initiative, because production AI infrastructure should be accessible to everyone — not just well-funded enterprises.

At Aulendur Labs, we believe in contributing back to the communities that enable our work. vLLM has been instrumental in our defense AI projects, and Cortex is our way of making it easier for others to leverage this powerful technology.

"The best software infrastructure feels invisible. It should just work, so you can focus on what matters—your AI applications."
— Jorden Gershenson, CTO, Aulendur Labs

Getting Started

Ready to try Cortex? Here's how to get started:

Visit the GitHub Repository: github.com/AulendurForge/Cortex
Clone and Install Prerequisites: Docker and Make are all you need
Run Quick Start: make quick-start
Access the Admin UI: Login at the displayed URL

Documentation & Support

The repository includes comprehensive documentation covering:

Quick start and installation guides
Architecture and design decisions
Model download and configuration
Deployment best practices
API reference and examples
Security considerations

What's Next for Cortex?

Aulendur Labs is actively developing Cortex features including automatic model scaling, request queuing and load balancing, fine-tuning pipeline integration, enhanced analytics, and multi-tenant isolation — building toward the infrastructure that will serve DeepLoom in production.

This is just the beginning. We're actively developing features like:

Automatic model scaling based on demand
Request queuing and load balancing
Fine-tuning pipeline integration
Enhanced analytics and usage tracking
Multi-tenant isolation improvements

Join the Community

We'd love your feedback, bug reports, and contributions! Whether you're:

Using Cortex in production
Finding bugs or suggesting features
Contributing code or documentation
Sharing your use case

Please engage with us on GitHub. Star the repo, open issues, and submit pull requests—we're building this together.

About AulendurForge

AulendurForge is Aulendur Labs' open-source initiative, dedicated to building and sharing AI infrastructure tools that benefit the entire community. While our defense work remains proprietary, we're committed to open-sourcing components that have broad applicability.

Questions or feedback? [email protected]

Frequently Asked Questions

: Cortex is an open-source, enterprise-grade model serving platform built by Aulendur Labs. Cortex simplifies deploying large language models like those served by vLLM, providing production-ready monitoring, APIs, and authentication out of the box. Cortex is the infrastructure that will power DeepLoom at scale.
: Yes, Cortex is fully open source under the Apache-2.0 license and is available on GitHub at github.com/AulendurForge/Cortex. Aulendur Labs is committed to open-sourcing AI infrastructure tools that benefit the entire community through its AulendurForge initiative.
: Cortex is the model serving infrastructure that will power DeepLoom in production. By building and proving Cortex as open-source software, Aulendur Labs validates its ability to deploy AI systems at scale with the monitoring, APIs, and authentication that enterprise and defense environments require.