Deploy your AI applications across continents and run on an edge network of 250 locations.

Low latency AI workloads

Learn how to set up a vLLM Instance to run inference workloads and host your own OpenAI-compatible API on Koyeb.

Deploy the vLLM Inference Engine to Run Large Language Models (LLM) on Koyeb

Deploy your fulll stack apps next to fully-managed serverless PostgreSQL databases on high-end infrastructure. No credit card required.

Serverless PostgreSQL with built-in pgvector

In this tutorial, we showcase how to deploy a FAQ search service built with Hugging Face's Inference API, pgvector, Koyeb's Managed Postgres. The optimized FAQ Search leverages sentence similarity searching to provide the most relevant results to a user's search terms.

Hugging Face

Deploy the vLLM Inference Engine to Run Large Language Models (LLM) on Koyeb

Use pgvector and Hugging Face to Build an Optimized FAQ Search with Sentence Similarity