Architecture & Security Deep Dive

Built for Scale. Hosted on Your Terms.

Flora runs an entirely local, data-sovereign pipeline. Ingestion, embeddings, retrieval, and generation stay inside the perimeter you control, so sensitive knowledge never leaves your network.

Deployment model
Fully local
Data posture
Data sovereign
Operations
Audit-friendly
Deployment contract

Local RAG from source to answer

No cloud egress
Ingestion

Documents are parsed within your network boundary.

Storage

Vectors and metadata remain in your Qdrant deployment.

Generation

Only authorized context reaches the model.

The Pipeline Breakdown

Four local stages from source document to answer

Each stage is optimized for throughput, latency, and strict control of the data path.

  1. Step 1/Doctor

    Ingestion with Doctor

    Doctor connects securely to enterprise knowledge bases and parses complex documents inside the local network, so source content never leaves the perimeter during ingestion.

    Security posture

    All parsing happens on-premise, with no document egress to external services.

  2. Step 2/TEI

    Embedding with TEI

    Hugging Face Text Embeddings Inference (TEI) turns cleaned content into highly accurate vector embeddings with near-zero latency, keeping the embedding stage fast enough for high-volume pipelines.

    Inference mode

    Local embeddings are generated immediately after parsing, without a network hop.

  3. Step 3/Qdrant

    Storage with Qdrant

    Qdrant stores the vectors at massive scale and supports sub-millisecond search, while retrieval-time RBAC filters narrow the candidate set to only the documents a user is allowed to see.

    Retrieval control

    Access filters are applied before results are ranked or returned.

  4. Step 4/vLLM

    Generation with vLLM

    vLLM produces the final answer on local hardware, using PagedAttention to sustain high-throughput, low-latency inference even as request volume climbs.

    Serving layer

    PagedAttention keeps GPU memory usage efficient under load.

Bottom line

Ready to deploy Flora on your infrastructure?

Get a deployment architecture that your security and platform teams can review with confidence.

Contact Sales