Prismatic Labs

Cloud & Research Roadmap

Prismatic Labs is building open-source tooling to detect, attribute, and control inference waste in production AI systems. This page describes where we’re headed technically, the infrastructure that would accelerate it, and the principles we apply to compute resources.

Technical direction

Four near-term priorities for Vetch and the Prismatic Labs research programme.

Priority 1

Multi-cloud inference coverage

Vetch currently instruments OpenAI, Anthropic, Vertex AI, and Azure OpenAI. The next step is parity across managed inference APIs — including Amazon Bedrock — with consistent waste detection, attribution, and OTLP export across providers.

Needs: API access to Bedrock, Vertex AI Gemini endpoints, and Azure OpenAI at scale for cross-provider benchmarking.

Priority 2

GPU energy model calibration

Current per-call energy estimates rely on token-count heuristics and provider power profiles. To improve accuracy — particularly for local models running on Ollama, vLLM, and llama.cpp — Vetch needs calibration against real GPU power measurements under controlled inference workloads.

Needs: GPU compute (NVIDIA A100/H100 class) for controlled inference benchmarking runs.

Priority 3

Public benchmarks and hosted dashboards

Inference waste benchmark results, regional carbon comparisons, and the Vetch inference calculator should be publicly accessible with reproducible methodology. This means hosting lightweight static dashboards and benchmark artefacts at stable URLs, with documented data provenance.

Needs: Storage and hosting for static assets, benchmark data, and public demo endpoints.

Priority 4

Observability pipeline automation

Vetch emits OTLP-compatible events today. The roadmap includes pre-built Grafana dashboard templates, CI/CD benchmark automation for regression testing of waste detection accuracy, and structured compliance exports aligned with EU AI Act and CSRD reporting requirements.

Needs: Grafana Cloud or self-hosted Grafana, OTLP pipeline infrastructure, CI compute for benchmark runs.

Responsible compute

We use cloud resources to measure and reduce waste — not create more of it.

Inference efficiency tooling should be held to the same standard it promotes. These are our operating principles for cloud and compute resources.

Budget-bounded tests

All benchmark runs are scoped to defined cost and token budgets. No open-ended experiments. Every run has a budget ceiling and a clear stopping condition.

Open-source outputs

All benchmark methodology, scripts, and results are published under Apache 2.0. The point is public evidence of inference efficiency, not proprietary advantage.

Reproducible artefacts

Results are published with the environment, model versions, and data sources needed to reproduce them. Uncertainty is documented, not hidden.

No unnecessary training runs

Vetch is an inference observability tool. The roadmap does not include model training. Compute is used for calibration, benchmarking, and tooling, not training runs.

Measurement-first

Compute decisions follow measurement. We benchmark before scaling, track energy and cost per run, and publish what we find — including the cases where estimates were wrong.

Developer tooling focus

The output of cloud compute is tooling that helps other AI teams use cloud resources more efficiently. Infrastructure investment here has a multiplier effect on the wider ecosystem.

Ecosystem value

Why inference efficiency tooling matters for cloud AI infrastructure.

Better AI resource utilisation

Stalled agent loops, RAG bloat, and retry storms waste compute on every iteration. Tools that detect and stop these patterns reduce unnecessary load on cloud inference infrastructure — freeing capacity and lowering the cost per useful output.

Observability around AI workloads

AI teams lack per-call visibility into cost, energy, and carbon. Vetch fills a gap that provider dashboards cannot — attributing spend and resource use to the features, customers, and workflows that drive it. This makes AI workloads legible in the same way APM tools made web services legible.

Public education and open tooling

Public benchmark dashboards, reproducible methodology, and open-source tooling create educational artefacts that raise the quality of inference efficiency work across the ecosystem — not just at Prismatic Labs.

Get in touch

For cloud and infrastructure partnership enquiries, startup programme reviewers, or research collaborations — contact Prismatic Labs directly.

marco@prismaticlabs.ai → Book a 30-min call

Prismatic Labs

Open-source AI infrastructure tools.

GitHub LinkedIn Contact Home Vetch Manifesto