Highlights: VerTQ is an accelerator chip that implements Google’s TurboQuant algorithm which reduces KV cache memory usage of Large Language Models by a factor of…
Tag: Inference
Red Hat AI Inference and Red Hat OpenShift Virtualization Service are Coming to IBM Cloud
May 18, 2026 IBM, in partnership with Red Hat, is offering two new managed services, Red Hat AI Inference on IBM Cloud and Red Hat…
Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way |
Stanford adjunct professor and successfully exited founder Zain Asgar just raised an $80 million Series A for a startup that solve the AI inference bottleneck…
Co-founders behind Reface and Prisma join hands to improve on-device model inference with Mirai |
Much of the conversation around AI today is focused on building cloud capacity and massive data centers to run models. Companies like Apple and Qualcomm…
Exclusive: AI inference startup Modal Labs in talks to raise at $2.5B valuation, sources say
2:48 PM PST · February 11, 2026 Modal Labs, a startup specializing in AI inference infrastructure, is talking to VCs about a new round at…
Inference startup Inferact lands $150M to commercialize vLLM |
In Brief Posted: 2:42 PM ps · January 22, 2026 The creators of the open source project vLLM have announced that they transitioned the popular…
Red Hat and AWS Deliver Enhanced AI Inference
Dec 8, 2025 Red Hat, a leading provider of open source solutions, announced an expanded collaboration with Amazon Web Services (AWS) to power enterprise-grade generative…
MarketsandMarkets’ 360Quadrants Recognizes Top Startups and SMEs in the AI Inference Quadrant Report 2025
, /PRNewswire/ –360Quadrants has released its latest AI inference Startups/SMEs Companies Assessment, 2025, recognizing key players, including both global giants and emerging innovators, for their…