<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Genai on SREKubeCraft | Nikos Nikolakakis</title><link>https://srekubecraft.io/tags/genai/</link><description>Recent content in Genai on SREKubeCraft | Nikos Nikolakakis</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sun, 03 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://srekubecraft.io/tags/genai/index.xml" rel="self" type="application/rss+xml"/><item><title>KServe - Production ML Serving on Kubernetes, from sklearn to LLMs</title><link>https://srekubecraft.io/posts/kserve/</link><pubDate>Sun, 03 May 2026 00:00:00 +0000</pubDate><guid>https://srekubecraft.io/posts/kserve/</guid><description>&lt;p&gt;In November 2025, &lt;a href="https://www.cncf.io/blog/2025/11/11/kserve-becomes-a-cncf-incubating-project/"&gt;KServe joined the CNCF as an incubating project&lt;/a&gt;. For a project that started life as KFServing inside Kubeflow back in 2019, this was the formal recognition that ML model serving on Kubernetes had grown up. KServe is now the closest thing the cloud-native ecosystem has to a standard for putting trained models behind an API: scikit-learn, XGBoost, PyTorch, TensorFlow, ONNX, Triton, and increasingly Large Language Models all served through the same &lt;code&gt;InferenceService&lt;/code&gt; CRD with the same scale-to-zero, autoscaling, traffic splitting, and observability primitives.&lt;/p&gt;</description></item></channel></rss>