
Uncover how firms are responsibly integrating AI in manufacturing. This invite-only occasion in SF will discover the intersection of expertise and enterprise. Discover out how one can attend right here.
Knowledge streaming firm Confluent simply hosted the primary Kafka Summit in Asia in Bengaluru, India. The occasion noticed an enormous turnout from the Kafka neighborhood — over 30% of the worldwide neighborhood comes from the area — and featured a number of buyer and associate periods.
Within the keynote, Jay Kreps, the CEO and co-founder of the corporate, shared his imaginative and prescient of constructing common information merchandise with Confluent to energy each the operational and analytical sides of knowledge. To this finish, he and his teammates confirmed off a number of improvements coming to the Confluent ecosystem, together with a brand new functionality that makes it simpler to run real-time AI workloads.
The providing, Kreps stated, will save builders from the complexity of dealing with a wide range of instruments and languages when making an attempt to coach and infer AI fashions with real-time information. In a dialog with VentureBeat, Shaun Clowes, the CPO on the firm, additional delved into these choices and the corporate’s method to the age of contemporary AI.

Confluent’s Kafka story
Over a decade in the past, organizations closely relied on batch information for analytical workloads. The method labored, nevertheless it meant understanding and driving worth solely from info as much as a sure level – not the freshest piece of knowledge.
VB Occasion
The AI Affect Tour – San Francisco
Request an invitation
To bridge this hole, a collection of open-source applied sciences powering real-time motion, administration and processing of knowledge had been developed, together with Apache Kafka.
Quick ahead to right now, Apache Kafka serves because the main alternative for streaming information feeds throughout hundreds of enterprises.
Confluent, led by Kreps, one of many authentic creators of the open platform, has constructed business services (each self and totally managed) round it.
Nonetheless, that is only one piece of the puzzle. Final yr, the info streaming participant additionally acquired Immerok, a number one contributor to the Apache Flink undertaking, to course of (filtering, becoming a member of and enriching) the info streams in-flight for downstream functions.
Now, on the Kafka Summit, the corporate has launched AI mannequin inference in its cloud-native providing for Apache Flink, simplifying probably the most focused functions with streaming information: real-time AI and machine studying.
“Kafka was created to allow all these completely different techniques to work collectively in real-time and to energy actually superb experiences,” Clowes defined. “AI has simply added gasoline to that fireside. For instance, if you use an LLM, it’ll make up and reply if it has to. So, successfully, it’ll simply hold speaking about it whether or not or not it’s true. At the moment, you name the AI and the standard of its reply is nearly at all times pushed by the accuracy and the timeliness of the info. That’s at all times been true in conventional machine studying and it’s very true in trendy ML.”
Beforehand, to name AI with streaming information, groups utilizing Flink needed to code and use a number of instruments to do the plumbing throughout fashions and information processing pipelines. With AI mannequin inference, Confluent is making that “very pluggable and composable,” permitting them to make use of easy SQL statements from throughout the platform to make calls to AI engines, together with these from OpenAI, AWS SageMaker, GCP Vertex, and Microsoft Azure.
“You can already be utilizing Flink to construct the RAG stack, however you would need to do it utilizing code. You would need to write SQL statements, however you then’d have to make use of a user-defined operate to name out to some mannequin, and get the embeddings again or the inference again. This, however, simply makes it tremendous pluggable. So, with out altering any of the code, you may simply name out any embeddings or era mannequin,” the CPO stated.
Flexibility and energy
The plug-and-play method has been opted for by the corporate because it needs to offer customers the flexibleness of going with the choice they need, relying on their use case. To not point out, the efficiency of those fashions additionally retains evolving over time, with nobody mannequin being the “winner or loser”. This implies a consumer can go together with mannequin A to start with after which change to mannequin B if it improves, with out altering the underlying information pipeline.
“On this case, actually, you principally have two Flink jobs. One Flink job is listening to information about buyer information and that mannequin generates an embedding from the doc fragment and shops it right into a vector database. Now, you might have a vector database that has the newest contextual info. Then, on the opposite facet, you might have a request for inference, like a buyer asking a query. So, you’re taking the query from the Flink job and fix it to the paperwork retrieved utilizing the embeddings. And that’s it. You name the chosen LLM and push the info in response,” Clowes famous.
At present, the corporate provides entry to AI mannequin inference to pick clients constructing real-time AI apps with Flink. It plans to increase the entry over the approaching months and launch extra options to make it simpler, cheaper and quicker to run AI apps with streaming information. Clowes stated that a part of this effort would additionally embody enhancements to the cloud-native providing, which can have a gen AI assistant to assist customers with coding and different duties of their respective workflows.
“With the AI assistant, you may be like ‘inform me the place this matter is coming from, inform me the place it’s going or inform me what the infrastructure appears to be like like’ and it’ll give all of the solutions, execute duties. This can assist our clients construct actually good infrastructure,” he stated.
A brand new method to save cash
Along with approaches to simplifying AI efforts with real-time information, Confluent additionally talked about Freight Clusters, a brand new serverless cluster kind for its clients.
Clowes defined these auto-scaling Freight Clusters benefit from cheaper however slower replication throughout information facilities. This leads to some latency, however gives as much as a 90% discount in price. He stated this method works in lots of use instances, like when processing logging/telemetry information feeding into indexing or batch aggregation engines.
“With Kafka customary, you may go as little as electrons. Some clients go extraordinarily low latency 10-20 milliseconds. Nonetheless, once we discuss Freight Clusters, we’re one to 2 seconds of latency. It’s nonetheless fairly quick and may be a reasonable method to ingest information,” the CPO famous.
As the subsequent step on this work, each Clowes and Kreps indicated that Confluent appears to be like to “make itself recognized” to develop its presence within the APAC area. In India alone, which already hosts the corporate’s second greatest workforce primarily based exterior of the U.S., it plans to extend headcount by 25%.
On the product facet, Clowes emphasised they’re exploring and investing in capabilities for bettering information governance, basically shifting left governance, in addition to for cataloging information driving self-service of knowledge. These components, he stated, are very immature within the streaming world as in comparison with the info lake world.
“Over time, we’d hope that the entire ecosystem can even make investments extra in governance and information merchandise within the streaming area. I’m very assured that’s going to occur. We as an business have made extra progress in connectivity and streaming, and even stream processing than we have now on the governance facet,” he stated.