Neszed-Mobile-header-logo
Wednesday, November 12, 2025
Newszed-Header-Logo
HomeAIMoonshot AI Releases Kosong: The LLM Abstraction Layer that Powers Kimi CLI

Moonshot AI Releases Kosong: The LLM Abstraction Layer that Powers Kimi CLI

Modern agentic applications rarely talk to a single model or a single tool, so how do you keep that stack maintainable when providers, models and tools keep changing every few weeks. Moonshot AI’s Kosong targets this problem as an LLM abstraction layer for agent applications. Kosong unifies message structures, asynchronous tool orchestration and pluggable chat providers so teams can build agents without hard wiring business logic to a single API. It is also the layer that powers Moonshot’s Kimi CLI.

What Kosong provides?

Kosong is a Python library that sits between your agent logic and LLM providers. It as an LLM abstraction layer for modern agent applications and shows example code that uses a Kimi chat provider together with high level helper functions generate and step.

The public API surface is intentionally kept small. At the top level you import kosong.generate, kosong.step and the result types GenerateResult and StepResult. Supporting modules define chat_provider, message, tooling, and tooling.simple. These modules wrap provider specific streaming formats, token accounting and tool calls behind one consistent interface.

ChatProvider and message model

The core integration point is the ChatProvider abstraction. Moonshot team shows a provider implementation for Kimi in kosong.chat_provider.kimi. A Kimi object is initialized with base_url, api_key and the model name, for example kimi-k2-turbo-preview. This provider is then passed into kosong.generate or kosong.step together with a system prompt, tools and a message history.

Messages are represented by the Message class from kosong.message. In the examples, a message is constructed with a role, such as "user", and a content argument. The type of content is documented as either a string or a list of content parts, which lets the library support richer multimodal payloads while keeping the basic chat example simple for new users.

Kosong also exposes a streaming unit StreamedMessagePart via kosong.chat_provider. Provider implementations emit these parts during generation, and the library merges them into the final Message. The optional TokenUsage structure tracks token counts in a provider independent way, which is then attached to the result objects for logging and monitoring.

Tooling, Toolset and SimpleToolset

Most agent stacks need tools such as search, code execution or database calls. Kosong models this through the tooling module. The example in the GitHub repo defines a tool by subclassing CallableTool2 with a Pydantic parameter model. The example AddTool sets name, description and params, and implements __call__ to return a ToolOk value which is a valid ToolReturnType.

Tools are registered in a SimpleToolset from kosong.tooling.simple. In the example, a SimpleToolset is instantiated and then augmented with the AddTool instance using the += operator. This toolset is passed into kosong.step, not into generate. The toolset is responsible for resolving tool calls from the model and routing them to the correct async function, while step manages the orchestration around a single conversational turn.

generate for single shot completion

The generate function is the entry point for plain chat completion. You provide the chat_provider, a system_prompt, an explicit list of tools, which can be empty, and a history of Message objects. The Kimi example shows a minimal usage pattern where a single user message is passed as history and tools=[].

generate supports streaming through an on_message_part callback. In the GitHub repo, the research team illustrates this by defining a simple output function that prints each StreamedMessagePart. After streaming is complete, generate returns a GenerateResult that contains the merged assistant message and an optional usage structure with token counts. This pattern lets applications both display incremental output and still work with a clean final message object.

step for tool using agents

For tool using agents, Kosong exposes the step function. The example in the Git Repo shows kosong.step being called with a Kimi provider, a SimpleToolset that contains AddTool, a system prompt and user history that instructs the model to call the add tool.

step returns a StepResult. The example prints result.message and then awaits result.tool_results(). This method collects all tool outputs produced during the step and returns them to the caller. The orchestration of tool calls, including argument parsing into the Pydantic parameter model and conversion into ToolReturnType results, is handled inside Kosong so agent authors do not have to implement their own dispatch loop for each provider.

Built in demo and relationship with Kimi CLI

Kosong ships with a built in demo agent that can be run locally. The Git README documents environment variables KIMI_BASE_URL and KIMI_API_KEY, and shows a launch command using uv run python -m kosong kimi --with-bash. This demo uses Kimi as the chat provider and exposes a terminal agent that can call tools, including shell commands when the option with bash is enabled.

Key Takeaways

  1. Kosong is an LLM abstraction layer from Moonshot AI that unifies message structures, asynchronous tool orchestration and pluggable chat providers for agent applications.
  2. The library exposes a small core API, generate for plain chat and step for tool using agents, backed by abstractions such as ChatProvider, Message, Tool, Toolset and SimpleToolset.
  3. Kosong currently ships a Kimi chat provider targeting the Moonshot AI API, and defines the ChatProvider interface so teams can plug in additional backends without changing agent logic.
  4. Tool definitions use Pydantic parameter models and ToolReturnType results, which lets Kosong handle argument parsing, validation and orchestration of tool calls inside step.
  5. Kosong powers Moonshot’s Kimi CLI, providing the underlying LLM abstraction layer while Kimi CLI focuses on the command line agent experience that can target Kimi and other backends.

Kosong looks like a pragmatic move from Moonshot AI, it cleanly separates agent logic from LLM and tool backends while keeping the surface area small for early developers. By centering everything on ChatProvider, Message and Toolset, it gives Kimi CLI and other stacks a consistent way to evolve models and tooling without rewriting orchestration. For teams building long term agent systems, Kosong could be the right kind of minimal infrastructure.


Check out the Repo and Docs. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.


a professional linkedin headshot photogr 0jcmb0R9Sv6nW5XK zkPHw uARV5VW1ST6osLNlunoVWg

Michal Sutter is a data science professional with a Master of Science in Data Science from the University of Padova. With a solid foundation in statistical analysis, machine learning, and data engineering, Michal excels at transforming complex datasets into actionable insights.

Source link

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments