Beijing University of Posts and Telecommunications Reconstructs Multi-Agent Orchestration Paradigm

Introduction

The MASFactory framework, developed by a team from Beijing University of Posts and Telecommunications, redefines multi-agent system orchestration using Vibe Graphing. This innovative approach converts natural language commands into structured workflows, significantly reducing API costs to one-tenth of traditional methods, while drastically decreasing code volume and enhancing development efficiency. The system, centered around graphs, supports hybrid orchestration and full-stack visualization, having passed multiple authoritative tests and outperforming similar solutions.

Vibe Graphing: A Game Changer

Recently, a technical analysis video titled “Vibe Graphing: 10x More Affordable than Vibe Coding (MAS-Factory)” gained significant traction on platforms like YouTube, resonating strongly within the developer community. The video highlights astonishing data: under equivalent task complexity, traditional Vibe Coding incurs costs of several dollars, while the new paradigm of Vibe Graphing reduces API costs to just one-tenth, alongside a substantial increase in success rates.

The star of this video is the MASFactory framework, which is now open-source on GitHub, with accompanying academic papers available for reference.

Challenges with Vibe Coding

As foundational large model capabilities evolve, the industry has reached a consensus that multi-agent systems (MAS) can tackle complex long-term tasks that a single agent struggles to manage through role specialization, cross-validation, and iterative collaboration. However, building a robust and scalable MAS remains a daunting engineering challenge.

Existing orchestration frameworks generally fall into two categories, and even the widely recognized Vibe Coding paradigm fails to address the shortcomings of these development approaches. Many large language models lack comprehensive training on various niche domain-specific languages (DSLs), leading to difficulties when Vibe Coding tools confront hard-coded frameworks. This necessitates developers to incur additional token costs for the model to familiarize itself with DSL syntax, while also requiring significant effort to enforce adherence to DSL’s topological norms and communication logic, severely degrading the development experience.

Vibe Graphing: The New Paradigm

Vibe Graphing serves as a compiler that translates natural language intent into structured intermediate representations (IR) and executable workflows. Unlike Vibe Coding, which directly outputs low-level code, Vibe Graphing abstracts MAS construction into a well-ordered three-stage design process encapsulated within a loop component driven by three agents:

Role Assignment: The system first strips away all code implementation details, focusing solely on the ‘human’ aspect. When a developer inputs a natural language command like “Help me construct a literature review workflow,” the engine maps the task intent to a set of candidate agents (e.g., retriever, reader, writer, reviewer) and delineates strict responsibility boundaries.
Topology Design: After defining roles, the system constructs a directed graph topology skeleton, which specifies message dependencies and execution order (e.g., serial, parallel, loop) without involving any specific prompts or tool calls.
Semantic Completion: Finally, the system parameterizes the topology skeleton, precisely configuring instructions and input-output constraints for each node.

By requiring the AI to generate or modify only very short JSON topology configurations instead of thousands of lines of Python logic, token consumption decreases exponentially, achieving a tenfold reduction in token usage compared to Vibe Coding.

MASFactory: A Graph-Centric Architecture

The elegant operation of Vibe Graphing is supported by MASFactory’s underlying graph engine, which models multi-agent workflows as directed computation graphs. The entire system is scientifically divided into four layers:

Graph Skeleton: At the lowest level, the system consists entirely of nodes and edges. MASFactory rigorously isolates collaborative signals, splitting them into three streams.
Component and Reusable Layers: Basic nodes are further abstracted into a rich library of components. In addition to the foundational agents following the Perception-Reasoning-Action paradigm, the system also provides:
Protocol and Context Adaptation Layer: Modern MAS applications inevitably require integration of heterogeneous external components (Memory, RAG, MCP, etc.). MASFactory introduces two types of adapters.
Hybrid Orchestration and Visual Interaction Layer: At the operational entry point, MASFactory accommodates code development (declarative, imperative), visual drag-and-drop, and Vibe Graphing simultaneously. Notably, these three methods are not only non-conflicting but can also be mixed and nested within the same large project. Developers can hard-code a rigorously defined tool subgraph, generate a static structure diagram with Vibe Graphing, and finally combine them into a complete workflow using visual drag-and-drop.

Additionally, MASFactory offers a deeply integrated VS Code extension tool—MASFactory Visualizer. This tool provides full-stack visual support during code orchestration, drag-and-drop orchestration, Vibe Graphing phases, and debugging runs.

Performance Evaluation

MASFactory has demonstrated compelling data across seven major benchmarks covering code generation and complex reasoning, including HumanEval, MBPP, BigCodeBench, SRDD, GAIA, and MMLU-Pro. Experiments indicate that MASFactory can reliably and efficiently reproduce the most representative systems in the industry (including ChatDev, MetaGPT, Agent Verse, CAMEL, HuggingGPT, etc.), outperforming original hard-coded implementations on multiple metrics.