About Us#
UCM is rooted in KV Cache, with the goal of reducing inference costs and building commercially viable inference solutions. It enhances throughput through methods such as Prefix Cache, sparsification, and PD Disaggregation.
The UCM team consists of a group of “lazy” people who love simple things and also enjoy “borrowing” the excellent experiences of others. Adhering to the principle of full openness, we hope everyone will generously share their insights. We also welcome everyone to learn from these experiences together, engage in discussions, and help us make progress.