介绍 Introduction


This is a course project for "Scientific Research Project Guidance and Training" course of my major. We chose to build a Rubik's cube solving robot, which can identify a third-order cube's pattern and solve it.

该机器人拥有六个自由度,对六个面进行旋转,并有四个摄像头从四个角进行颜色的识别。此外,机器人采用树莓派作为上位机进行总体控制,采用 Arduino 作为下位机进行步进电机的控制。

The robot has six degrees of freedom, rotating the six sides. It identifies the cube by four cameras from the four corners. A Raspberry Pi is used as an upper controller for the general control, and an Arduino board as a lower controller for the motor control.

项目在最终结题答辩时获得非常高的评价,同一课程中共有 9 组项目。

The project gets an excellent feedback at the final presentation among 9 projects of the course.

视频 Videos

视频:YouTube (with English captions) 优酷

视频:YouTube (with English captions) 优酷

项目贡献 Contributions to the Project


Apart from finance and purchasing affairs, I was in charge of the upper controller, specifically choosing the robot architecture and the programming language, building the overall control program, integrating the solving algorithm, and implementing a display interface. I also took part in the debugging process and wrote a user manual (Chinese, required by the course).

It required a lot of teamwork. I cooperated with the lower controller colleague to work out the serial communication protocol. I worked with the visual module colleagues to integrate their program into my part, and helped debugging. I also implemented the display interface according to the design draft made by another colleague.

上位机选型与展示界面 The Architecture of the Upper Controller and the Display Interface

上位机软件结构 Composition of the software on the upper controller

如视频中所见,我们采用的展示界面基于 Web,对电脑上全屏显示进行了适配。其依赖于运行在上位机的 Python 服务器,并利用 WebSocket 技术实现机器人运行状态的推送,得以将机器人运行时魔方应有的状态通过网页上的 3D 魔方模型显示。(状态显示是下位机实时反馈的旋转结果)

As you can see in the video, we built the display interface based on web technology, which was optimized for full-screen presentation on PC. It relies on a Python server running on the upper controller, and thanks to WebSocket pushing robot status to the client instantly, the designated cube pattern can be shown on the 3D cube model as the robot runs. (It's the rotation feedback from the lower controller indeed.)

网页控制方案的优点:The advantages of a web-based solution:

  • 服务器资源占用少:服务器在接受控制时只需维持一个监听端口的服务器程序,通常进行纯文本通信,不需渲染客户端显示的图像,占用资源少;
    Less server resource consumption: The server doesn't render the graphics for display, and normally only texts are transmitted to the browser.
  • 客户端要求低:客户端只需要连接无线网络、安装有现代浏览器,即可对机器人进行操作,不需额外安装特定的软件;借助已经十分强大的 WebGL 等网页交互技术,客户端可实现魔方的 3D 实时渲染等,显示效果可与其它方案匹敌;
    Lower requirements for clients: Only a wireless network connection and a modern browser is required, which means mobile devices can control the robot without special app required. Thanks to WebGL the 3D cube can be shown perfectly.
  • 扩展能力强:客户端与服务器之间的通信采用统一的接口,可根据不同的用途,如调试、演示等,开发不同的网页“客户端”,十分灵活;只需将网页进行小幅修改,优化手机操作,即可实现不需电脑、手机直接控制机器人。
    Better extension: Different clients webpage can be made based on the same API on the server, according to the need (for debugging or for beginner usage). They can control the robot at the same time.


We also integrated two "bullet screen" (danmu) interactive programs, both in the Intranet and the Internet. Because the final presentation doesn't need this, we showed some predefined words by the bullet screen instead.

扫屏幕上二维码后的弹幕互动界面 Interactive webpage shown after scanning QR code on the presentation screen

上位机服务器程序 Server Program on the Upper Controller

上位机服务器程序采用 Python 语言编写,基于 OpenCV 图像处理、bluezero 蓝牙控制、pyserial 串口收发等库。魔方机的运行控制采用多进程技术,机器人在解魔方时,网页服务器仍可对用户操作做出响应,且在进行图像识别时可同时运行不同摄像头图片的处理进程,提高了运算效率。

The server program on the upper controller is coded with Python, based on OpenCV (image processing), bluezero (Bluetooth control), and pyserial library. Multiprocessing makes it possible to handle user request when the robot is solving the cube. It also enables simultaneous computation with different camera's images, which makes full use of the CPU and speeds up a lot.

解算时序图(部分) Part of the solving sequence diagram

算法集成 Integration of the Solving Algorithm

在研究并测试对比网络上现有的各种解算算法后,我最终选择了 muodov/kociemba 程序,并对运行参数进行了优化。最终算法在 PC 机上运行的平均耗时为 0.196 秒。

算法采用 C 语言通过 gcc 编译为可执行文件,Python 用命令行调用算法。

After some research and comparison of the existing solving algorithm on the Internet, I finally chose muodov/kociemba and applied optimizations to the parameter. The average time cost on PC of the final program is 0.196s.

The program written in C is complied with gcc into a executable file, run by Python with the command line.

[captionn]不同参数下 Kociemba 算法的评测 Benchmarks of Kociemba algorithm with different parameters

解算器 平均耗时 (sec) 平均耗时 (Hard *) (sec) 平均步数 (HTM) 平均步数 (Hard *) (HTM) 失败用例数 Failed cases
solver-21 0.054 0.089 10.54 20.19 0
Final 0.196 0.405 10.34 19.77 0
solver-20 0.181 0.373 10.33 19.74 #6
solver-19 0.704 1.486 10.25 19.57 #33

* “Hard”模式指 HTM 步数大于5步的测试用例,在所有的解算器中,满足这一条件的用例都是相同的(共47个)。
# 标注的解算器的数据并不准确,因为这些解算器解不出来的用例的时间和步数,直接使用了这一用例在上次成功的解算器解算时的数据(解算排序为表中从上到下的排序),而没有额外加上当前失败解算器的解算时间。

“魔方机器人(网页与 Python 控制程序) Rubik's Cube Solving Robot”的一个回复


电子邮件地址不会被公开。 必填项已用*标注