ML-Optimized Walking Robot

This project was a hands-on exploration of machine learning in robotics, developed in close collaboration with a partner as part of an end-to-end design and build process. We set out to create a bipedal, Raspberry Pi-controlled walking robot that could learn how to move through reinforcement learning. While the original design was inspired by duck locomotion, repeated balance and hardware issues forced us to adapt, which culminated in a lateral “shuffling” motion that the robot discovered as an optimal strategy. Through this iterative and often unpredictable process, we gained firsthand experience in the challenges of translating simulated ML solutions into physical movement, as well as the broader reality of working across digital and mechanical domains.

Goals

Build a fully integrated robotic system, from design and fabrication to embedded control and learning algorithms.
Apply reinforcement learning techniques to discover gait strategies in a real-world setting.
Implement robust software infrastructure for startup, shutdown, servo control, and system health monitoring.
Collaborate across roles, with a division of responsibility between mechanical design and software/ML implementation.
Learn to adapt design and expectations in response to real-world physical constraints.

Specs

Hardware: 6 servo motors, Raspberry Pi 4, 12V battery system, 3D-printed chassis (originally duck-inspired)
Software: Python 3, sinusoidal control loops, servo homing and reset functions, health monitoring scripts
Machine Learning: Reinforcement learning logic for gait tuning, using lateral movement as the reward signal
Simulation: Online ML modeling using CAD geometry for pre-testing gait strategies
Embedded Systems: Real-time coordination of servo motion and condition-aware startup/shutdown sequences

Design process

Collaborated with a partner: he led CAD design while we shared responsibility for fabrication and assembly.
Started with a duck-inspired design, but iterative testing revealed major balance issues, leading us to remove the upper "body" for stability.
Prototyped leg motion using sinusoidal functions, gradually layering in ML tuning for amplitude and phase shifts.
Implemented safety-focused systems like health checks, servo reset/homing methods, and controlled boot/shutdown processes.
Continuously revised both mechanical and software systems as we encountered failures in balance, joint tolerance, and servo calibration.

Challenges

The physical robot was top-heavy and difficult to balance, preventing successful forward walking.
Reinforcement learning in the real world is noisy — the robot consistently "learned" to shuffle sideways instead of walking to avoid falling.
Simulation results often failed to transfer to hardware due to small differences in build accuracy and material properties.
Servo drift and uneven torque across joints required manual tuning and constraint enforcement.
Time-intensive testing cycle due to lack of sensing or motion capture — robot behavior had to be evaluated visually and iteratively.

Outcomes

Built a fully functioning robotic prototype that could autonomously execute a learned lateral gait pattern.
Successfully implemented a modular control system on Raspberry Pi with core features for motion, monitoring, and resets.
Demonstrated the practical difficulty of translating ML strategies from simulation to physical hardware.
Gained firsthand experience in how mechanical design decisions impact software complexity, and vice versa.
Developed a deeper appreciation for iteration, debugging, and flexibility in hardware/software integrated systems.

Potential next steps

Revisit mechanical layout to prioritize balance and lower center of gravity from the initial design phase.
Add sensors (e.g., IMU or distance sensors) to enable better feedback and penalize falling within the learning loop.
Refine fabrication methods to reduce inconsistency between CAD and physical build.
Explore central pattern generators (CPGs) or hybrid control methods for more stable forward walking.
Simulate gaits in a more advanced physics engine (e.g., PyBullet) with hardware constraints mirrored more accurately.