Spinal motor control: from physiology to modelling

(1)

Rudolf Szadkowski

Articial Intelligence Center

Faculty of Electrical Engineering Czech Technical University in Prague

October 24, 2019

(2)

Motivation: Long-term hexapod deployment

Long-term deployment of a multi-legged walking robot in a dynamic unknown environment.

Real-time adaptationto terrain dynamics.

→ asphalt, ice, dirt, swamp. . .

Robust to body changes during deployment.

→ leg damage, faulty servo, weight increase. . .

Life-long learning of locomotion control: real-time, adaptable, and robust.

Motion-planning approach: high-degree of controllable freedom makes it slow.

Control theory approach: no incremental plasticity

The state-of-the-art can be observed in nature!

(3)

Animal locomotion

Muscles move body.

Thoracic ganglia controls muscles.

Proprioception provides feedback.

Brain controls the thoracic ganglia.

Exteroception provides long range observations.

Brain

Thoracic Ganglia

Muscle

Tactile seta Stretch receptor Antenna

Eye

}

Exteroception

Proprioception

(4)

Gait

Gait: a repetitive motion pattern.

P. Holmes et al., SIAM, 1994

(5)

Gait

Repetitive but also adaptive:

Robust to terrain irregularities.

Can adapt to body changes.

Can learn new gaits.

Two phases of a leg/muscle:

Stance: Propelling the body forward.

Swing: Propelling the leg forward.

A. Bushges et al., e-Neuroforum, 2015

(6)

Source of Gait Control

Where the gait control comes from?

Spinal cat on treadmill.

Changing gaits from walking to running with respect to speed.

Able to walk on treadmills with dierent speeds.

G. Barriere, JN, 2008 FV Severin et al., Biozika, 1967

(7)

Neural architecture

Neural pathways between proprioception and muscles.

Aerentsare excited by receptors, then relayed by inter-neurons toeerents controlling the muscle.

Eerent activation can be dependent on activation of multiple aerents.

(8)

Neural Architecture

Neural pathways are not fully mapped, but there are behavior observations.

Reexes:

Stopping reex (B)

Searching reex (C)

Local motion control:

Task dependent: swimming/crawling, reverse walking

Phase dependent: can't lift leg during early stance

Load dependent: climbing hill

(9)

Central Pattern Generator

Even without proprioception and descending signals, the spine generates rhythmic control signals.

T.G. Brown, Proc R Soc Lond, 1911

Centrally generated rhythmic signals:

Central Pattern Generator(CPG)

Half-center oscillator: reciprocally coupled neurons

Neuron is not oscillatory itself.

At time just one neuron (group of neurons) res.

Active with positive tonic input.

ﬂexor neuron extensor neuron tonic input

inhibition

The gait is controlled by reexive pathways and CPGs.

(10)

Modelling The Gait Control

Maintaining the cyclic trajectory.

x(t)∈R^M proprioception

y(t)∈R^N control signal

In unperturbed regular environment:

x(t+T) =x(t),y(t+T) =y(t)

Control yacts on environment which is observed by proprioception x.

Proprioceptionx is processed by controller into controly.

Left stanceLeft swing

right legs contact

left legs contact left legs max speed right legs max speed

control sensing

Controller Environment

Eﬀectors Proprioception

acts on

observed by

Changing Unknown environment body

Coupling between neural and motion dynamics.

(11)

Modelling The Gait Control

Possible with just reexive pathways (w.o. CPGs).

what is the advantage of using CPG?

Reexive pathways are dependent on proprioception.

Possible control without feedback.

Adds phase dependencies to gait control.

Left stanceLeft swing

right legs contact

left legs contact left legs max speed right legs max speed

control sensing

Controller Environment

Eﬀectors Proprioception

acts on

observed by

Changing Unknown environment body

(12)

Models of CPG

Van der Pol Oscillator Non-Linear Oscillator

A.J. Ijspeert et al., Neuroinf., 2005

Matsuoka Neural Oscillator

˙ v=u

˙

u=β(1−v²)u−v

τv˙=u

τu˙ =−β^v²^+u_E²^−Eu−v

τv˙^f =u^f−v^f τv˙ê=uê−vê

γu˙^f =−u^f−βv^f −αuê+c^f(t) γu˙ê=−uê−βvê−αu^f +cê(t) x= max(0, x)

(13)

Self-sustained oscillator

CPGs are modeled as aself-sustained oscillator (SSO).

Non-linear dynamic system.

Self damping.

Excited by external non-oscillating force.

Has a limit-cycle attractor.

The amplitude is stable but phase is free.

Matsuoka Neural Oscillator τv˙^f=u^f−v^f

τv˙ê=uê−vê

γu˙^f=−u^f−βv^f−αuê+c^f(t) γu˙ê=−uê−βvê−αu^f+cê(t) x= max(0, x)

(14)

Properties of Self-Sustained Oscillator

x˙ =f(x) General SSO

Dynamics on the limit cycle:

A(x) = 0˙ Amplitude

Φ(x) =˙ ω0Natural angular velocity

x˙ =f(x) +Q(x, t) Perturbed SSO

Let Q(x, t) be small and periodic perturbation.

Amplitude is stable→we neglect perturbations in amplitude.

Perturbed phase

Φ(x) =˙ ω0+εsin(Φq(t)) Φq =tω

εandω are perturbation force and angular velocity respectively.

(15)

Synchronization ω=ω₀

Φ(x) =˙ ω0+εsin(Φq(t)); Φq=tω

Phase dierence between SSO and perturbation is stableΦ(t)−Φq(t) =cnst.

No perturbation Synchronization Noisy perturbation

(16)

Synchronization ω6=ω₀

Φ(x) =˙ ω₀+εsin(Φ_q(t)); Φ_q=tω

Phase dierence between SSO and perturbation is stableΦ(t)−Φ_q(t) =cnst.

Multipleω Arnold tongue

Synchronization region

(17)

CPG-based controller

Control decomposed into

Phase control: CPG, joints synchronization

Amplitude control: Reexes, local adaptation

Dierent architectures:

Biological plausibility: Focused on robotic control or biologically plausible.

Feedback: Proprioception is fed to both phase control and amplitude control.

CPG distribution: One CPG per joint/leg, exploiting body symmetry.

Phase control post-processing: Direct mapping to control or assisting the reexes.

S.N. Markin et al., Ann. N. Y. Acad. Sci., 2010

(18)

CPG-based controller learning

Learning the CPG

Hard: CPG is a non-linear dynamic system.

Learning the waveform, frequency, phase dependencies.

Supervised or self-supervised.

Connectionist methods of learning:

Back-propagation, Hebb-like learning

Hebb-like frequency learning rule

˙

x=f(x, y, ω0) +εQ(t)

˙

y=f(x, y, ω₀)

˙

ω0=−εQ(t)√ ^y

x²+y²

L. Righetti et al., Physica D, 2006

(19)

Learning CPG with Back-Propagation Algorithm

R. Szadkowski, P. íºek, J. Faigl, ITAT, 2018

T_av˙^f_i =u^f_i−v_i^f

T_ru˙^f_i =−u^f_i−βv_i^f−w_feu^e_i−P^N

j=1

w_iju^f_j+c^f_i(t)

x= max(0, x)

Parameters to learn: T_a, T_r, β, w_fe, w_ij.

(almost) dierentiable.

Optimization method: Back-propagation through time

Extensor neuron Flexor neuron

v_i^e

u^e_i

v^f_i

u^f_i to other CPGs

wij wij

from other CPGs

c^e_i c^f_i

β β

T_ad dt

T_rd dt

T_ad dt

T_rd dt w_{f e}

w_{f e}

Problem: Unbalanced inhibition leads to stationary solution

(20)

Constraints preventing stationary solutions wf e< cmin

cmax

(1 +β)−max

i∈N





N

X

j

wij





w_{f e}>1 +T_r/T_a

Constraints integrated into CPG network equations

Below: rst two segments are compliant to constraints, the last one is not.

0 20 40 60 80 100 120 140

iterations 0

2 4

ue−uf

(21)

Hexapod

Coxa Femur

Tibia θC

θF θT

Learning results Control imitation

44 45 46

0.25 0.50

coxa[rad]

−1.2 −1.0 −0.8 −0.6 0.25

0.50

44 45 46

−1.25

−1.00

−0.75

femur[rad]

1.2 1.4 1.6

−1.25

−1.00

−0.75

44 45 46

t[sec]

1.25 1.50

tibia[rad]

0.2 0.4 0.6

[rad]

1.25 1.50

Stability test

0 20 40 60

0.0 0.1 0.2 0.3 0.4

0.5 Perturbation

+0.3+0.5 +0.7+0.9

500 520 540 560

iterations

(22)

Phase coupling learning - Experiments

Learning the tripod gait

Input

Proprioception: Ground contact, servo angle and angular velocity

Target signal: Repeated tripod gait control signal

Controller learns coupling between joints and proprioception.

Robustness and adaptability

Coxas are controlled by CPG controller, femurs are controlled externally

Coxas must adapt the phase of femurs.

The proprioception generated by legs on the left side is turned o.

The legs on the right side can sync to proprioception, while the legs on the must sync to other CPGs.