Sunday, December 27, 2020

One spider leg, dancing

I am claiming the points for having completed my experiment with machine-learning AI doing the dull parts of programming the movements of a single spider's leg.

I ran something over 3000 leg movement actions (only a proof of concept, not enough to get any real skill learned in) directed by a reinforcement learning agent using a tensor-flow based SARSA agent.

Most of the code came from Gelana Tostaeva in an article she published through Medium named

Learning with Deep SARSA & OpenAI Gym.

After that 3000 steps, I learned something important. The agent was learning from it's experience, using my 4 servo spider leg, and the python code I wrote to represent the spider leg as an Open AI Gym environment. My reward code was allowing it to learn to make big dramatic gestures without actually overheating the servos. Mwahahahaaa.

(There were a bunch of non-zero values in the Q matrix when I stopped, indicating some bits of learning, I believe)

However, eventually the jerky, fast, full arc, drama of the gestures did overstress the servos and break the gears in one. I expect it might eventually learn that it has to ease up to full speed and ease back to a stop. However to save on costs, I will be futzing with the reward design a bit. Thanks Gelana!

Sunday, December 13, 2020

Leg Agency

I have been rewriting some python code to be multiprocess so that it can be writing and reading from the serial port "at the same time".

That is in good shape now, though I worry a bit about how many processes I have broken the work up into. There is a monitor process and then two for each serial port that has servos attached (one reads and the other writes).

I took sample code from "Create custom gym environments from scratch — A stock market example" by Adam King on Medium. That gave me the methods required for a wrapper. That code makes my multiprocess SpiderWhisperer comply with the gym environment API from OpenAI.

In order to have an RL agent learn how to move my spider leg while only developing the sort of familiarity with actual machine-learning math that one gets from sharing a friendly nod across a noisy party, I stripped out pieces of sample code from the Gym tutorial here. That gave me the rest of the code I needed for some conceptual experimental runs.

I hesitate to mention that there was some debugging. Since I am intent on learning from the noble creatures in the phylum arthropoda, I will say that some adjustments to the implementation were required that I discovered by trying to run the RL scenario using code I shuffled together.

Now I am ready to try a long run, and see if an RL Agent that could learn to maneuver a cart up a mountain from a valley, can also learn to make interesting movements with a 4 servo spider leg, without burning out a lot of servos.