Sunday, December 27, 2020

One spider leg, dancing

I am claiming the points for having completed my experiment with machine-learning AI doing the dull parts of programming the movements of a single spider's leg.

I ran something over 3000 leg movement actions (only a proof of concept, not enough to get any real skill learned in) directed by a reinforcement learning agent using a tensor-flow based SARSA agent.

Most of the code came from Gelana Tostaeva in an article she published through Medium named

Learning with Deep SARSA & OpenAI Gym.

After that 3000 steps, I learned something important. The agent was learning from it's experience, using my 4 servo spider leg, and the python code I wrote to represent the spider leg as an Open AI Gym environment. My reward code was allowing it to learn to make big dramatic gestures without actually overheating the servos. Mwahahahaaa.

(There were a bunch of non-zero values in the Q matrix when I stopped, indicating some bits of learning, I believe)

However, eventually the jerky, fast, full arc, drama of the gestures did overstress the servos and break the gears in one. I expect it might eventually learn that it has to ease up to full speed and ease back to a stop. However to save on costs, I will be futzing with the reward design a bit. Thanks Gelana!

Sunday, December 13, 2020

Leg Agency

I have been rewriting some python code to be multiprocess so that it can be writing and reading from the serial port "at the same time".

That is in good shape now, though I worry a bit about how many processes I have broken the work up into. There is a monitor process and then two for each serial port that has servos attached (one reads and the other writes).

I took sample code from "Create custom gym environments from scratch — A stock market example" by Adam King on Medium. That gave me the methods required for a wrapper. That code makes my multiprocess SpiderWhisperer comply with the gym environment API from OpenAI.

In order to have an RL agent learn how to move my spider leg while only developing the sort of familiarity with actual machine-learning math that one gets from sharing a friendly nod across a noisy party, I stripped out pieces of sample code from the Gym tutorial here. That gave me the rest of the code I needed for some conceptual experimental runs.

I hesitate to mention that there was some debugging. Since I am intent on learning from the noble creatures in the phylum arthropoda, I will say that some adjustments to the implementation were required that I discovered by trying to run the RL scenario using code I shuffled together.

Now I am ready to try a long run, and see if an RL Agent that could learn to maneuver a cart up a mountain from a valley, can also learn to make interesting movements with a 4 servo spider leg, without burning out a lot of servos.

Sunday, November 1, 2020

Spider Leg exercises in the Gym

After doing some more thorough reading and experimentation with the tutorials, I have decided that the Google Deep Mind control components are more heavyweight than I need. They have a dependency on a pay-to-license physics simulation engine that I don't need, mujoco. So instead I am narrowing my list of components some, to deep mind acme (dm_acme), the deep mind environment (dm_env) package, and openai gym. Together they provide a framework for easier implementation and a sample case of reinforcement learning agents.

ps: If you read this post aloud to someone, be thoughtful how you place the pauses when you say "google deep mind control".

Friday, October 23, 2020

Deep Minded Acme Spiders

The Google Deep Mind projects include some demos and examples that sound like an excellent base for what I want to try for the SpiderDroid. Now let us see if I can absorb enough new comp sci concepts to produce any interesting behaviors.

Tuesday, October 13, 2020

SpiderDroid Rewards

The core mechanism of using machine learning to "solve" games is building in an incentive factor that the machine learning algorithm will use to prioritize and identify the best values for the parameters it can control. For my experiments I think I will need some external pure incentive and disincentive stimulus that can be used for any early assisted learning stages that are required. This is because I expect the learning process that will connect actuator parameters with movement results will be full of dead ends, and I want to be able to try to back the ML algorithm out of them and encourage the patterns that look more productive.

However, it seems relevant to mention the unfortunate babbling idiocy that has eventually forced me to erase and restart all of the early (local brains only) voice to text translators I tried.

Caveat creator. Do I need a kill switch?

SpiderTraining

I will know that I have reached a new plateau when my spider droids can learn movement on their own by experimentally moving whatever collection of active joints I have given them. I expect that discovering the most productive movements and combinations of movements in terms of visibly and inertially detectable changes in position and location of a single eye cluster, and detectable changes in sound composition and volume at the single ear cluster will be the first incentivized results. If I get that far, there will be others even more interesting.

Tuesday, May 12, 2020

SpiderDroid 0.02: 3D Models

I have published the 3D Models for the upper (and most interesting) pieces of the leg design of the 0.02 SpiderDroid in TinkerCad under easy free licensing.

https://www.tinkercad.com/embed/e5ogrbWnosU?editbtn=0

Saturday, May 9, 2020

NOT becoming a Supervillain by mistake

I do a lot of cryptoquote games. Way too many, probably. So I have a stream of notable quotes offering themselves for my fleeting consideration on most days. Lately I have been adding "... as a Supervillain" to the end of them, like one might add a phrase to a fortune cookie message.

Example fortune cookies:

"Today it's up to you to create the peacefulness you long for" ... as a Supervillain.

"If you refuse to accept anything but the best, you very often get it" . . as a Supervillain.

Example Cryptoquotes:

"Life isn't about finding yourself. Life is about creating yourself" ... as a Supervillain. (George Bernard Shaw)

"Love is composed of a single soul inhabiting two bodies" ... as a Supervillain. (Aristotle)

The story I am telling myself is that this activity will help me avoid becoming a Supervillain by accident, because there might be trapdoors into evil thoughts under high minded thoughts.

😁

Thursday, May 7, 2020

NOT Being A Supervillain LLC

In the second (depending on how one counts, it might be the third) trimester of the COVID-19 global pandemic, I have decided that I need to do something besides simply making pandemic masks and obeying rules. So I have established the NOT Being A Supervillain LLC as a North Carolina USA company. I plan to offer computer consulting services, including security audits and challenge tests.

Friday, May 1, 2020

SpiderDroid 0.02: Leg twitches

I am happy to announce that I am making progress on one of the wishlist projects that I have imagined for the next few years. At the moment I have two assembled legs (with servos) that I like for this second incomplete design/build and two more 3d printed but not assembled yet. Early walking experiments with only 4 legs will be creepy but useful.

I hesitated, because in the midst of this global pandemic the distributor for my preferred servos went 503.

Fortunately, HiWonder seems to be distributing those LX16A servos now, so I am rolling. :D

Allan Dickson

Saturday, April 4, 2020

You might think that not becoming a super villain is simple common sense.

You might be right, I suppose. I seem to be among the people most uncertain about what common sense is.

The examples I can first recall just now of decisions that I have heard people describe as "just common sense", are "Don't touch things that are hot enough to burn you," "Wear shoes when you go outside in the snow," "Don't break the law," and "Only eat when you are hungry ".

Each of these examples has a tree of exceptions and gray areas that leap immediately to my mind.

Having accurate judgement of "...hot enough to burn you..." requires a wide range of levels and kinds of experience in different circumstances. How long does a plate have to be out of the oven before the waiter can carry it safely using the specific towels or tools they have available to them? How long does a piece of iron need to be in the quenching water or air cooled after it has been red hot to be touched safely? These questions are meant to show that using this example of Common Sense requires some uncommon knowledge in order to practice it. So is calling it "common" just casual hyperbole?

That objection, or others, seem like crippling weaknesses to all the tenets I have considered as candidates for Common Sense, including "Don't be a supervillain."

Monday, March 30, 2020

"Really?", you might think.

Yeah, it isn't really a moment by moment thing, for me; deciding not to be a super villain. You are right to be thinking that. It is, as you no doubt expect, more complicated than that.

The key question involved in my mind is about where the boundaries of villainy, and super-villainy are, and whether or not I am dealing with pressures and ideas that move me closer to them.

I try to hold that question as the subject of daily meditation (or at least consideration).

So maybe "day by day" is more accurate, and only on the worst days are there a series of moments of decision.

Sunday, March 22, 2020

Moment by moment, deciding not to become a super villain

This is a moment in which the decision not to become a super villain has a special weight in my mind. My partner and I are socially distancing ourselves from everyone else to do our part to protect each other and everyone else from the fast natural spread of the first global wave of SARS-CoV-2.

The chief example of community identified evil in this moment, perhaps worldwide, is the selfish refusal of individuals or groups to take action and change behavior to help the community, local to wherever you are, to protect their weakest members.

Not an inspiring example.