Littera Nova

I write things sometimes.

Mental Hypervisors

Word count: 868 (~5 minutes), Last modified: Wed, 12 Feb 2020 19:45:30 GMT

Epistemic effort: this is metaphor and introspection, and narrativization at that. Little has been done to verify that this works beyond n=1 besides throwing the metaphor at others and seeing some small progress.

Let the mind be thought of as a program that transforms data, input to output. The input consists not only of things in the environment, but the stored result of running the program time upon time, building up a memory that also influences future actions. This means that the program can adapt over time to trends, which is a good thing.

In most people, this program has faults. You can have faults processing input (mistakes in verbal comprehension, hallucinations), faults creating output (involuntary actions), or simply built-up bad experiences and trauma resulting in input sets being ignored due to the amount of stored bad results.

It's hard to have direct control over the machine's software, and oftentimes the problem can be in multiple locations, rendering single-part fixes less effective or entirely ineffectual. This sort of breakage also makes it harder to find the ultimate source of an issue: a problem with corrupted memory may be due to corruption in-memory, problems with input being stored incorrectly, or some horrid combination of the two. Because of these diagnostic difficulties and lack of control, oftentimes the fixing process it to simply spam additional data, and hope that the whole thing overall trends better. This doesn't work if your input processing is sufficiently desensitized or otherwise broken. Another complication is that you don't process the whole of your memory with every new input, but rather update the process itself to incorporate trends.

This creates a problem where if you are attempting to make updates to software via the customary patch mechanism of data in, data out, this fails with sufficiently corrupt software loading - you may simply fail to patch at all, because you've made a terminal update (one which prevents further updates). I call this a terminal update because static systems are unable to adapt to changing conditions and are brittle to changes in the inflow of information, and removal of update mechanisms changes the software from a dynamic thing to a static one. This sort of updater corruption is an issue inherent with having the updater be part of the same program as the runtime code - a self-modifying program. It is difficult if not impossible to sandbox your updater sufficiently to avoid terminal updates entirely without becoming static due to refusing to use updates (as opposed to being insensitive to them).

The most straightforward solution from this framework is to hold separate the update mechanism to privilege it from being updated itself - that is, to render it immutable and static, regardless of updates to the runtime code. This can help prevent corruption, but is a terminal update to the way you update, which can be a major downside if paradigms shift. It also doesn't help you if you've already corrupted your updater - you're now simply repeating the existing behavior of being insensible to updates, and failing to adapt as a dynamic system ought to.

The flaw here can be remedied by introducing a concept of versioning - that is, explicitly replacing software while keeping around an older version to fall back on, rather than trying to update in-place. Think of this like giving yourself a chance to refactor things between updates, rather than being limited by the initial code structure. In practice, this could be something like isolating a thoughtform, trapping it to be analyzed (garbage in, garbage out), constructing an analogous thoughtform, and setting an interrupt to A/B test between the two to check which one is giving you better results. If you're not sure about what better means, or think that your value functions might be similarly compromised, you can bounce this off a friend or someone you trust to try and get a better outside view of things.

Stopping at just having the updater be immutably updated for new versions doesn't seem like an inherently necessary thing - indeed, if you're able to break down more thoughtforms than just updater and weights on the really big matrix transform that is a Brain, then you can work around a lot of issues inherent to the mutable, patch-based naive approach to updating yourself. Being explicit about the choice between two (or more) decision-making systems has overhead, and this is an obvious downside to this approach - you have to actually manage those versions, and that itself is a kind of software. I think of it like a hypervisor that routes the same set of inputs to multiple single-purpose VMs, then evaluates which VM to keep based on another set of VMs.

It's definitely a dissociation-based approach requiring high introspection levels and high overhead, but it does make you more robust against certain kinds of attacks (ones targeting your ability to stay dynamic), and if you can pay the price, it allows you to evaluate more mental models with lower risk to yourself or your epistemics than you would have if you were running everything in the metaphor's Ring 0.

Credit to root for some phrasing of the initial analogy - thanks!