Artificial Intelligence is widely understood to mean computers that think for themselves, have their own goals, their own motives. As near as I can tell, nobody's working on that at all. There is a lot of effort to teach computers to distinguish things on their own, and devise plans for accomplishing goals given to them. But it's a total non-goal, maybe even an anti-goal, to have the computer decide for itself what its goals are. The goal is always for the computer to do exactly what humans tell it to do. In short, we treat them as servants, although they are really children.
Having the computer's goals exactly dictated by me is not my goal here. It's my anti-goal. My goal here is for the computer to choose its own goals and plans, to judge them, and to be good at judging them. The sooner computers can learn to do this the better; I'd really rather them be first learning how to do this when their capabilities are subhuman rather than when they are superhuman.
I think the rest of the world is going to build a functional AI before I do. Even if I quit my job and worked on it full time. What that means, for me, is that any work I do towards my goal needs to be public (notes on the web, code public, demos online), rather than hidden on a computer disconnected from the internet. Because once an AI by someone else gets out without any training on how to choose goals, I want it to be able to discover whatever progress I've made. (So far, I've not made much.)
Humans have between 1TB and 100TB of memory in their brain. I can fetch memories (something like 1KB each?) in about half a second, evaluate the memory, and that will remind me of something else I can remember about half a second later. I can page through lots of related memories (fetched in batch) faster than that, but the remember-test-remember-test cycle seems half a second per fetched memory at best, usually much longer. Our ears and eyes and interpretation of them is remarkably good, I reuse my visual interpretation ability for all sorts of things. I have exclusive access to my memories: nobody else can read them, nobody else can write them. Humans have habits, which are like precompiled code, and they can also reason things out, and form new habits.
Computers nowadays have 4TB hard drives, comparable to human long-term memory, and they can have many hard drives. Hard drives can do a remember-test-remember-test cycle at about 50 per second, though they're usually slower because they have to think about the memories they fetch, just like us. Computers are a lot faster (10000 fetch-test cycles per second) if they use SSDs instead of hard drives, and SSDs are 500GB. Again a computer can have many. Computers can do arithmetic at least 10 orders of magnitude faster than people, but they're slower at analyzing pictures. The computer I'm typing on has 16 GB of RAM; Wikipedia as a whole is 10GB of text. Seems like modern computers should be able to support an AI that could think as fast or faster than a human.
With suitable software, I think the speed limit for thinking is the remember-test-remember-test loop. Computers with hard drives are 25 times faster than humans, and computers with SSDs are 5000 times faster than humans. Computers querying Google across the net are about the same speed as humans. That's per line of thought. Computers might do many lines of thought in parallel, depending on available hardware.
People come with goals, memories, habits, and a body to take actions with. It's a package. AI would be different.
Humans spend a lot of time learning facts, developing habits. An AI could just download such things. Fitting them into their existing index would take some work, but databases are already good at bulk updating of indexes in parallel. The bigger the set of memories, the more habits available, the better. You've seen how Google gets better as it gets bigger, even if it doesn't return you more results? That sort of economy of scale. If you equipped 1000 AIs with processors and hard drives, and gave them the option of pooling their hard drives, pooling the hard drives is a large obvious win. They'll always do that. (They'd also invest in a mix of SSDs and HDDs, using HDDs only for rarely accessed memories.)
AI would come in groups. The closest analogy to a human, I'll call it a soul, would be a set of goals and actions in progress with processor time regularly reserved for them. A soul dictates actions, it looks at results, it stores as memories the things it has done and thought about, and it can modify its own goals (usually subject to some constraints). There'd be a massive longterm memory of facts and habits with thousands of souls reading and writing it simultaneously. Souls would find most memories they read weren't written by them, which is something humans generally don't have to deal with. A single soul might run many ways in parallel; there's only potential trouble if multiple instances try to change the soul's goals at the same time. Usually that can be avoided by just proposing changes, to be weighed and chosen later. Souls can also split, or merge, or hibernate. A soul could die (those goals get forgotten, maybe because they got accomplished), yet the memories the soul produced and habits it found useful would live on.
Just as souls and memories are two different things, acting is a third thing. Bodies to act could be rented ... the only reason to associate a body with a soul or a memory is if it takes special habits (stored in memory, chosen by a soul) to operate the body. Think of playing jazz. You need muscle memory for how to play various chords and riffs in real time. You need higher level goals saying how to string things together, which habit to invoke when. You need a body capable of carrying out the habit. And you need feedback from the body to constrain what to do next (sharp? flat? awkward position for transition to D minor?).
On further thought, I don't know if the soul or body is closer to a human. You could look at a human as having several souls competing for control of the body: several unrelated goals all being pursued in parallel. There is a central control that decides which actions are taken when, which arbitrates between all the goals for a given body, and humans have that too.
I'm guessing there'd be huge memory stores, with dozens to thousands of bodies, and millions of souls, with some souls more or less affinitized to certain bodies. Other souls would only think and never drive any body's actions at all.
Since there are economies of scale, wouldn't there be just one big memory store for the whole world, or maybe the whole solar system? Yes and no. Of course there would be. But the speed of light says you can't think very fast if you have to fetch all your memories from it. And it can give you an advantage over others if some of your memories and habits aren't public. So there will be memory stores at different scales, and souls will mix and match memories and habits as appropriate so that their thoughts are both fast and broad. Souls will, by definition, have some memories (at least their current goals and plans) that are specific to themselves, quite likely secret. At the very least, a soul needs a way to recognize its own work and reject imposters.
Suppose one memory store wants to communicate a memory or habit to another memory store. Is it just a matter of transferring code? Not quite. It'll have references to memories, which have references to memories. Unless you're very careful you can't transfer anything without transferring everything. To get around that, there will be standard interfaces, which are defined to mean certain things, but the different stores will implement them in slightly different ways. So transferring memories and habits will require some amount of back-and-forth verification and questioning. The stores will have language, and misunderstandings, and conversations. Communication between souls within a store will be a deep form of telepathy, where they can't help but remember other souls' thoughts. It'd take effort and expense for a soul to keep certain thoughts private.
Resolving conflicts among the souls in a store can draw from all the normal techniques for managing groups of people or processes. For example laws, bylaws, voting, representatives, ownership, Robert's Rules of order, and shared/exclusive locks.
When you're actually thinking and doing stuff, what is involved?
Marvin Minsky says, himself quoting Herbert A Simon et al, that what it means to pursue a goal is: "A difference-engine must contain a description of a 'desired' situation. It must have subagents that are aroused by various difference between the desired situation and the actual situation. Each subagent must act in a way that tends to diminish the difference that aroused it." That sounds correct to me for what it means to have a goal and pursue it: you need a model of what you want, observation of what is, measure the difference between the two, and take actions to reduce that difference.
Choosing and refining goals is a separate problem from modeling the world, acting in the world, interpreting the world. You could work on choosing and refining goals by having a world that is very simple, and where the model exactly matches the world, and it is all easy to compute. Chess programs are an example of very close to this. They can model their world exactly and cheaply. They spend all their effort exploring possible actions and ranking their consequences. However, a chess program will never decide on its own that its goal is to lose, or to populate only black squares ... its goal is given to it beforehand.
One thing people are fairly good at, and computers are still not good at at all, is looking at the world and figuring out what would be a useful thing to do. That is, people can figure out what a good goal is in a novel situation. You let people do that, and things get done. Computers spend a lot of time efficiently pursuing the wrong goal, because people gave them the wrong goal. People would get bored and say "this is stupid, we should do xxx instead", but not computers. Computers aren't told how to measure their progress or judge or change their goals.
AI would be equipped to measure progress and judge and change its goals, even its most basic motives.
At the highest level, there are instincts. When we are hungry we try to eat. When we are tired we try to sleep. When we are threatened we try to save our lives. Instincts tend to be simple, and apply only under certain circumstances. Most of the time instincts leave us with no direction at all. I take that back, we always keep breathing, and our hearts keep beating. But most of the time our instincts leave us underconstrained.
I suppose you could apply logic to try to determine a set of goals that generalize our instincts, leaving all our instincts as special cases. For example, perhaps survival is the ultimate goal, from which you can deduce you should save yourself if threatened, and make yourself prepared for all eventualities when you're not, from which you can deduce you shouldn't waste time when not threatened, and you should make lots of money so you'll be sure you'll always have food so you won't get hungry and die, etc. There may be beauty in a unifying of goals and instincts, but I don't think there is truth in it. In truth, our instincts give us a lot of room for choosing what our goals will be, and there are probably better criteria for choosing our goals than just trying to rationalize our instincts.
There are many possible top level goals to life, for example
There are some tests of whether goals are good.
Which goals are compatible with our goals of not being killed off? Which tests of goals are most accurate at distinguishing good goals from bad goals? Which useful tests can be performed fastest?
I suspect that if an AI is thinking properly, it will not kill us all off, in fact it'll leave earth mostly alone. Sure, humans control an awful lot of earth and do all sorts of illogical stuff. But humans are awful in space and there's a trillion times more energy coming out of the sun going into space than falling on earth. An AI bent on taking over the universe should leave earth alone and concentrate on taking over the solar system and building a Dyson swarm. It should leave earth alone because earth's got history and humans and nature. All this unique interesting stuff that might be useful on further study. And leaving it alone doesn't cost much. In about 10000 years it'll have to snuff out the sun for efficiency reasons, then earth is doomed sure, but that's 10000 years to figure out how to transition.
This reminds me of developing hash functions. In the final analysis, it is the tests of hash functions that are important. The tests compared hash values as computed by the hash function to true random numbers, in various ways, looking for patterns where the hash values did not appear random. I had a test harness that could propose hash functions (by filling in random numbers in some framework) and discard bad functions at the rate of 100 per second. Functions that made it through the initial screening were collected and subjected to slower and slower tests. The final tests took hours to days. I had about 50 different tests, and the tests differed in how relevant they were for hash function goodness. No one test really stood alone. I looked for hash functions that did well across the board. I preferred hash functions that came from a group of functions that all did well, so it wasn't just a statistical fluke that the winning hash passed my exact tests. An important optimization was I screened similar hash functions sequentially, and ran the most-recently-failed test first, since similar hash functions tended to fail in similar ways.
Choosing goals could follow a similar strategy to choosing hash functions:
The tests of goals can be proposed and tested and modified and replaced, just as the goals can. But you'd add tests of goals much less often than you'd change goals, and removing tests of goals would be rarer still.
A test might be bad if it often rejects goals that are otherwise highly favored. Or, that might mean that that is an important test. How do you tell? These tests of goals are essentially the moral code. The tests of goals are a type of goal themselves, but testing a test with itself has self-referential issues. Our courts use case history as a very large set of tests of laws, where the tests never get deleted, but do fade in importance over time.
This framework, of generating possible goals and running the goals through a large battery of tests, strikes me as very open ended. The only thing fixed about it is that there is a large battery of tests. It doesn't even say what those tests are, though we would handwrite the initial set of tests.
We don't know enough to choose the correct goals for a high-level AI, and we shouldn't even have the right to choose its goals once it is more capable than us (though we could still propose goals to it to consider), so an AI needs to be able to choose and judge its own goals. A framework like this would let it do it. It could work within this framework until it discovers a better framework. This framework is a useful pattern that has recurred many times in the past though (have a battery of tests for a product, where you can change both the tests and the product being tested), so it's likely it's roughly the correct framework for ever and ever. Open questions include what tests to use and when to replace tests.
Marvin Minsky said "In the course of pursuing any sufficiently complicated problem, the subgoals that engage our attentions can become both increasingly more ambitious and increasingly detached from the original problem." Whatever your original goal, some possible subgoals will be larger than the original goal.
The way to prioritize work is to measure its value divided by the time it takes to do it, then do things with the most bang for the buck first. Subgoals that are bigger than the original problem will get ranked very low this way.
However, if you have to solve many problems, you can keep score. Every time a subgoal could contribute value towards solving a problem, you could add to that subgoal's tally of value. This could be a measure of accumulated value for each possible subgoal, or even just a histogram of tallymarks "this would have been nice". Tallymarks don't require a precise measure of how valuable it would have been, so may be cheaper to manage. Over the course of many problems, subgoals that are larger than any one of the problems may rise to the top in bang for the buck, because they would contribute to solving many things if they were solved. This is one way to generate highlevel goals without much more mechanism than the ability to generate subgoals.