Summary of My Current Theories for an AGI Program.
April 2013
There have been a great many achievements in Artificial Intelligence
(AI) during the past decades. However, there is a question whether these
advancements are really forms of Artificial General Intelligence (AGI) or if
they are just specialized forms of Narrow AI (programs which are not capable of
exhibiting the human skill of genuine learning and subsequent use of knowledge
to solve a variety of problems. Narrow AI only solves a special subclass of
problems.) An AGI program would be able to learn about a wide variety things,
requiring, at most, only a few months of modifications to be used when a new
kind of sensory or robotic device is first used with it.
I feel that
complexity is a major problem facing contemporary AGI. It is true, that for
most human reasoning we do not need to figure out complicated problems precisely
in order to take the first steps toward competency, but so far AGI has not been
able to get very near the kind of intelligence that we see in human beings.
I am going to start with a text-based AGI program. I agree that more
kinds of Input-Output (IO) modalities would make an effective AGI program
better. However, I am not aware of any evidence that sensory-based AGI or
multi-modal sensory based AGI or robotic based AGI has been used to achieve
something greater than has been achieved with other means. The core of AGI is
not going to be found by adding more peripherals. And it is clear that starting
with complicated IO accessories will make AGI programming more difficult. It
seems obvious that IO is necessary for AI/AGI and the recognition of this simple
abstraction is probably a more appropriate basis to be used as a prerequisite of
AGI. If I was able to create an effective AGI program then it could be adapted
for other IO modalities as needed.
My AGI program is going to be based
on discrete references. I feel that the argument that only neural networks are
able to learn or are able to incorporate different kinds of data objects into an
associative field is not accurate. I do, however, feel that more attention needs
to be paid to concept integration. And I think that many of us recognize that a
good AGI model is going to create an internal reference model that is a kind of
network. The discrete reference model more easily allows the program to retain
the components of an agglomeration in a way in which the traditional neural
network does not. This means that it is more likely that the parts of an
associative agglomeration can be detected when necessary. On the other hand,
since the program will develop its own internal data objects, these might be
formed in such a way so that the original parts might be difficult to detect.
But with a more conscious effort to better emulate how the human mind is able to
work with ideas and concepts I think that the discrete conceptual network model
will prove itself fairly easily.
I am going to use weighted reasoning
and probability but only to a limited extent.
I believe that it
takes a great deal of knowledge to 'understand' one thing. A statement has to
be integrated into a greater collection of knowledge in order for the relations
of understanding to be formed. And the knowledge of a single statement has to
be integrated into a greater field of knowledge concerning the central features
of the subject for the intelligent entity to begin understand the statement.
In order to integrate new knowledge a new idea that is being introduced
usually has to be verified using many steps to show that it holds. Since there
is no absolute insight into truth for this kind of thing, knowledge has to be
integrated in a more thorough trial and error manner. The program has to create
new theories about statements or reactions it is considering. This would extend
to interpretations of observations where other kinds of sensory systems were
used.
A single experiment does not 'prove' a new theory in science. A
large number of experiments are required and most of those experiments have to
demonstrate that the application of the theory can lead to better understanding
of other related effects. It takes a knowledge of a great many things to verify
a statement about one thing. In order for the knowledge represented by a
statement to be verified and comprehended it has to be related to and integrated
with a great many other statements concerning the primary subject matter. It is
necessary to see how the primary subject matter may be used in many different
kinds of thoughts to be able to understand it. So I believe that most insights
will occur when conceptual integration is able to explain more than one thing or
one very narrow type-of-thing about a subject.
While an analysis of
conceptual integration and conceptual relations, by some name, has always been
primary subject in AI/AGI, I think that concepts and ideas were relegated to a
subservient position by those who originally stressed the formal methods of
logic and science, linguistics, psychology, numerical methods, probability, and
neural networks. The details of how ideas work in actual thinking was seen as
either part of some dawn-of-science-philosophy or the turn-the-crank product of
successful formal methods. A focus on the details of how ideas work in actual
problems was seen as naive.
If a problem is complicated then you need to
study it carefully before you get enmeshed in it. Complicated problems which do
not lend themselves to the discovery of solutions through simple trial and error
have to be carefully studied. There is no question in my mind that a prolonged
period of studying the problems of AGI has been useful to me. We need to bring
rational creativity to those kinds of problems. Rational creativity, where
possible solutions are designed according to a better knowledge of the
characteristics of the problems, can enhance the likelihood that incremental
trial and error methods will work. Yet I am also critical of becoming overly
preoccupied with purely abstract generalizations. We should not expect too much
from an elaboration of pure conjecture. The details of the conjectures have to
be developed from implementation plans drawn from the study of an extensive
number of individual cases. Many of us have spent a great deal of time thinking
about the application of our theories on real world problems but we also need to
more carefully study how human beings, with their higher intelligence, are able
to shape and synthesize conceptual relations using creativity and insight as
they do. I feel that it is obvious that human beings and other animals have
methods to deal with ‘ideas’ and ‘concepts’ of the mind and these need to be
simulated in our AGI programs. So while major AI paradigms have been applied to
real world problems they have not been insightfully applied to these hidden
systems of how we work with ideas. I believe that this issue may be a part of
the best way to differentiate between what has been called “narrow AI” and AGI.
Narrow AI programs are unable to deal with ‘ideas’ and ‘concepts’ in
sophisticated ways that emulate or approximate how human beings do.
During the past hundred years there has been a great deal of bias directed
toward the idea of ‘ideas’ as a subject for a valid psychological theory. As I
am writing this, problems of creating simulations of how the human mind works
with ideas is still not seen as a completely suitable subject matter for the
science of computer programming. And it is not considered to be a fit subject
matter for neuroscience either because such things cannot currently be explained
from the observation of events at the level of individual neurons. In the
twentieth century, the field of psychology was differentiated from the
philosophical speculations on the operation of the mind through experiment and
observation. And as a result, thinking about things like ‘ideas’ was seen as
insipid or fruitlessly fanciful unless it could be subjected to some kind of
rigorous experiment. In the early days of computer programming it was thought
that logic was a method of scientific thought and therefore it would become a
superior method for artificial thought. Again, ‘ideas’ and ‘concepts’ were
dismissed because they weren’t easily converted into the logical terms of a
computer program, and they were not seen as proper scientific objects. And the
experts simply did not think they were necessary. However, their efforts did
not produce AGI. When new computational methods were developed for AI it was
immediately thought that these would finally explain human-like intelligence.
So again, there was no recognition of the need to try to imbue an AI program
with something as vague as an ‘idea’.
The problem where the smartest
thinkers spend lives pursuing the abstract problems without carefully examining
many real world cases occurs often in science. It is amplified by ignorance.
If no one knows how to create a practical application then the experts in the
field may become overly preoccupied with the proposed formal methods that had
been presented to them. Formal methods are important - but they are each only
one kind of thing. It takes a great deal of knowledge about many different
things to 'understand' one kind of thing. A reasonable rule of thumb is that
formal methods have to be tried and shaped based on extensive studies of real
world problems. But my thesis is that for AGI, the way the mind works with ideas
and concepts is part of the real world problem. Although the details of how the
mind works with concepts may be elusive, I am confident that good simulations
will be found once a more careful search for them is made.
The
program will make extensive use of generalizations and cross-generalizations.
The program will need to be able to discover abstractions. These abstractions
typically may be used to develop generalizations. A generalization may be formed
from a group in which the members share some characteristics. However,
generalizations may also be formed by various arbitrary processes. And, if the
program works, generalizations may be formed in response to some educational
instruction. The most typical example of cross-generalization may be the
consideration of similarities across individual systems of taxonomies or classes
or subclasses. However, in the broader definition of generalization that I
intend for the AGI program to develop the collections will not have to be
grouped by any common characteristic. Although this might be a misuse of the
term generalization, the generalizations that my program will create may not be
trees because they can potentially branch off in different directions. Indexes
into data for internal searches may be formed in a similar way but I will have
to think about whether the variety of branching makes sense for the indexes as I
am developing the program.
I believe that because of the variety of
forms of generalization or categorization that the program will need to use it
will be necessary for the program to keep track of the different kinds of
categorization and generalizations that it develops. And it will put
transcendent boundaries around portions of the categorizations-generalizations
that it develops as it uses them. These boundaries are transcendent in that
overlapping relations may be developed across them (as in cross-generalization
or cross-categorization).
Perhaps the terms relations and categorization
are more abstract than the terms of generalization. So the program will be able
to develop abstractions of relations and then build categorizations from these
relations. The categories that I have in mind may be somewhat free-wheeling. A
categorical relation is almost always based on or is effectively based on a
concept of categorization. This means that the categorization and
generalization of concepts is usually based on a conceptual structure of
definition itself or it is based on some principles of categorization which are
effective definitions of the categorization. Cross-categorizations will be
important because they will help the program find and consider relations across
the categorical structures. These categorical structures may need to be bounded,
but since bounded categories may still be related across a relatively dominant
categorical relation that means that the boundaries can be transcended by other
associative relations.
Logic is a kind of bounded system. If you chose
to you could create multiple logical systems which refer to the same objects or
to facets of the same objects. The value of this is that the logical relations
do not need to be totally integrated into a single logical system. However, if
you choose to, you can begin looking at other logical relations of the
propositional objects to see how they might be better integrated. This is an
example of what I mean by a transcendent bounded system. Other kinds of
relational systems, where the relations have some kind of meaning or valuation
can also be bounded and transcended in this way. One benefit of this system is
that it allows you to build the system gradually by allowing your intuitive
sense of how the system would work to be part of the process without invoking a
premature critical destruction of the concepts being considered. And it can be
used to tolerate concepts that are relied on but do not fit in all that well
together to be utilized. If it becomes necessary or if your curiosity is
provoked you can examine the different aspects of the concepts more closely as
you are able.
Artificial imagination is also necessary for AGI.
Imagination can take place simply by creating associations between concepts but
obviously the best forms of imagination are going to be based on rational
meaningfulness. An association between concepts or (concept objects) which
cannot be interpreted as meaningful is not usually very useful. So it seems that
if the relationship is both imaginative and potentially meaningful it would be
advantageous. An association formed by a categorical substitution is more
likely to be meaningful so I consider this a rational form of imagination.
However, you can find many examples where a categorical substitution does not
produce a meaningful association, so perhaps my claim that it is a rational
process is dependent on the likelihood that the process will turn up a greater
proportion of meaningful relations than purely random associations.
Some
imaginative relations may exist just as entertainment, but I believe that the
application of the imagination is one of the more important steps toward
understanding. In fact, I believe that all understanding is essentially a form
of imaginative projection, where you project previously formed ideas onto an
ongoing situation which is recognized or thought to share some characteristics
with the projected ideas. So from this point of view, the reliance of
previously learned knowledge is really an application of the imagination.
Perhaps it is a special form of imagination but is a form of imagination none
the less.
Anyway, once an imaginative association or relation is created
it has to be tested. I feel that relations of understanding cannot be
appreciated out of context. The basic rule of thumb is that it takes knowledge
of many things to understand one thing. This creates a problem when trying to
test or validate an insight which was partially produced by the imagination or
which had to be fitted using imaginative projection. The only way an AGI
program is going to be able to validate a new idea is by seeing how well it fits
and how well it works in a variety of related contexts. This is what I call a
structural integration. It not only represents a single concept but it also
carries a lot of other information with it that can seemingly explain a lot of
other small facts as well. A new idea seems to make sense if it fits in with a
number of insights that were previously acquired.
Gradual methods
seem to be called for. However, by utilizing structural verification and
integration, the gradual method can be augmented by structural advancements
where key pieces of knowledge seem to be able to better explain a variety of
related fragments of knowledge. Of course even these methods are not absolute
so there will always be the problem of inaccurate knowledge being mixed in with
the good. One of the key problems with contemporary AGI is that ineffective
knowledge (in some form) will interfere with the effort to build even the
foundations for an AGI program. Since I do not believe that there is any method
that will work often enough to allow for a solid foundation to be easily formed,
a way to work with and around inaccurate and inadequate knowledge has to be
found. Even structural integration can sometimes enhance a cohesive bunch of
inaccurate fragments of knowledge. But I believe that there are a few things
that can be done to deal with this problem. First of all, the method of
(partial) verification through structural knowledge should usually work better
with effective fragments of knowledge then it would with inaccurate fragments.
Secondly, a few kinds of flaws can often be found in inaccurate theories. One
is that they are often 'circular' or what I call 'loopy'. Although good
paradigms (mini-paradigms) are often strongly interdependent, nonsensical
paradigms do not fit well into systems external to the central features of the
paradigm. This fitting can be explored by using the cross-categorization
networks and it is an important part the process of understanding how good
theories work. The idea of the transcendent boundary is a solvent for the fact
that we don't really form our understanding of the world based on perfect logic.
So by being able to examine cross-categorical relations we should be able to
deal with small logical or other relational systems that can be related to other
small systems even though they may not be perfectly integrated.
But
there is another problem that my theory of the transcendent boundary system
would tend to create. It would be pretty easy to build small systems that
overlay an 'insightful' bounded system and these could even be integrated with
other transcendent systems that were built to overlay other insightful bounded
systems. So a well developed fantasy system could be created on top of the
kinds of insightful systems that I have in mind. This problem does have a
solution. These systems which overlay the insightful systems can be carefully
examined to see if a viable method to tie these into some IO observations that
are directly related to the insightful systems could be created. If a
transcendent system is truly insightful, it should typically be useful in
explaining and predicting some basic observations. Of course systems like this
are not perfect and during the initial stages of learning the program might
create some elaborate systems of nonsense. And an exhaustive search for
inaccurate theories can interfere with learning since inaccuracies that do not
play key roles in paradigms can act to support the weight of the paradigm while
the 'student' is first learning. For instance, the good student will be aware
that the fact that even though he does not fully understand the supporting
structures (and transcendent relations) of a paradigm that does not mean that he
can use his ignorance to knock the theory down. Similarly, the fantasy that a
system (like an axiomatic system) is sufficient to support an application of the
system would not ruin that student's work with the system unless he tried to
apply it to a field where the naive application was not effective (like trying
to use traditional logic to produce AGI).
An AGI program has to be
relativistic. To give you one obvious example: You need to use concepts in
order to analyze a concept. Since concepts can affect other concepts this means
that the kind of concepts that you use for the analysis will affect the result
of the analysis. For instance if you analyze a city scene thinking of the
colors and shapes of the view you will get a completely different kind of result
than if you were analyzing the scene using economics and real estate values.
This means that a definition of or a definition of the usage of a concept will
be defined based on the concepts that you use for the definition. A concept may
take on a stable meaning in the program, (at least I hope they will), but the
basis for the meaning and the application of a concept will be dependent on how
it relates to other concepts. While concepts will be defined in the terms of
other concepts there does not have to be some set of concepts which serve as the
fundamentals from which all the other concepts are formed. There are dependent
concepts but there are no fixed set of independent concepts so to speak. (Some
might exist for a while but they could subsequently be defined relative to some
other concept.) It has to be possible for an AGI program to learn more about a
subject matter so any concept might be further defined or redefined at some
later time. Now, using this as a simple example, if concept A is defined
relative to some other concept B and concept B is later redefined so that it
becomes dependent on some new insights does this mean that all of the new
insights should be passed on in regards to the definition of Concept A? No, not
necessarily. In some cases the new insights about Concept B, would be relevant
in the definition of Concept A but other cases they would not be. So then how
would the AGI program be able to decide which aspects of Concept C are relevant
to the definition of Concept A? There would be a number of ways. As I said, to
understand one thing (one small idea) the program needs to have knowledge about
how it relates to many things. So to know whether some fact in Concept C
concerning Concept B is relevant to Concept A the program has to have some other
kind of knowledge about the relation. This might be found through a specific
idea that directly relates Concept B and Concept A and Concept C. Or it could
be inferred from generalizations that the Concepts belong to. Or it might be
inferred through other kinds of relations concerning the concepts. Since the
program will have an artificial imagination the inferences could get quite
imaginative.
So this case, where a relatively independent concept of
definition is redefined, is similar to any other case where the program tries to
fit a new idea into the background of previously acquired knowledge. The new
idea has to be examined through a trial and error process to see if the
inference can explain something in a more powerful way, or improve on some
behavior. And if that explanation or behavior can be tied effectively into
something that has been or will be observed in the Input then that confirming
observation may act as a positive reinforcement for the integration of the new
insight.
I believe that key structural insights are the secret to
learning. They are acquired incrementally but because they are key to multiple
insights their power is multiplied. I believe that even animals learn when some
some key insight falls into place and can be used to ‘explain’ a number of
related variants of a situation or to develop an active adaptation to a number
of immediate variations that may arise. To make myself clear, my point of view
is in partial agreement and partial disagreement with Wolfgang Kohler’s
conclusions that chimpanzee’s use insight to learn. I believe that animals do
use insight in solving problems but that this insight comes about through an
incremental trial and error method based on the discovery of key structural
insights which seem to offer an explanation for a number of different issues at
one moment. But I go further because my postulate is that all learning is based
on a trial and error procedure where key structural insights occasionally lock
into a conceptual network and can then be used by higher intellectual functions
or to direct reactive adaptation in a kind of activity. A simple concept is
understood by fitting and integrating it with many other concepts. That then is
an action of keying the concept into a structure of previously acquired
knowledge.
But how could a computer program achieve this kind of
learning using a trial and error method to find simple key structural insights?
It could take thousands of years (or more) for a computer to hit on all the key
structural insights that it would need to achieve even the simplest level of
competency using only random processes. My theory is that animals do learn
through incremental trial and error learning but because they are able to hit on
key structural insights the process is amplified when the new information is
well aligned with the natural instincts the animal has for the knowledge. But
how to get a computer program to do this?
If the program hit on a key
structural insight concerning a system of generalizations that were well
established along with a number of specializations it would be more likely to
recognize the vitality of the insight because it could immediately examine the
implementation of the insight. The value of a simple effective insight would be
multiplied because it would have so many uses. If my type of AGI program is able
to find a key structural insight at a point where a number of cross-relational
generalizations intersected then if enough specializations for the
generalizations were known it would be more likely to recognize the potential
value of the key. This theory can be extended to other cross-categorization
systems since a categorization is similar to a generalization in many ways.
Although I do not have all the details worked out I realized that this simple
relation between specializations and generalizations would make it more likely
that important key structural insights could be found because the program could
be designed that way.
When I had the first sense of the potential of the
relation between the cross-generalization networks and the search for key
structural insights I realized that I could use special input in order to direct
the program here and there. Because of this and because I have a better idea
how the key structural insight might work as part of the dynamics of the program
I now feel that I will have much greater control during the initial development
and testing of the program. This means I can test some ideas even before I
figure out how to get the program to produce them. The ability to conduct this
kind of controlled testing program is a major step forward. And because I have
a better sense of the relation between specializations, generalizations and the
development of key structural insights I should be able to develop some novel
automation as well.
Although a more varied conceptual network is a
little more complicated than a simple generalization network I am sure that a
similar method can be used to look for structural concept keys.
Jim
Bromer