With Apache Jackrabbit, the content of the documents can be retrieved via Java API calls in many different ways from a variety of content repositories. Multiple data sources may be integrated into one presentation.

Add in Apache Tomcat, and you have JSP presentation capability. This adds another possible dynamic layer. A static version may be presented in the first phase.

The API for Jackrabbit is straight forward:   Create a repository, connect, get the repository, and then get the data.  The data may be a science fiction story in PDF format, a piece of sheet music, an mp3 file, or even information about all documents containing the word (string) “qbit” or “qubit”.

Further, I noticed the potential for the implementation of certain Enterprise Integration Patterns that would be cheap to implement and maintain.

One could read from one Content Repository (CR) and write to another utilizing nothing but Jackrabbit, if the CR is compliant with writes.

In a nutshell, with Jackrabbit, there are two levels of compliance – one that allows reads  from the CR is the first level, and one that allows for the inserts and updates of the document into CR.  Add in JSPs and you have remote read-write access utilizing a simple standard, and running in a relatively safe environment.

Add in open source, fully compliant Jackrabbit CR at each hub (Here a “hub” is any network-capable workstation or server that all other local workstations and PCs on can access), and the framework for a primitive, yet potentially very useful EAI messaging system are in place.

Next, one may swap in and out various open source components to meet the needs of the enterprise.  This ability further expands the capabilities and usefulness.  For example:

  • Replace JSPs with internal standards for remote SOA.
  • Replace Tomcat with the enterprise’s internal standard J2EE Application Server.
  • Replace JSPs altogether with an API interface to the enterprise’s existing, already-paid-for third-party legacy EAI message bus or EAI tool (e.g. Tibco, iWay, .NET, …).

A CR is an excellent place to store useful data; usually in the form of very important, and often proprietary documentation.  It is nice to have standard methods of documentation synchronization and exchange in place within an enterprise.  A small business may take this concept a long way, and a large enterprise may integrate its CR systems so that all employees may have access to all documents, based on need rather than availability.


Building a Network of Intelligent Agents

A Top Down And Bottom Up Approach


1. This posting is subject to multiple revisions as are all posts, stories and music on this blog site. Stream of concisenesses writing technique is incorporated, so some paragraphs may tend to veer off onto tangents. It is the nature of this blog site. Eventually this posting will be linked into the science fiction story: Upgrade 01A, but the content itself is not necessarily fiction – many of the concepts are very possible within the next decade starting right now before your eyes.

2. There is already a great deal of work being done on creating a framework for what is being discussed here. The W3C Organization, for example, has done extensive work in such areas as:

and various other Semantic Web Activities…

Agents will take advantage of progress made in these areas as the technologies mature. Some progress is already being made. See links throughout the text and at the bottom for a few more ideas to consider. A later posting (Part II) will likely dive into some of this as details are further revealed.

Comments are welcome!

Author: David Saxton Ullery

Step One: Build An Internet Spider Program with Weighted Results

It will start out as a simple Spider Program that can quickly crawl through web sites and follow hyperlinks. In addition, it will have its results weighted, based on a simple interface with its owner. For example: “Good Job” may translate to +1, “Great Job” to +2, “Not what I wanted” to -1, and so on. The weights would be tunable.

The back end would be augmented to include technologies such as a simple neural network, heuristics, fuzzy logic, genetic programs and algorithms that adapt to fitness functions, swarm intelligence, and so on. Open source databases would generally be included. The purpose of the back end is to add knowledge to each agent, add personality to the human-agent interface, add on new capabilities to the agent-agent interface, gradually improve symantic network pathways, and gradually add seamless context-switching capabilities when the network grows to sufficient size. The human ought to be able to change topics and the agent ought to know where to look to find intelligent-sounding responses so that it may interact in interesting ways that are both useful and entertaining to its owner.

Programming A Spider in Java

Open Source Crawlers in Java

Human interfaces may be safely individualized to the owner’s taste as long as the agent-to-agent interface remains intact.

Like the oridinary existing spider programs, these new intelligent Spider Agent hybrids would have HTML access scanners as well and many or most versions of these programs would start off by scanning web sites for information and tags.

Step Two: Add Semantic Interfaces to Web Sites Using XML

XML tags will be one of the features that are the most highly dynamic within the network of cooperating Agents. WordPress and other blog sites already do this. This posting includes tags and categories.

A nice feature is that each agent ought to be attached to a home-base web site, or a network of the owner’s blogs, websites and so forth. For example, this author owns this site plus access to other sites. Each site is linked on the home page. One site has the owner’s music that may be listened to by anyone.

This site features original science fiction stories that include links to other sites that contain information of interest and relevance to the context of the stories. Such related links could be tagged using an agreed upon standard, making it very easy for the back-end agent heuristics to gather meaningful semantic information. Thus, another network is created in addition to the spider-agent network, adding a kind of synergy on top of the system.

Semantic Web


See more links on the semantic web at the bottom of this post. Thanks!

Step Three: Build a Network of Spider-Agent Programs that Learn From Each other

Make the Agents highly cooperative with dynamic service oriented architecture hybrids. The interfaces will need to evolve/adapt/be designed with interfaces that are highly cooperative. At first, the initial agents may interact with a handful of trusted sites, but later there may be thousands, millions, or a billion. Some agents will evolve specialties, others may be more generalist, others may evolve into a kind of middlemen. Still others may act as a kind of immune system or private security guards. There may even be a “queen” agent that produces offspring.

A single human may own several cooperating agents, use self selecting techniques or breeding techniques to mate the programs. Some of these agents may simply be clones to allow faster searching, others may be specialists of one type or another. This process may lead to venturing out into the community of networks of trusted sites. As with biological systems, too much inbreeding may lead down a bad path or at the very least limited imagination. Progress is on the order of linear. On the other hand, cooperation leads to something beyond simple exponential growth. Ideas bounce back and forth, agents “breed” with a larger pool of ideas. With “breeding” to be taken literally with genetic programming techniques, metaphorically with human interaction – quite naturally with new ideas added into the mix, or more likely a combination.

Hopefully, researchers with advanced knowledge in the various fields I am mentioning here will cooperate with this effort and contribute by adding interfaces to their own projects. Presumably, they would be able to retain an isolated version of their research projects. If done carefully, the research community and the public at large would stand to benefit. Imagine using such a system for research on virtually any topic – be it scientific, music, philosophy, or pure entertainment!

The Facade pattern may be used as the interface, providing services. The services may include providing entire classes or class sets (Java perhaps) that will have a function another agent requests. In addition, each web site will provide XML tags. Interfaces will need to be standardized in such a way that the methods may be discovered and queried.

They (network of hybrid cooperating spider agents) may start off sticking to single topics and spread to new topics over time. For example, a new type world wide dictionary may result from an initial effort. PC owner may dedicate their off time to such a venture in a manner similar to the SETI effort. The network of agents involved may actually start understanding words, or at least act as if they do. Such a project may end up branching off into many useful areas including better automated phone systems, better GPS and Music units in the car, better machine patient interfaces to help in the care of the elderly, all kinds of fun toys, games, and gadgets. An efficient, mobile Multi-lingual translator may result. Imagine your smart phone with instant access to word phrases translated to any language for you (spoken or text or both). Suppose each agent were connected to a semantic “dictionary” of only 10,000 words and phrase, but there were 1,000,000 such agents all cooperating on a high-speed network. Each agent’s home base would included its own competing heuristics, neural network, and genetic algorithms. Eventually, the best of these systems would survive, while poor one would die out, be replaced, or evolve into even better systems.

Currently, a single Neural Network on a single application are often limited to a hundred or so artificial neurons each. Imagine one million agents cooperating with one another, having several connections (artificial synapses) to each other.

Some agents may evolve entirely through artificial selection using genetic algorithm techniques (with fitness functions). Others may have Intelligent Design (unlike natural selection which contains no ID, but took billions of years to get to where we are today – we do not want to wait that long) aspects about them where its owner adds in top down heuristics. Still others may be hybrids of the two with additional features such as fuzzy logic.

In each step, positive results are reported back to your spider program and the program adds a plus 1 to its database. Negative results are given a negative 1 response.

Once step three is achieved, then your agent could search other agents for similar searches or questions, check their given weights and use these weights as a means of finding the most likely results.

Agents with the best responses to the types of questions a person may ask could be linked to the owners agent and replicated for use on that person’s computer or computer network. Owners may choose to retrieve personalities from cooperative agents. Perhaps a Ramona-like Avatar would be added into the mix, or one that plays chess or other games, or both!

Agent programs could be standardized in such a way that modules could readily be replicated. A person’s agent would automatically select the fittest algorithms. The algorithms will evolve “naturally” using various fitness functions. The fitness functions themselves can be shared and distributed. Human augmentation would be encouraged.

Example Scenario

Suppose an individual had a tendency to ask questions about Friday evening flights from LAX to Portland, enjoyed playing chess, had an arm-chair interest in nanotechnology, wanted the latest iPhone apps, enjoyed Aaron Copland, Mozart, and Bob Dylan, and enjoys images of fractals and exotically clad young women.

First, let’s examine the flight information in more detail. After several runs, this person’s agent would come come to find out the following:

The owner prefers American Airlines over Alaska, unless the difference in price is $20 more for AA than Alaska (suppose the owner has frequent flier mileage). If the flights are booked at least 2 weeks in advance, then the owner gets the seats he wants at a reasonable price. The agent got this way on its own.

Another owner’s agent discovers this agent while searching for the same information. It requests, the data plus any modifications to the algorithm search order from your agent. It turns out that the results are pretty good for the other owner, except this person prefers Alaska airlines. The adjustments are quickly made by the second agent. Its modified rules are made public.

A third agent with similar requirements finds both of these interfaces along with Goggled information from Delta airlines. Its owner likes the cheapest prices regardless of airlines, so the agent grabs the information from the first agent regarding strategy for better prices and applies them to a more general algorithm for its master’s purposes.

Suppose information was shared among all of the agents as to how often positive results were obtained directly from each agent. Weights could be added to each agent regarding this information.

As new information is added, new standard XML tags could be added into the mix and distributed. At first, there may be very few tags, such as the standard information given by all airlines including names, flight number, pricing information, and so one. Pricing information may be the most dynamic, with flight times coming in second and so on. The information does not have to be a direct, static value, but could serve as a pointer (a URL for example).

Over time, some agents, or pools of agents, will become robust, generalized intelligent bots able to respond to more and more questions, play chess, research nanotechnology, and more simply by linking into areas of interest by tags. Tags themselves will evolve to have both general and specific qualities about them.

Context switching agents may evolve, so that topics may change seamlessly and efficiently. Context-switching capabilities is one of the goals of the back end of the spider-agent hybrid system. It needs to be highly modular, with a well-designed object oriented architecture, so that new versions can easily be swapped in

Efficient data trails may be created in a manner similar to an ant colonie’s chemical trails to the best food sources (swarm intelligence is but one example). Trails with the highest weights are followed, but with a twist – the weights need to have semantic context. The agents are usually looking for specific types of information for their owners, although sometime they may be hunting down better algorithms or better heuristics for their own purpose. If an agent does better and improves itself, it will have a tendency to survive to the next generation, therefore this incentive ought to be build into its program(s). It should “want” to survive and have its algorithm cloned or its genes replicated, but only if it improves the colony (only if the owner’s agents will improve as a result). The “food” is whatever data or information the owner likes. To survive, the agent must improve the overall quality of this food for its owner.

The entire Internet may someday evolve into an artificial super-organisms, from the bottom up and from the top down. Humans are still permitted to create new ideas from scratch and add new ideas into the mix to speed up the process and at the same time the agents themselves will share and swap data and algorithms. In the long term, this approach may be more successful than any of the relatively isolated AI projects to data.

Negative intruder spiders will require a defense system of agents (an immune system). What does not kill the system will make it stronger! Competing systems will surely evolve.

Powerset Symantic Web Searching

Artificial Intelligence

The Emotion Machine – Minsky

Ant Algorithms


Fuzzy Tutorial

Fuzzy Logic


Free Will and AI forum posting (if you here from there and want to return ….

Note: this posting is subject to multiple revisions as are all posts, stories and music on this blog site.

Author: David Saxton Ullery

© All rights reserved, with the exceptions given on the home page. In short, feel free to use this material in any public URL with “.com”, or “.edu” domains for non-profit purposes. Please link back to whatever you reference.

Consider cooperation for a greater gain over theft for short-term smaller gain. If you have good ideas share them using links, comments, original ideas. Make us all wealthy …. thanks!