Vitenka: 1496
I made this / Cat Games - Less lag, more 'thats bad'
[vitenkas.morat.net]
/ MORAT / freespace / Catnews / Editorial
There's nothing to see here except for shadows of the past - and these ones won't be returning.

I'd point you to my next project here - but I'm not that organised. My style is to act and then sort out the consequences, rather than the other way around. Oh, and lying. I do that a lot too. (i.e. if you look closely, you may have seen some links appearing roughly once a week)

Vitenka.com is registered to me for the forseeable future, so you might find something there.

Edited by Vitenka at 2003-04-09 08:22:54

 
Vitenka : Tue 2 04:18:29 2002  
Using self-adaptive algorithms to bypass restrictive IP.

And the Algorithm said - "I can do that"

This is all speculative, it's been about two years since I seriously looked at adaptive algorithms. (a class of AI software that includes neural networks, genetic algorithms, classifier networks and so forth) With the current fuss about celluar automata, I decided to dust off yet another old idea. Ok, so I was running out of ideas for columns today.

Black Box

A black box is a system which is totally hidden from view. All you have is an input slot, and an ouput slot. Very little these days is actually as cut off as that. Usually you will have other avenues of attack - hardware can be examined under a microscope, software can be decompiled...

However, the current trend is to make all of this illegal. Take the cover off of your computer and go to jail kind of stuff.

But there's nothing you can do about the inputs and outputs. They have to be there, or the damn thing won't work. Furthermore, most law protect the processes - not the results. If you can honestly say that you have no idea how someones software works, and are, in fact, pretty damn sure that yours does things diferently, then it doesn't matter if they end up doing the same thing.

The Basic System

At its simplest level a classifier is a system that can tell you whether something is the same as something else, or not. Think back eight years or so to tomorrows world lectures on 'fuzzy logic' - it's easy to say whether something is identical, but hard to say if something is similar. Classifiers work using an internal economy to promote rules that usually promote good results - without totally killing off the rules that usually produce bad results, but are sometimes very much more correct. The internal systems used to do this arne't important for this discussion - you just need to know that they are a LOT more efficient than just choosing random rules and then checking to see if they are good ones.

So, assume that we can build a classifier that can tell us whether one piece of software is functionally identical to another. It takes quite a while to build an accurate classifier for even a simple piece of software - but importantly, it builds itself. It will happily sit there just putting inputs into a system, watching the outputs, and learning how to tell if it has been given something that is "a lot like software A" or "not a lot like software A"

Let me phrase that in terms that software engineers will be happier with. We just built an oracle. For the laymen - an oracle is something that can tell you whether or not you got the right answer.

Now that you have an oracle, you need to generate software that generates a good positive from it. In other words, turn the problem around again, and find a "software B" that the oracle says is "very like software A". There are several obvious ways of doing this - I'll mention an obvious one and then two of the best.

  1. Obvious one first. Just randomly give it every piece of code and pick the one it gives the best score. Clearly this will work, but it will take a long time.
  2. A better one. You can inspect the internal rules that the classifier system has created. To put it simply, the system must know what output to expect in order to be able to say whether or not the output it got (from a given input) was accurate. Just throw away the stuff that generates the score, and you'll have your system. This isn't quite perfect since not all rules will be amenable to this kind of disentanglement (many rules may work backwards from the given output instead) and the system is likely to be hard to read.
  3. And another better one. Another learning system. Genetic algorithms (for example, CA's or neural nets would work too) are very good at adapting to rules, as has already been stated. Since we have a system that gives them a score - we can use a second system to build the software to test, and adapt it until it gets a high score.

Results

Well, building a system this way should give you a functional copy of the origional. Its internals will be radically different (unless the thing you are copying was also 'grown' in this way) but it will do the same thing most of the time.

Problems

One problem here is that the resulting program will not be a perfect copy. It is likely to have flaws. However, conventional software has bugs too - and using this method you can get actual knowledge of how 'tight' the fit is. (For example, given ten thousand random inputs, our system reacts identically for ninety-nine thousand and six hundred of them...)

Also, there are disadvantages to copying software too closely - you may end up copying undesired behaviour (the bugs) and ignoring the 'normal' behaviour in some cases instead. It might be better to allow some slack.

Next probem is more major. The resulting software will not be as efficient as a tightly coded human-written program. There are ways around this (such as giving 'compactness of code' as a learning parameter) but it may not be a major problem anyway. Large software projects are not written efficiently - most of the code exists to make it eaier for humans to read and debug. Since we don't care about the internal structure we can lose this overhead.

Simple tests have shown that, at least for small programs, a doubling of the size of the input program is reflected by only a five percent increase in the length of the smallest copying program. While this may not stand up to real life (the programs used were randomly generated in a simplified language) and extrapolation is dangerous - this certainly shows hope.

Speed is another problem. The support structure needed to interpret the rules the learning algorithm produces incurs a heavy overhead. It can also take a long time for an algorithm to find a good match to a complex system. (Although compared to the man-hours it took to produce the system in the first place, this might not be so bad)

Partitioning

Splitting a complex problem into several smaller problems provides an exponential improvement in speed, size and 'time to match' Although you cannot always partition a black box correctly, this doesn't matter. As long as your partitioning is fairly reasonable, some improvements will be made.

Also, there are some obvious improvements. Standard components, such as icons and other widgets, may be provided by the user (although they seem simple to us, bitmap images take up a large number of rules to create) - these things may be seperately protected by laws anyway.

Final Note

Obviously, vastly more work than the couple of 2k-rule examples I tried out needs to be done. But I think this approach shows promise. One of the best things that could come out of this, potentially, is freeing up the vast number of people currently employed in cloning competitors softwware for actually producing new ideas. Oh, and the traditional promise of all learning algorithms - it would let people spend more time deciding what they want a system to do, rather than how to do it.

There is something of a change happenning in such deign processes. I'm not providing the links to all the 'learning algorithm' sites on the web. I just thought an opinion piece here might make people think about a potential use of such things now.

Ok, ok, back to your regularly scheduled gaming. Maybe I should have tried harder to press for 'genetic bots in quake' as my third year project instead.

Older News  
Main News>>>
[M]
[Haiku]

[Rifle]
[MORAT]
[Fa-Teen]

 
[Froody Comics]



[Nifty Links]
[Editorials]
[Guilt Box]
You owe:
00:02
 
['Tenkas Tips]

This HTML design by Vitenka
I'm aware it sucks, but am also too lazy.
Note that now you've changed all the colours, it could be about anything!
BTW - this site looks fine in IE5 and netscape4. If you have a problem, if no one else can help, and if you can find it, maybe you can use - Netscape 2.