The source code of SGPO might well be lost... However we still have an old version of it that is not compatible but is working the same way.
I found the necessary tool to handel PNG in C(++). So I might well also find the time to make a similar program for PNG file.
I found a similar program in the name of Gifshuffle. Gifshuffle also offer build in encryption and compression. I guess if it can be disabled then maybe it will be "compatible" with SGPO. If it is not then maybe the new version of SGPO will be.
Because there is absolutely no secret within the method we use, anybody can retrieve the message if he have the program/knowledge.
It is possible to retrieve something from a graphic file where nothing have been hidden. Of course, this thing won't be anything intelligent. It's up to the person extracting the message to figure out if this is a message or just some trash.
If you have something to hide and want to protect your message, then you better have to send an encrypted hidden message and make sure your encryption program put as few structure as possible into the encrypted message so that it will be difficult to recognize that this is an encrypted message.
If you want to compress your message, make sure this is the first operation you do, then only encrypt and finally hide the result into a graphic file. Because encryption introduce confusion within your data, it will be very hard to compress after.
As a matter of fact, a steganographic technique should also provide a build in encryption algorithm so that the knowledge of the technique is not the only secret (as for security by obscurity).
The method we have implemented can provide this build in encryption with almost no extra cost (in space not in cpu time). The encryption is to my knowledge very strong if the length of the key is as long as the length of the message. It should be equivalent to use an XOR between your message and the key before trying to hide the message.
The encryption is not yet implemented...
You can safely skip a few paragraph if you only want to just use the program.
For graphic file that use a palette, this palette is a table of color. Then, every pixel is represented as an index to this table. The trick is to change the order of the color into the palette and also change every reference to the new index after permutation.
So we will encode our hidden message into a permutation. With a palette of 256 color, we have factorial of 256 possible permutation. As we will translate a message into a permutation, we will be able to hide as much as factorial of 256 possible different message. All of those message will have an associated permutation. We will be able to also make the reverse mapping, from the permutation to the message.
As we need to be able identify the permutation that have been used, we need a reference order for the color into the palette. So before you apply a permutation, you need to "sort" the palette into a reference representation. From there, you can apply the permutation to the new sorted palette. Your message have now been hidden.
On the other side, to get back the message, we will have to sort the palette and figure out the permutation to get from the sorted palette to the palette we received. Some kind of reverse permutation. The using the reverse mapping we can find back the message that was hidden and it will be clear text for us.
A few question still need to be answered to fully understand how it actually work within our implementation. You need to know how we get from a message to a permutation and then back to the message again. We will explain more in detail the process and algorithm to implement this technique.
Most of the time, our message is only made of stream of byte, then we will put them in the Most Significant Bit First order and the byte/character 'A', which have position 65 into ASCII character set and 65 have the binary representation '01000001', will be encoded 101000001 (so 321 which is 65 + 256). As a matter of fact, the empty message will be encoded as 1. Because there is no message associate with the value 0, we will always remove one to the encoding of the message. So now the empty message will be a value of 0 and 'A' will be a value of 320.
So on a practical point of view we start with one into an accumulator and for every byte/character from the message, we add the value of that byte to the accumulator multiply by 256 (except for the last byte). At the end we remove one from the big value we get. To decode the message we first add one to the value. Then we just have to compute the remain from dividing by 256, which is the last character. Then divide again (and keep the remain) for the previous one, and again until we get one or zero. If we get to one then this is a message with a size multiple of 8. Otherwise, the last byte we get is to be decoded into a certain number of bit from one to seven to be further analyze. No big deal, we will work with stream of byte even if this could be a weakness (as we don't use all possible permutation).
A permutation is always a permutation of something, so we need to know how much 'distinct' element we have available. So how much color we have in the palette. We will suppose that the palette contain only unique color. We also need to make sure the number we want to encode is lower than factorial of this number of element. We won't work with permutation of a set of element where you have duplicate because in a palette of color you should not find any duplicate. But it could be interesting to work on this problem...
We need to consider the permutation in an other way in order to simplify the algorithm. If you create a permutation of height element (the number between zero and seven) then for your first index you have height possibility, for the next one, you must choose between the seven left possibilities, then the six left� until your last choice is forced. So we can see this permutation with number within those range (0..7, 0..6, 0..5, 0..4, 0..3, 0..2, 0..1, 0). And this make 8*7*6*5*4*3*2*1 possibilities (witch is what we expect). To get from a number to a representation like that we just have to divide our number by height, then we have left by seven, �
To get from this representation of a permutation to a number, we can just apply reverse rule. You take the last number (always zero) and multiply by the number of element, then add the previous one and multiply by the number of element minus one, and continue until you get to add the first element that we multiply by one.
To terminate the story of our code, it seems that we work on element starting with the last one. So our range of value move that way: (0, 0..1, 0..2, 0..3, 0..4, 0..5, 0..6, 0..7). We will continue to explain using our representation.
If I find the name of this algorithm, I'll let you know. But this is well known and you can find it in almost any excellent algorithm book.
for (int i=0; i<s; i++) for (int j=0; j<i; j++) if (v[j]>=v[i]) v[j]++;To get back from a permutation to our own representation, we use the following piece of code:
for (int i=s-1;i>=0;i--) for (int j=0;j<i;j++) if (v[j]>=v[i]) v[j]--;Well, this is some kind of magic to me� I will let you figure out yourself how this can work and give what we expect.
We will apply multiple permutation on the set of data in order to create some kind of build in cryptography. One permutation is suppose to hide the other one. As a result of test with the effect of multiple permutation, I discovered a small flaw in the way our permutation where generate. For a small number, you get a permutation like this (7,6,5,4,3,0,2,1). Where what was on the top go to the bottom and what was at the bottom get to the top with some permutation. Then if you apply a second permutation of that kind, coming from another small number, you won't get overlapping permutation but the central value stay fix and separate permutation of the top and bottom of the set. For the set (A,B,C,D,E,F,G,H) you may get (C,B,A,D,E,G,H,F) where D and E don't move and you easily distinguish both permutation of ABC and FGH.
In order to make those to permutation overlap, we will reverse our permutation to get from (7,6,5,4,3,0,2,1) to (1,2,0,3,4,5,6,7). The smaller the number, the less we will mix the end of the set. For the number zero, we will have the 'no change' permutation (0,1,2,3,4,5,6,7). This mean an empty message will just sort the palette (or keep it the same way) of your graphic file (nice side effect). If you read carefully the next paragraph, you will notice that if you put an empty encryption message, it's just like making no encryption at all (another useful side effect). More on that, if you want to really "secure" your "hidden" text, you need to provide an encryption message key bigger (in size) than your message.
Whatever, we offer you the option to enter two messages, one to hide and one to serve as a secret key. If you don't use this optional key, then anyone with the program will be able to recover your hidden message. If you use the key, make sure the person on the other side share this secret with you. You can be pretty confident that no one will be able to recover your message without the key (or the key without the clear text message). I hope this option won't stop you to use other cryptographic tools in order to sign or encrypt (your usual way) your message.
More is to be done on this "encryption". Is it useful? What if you use multiple time the same secret? Is it better, equal or worse than just doing an XOR between the 'secret key' and the 'message to hide' before the permutation process? What if you use a too small 'secret key', will it make some part of the message 'easy' to recover and then which part, beginning or the end? I'll let you work on those question, I already have some answer. I just like the fact that we can apply a permutation on a permutation. Also, there is no reason to call one of the permutation the message or the secret. This way you hide two message into something and anybody knowing one of them can find the other one. This is just for fun.
For me a permutation is a vector of index value where every number from one to the maximum number is represented once and only once. The vector (3,1,2,5,4) is thus a permutation because it contain five elements from one to five and those elements are represented once and only one. This permutation tell us that data element in position one should be put in position three, data element in position two should be put in position one, �
Let's apply this permutation to the sorted data vector (A,B,C,D,E):
The sorted data vector : A B C D E The permutation vector : 3 1 2 5 4 The permutation applied on the data : B C A E DNow once we have the permuted data, how can we figure out witch was the permutation applied? Well, it is quiet easy, we just have to mark every element of the permuted data with their position from one to five in the vector. The we can sort the data and this will give us the permutation we are looking for.
The marked data : B1 C2 A3 E4 D5 Sorted marked data : A3 B1 C2 D5 E4 The permutation we where looking for : 3 1 2 5 4Ok, so now can we figure out witch is the reverse permutation to get back from (B,C,A,E,D) to (A,B,C,D,E)? In fact this is really easy, we just have to apply the permutation to the set of data (1,2,3,4,5) with number from one to five.
The sorted data vector : "1" "2" "3" "4" "5" The permutation vector : 3 1 2 5 4 The permutation applied on the data : "2" "3" "1" "5" "4" The reverse permutation we found : 2 3 1 5 4We can now apply this reverse permutation on the permuted data in order to check that we have what we expect.
The permuted data : B C A E D The reverse permutation : 2 3 1 5 4 The applied reverse permutation : A B C D EI hope those example will help you understand how we can work with permutation, apply them and find them back.
To confuse the discussion, there is an other way to consider a permutation vector, witch is just like applying reverse permutation� You will read something like (3,1,2,5,4) as fill first position with element in position number three, fill second position with element in position number one, �
The sorted data vector : A B C D E The permutation vector : 3 1 2 5 4 The permutation applied on the data : C A B E DWhich is exactly the result we get if we apply reverse permutation in the way we did previously.
The sorted data vector : A B C D E The reverse permutation : 2 3 1 5 4 The permutation applied on the data : C A B E DSo now you know everything there is to know about permutation. Chose your way to work with those permutation vector. I don't know if any of those two symmetrical way is standardize, I will have to ask the closest mathematician I know.
Let's have two permutation:
The KEY permutation vector
: 3 1 2 5 4
The MSG permutation vector
: 5 4 2 1 3
and apply those permutations to the sorted data vector (A,B,C,D,E):
The sorted data vector
: A B C D E
The KEY permutation vector
: 3 1 2 5 4
The KEY permutation on the data :
B C A E D
The MSG permutation vector
: 5 4 2 1 3
The MSG permutation on KEY perm. :
E A D C B
So the merged permutation is:
The sorted data vector
: A B C D E
The KEY then MSG permutation vector : 2
5 4 3 1
The KEY then MSG permutation of data : E A
D C B
Well, I don't know how to explain that to you, but to find the combine
permutation, I just have to apply the reverse permutation of the first
one to the second one:
The MSG permutation vector
: "5" "4" "2" "1" "3"
The reverse of KEY permutation
: 2 3 1 5 4
The KEY perm. apply on "MSG" perm. : "2" "5" "4" "3"
"1"
This is not all, but I still have to work (again) on the encryption stuff so I will sort it out and try to make it work once and for all.
There are multiple reference to the palette index into a GIF file, not only the pixel. We also have a Background color and a potential transparent color into GIF89. So we will have to also modify those reference once the palette have been permuted.
Also into a GIF file there might be more than one picture or piece of picture and for all those picture, we can have different palette, global palette and local palette. Those option are use mostly for 'animated' GIF file but might be encounter at some other place. A GIF file can also hold all kind of chunk of data, like a text comment, including proprietary chunk. A GIF file can be interlaced picture to do progressive display while downloading the file. So there are many thing to implement in order to be able to fully support GIF. We don't implement all of these.
Because we will decode GIF file generate by our own code and we control the way we generate those GIF file, we will restrict ourselves to a subset of the standard. Most GIF feature won't be available to you with our program. Of course you can manually use the generated palette and apply it to your GIF file with fancy feature.
In the future, we might remove the GIF decompression algorithm and use the one available into AWT. That way the input file won't have to be a GIF file and any format supported by AWT in witch we can find a number of color smaller or equal to 256 is ok for us. Of course the output will always be a GIF, except if we implement also PNG.
I would like to have support for PNG file format as this format use an open compression algorithm witch is patent free. Also this seems to be the format of the future of the World Wide Web, at least as soon as your favorite browser will support this and/or Web server will be able to serve in a different way clients with different capabilities.
Java is portable, but it would be nice to have a complete C/C++ version of this code, including support for big integer and compression/decompression of the mostly used graphic file that support palette. This way we could have a highly optimized version of the code because this is a bit CPU intensive and java does not help. I would like to write a Palm Pilot version of this code because this is more likely to be my next development platform.
I need to clean-up the code a little bit. I need to clean up this documentation and the "distribution" (installation, pointer to JDK, ...) of the program. A wintel.exe could be nice. Encryption is definitively needed as a must. I would like to also make available a java applet (my text file palette version would do the trick).
Also I am looking for other place where 'steganography by permutation' can be apply. Any file format sufficiently standardize, open and used on the internet or for electronic exchange that support chunk of data that can be sorted and reordered without lost of information will do the trick.
Of course because this is free time programming it is more likely that we won't have time to provide support and that we can not provide any warranty for the usefulness and workability of this program.
// ImageEncoder - abstract class for writing out an image
//
// Copyright (C) 1996 by Jef Poskanzer <[email protected]>.
All rights reserved.
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions
// are met:
// 1. Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following
disclaimer.
// 2. Redistributions in binary form must reproduce the above copyright
// notice, this list of conditions and the following
disclaimer in the
// documentation and/or other materials provided
with the distribution.
//
// THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS
IS'' AND
// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE
// ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
BE LIABLE
// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL
// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS
// OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
// HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
STRICT
// LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY
// OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
OF
// SUCH DAMAGE.
//
// Visit the ACME Labs Java page for up-to-date versions of this
and other
// fine Java utilities: http://www.acme.com/java/
As for the ``AS IS´´ concept, this also apply for the rest
of the code.
If anybody else think he need credit, please get in touch with us.
Given the current problem with GIF pattend, it is more likely that this program is illegal...
This program could be adapted for PNG input and output and use PNGLib, but this was just a proof of concept.