Interfacing the law research Zalan Szakacs
Research ideas
Research questions:
- How would be possible to set up a pirate library through steganography?
- Which technologies would be used?
- Where would be the books hidden? JPG, PDF, MP3, WAV, EXIF?
- Which books would be included?
- How do you link them?
- Which interface to use?
- Which cataloging system to use?
- How could layering (text, images, metadata) become the navigation element?
- How could be pirated & steganographed books brought back to the "official" library?
- How could the reader find those books in the library?
- How would it be possible to translate the texts into audio and while playing the audio the images would appear in frequency levels?
For my research I want to look into using several steganography tools through python to explore abusing file formats, such as hiding books in other books, texts in images, images in audio, etc. These experiments would create the foundation for programming the steganographed pirate library.
Research thought:
interest in hiding files in other files (imagine a PDF with an audio file?) A sound file has a whole library inside it. JPEG has free space inside it. (metadata ... EXIF data .... ) Steganography
hiding books in other books
censorship
books on the blacklist
read&seek
pirate library is pirating it’s own files
hiding pirated books in “official” library
Research references
→ Funky File Formats
Binary tricks to evade identification, detection, to exploit encryption and hash collisions.
→ Steganography
Digital steganography, a set of algorithmic techniques for hiding data in files, is often used to hide text messages (or other digital content) within the bits of an image. In contrast to cryptography, steganography allows to hide the very fact that you are trying to hide something, an aspect that makes it really desirable for hidden communications or classified information leakage.
→ Javier Lloret - On opacity (2016)
→ Hiding in Plain Sight. Amy Suo Wu's The Kandinsky Collective
→ british pow uses morse code to stitch hidden message during wwii
→ Introduction to Steganography
→ Using PIL
→ Hack This: Extract Image Metadata Using Python
→ ExifRead 2.1.2 Exif
Bibliography
Articles are saved in this Zotero library.
Python experiments
#1 experiment based on the script of steganography the art science of hiding things in other things part 1
# let's get our message set up
message = list('Steganography')
# convert to binary representation
message = ['{:07b}'.format(ord(x)) for x in message]
print("Message as binary:")
print(message)
# split the binary into bits
message = [[bit for bit in x] for x in message]
# flatten it and convert to integers
message = [int(bit) for sublist in message for bit in sublist]
print("Message as list of bits:")
print(message)
#1 experiment based on the script of the art and science of hiding things in other things part 2
from PIL import Image, ImageFilter
import numpy as np
message ='Digital steganography, a set of algorithmic techniques for hiding data in files, is often used to hide text messages (or other digital content) within the bits of an image. In contrast to cryptography, steganography allows to hide the very fact that you are trying to hide something, an aspect that makes it really desirable for hidden communications or classified information leakage.'
# first, open the original image
imgpath = 'steganography_test_1.bmp'
img = Image.open(imgpath)
# we'll use simple repetition as a very rudimentary error correcting code to try to maintain integrity
# each bit of the message will be repeated 9 times - the three least significant bits of the R,G, and B values of one pixel
imgArray = list(np.asarray(img))
""" given a value, which bit in the value to set, and the actual bit (0 or 1)
to set, return the new value with the proper bit flipped """
def set_bit(val, bitNo, bit):
mask = 1 << bitNo
val &= ~mask
if bit:
val |= mask
return val
msgIndex = 0
newImg = []
# this part of the code sets the least significant 3 bits of the
# R, G, and B values in each pixel to be one bit from our message
# this means that each bit from our message is repeated 9
# times - 3 each in R, G, and B. This is a waste, technically
# speaking, but it's needed in case we lose some data in transit
# using the last 3 bits instead of the last 2 means the image looks
# a little worse, visually, but we can store more data in it - a tradeoff
# the more significant the bits get, as well, the less likely they are to be
# changed by compression - we could theoretically hide data in the
# most significant bits of the message, and they would probably never
# be changed by compression or etc., but it would look terrible, which
# defeats the whole purpose
for row in imgArray:
newRow = []
for pixel in row:
newPixel = []
for val in pixel:
# iterate through RGB values, one at a time
if msgIndex >= len(message):
# if we've run out of message to put in the image, just add zeros
setTo = 0
else:
# get another bit from the message
setTo = message[msgIndex]
# set the last 3 bits of this R, G, or B pixel to be whatever we decided
val = set_bit(val, 0, setTo)
val = set_bit(val, 1, setTo)
val = set_bit(val, 2, setTo)
# continue to build up our new image (now with 100% more hidden message!)
newPixel.append(val) # this adds an R, G, or B value to the pixel
# start looking at the next bit in the message
msgIndex += 1
newRow.append(newPixel) # this adds a pixel to the row
newImg.append(newRow) # this adds a row to our image array
arr = np.array(newImg, np.uint8) # convert our new image to a numpy array
im = Image.fromarray(arr)
im.save("image_steg.bmp")
# open the image and extract our least significant bits to see if the message made it through
img = Image.open(imgpath)
imgArray = list(np.asarray(img))
# note that message must still be set from the code block above
# (or you can recreate it here)
origMessage = message[:20] # take the first 20 characters of the original message
# we don't use the entire message here since we just want to make sure it made it through
print("Original message:")
print(origMessage)
message = []
for row in imgArray:
for pixel in row:
# we'll take a count of how many "0" or "1" values we see and then go with
# the highest-voted result (hopefully we have enough repetition!)
count = {"0": 0, "1": 0}
for val in pixel:
# iterate through RGB values of the pixel, one at a time
# convert the R, G, or B value to a byte string
byte = '{:08b}'.format(val)
# then, for each of the least significant 3 bits in each value...
for i in [-1, -2, -3]:
# try to get an actual 1 or 0 integer from it
try:
bit = int(byte[i])
except:
# if, somehow, the last part of the byte isn't an integer...?
# (this should never happen)
print(bin(val))
raise
# count up the bits we've seen
if bit == 0:
count["0"] += 1
elif bit == 1:
count["1"] += 1
else:
print("WAT")
# and once we've seen them all, decide which we should go with
# hopefully if compression (or anything) flipped some of these bits,
# it will flip few enough that the majority are still accurate
if count["1"] > count["0"]:
message.append(1)
else:
message.append(0)
# even though we extracted the full message, we still only display the
# first 20 characters just to make sure they match what we expect
print("Extracted message:")
print(message[:20])
Encoding a text message based on the script of ASCII
for ch in "Digital steganography!":
d = ord(ch)
b = bin(d)
print(ch, d, b)