Wednesday 11 April 2012

Base96 Encoding

To convert a number to base 96, we first split it into two parts
  • The lowest digit - holds values 0 to 95
  • The highest digit - holds 0 to 95 multiples of 96.
When added together these produce a number from 0 to 9,215

We have already done something like this - when we laid out sprites in our panel.  We converted a value into two parts, one for the column - the low digit and one for the row - the high digit.

The same principal is used here.

The low digit is obtained by using modula division on the number with the value 96, the remainder will be in the range 0 to 95.

The high digit is obtained by dividing the number by 96.  Again the result will be in the range 0 to 95.

We then add 32 to each of these results to get a number in the range 32 to 127.

Since string comparisons work from left to right and stop when a difference is found, it is important that the characters are in the right order - the high digit must be first in the string.

Value 1 which would be encoded as " !" (space exclamation) must be found to be lower than Value 96 which would be encoded as "! " (exclamation space).

In the string comparison the left character is checked first,  and space is lower than the exclamation mark so no further characters would be checked.

In our project we will have two functions, Base96Encode() to convert a value to a 2 digit string and Base96Decode() to convert a 2 digit string to a value.  You should know by now where these go.

function Base96Encode( thisValue )
   lowValue = thisValue mod 96
   highValue = ( thisValue / 96 ) mod 96
   thisString$ = chr( highValue + 32 ) + chr( lowValue + 32 )
endfunction thisString$

This does exactly as we described, it converts the parameter passed in thisValue into two parts and stores them in variables lowValue and highValue.

Since we don't check the number first, the additional mod 96 at the end of the highValue line is to make sure the result is not higher than 95, or we would have problems later.

We then add 32 to each of the values and use the chr() command to convert the result into a character.  The two characters are joined using the +, which on strings is used to join them together.  The result  is stored in thisString$, which is returned at the end of the function.

To test this function, we add the following lines after the existing print statements in the do/loop portion - just before the sync() command.

for i=0 to 9
   s$ = chr(34) + Base96Encode( i ) + chr(34)
   print( str(i) + " = " + s$ )
next i
for i=94 to 103
   s$ = chr(34) + Base96Encode( i ) + chr(34)
   print( str(i) + " = " + s$ )
next i

This uses two for/next loops, one which counts from 0 to 9 (to test the first few values) and one which counts from 94 to 103 (to test values where the high digit first changes - value 96).

We create a temporary string s$, which is made up from three parts, separated by the plus symbol to indicate they are joined together.

At the start and end is the same command we used in the function chr(), which in this case used 34 (the ASCII code for a quote symbol).

Since strings are identified by quotes surrounding them, we have to use this command to put a quote into a string.  If we just used the quote character,  AGK would think we were describing a string with the value +Base96Encode(i)+ inside it.

In this middle of all this is the function call with i as the parameter.  When the code is run, it is the return from the function that is put at this place in the string.  So the result is the encode string surrounded by quotes, all of which go into s$.

The next line prints i converted to a string using the str() command which is joined to an equals symbol and to the contents of s$.

Notice the difference here, str() converts a value to a string version of that value, chr() converts a value to the character which has that value as its ASCII code.

For example.

chr(65) would produce the string "A", because 65 is the ASCII code for the character "A"
str(65) would produce the string "65", which is the that value converted to a string of two characters.

Neither would put quotes around the string, they simply are used here to show where the string starts and ends.

Which is what the chr(34) is used for in the above code, it puts quotes around the string so you can see where it starts and ends.
You may notice from the results that the quote character is also used in our encoded characters in values 2 and 98.

The results also show value 1 is indeed encoded as " !" and value 96 as "! " as expected.

No comments:

Post a Comment