Decode a GIF counter and extract the number
| ||→ || 035352
I could try to make a long story short – but why should I?
The initial goal was this: when you run any kind of website you most probably want to know if there are any visitors – and how many. If there aren’t any there is no need to maintain it. As this is a very common desire, in the early times of the internet tools were developed to perform this service. Most of these tools track the IP address of your visitors to avoid double-counting a visitor coming twice (or more often) – unless he or she changed his/her IP address.
Some internet sevice providers even give you access to log files but not all of them do.
It is well known that web counters mostly come as GIF (Graphics Interchange Format) images showing you a six-digit (or even longer) number giving you a graphical information about the amount of people having accessed your domain.
If you want to view a long term statistics you have to write down the numbers that are shown by the GIF picture and key them in into some spreadsheet software.
After a while you will get tired of doing that and think about a way to get this job done automatically. But the problem remains: there is only a GIF representation of the numbers.
If your counter is displayed like one of these:
don't keep on reading - this article is not for you. You either manage to configure your counter the other way, or you simply give it up.
So the procedure consists of several jobs:
O.k., let’s start:
- download the GIF (my counter comes as GIF images containing 450 bytes each)
- transform the GIF data to a bitmap which is easier to understand and interpret
- do some optical character recognition at least for the figures ‘0’ to ‘9’
- To download the GIF file you only have to check the HTML source of a web page that contains the counter and extract the URL of the counter which probably will be a link to a CGI script. Just write a web client only to do the download and store that GIF image locally.
- The formats of GIF87a and GIF89a, though protected by patents for many years, are well documented. And there are libraries available, but not for the Arduino.
- And as the figures are software-generated without any jitter or noise the last job is the easiest to do.
- Arduino-sketches to operate as a web client are to be found in the Arduino examples folder and in thousands of tutorials. So I don’t talk about them.
- The matter of decoding (and displaying) GIF images with an Arduino was put up many times, and the result always was: it can’t be done. The reason is not the size of the FLASH but the limited RAM space to store
This article is about how to solve it in the very special case of monchrome GIF files of fixed length and structure.
While in general, OCR is a very complicated task to do, in this case it is extremely easy: there are only ten partial bitmaps to be detected. If digits are shown in a 4 by 7 bitmap you can store them in a 28-bit data structure (the data type “long” would be nice) and perform a “switch – case” to compare it to the ten possible constants – ready.
So let’s talk about decoding GIF files.
Why the hell did they decide to wrap 6-digit-data in a graphic file of 450 (or even more) bytes? What other ways could they have chosen?
- the source file (GIF),
- the color table,
- the directory tree,
- the LZW decoding, and at last
- the destination bitmap to be displayed.
You definitely would not rewrite and upload all your web pages once the counter has incremented.
Well, there is the SSI (server site includes) option by which a server can insert variable information before the HTML file is sent to the user. But SSI is not availabe on all web servers.
And there is the iFrame tag (inner frame) introduced in later HTML versions. But it was not supported by all web browsers.
So, including a GIF file which is generated “on the fly” was the only possibe way to install a counter.
Well, GIF files come in many flavours: animated ones, interlaced ones, pictures with background color and many more of them. In our case we can omit them all.
So you start checking all the documents about GIF formats and GIF decoders you find on the net.
When you try to analyse a simple GIF file containig the value of your counter, you will discover that the amount of bits used to store one pixel varies (and is not a multiple of eight), and without performing the complete decoding algorithm, there is no way to get any further. So you either give up – or try to find another way.
For me, the other way was downloading as many counter images of the same type as I could get, comparing them bit by bit, and finding the commons and the differences. As I only was interested in monochrome images, the color table always was: “0,0,0,255,255,255” meaning black and white.
At one point I realized the bit position of a specific pixel within the file was always the same no matter what the rest of the images was. In my case, the size of the image was 36 by 9, giving 324 pixels. As there is a frame of one pixel thickness, there were “only” 7 x 34 = 238 inner pixels of interest. The time-consuming job was to find out and enter the bit positions and pixel coordinates of these pixels as a basis to convert the downloaded GIF file to a bitmap.
As a tool to inspect the GIF file you can use the old MS-DOS program DEBUG.EXE which is still included in the 32-bit version of Windows-7; you will find it in the \windows\system32 directory.
There are some pitfalls or tripwires to observe:
In my case there was a long header containing ASCII characters giving a copyright information about who wrote the GIF encoder:
“Count.cgi 2.5,(Apr-08-2001-1) By Muhammad A Muquit http://www.muquit.com/muquit/software/Count/Count.html”
Of course I could skip those data.
- The start of an image is marked by 0x2C.
- The end of an image is marked by 0x3B.
- At the start of the image the amount of data bytes is given. Once that amount of data has been read (and there will be more data to read) there will be another length byte giving the size of the next portion of data. Take care not to take these bytes as picture information.
To make life easier, I first coded the algorithm using Borland Delphi (version 6.0) giving me immediate and visible results of success (or failure) of my progress in understanding the real problem. And at the end of the day, I made Delphi to produce a text file containing some of the code to be used in the Arduino sketch, so I did not have to enter it twice.
That’s it. If your counter is the same kind as mine you save a lot of time. Otherwise you have to modify it yourself.
anything to access an SD card (most probably a shield)
the table to convert GIF to bitmap