Sample=PINC; Sample=PINC; Sample=PINC; Sample=PINC; Sample=PINC; Sample=PINC; Sample=PINC; Sample=PINC; Sample=PINC; Sample=PINC; Sample=PINC; Sample=PINC; Sample=PINC; ...
(See also 3 Million samples per second by Bob Davis.)
You can key in this letter by letter or simply use a spreadsheet to generate the source.
But Mr. Davis missed to check the compiled code. It actually looks like this:
up to index 63 alternating this: z = analogRead8(); 4bd8: 0e 94 98 11 call 0x2330 ; 0x2330 <_Z11analogRead8v> 4bdc: d8 01 movw r26, r16 4bde: de 96 adiw r26, 0x3e ; 62 4be0: 8c 93 st X, r24 z = analogRead8(); 4be2: 0e 94 98 11 call 0x2330 ; 0x2330 <_Z11analogRead8v> 4be6: f8 01 movw r30, r16 4be8: 87 af std Z+63, r24 ; 0x3f as from index 64 like this: z = analogRead8(); 4bea: 0e 94 98 11 call 0x2330 ; 0x2330 <_Z11analogRead8v> 4bee: f8 01 movw r30, r16 4bf0: e0 5c subi r30, 0xC0 ; 192 4bf2: ff 4f sbci r31, 0xFF ; 255 4bf4: 80 83 st Z, r24
But reading the Atmel instruction set manual carefully, you will find the instruction
ST (STD) – Store Indirect From Register to Data Space using Index Z
The Z-pointer Register can either be left unchanged by the operation, or it can be post-incremented or predecremented.
Using this instruction with the post-increment, all samples are read at equal time intervals, assumed not Timer interrupts are permitted.
So, the repeated "loop" line will look like this:
"sts %2, r17 \n" // ADCSRA = conv "lds r16, %2 \n" // read ADCSRA "sbrc r16,6 \n" // test bit ADSC: ADC Start Conversion "rjmp .-8 \n" // jump to read again "lds r16, %3 \n" // read ADCH "st Z+, r16 \n" // store in x
which can be written in only one line:
"sts %2, r17 \n lds r16, %2 \n sbrc r16,6 \n rjmp .-8 \n lds r16, %3 \n st Z+, r16 \n"
You can copy this line as often as you want without modifying a single letter or number.
The editor of the Arduino IDE will help you keeping track of the number of copies:
The number of copies must not exceed the size of the array.
As we are using some registers their values have to be saved before:
"push r16 \n" "push r17 \n" "push r30 \n" "push r31 \n"
and initial values have to be provided:
"lds r17, (%0) \n" // conv "ldi r30, lo8(x) \n" // x "ldi r31, hi8(x) \n" // Z
At the end the old values have to be restored:
"pop r31 \n" "pop r30 \n" "pop r17 \n" "pop r16 \n"
and the mapping of the operands has to be added:
:: // list of input operands: "o" (conv), // %0 "m" (x), // %1 "M" (_SFR_MEM_ADDR(ADCSRA)), // %2 "M" (_SFR_MEM_ADDR(ADCH)) // %3 : // list of used registers: "r16", "r17", "r30", "r31"
Having done all this you surely want to know how much all this work speeded-up the performace. Taking a look at the TFT display you will find:
With a for-loop taking 160 samples needs 800 microseconds.
Eliminating the loop you need only 650 microseconds.
So, you gain just some 20 percent of time. And what is the price for this? Well, each asm line takes 18 bytes. As I used four times the width of the TFT display for better triggering the size of the binary exploded from 10 kByte to 22 kByte.
Download the source
Download the auxiliary file