performance 20m2 versa arduino 328

albatros

New Member
Hello,
Im currently investigating the performance between
- a picaxe 20M2, running with 32mhz and
- a arduino UNO board, stuffed with atmega328, and ext. Oszillator of 16MHz

I wrote a simple program as

Loop
- Pin 2, high
- Pin 2, low

I measured the following values,

Picaxe 160us high, 410us periodtime
Arduino 115us high, 230us periodtime.

Made I something wrong regarding the configuration of the picaxe?
I expected that the picaxe would be much faster then the arduino.

Albatros
 

eggdweather

Senior Member
Using this on an Arduino:
Code:
#define pin 4

void setup() {
  pinMode(pin, OUTPUT);      
}

void loop() {
while (true) {
 digitalWrite(pin, 1);
 digitalWrite(pin, 0);
 } 
}
Should give you times of 4uS high and 4uS low a nice clean symmetrical square wave on Pin 4 which is significantly quicker than the PICAXE interpreter. The Arduino uses compiled code and the source code would be converted into fairly minimal instructions, whereas the PICAXE chip has to run the interpreter and then the BASIC code, so it's overheads are much greater.

Just run this programme on an Arduino Uno 16Mhz and monitored the result with an oscilloscope.
 
Last edited:

eggdweather

Senior Member
OK, the code above is the only way I can get a symmetrical result, I think the jump command from the second digital write back to restart the loop must be very quick, say 4-cycles or 0.7uS or faster on the Arduino. I think the Arduino compiler is using a read modify write instruction, because the output is symmetrical high and low as far as I can see on my scope, hence the relatively slow speed for setting a pin high or low. It could be done with op-codes/machine code for even greater speed.
 

Jeremy Harris

Senior Member
The Picaxe interpreter is pretty fast, but is always many times slower than the same code running in compiled form. The Arduino uses a far from efficient or properly optimised C compiler, but will still produce code that runs a lot faster than any Picaxe.

Arduino code is usually pretty messy and bloated in terms of the size of the source code and libraries, but the object code from the compiler will always be relatively fast as it is executing directly on the chip, rather than being interpreted and then executed on the chip as is the case with the Picaxe.

I'm amazed at how fast the Picaxe is for an approach that carries such an inherently high overhead in the chip firmware, it really is a masterpiece of how to wring out the best performance from an approach that doesn't require a development environment and compiler to do all the grunt work on the machine that is being used to write the code. Being interpreted on the chip also helps a lot when it comes to debugging, too, as it's pretty easy to see how the code is actually performing, in near real time.

Having said that, I'd love to see the hinted at compiled version of Picaxe Basic. 99% of the time I can find a work around to get fast things to happen with the Picaxe, but there are some projects where being able to have the additional speed that compiled object code can give would be great.
 

eggdweather

Senior Member
Out of interest:

Sketch uses 856 bytes (2%) of program storage space. Maximum is 32,256 bytes.
Global variables use 9 bytes (0%) of dynamic memory, leaving 2,039 bytes for local variables. Maximum is 2,048 bytes

Which is a lot of bytes for such a simple programme.
 

Jeremy Harris

Senior Member
Out of interest:

Sketch uses 856 bytes (2%) of program storage space. Maximum is 32,256 bytes.
Global variables use 9 bytes (0%) of dynamic memory, leaving 2,039 bytes for local variables. Maximum is 2,048 bytes

Which is a lot of bytes for such a simple programme.
That's one of the issues I have with the Arduino, the source code ends up being massive just to do really simple things, and the compiled object code is far from being optimised in terms of either speed or storage space.

I remember when we first started to use compiled Pascal on microcontrollers and were shocked at how large the object code was compared with similar programmes we'd written for the same microcontrollers using assembly language.

I suspect a good programmer can still write code that is smaller and takes up less memory using assembly language than any compiled high level language.
 

eggdweather

Senior Member
I agree, but today time is money and memory is cheap and plentiful. I wrote a compiler at University and it is not easy to generate the object code, in mine I produced assembler code that was then assembled to the final product using the MicroChip assembler, it worked, but it was slow in development time. The hardest task was the parser to interpret the source code, which is clearly done for the PICAXE, so converting the resultant pseudo code (surely they must be doing that) to an object code should be straightforward, maybe that could be Rev-Ed's next step. Their 'compiler' does not bring much efficiency gain in storage or speed terms, I had a quick look at it and thought there was no advantage over using the standard development environment. In the cased of a compiler the Harvard architecture helped a lot with simplicity. So I converted the line:

A=2+1

in op-code to:
PUSH 2
PUSH 1
PULL a
PULL b
ADD a,b
STORE b, address

About 12-bytes (16-bit) just to do something really simple, but could be as simple as:
MOV 2, a
MOV 1, b
ADD a,b

Or 7-bytes, happy days...
 

inglewoodpete

Senior Member
If you only look at part of the picture (how fast will it go?), then you miss the overall advantages and costs of a microcontroller.

I use microcontrollers to earn my living these days. This has brought one of the advantages of the PICAXE to the fore. Development time costs - every hour of development time costs me money. Put another way, development time eats away my time that I can be putting towards taking on more projects. With experience, the PICAXE is particularly good for reducing development time. As a hobby, I might not factor this into the overall equation.

I have been using PICAXEs for 11 years and have learned a lot about how to get the best out of them. I don't use PICAXEs exclusively - I also use 'raw' PICs, programming them in C. The advantage with Microchip C is that I can use interrupts and have control of any recurring or revisited task (ADC, SPI, Serial transmission and reception, timers etc). PICAXE and Arduino only allow limited user control of interrupts and therefore limit their own flexibility and speed. I still use PICAXEs in around 95% of my projects because they do the job at the best cost. If I use a PIC, the development costs escalate and I have to justify and account for this. I have chosen not to go down the Arduino path because I am familiar with the PIC hardware already (I only have a limited number of brain cells :)). As an aside, I see that Microchip has recently announced that it is buying Atmel, too: http://www.microchip.com/investor/Pressrelease/Microchip%20Technology%20to%20Acquire%20Atmel.011916.pdf
 

bpowell

Senior Member
Out of interest:

Sketch uses 856 bytes (2%) of program storage space. Maximum is 32,256 bytes.
Global variables use 9 bytes (0%) of dynamic memory, leaving 2,039 bytes for local variables. Maximum is 2,048 bytes

Which is a lot of bytes for such a simple programme.

I actually did a little write-up for my local maker group, breaking down the Arduino "Blink" program, against a straight-to-AVR blink program.

I tried to cut-and-paste it here, but it's too long...take a look at our local Message Forum And you'll see the 3 levels of the same program...in C, Assembly, and Hex...it's amazing the bloat Arduino adds in. But, it does make the Arduino easier to program and learn on.
 

bpowell

Senior Member
Hello,
Im currently investigating the performance between
- a picaxe 20M2, running with 32mhz and
- a arduino UNO board, stuffed with atmega328, and ext. Oszillator of 16MHz

Aside from the comments already posted regarding the difference between running interpreted code (PICAXE) and straight compiled code (Arduino) you should also note: The 8-bit microchips that PICAXE is built on run an instruction ever 4 clock cycles...so, the effective million of instructions per second for a 20M2 running at 32 MHZ is 8 MIPS. As noted, those are PICAXE FIRMWARE instructions, not your code's instructions.

ATmega runs an instruction every clock cycle...so the effective million of instructions per second for the "standard" Arduino (ATMega 328P) is 16 MIPS.

To compare fairly, you should be running a 20X2 at 64MHz to "match up" MIPS-wise with an Arduino running at 16Mhz.
 

nick12ab

Senior Member
As already mentioned, Arduino should be faster because it uses compiled code. PICAXE has many strengths, but speed isn't one of them.

If speed is required, then on any microcontroller system it's usually better to write to whole ports at once instead of set individual pins. PIC and AVR are 8-bit microcontrollers, so writing 8 bits at a time is no more difficult than just writing one - in fact it's easier because when just writing one you've got to read what all the other bits are and write those to the port as well.

Take for instance the Arduino function digitalWrite: (PICAXE does something similar)
Code:
[color=#CC6600]void[/color] [color=#CC6600]digitalWrite[/color](uint8_t pin, uint8_t val)
{
	uint8_t*timer*=*digitalPinToTimer(pin);
	uint8_t*[color=#CC6600]bit[/color] = digitalPinToBitMask(pin);
	uint8_t*port*=*digitalPinToPort(pin);
	volatile*uint8_t**out;

	[color=#CC6600]if[/color] (port == NOT_A_PIN) [color=#CC6600]return[/color];

	[color=#7E7E7E]// If the pin that support PWM output, we need to turn it off[/color]
	[color=#7E7E7E]// before doing a digital write.[/color]
	[color=#CC6600]if[/color] (timer != NOT_ON_TIMER) turnOffPWM(timer);

	out*=*portOutputRegister(port);

	uint8_t*oldSREG*=*SREG;
	cli();

	[color=#CC6600]if[/color] (val == [color=#006699]LOW[/color]) {
		*out*&=*~[color=#CC6600]bit[/color];
	}*[color=#CC6600]else[/color] {
		*out*|=*[color=#CC6600]bit[/color];
	}

	SREG*=*oldSREG;
}

/*

ABOVE CODE TAKEN FROM THE FOLLOWING LIBRARY
  wiring_digital.c - digital input and output functions
  Part of Arduino - http://www.arduino.cc/

  Copyright (c) 2005-2006 David A. Mellis

  This library is free software; you can redistribute it and/or
  modify it under the terms of the GNU Lesser General Public
  License as published by the Free Software Foundation; either
  version 2.1 of the License, or (at your option) any later version.

  This library is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
  Lesser General Public License for more details.

  You should have received a copy of the GNU Lesser General
  Public License along with this library; if not, write to the
  Free Software Foundation, Inc., 59 Temple Place, Suite 330,
  Boston, MA  02111-1307  USA

  Modified 28 September 2010 by Mark Sproul

  $Id: wiring.c 248 2007-02-03 15:36:30Z mellis $
*/
That's very complicated just to do something very simple. And it even contains other functions that are defined elsewhere!

If you're writing a whole port at once, it's simply:
PORTB = B10101010;
on both systems (remove ; and replace B with % for PICAXE) which translates into very simple assembly when compiled.
On PICAXE, the ports are already called that in the pinout diagram, on the ATmega328 PORTB is digital pins 8-13, PORTC is A0-A5 and PORTD is digital pins 0-7.
Do that comparison and see what timings you get.

Aside from the comments already posted regarding the difference between running interpreted code (PICAXE) and straight compiled code (Arduino) you should also note: The 8-bit microchips that PICAXE is built on run an instruction ever 4 clock cycles...so, the effective million of instructions per second for a 20M2 running at 32 MHZ is 8 MIPS. As noted, those are PICAXE FIRMWARE instructions, not your code's instructions.

ATmega runs an instruction every clock cycle...so the effective million of instructions per second for the "standard" Arduino (ATMega 328P) is 16 MIPS.

To compare fairly, you should be running a 20X2 at 64MHz to "match up" MIPS-wise with an Arduino running at 16Mhz.
Absolutely correct, a PIC with a 16MHz oscillator and 4xPLL to make it one instruction per clock cycle shouldn't be called 64MHz when comparing it with an AVR which does one instruction per clock cycle anyway. (AHEM! - Page 24 -Microcontroller Comparison)

Out of interest:

Sketch uses 856 bytes (2%) of program storage space. Maximum is 32,256 bytes.
Global variables use 9 bytes (0%) of dynamic memory, leaving 2,039 bytes for local variables. Maximum is 2,048 bytes

Which is a lot of bytes for such a simple programme.
This is because PICAXE already has all its functions defined in the program memory occupied by the interpreter, so all PICAXE functions only take up a small amount of memory.

Arduino has nothing apart from a bootloader to enable serial programming. Every new function you add has to be defined in program memory, which means simple programs take up lots of space though expanding on it shouldn't make it much bigger.
 
Top