Search Microcontrollers

Saturday, May 27, 2017

STM32 Graphic capabilities

I had a great opportunity yesterday, the kind of opportunities I usually try not to waste: I was offered to learn something!
Never pass an opportunity like that!

But that was just the beginning of my luck.
What if I tell you the topic was incredibly cool, that the instructor definitely knew his stuff, the same stuff I have been struggling around for a while, before getting invited at this free 1 day training?

Does it get any better? Actually it does, the whole thing came with free pizza (and a STM32F746 discovery board too!).

(Sorry, don't have an image of the pizza)

Jokes aside, thanks ST Microelectronics for this opportunity.

As usual, one of my preferred ways of writing down notes about something I learnt or I am learning is to write a blog post, it definitely helps me in the process of digesting the information and, from some feedback I received, it turns sometimes helpful to others too.

Ok, enough blah blah, down to business.
Graphics, oh yeah!

Let's start from the basics.

We live in a world where even toilets might have some GUI (not joking, see yourself..), adding graphics capabilities to MCUs seems just natural.
You can find in the web several projects even with Arduino unos and some SPI screen, some are pretty damn cool too.
However if you want something a bit more performant, you need horse power.
Cortex cores (from M3 upwards) start to get you in the ballpark, but for complex applications, you may need to scale up to M4s and M7s.
A 32 bit risc processor running at 216MHz such as the STM32F746 has the needed computational power to support a modern interactive GUI.

Turns out the horsepower is not all, there is more because we need more.

Ideally you want your gui to
1) run on a screen with decent definition (the bigger the screen, the higher the definition in order to mantain an acceptable DPI resolution)
2) Have a high refresh rate (60Hz would do)
3) Have decent color depth, such as 16 or 24 bits
4) Avoid to drain your CPU as it usually needs to do other suff in the meanwhile, like reacting to your inputs.

If you put all that together you discover that you need a fast processor, with quite a bit of ram and maybe some easy way to expand it with fast external SDRAM, you need maybe some fast storage for your images, and ideally something that deals with the complexity of the screen interface for you.

The thing is, it is not enough to solve ONE problem, you need to solve them all... and that's the kind of thing where MCUs excell, they provide you with a set of integrated peripherals designed specifically to tackle all the needs for an application.
I believe the STM32F7 is one remarkable example of that (Note : I am not sponsored by ST, although, techncially... they gave me a free pizza so... I might be biased :) ).

To explain why I believe the F4 and F7 families have very good features to support a gui, let's analyze what are the things that are needed.

If you are interested in getting the details, ST has great documents such as the Application Note AN4861, which I strongly encourage you to check.

When you interface a screen you can chose different devices, but there are mainly two (more actually, read the AN4861 for that) kind of screens :
- Those that have memory and timing controller onboard
- Those that don't

The TFT screens you see around connected with arduinos or even Cortex M3s in projects are usually of the first kind.
You can find them on ebay for few bucks so, why would you even consider the second kind?
Typically, if you need a higher resolution than 320x240 with a decent color depth and refresh rate, you need to manage yourself the controller and that adds quite a bit of complexity.
You need to manage the timing for the sync signals (see my posts here and here if you understand a bit of Verilog and FPGAs) and a 24 bit parallel interface for the colors, but the worse part is that the two must be precisely synchronized.
Say that you are refreshing a VGA screen at 60Hz, that means that at any specific instant you have to send the exact RGB values to correctly draw the current pixel, over and over.
If you CPU is dealing with that, probably it is not going to be able to do much more.
A cortex M7 @216Mhz might be able to do that, but actually it does not have to because you want to save your CPU clocks for your application, besides the GUI.

ST added to some F4 F7 and L4 devices a peripheral called LCD - TFT Display Controller (LTDC).
It can interface with several kind of displays (note : depending on the device itself, some recent interface technologies might or might not be supported), it is integrated in the MCU and can access the MCUs DMA channels.
Basically you define a buffer in memory (internal or external memory, works just the same) and your code populates this buffer with your graphics, the LTDC reads it authonomously and provides the needed signals for the display, no CPU needed once you started it.
And that's your RAM you are dealing with, the fastest thing you can read and write to from your MCU, no match for a SPI interface.

During a hands-on lab in the training I was debugging the code, the execution was halted at a breakpoint so the CPU was actually waiting, not doing anything. Still the internal LTDC was refreshing the screen, completely authonomously, just like if the screen had its own controller and ram buffer onboard (and it did not, the screen is a RK043FN48H-CT672B, it has an RGB parallel interface, no controller).

So, the LTDC plays an important role in enabling graphics capabilities and it does it allowing quite a bit of flexibility.

A second key component for the solution is RAM : GUIs in high resolution are memory hungry and this requires two things :
1) A decent amount of internal memory and / or an easy way to integrate inexpensive and powerful external memories
2) An efficient high bandwidth DMA

The F7 shines there, I suggest you check the schematics for the F7Discovery board, it is surprisingly (to me) quite readable and extremely informative.
The beauty is that the LTDC can tap with no issues in that DMA, in fact it is a master device in the AHB

 from the AN4861 @ST Microelectronics

If you want to learn more about the F7  architecture, ST has a nice MOOC training about it, you can search it on their website.

The potential draw-back of this kind of solution is that you have a high pin count to connect to the display (24 for the RGB, 3 for the sync signals, something to control the backlight, an I2c eventually for the touch panel...) wich calls for some "interesting" PCB layout and also you should expect to deal with high pin count devices, typically BGAs... maybe not all of you -nor me anyways- will be able to solder those in your kitchen.
Some new technologies such as MIPI-DSI , supported in some STM32 devices, solve that problem, I will not enter in details here.

So, now we can update the screen and we have a fast access to the frame buffer, being internal or external, that's a lot already, but it's not all.

Chrom-ART, aka DMA2D.
This is another important component for graphics of the STM32 architecture, it actually helps in populating your frame buffer.
Imagine your gui has a background image and some buttons with icons.
What you do is you get the background image maybe from a QSPI ram, copy it on the frame buffer, then you draw the buttons, back in the QSPI to fetch tthe icons, and finally copy them on the framebuffer as well.
It works, but requires wuite some effort to the CPU to handle all those memory transfers... unless you have a DMA2D which gladly takes care of that for you.
It's pretty cool, read more about it in the AN4943 application note document :)

Finally, all this is indeed amazing, but how does it come all together when you are writing an application?
The keyword here is : Software libraries.
By any means, you can fire up you STM32CubeMX, activate LTDC, DMA2D, QSPI and  whatever you need, then use HAL drivers to do your magic.
I tried and failed, maybe now I might have a better chance at it as back then I had very little understanding about what I just wrote in this post.
Is there another way?
Yes, as I said, software libraris.
We played with 3 of them during the training, all 3 looked pretty good to me even if you may want to use them for sligthly different purposes.
We tried Embedded Wizard , touchGFX and STemWin (Segger).
Those libraris smoothly and seamlessly integrate with the STM32 hardware, you don't even need to care about LTDC, DMA2D etc... they take care for you, and they do way more.
They have a PC based design where you build graphically your gui, including interaction with the touch screen, and then c code is generated for your device.
They work in slightly different ways, EW and tGFX have a well integrated environment, they deal with most of the tasks for you, while emWin requires a bit more coding.
I personally prefer the emWin approach because I feel it gives me better control over the code, but you pay that with more effort in most cases.
Also the PC tools are a bit less polished, but again, that's not a main concern for me.
One good thing about STemWin is that it comes for free with STM32 devices since ST made an agreement with Segger, customized/optimized the code for its devices and provided licenses for free to its customers.
If you have a medium to big sized project, you probably are not going to decide based on library license costs anyways.
My impression (but I still need to play more with those libraries) is that EW and tGFX may provide a faster time to market option and ensure good performances.
With STemWin I think you can achieve good performances, but it is up to you to optimize the process.

To wrap it up, ST seems to be quite committed in supporting graphics capabilities by :
- Providing fast MCUs (the STM32H7 will run at 400MHz!)
- Providing peripherals to remove load from the MCU and to ease the integration with memory and displays
- Working with partners to boost the software ecosystem
- Supporting customers with good documentation, examples and training

I may write more on this topic once I played a bit more with the libraries, maybe with some code examples.

Friday, February 10, 2017

STM32 Programming Ecosystem

Not long ago I started playing with STM32CubeMX and Eclipse to do some experiments with the STM32 ARM Cortex M3 processors.

Setting up the toolchain, the IDE etc was a bit complex, so I decided to create a youtube video about it, thinking it might be useful for others going through the same thing.

The reason why I did things that way was that with my Eclipse/ARM setup I was planning to use also other (non ST) devices, so it made sense not to use the ST specific version of the tools, which was also bound to the Mars eclipse version while I normally use Neon now.

I was wrong.

I mean yes, the intent made sense, but honestly all the additional hassle to avoid installing a new Eclipse instance was not worth it.

A couple of days ago I was lucky enough to participate to an extremely interesting Workshop at the ST Headquarters in Geneva (Switzerland).
They explained how to setup the tools,provided a few tips on how to best use them and provided extremely valuable information.
The workshop was engaging, well paced and indeed informative, kudos to ST for it ad thanks again for the invitation!

The workshop will be held in various cities in the next days (at the time I am writing), I strongly encourage you to participate if you are interested (it  is free).
This is the link for Europe, you might need to search around their website if you are interested in other regions, there might be something available, not sure

Now I need to capture in a new video the “standard/correct way” of doing things, I do it mainly because it is a sort of collection of minutes for myself, but then again, others might benefit from it.

ST uses a proprietary very low cost interface to allow you program and debug its chips, this interface is called ST-Link, which is basically an alternative to a standard JTAG (I normally use Jlink from Segger).
All the official boards include this interface and this allows you to plug in an usb cable and do all the programming/debugging thanks to a Windows driver.
No need for additional hardware.
However, should you have a non official ST board -that has no usb debugging- with one of their STM32 chips on it, chances are that it exposes the pins needed for the ST-Link interface, which you can grab for few bucks (less that 3$ shipped on ebay).

While a full blown JTAG device such as the Jlink might provide some more functionality/speed I have to admit that the cheap ST-Link will probably get the job done for everybody.

There is an (optional) utility you can use to upload the binary file on the STM32 flash, using ST-Link, this is called ST-Link utility.
I am saying “optional” because usually your programming IDE will be able to do that too, interfacing directly with the ST-Link driver.

When it comes to the IDE,while there are many different valid options, ST proposes a free solution based on Eclipse (Mars 2 at the moment).

If you followed my previous video, you saw that you need to install three main components with you IDE :
1) The IDE / Code Editor itself
2) The ARM toolchain (compiler, builder, linker, make…)
3) The Debugger interface

The good news part is that if you choose to go with the ST standard IDE (System Workbench – SW4) this is all taken care of, since ST packaged an eclipse environment that contains all the needed components.
I strongly recommend this approach, makes things WAY easier.
System Workbench comes with the Ac6 STM32 MCU GCC toolchain.

I like Eclipse, I admit it might be a bit “scary” at the beginning, but it is well worth spending a little bit of time to learn it since it can be used in so many different solutions (Coding any language / platform, ETL, Data Mining…).

With System Workbench (SW4) you can create your projects, but creating a Cortex M Project requires a few steps which include adding the relevant libraries / header files for your specific devices (CMSIS and additional stuff).
Like most IDEs, SW4 takes care of that, it will simply ask you which device or board you are targeting.
But it does more than that, it will automatically allow you to chose which additional libraries to include or even which middleware (such as freertos).

… but you will probably not even use those features.
Why? Don’t get me wrong, they provide tremendous help, but the reason why you may not want to use them is that you can do the same in an even better and easier way!

Imagine you were able to add all the libraries, middlewares, set up the correct stack and heap map etc, what would be your next step in the project?
These MCUs have an incredible number of peripherals, you will probably use a few of those in your project, so the first step is usually to set up the clock(s) and then enable and configure the multiplexer for the different pins and the peripherals you need to use.
While it obviously depends on the complexity of your project, this usually requires quite a bit of work, what if you could skip most of that?

Actually you can, let me introduce you to STM32CubeMX.
You are not obliged to use it, but I cannot imagine a reason why you would not.
For starters:It’s free and it nicely exchanges information with your IDE (SW4 is obviously supported, but so are Kyle and IAR).
What the cube does, is to help you set up your project by doing pretty much the same thing I said you would skip in the project creation in the IDE, PLUS it guides you in setting up your peripherals and the clocks.
Once you are finished, it generates a project dor your IDE with all the configuration set correctly for you and with a code skeleton that configures initializes the peripherals you selected.
Usually you decide upfront which peripherals you need to use and how they should be used and add your code later, but CubeMX allows you to change your mind.
In fact when it regenerates the skeleton code, it uses some specific comment tags to preserve code that you eventually added.
You need to be careful then, on where you write your code, in the skeleton files Cube will add lines like these



That means that if you need to add your code, it should be placed between those two lines, there are many different sections, in different places, where you can place your code.
As long as you respect this rule, you can go back to CubeMX, change whatever you need to change there and regenerate the code, the code you added will be kept in the new skeleton.
That DID NOT work in my previous setup (Users Eclipse with ARM GNU Toolchain manually setup, instead of using SW4), but maybe it was just me messing up things.
It definitely works smoothly using SW4.

The final component is STMStudio.
You can use it to debug your applcation, this is nto the only option obviosuly, since the IDE already includes a pretty good debugger, but STMStudio gives you a nice and simple way of monitoring variables (with graphical output eventually, sometimes it is useful).

Indeed there are many ways of customizing this ecosystem, but I found that, particularly if you are not an expert, it really helps in sticking to the standards: SW4, STM32CubeMx, ST-link and STMStudio seem to work very well together.

Here the links to download them :

Happy coding!

The new video is here