Building a Dynamic and Self-organizing Network of Devices

At a worldwide consulting team meeting, we decided to try an Internet of Things (IoT) project. We saw this as an opportunity to learn something new on an important topic and to have some fun. At the same venue, there is a big demo showcase, and we decided to find out which demo was the most popular one by measuring the sound level at the different tables.


The purpose of this is to measure sound levels at different locations in a room. To achieve this, we have to

  • Learn how to measure sound and find a way to classify it
  • Find hardware to use for measurements
  • Implement this in a way optimized for our hardware
  • Stream the measured data to the cloud for visualization and analysis

This tutorial will be split into three parts:

  • Part 1: Creating and optimizing an algorithm for measuring and comparing sound levels
  • Part 2: Create device driver blocks for interacting with custom I/O on the hardware
  • Part 3: Analysis and Visualization on ThingSpeak


We wanted to do a setup where lots of devices measure sound levels and then send the results to the cloud. For our sensor nodes, we chose Arduino Nano devices because they’re small and low-cost, and Simulink has an Arduino support package. Arduino Nano’s both low-cost and energy-efficient, which is great, but it doesn’t have Wi-Fi, which was a problem for our project.  Instead of using a Wi-Fi extension, we chose to use an RF chip (nRF24l01+, I’ll just call them RF24 for short), which can be used in a network and can communicate with several peers. We chose the RF24 over a Wi-Fi solution as it’s both very cheap and energy efficient, which is valuable if you’re running them with battery power.

To communicate with the cloud, we needed one node in the network with an internet connection.  For this, we chose a Raspberry Pi for its CPU power and ease of use. Simulink Coder has a Raspberry Pi Support Package which simplifies our workflow.

The setup then would be the Arduinos connected in a network, with the Raspberry as the master node, relaying the measurement data to the cloud.

RF24 chips can only talk to 5 other nodes, so in order to connect more chips, they have to be organized into a network. Our Arduino and Raspberry Pi chips needed driver routines, which we downloaded from this GitHub account. This also provides a mesh library, which allows the nodes to connect in a tree-like structure, with the master node at the root, and all sensors as leaves to the master node or other slave nodes.

Part 1: Create and optimize an algorithm for measuring sound levels

To measure a sound level with high fidelity, like a decibel measurement, a fair bit of calculation and therefore memory and computing power is required.

It’s fairly easy to create an algorithm, even with limited knowledge about a specific topic. In my case, I have no particular knowledge in audio engineering, so I just looked up how sound level is measured, basically by taking the frequency spectrum and some additional processing. This is fairly easy to implement in Simulink, and my first attempt looks like this.

A simple pipeline of operations should provide me with a number describing the loudness for a certain signal. If I try to compile this for the Arduino, however, I will see the following results:

Program: 12636 bytes (38.6% Full)
Data: 30842 bytes (1506.0% Full)

The program indeed uses more than its fair share of memory, and we will have to optimize this code if we want to run it on the Arduino. With some more research, it becomes clear that this algorithm can be simplified, if we’re e.g. not interested in the exact frequencies at which the sound is loudest. Through a successive set of optimizations

Program: 7140 bytes (21.8% Full)
Data: 29022 bytes (1417.1% Full)

We arrive at code that is more and more optimized…

Program: 6284 bytes (19.2% Full)
Data: 6494 bytes (317.1% Full)

…and smaller…

Program: 7728 bytes (23.6% Full)
Data: 625 bytes (30.5% Full)

…and faster.

Program: 6392 bytes (19.5% Full)
Data: 387 bytes (18.9% Full)

This is a lot of optimization, and we seem to be in a good position to use this algorithm on our resource limited Arduino. However, there is a question, that is probably raising its ugly head in our vicinity by now. Just in case it isn’t we’ll make one final optimization.

Program: 3290 bytes (10.0% Full)
Data: 238 bytes (11.6% Full)

It is not very likely that this particular implementation produces the same results as our first one, so the question should be clear now, namely:

Does the algorithm still produce the same, or comparable results?

The answer to this may be a clear No! with our last attempt, but what about the previous ones? Bereft of my crystal ball, I’m forced to resort to validation, and this is where the beauty of simulation becomes apparent. When I simulate these algorithms on my computer, not only can I compare them to each other. I can also compare the ones that do fit on the Arduino, with the more complex (and more correct ones), that I could never test on the Arduino.

My goal is not to have an absolute number of the sound level at any point, but rather a comparison of the relative loudness. This means that the algorithms don’t have to produce the same results, but if a sound sequence x is louder than sound sequence y for algorithm A, then algorithm B should get to this conclusion too.
I’m trying to find a winner of the showcase, i.e. the loudest booth, and I consider integrating the sound level over time to be a good measure of popularity. I create 5 sources of sound that I can compare with my algorithms, integrate the sound level outputs, and compare the results to deduce a winner.
I do this for all the algorithms (1-6), and see if they are acceptable. The model to compare the outputs looks like this:

The model reference blocks all point to the same algorithm, and can be swapped for different algorithms.
The output for the first, most viable algorithm looks like this:

This means, I now have a system simulating different algorithms, and also using them to actually compare the sound levels for different sources of noise. If I could only have built this on my Arduino, there would have been quite different conditions:

  • I would have had to start with an already optimized algorithm (to fit on the hardware), without knowing if it’s a good reference model for the results
  • Using synthetic test data would have been much more difficult
  • Logging results of the simulations would also have been more difficult
  • Changing and testing an alternative algorithm would also be more time consuming

The difference between simulating your algorithms on your host computer to just testing and adapting one algorithm on the Arduino, already strongly optimized to fit in memory, is huge. With this scheme, I can progressively optimize my way towards a working algorithm, that will both fit on my hardware and produce the same, or acceptable results compared to my reference algorithm.

I should maybe note, that I wasn’t quite happy with the results my algorithm achieved, but due to time limits, I had to make do with an adequate one. The important point, though, is that I had the tools to do this development, and through simulation, I was actually aware of the shortcomings of my algorithm. This is a part of what makes Model-Based Design so powerful.

In the next section, we’ll be looking at how the create device driver blocks for the RF24 chips.

Part 2: Creating device driver blocks for the hardware

So now we have a working algorithm, but we still have to make it read the actual signals from the microphone, and send the results somewhere to be processed. The part with the microphone is simple, as the Arduino support package already has blocks for reading analog signals, so that leaves us with the communication blocks.

The blocks from this part are available at MATLAB Central File Exchange.

Network architecture

We chose for our implementation the nRF24l01+ chips (RF24) for communication. These are fairly cheap chips, that can communicate with up to 5 other nodes. Our setup may have up to 100 different nodes, though, so we have to connect the nodes in some kind of a network.

We use an open source library from this account,, for communicating with the devices. These libraries not only contain functions for communicating between two nodes but also APIs for connecting several nodes into a network. There is also a mesh library that allows you to connect nodes in a self-organizing network. This will make it possible for the nodes to automatically retrieve an address and a position in the network. For more details, see this page in the RF24 library documentation.

The Arduino nodes will only be able to communicate via the RF24 chips, so they don’t have any internet connectivity. In order to communicate with ThingSpeak, we choose to do this via a Raspberry Pi. The Raspberry Pi will act as a server, which collects all information from the Arduino nodes, and relays this to ThingSpeak.

Blocks for the RF24 chips

In order to communicate via the RF24 chips, we had to make calls to the RF24 API in the C code. To be able to connect this with our Simulink algorithm, we create RF24 chips device driver blocks. This way the user of this library can fully concentrate on developing his algorithm and doesn’t have to worry about writing low-level I/O functionality. We then generated and deployed our C code by pushing Simulink’s build button.
The nature of the hardware is that potentially many parts of our model may use one hardware resource to send or receive data.  We chose to implement our model with separate blocks for initialization, send, and receive. Thanks to the commonalities of Simulink’s underlying libraries, for Raspberry Pi and Arduino we were able to use the same blocks for both the send and receive blocks.

We implemented the blocks using S-Functions, and put masks on the blocks to make user interaction easier.

What data to send

With these blocks, we can now send the data we collected to our gateway, the Raspberry Pi. We have already created an algorithm for analyzing the sound level, and optimized it for our hardware, but if we’re sending it to the cloud for further analysis, why don’t we do all the analysis there? Our cloud service will probably have more processing power than our Arduinos, so why this choice?

There are a few things to notice here:

  • Bandwidth – Through preprocessing, we can send fewer data over the network. This will save bandwidth, and in our network, this could otherwise be an issue.
  • Regulations – We are interested in the loudness, but if we were to send the unfiltered data to the network, we’d basically be sending a recording of sound to an internet server. There are regulations against things like this in many countries, so we’re better off not doing it. If not for regulations, also for the respect of privacy.
  • The mass of data may slow down our cloud service too, so if we can preprocess the data at the source, that’s a wise choice.

That said, we’ll stick with our algorithm, and we’ll just send the sound level to our gateway.

The Gateway

The Raspberry Pi will act as our gateway, receiving data from the different nodes, and passing it to the cloud using ThingSpeak.

To read the data, we use the RF24 Mesh Read block and pass the data into a subsystem for further processing.

In our system, we differentiate for different channels of data, and send them through the correct ThingSpeak channel. For each channel used, we have one ThingSpeak block, and we have the Simulink logic switch the received value to the correct block.

In the next section, we’ll be looking at how to analyze our data with MATLAB and ThingSpeak.

Part 3: Analysis and Visualization on ThingSpeak

Analysis in MATLAB using ThingSpeak Toolbox

We can now start looking at how we can analyze the data on ThingSpeak. I’ll first try out some data manipulation in my local MATLAB installation.

Get the data

Our Raspberry Pi application has uploaded the data to our channels at ThingSpeak. If we have downloaded the ThingSpeak toolbox, we have functions for retrieving data directly from ThingSpeak.

chIDs = 227322:227326;
N = 5;
T = cell(1,N);
Y = cell(1,N);
I = cell(1,N);
for k=1:N
   [Y{k}, T{k}, I{k}] = thingSpeakRead(chIDs(k), 'NumPoints', 200);
legStrings = arrayfun(@(x) sprintf('Node %d', x), 1:5, 'UniformOutput', false);

Pack all Y data in one matrix, and use just one-time vector

This is a simplification that is possible in our use case since all signals have the same time vector. This isn’t necessarily the case in a real application.

TV = T{1};
YV = [Y{:}];

Do a raw plot

These are the sound levels, as we’ve calculated them and sent them to the Raspberry Pi from our different nodes.

plot(TV, YV), title('Raw data'); legend(legStrings{:}), shg

Integrate and plot the results

To do a simple integration, we make a cumulative sum of the data. If our sample times wouldn’t be equidistant, we would have to take this into account for the calculations, but in this case, we can refrain from that.

IYV = cumsum(YV);
plot(TV, IYV), title('Integrated data'); legend(legStrings{:}), shg

Normalize the values by dividing by the maximum

Normalizing the data will make it easier to see who’s in the lead, as now value will ever go above 1, and the current leader will always have the value 1.

maxY = max(IYV, [], 2);
NIYV = IYV./maxY;
plot(TV, NIYV), title('Normalized, integrated data'), legend(legStrings{:}, 'Location', 'bestoutside'), shg
ax=gca; ax.YLim(2) = 1.02;

Sort the winners

The normalized curve we have isn’t really the most legible, so we sort the data with the integrated sum. The index to the rows will show us the leader.

[SNIYV, places]  = sort(IYV, 2, 'descend');
legPlaces = arrayfun(@(x) sprintf('Place %d', x), 1:5, 'UniformOutput', false);
plot(TV, places, 'x'), title('Leaders'), legend(legPlaces{:}, 'Location', 'bestoutside'), shg
ax=gca; ax.YLim = [0,6];
ax.YTick = 1:5; ax.YTickLabel = legStrings;

In this particular picture, this, we see that the first place, the blue line, is first taken by Node 2, then Node 4, and finally the full day winner is Node 5. Node 3 spends almost the whole time at the last place.
So, now we have a simple analysis in MATLAB, but how do we get this into the cloud? It turns out to be rather simple, as ThingSpeak supports a subset of MATLAB.

Analyzing data on ThingSpeak

To do this, I simply copy the code I have to a MATLAB Visualization in my ThingSpeak account. I then comment out all plots but the last one, since one visualization can only contain one plot. Done! As an example, I add the result to one of my channels, as seen in this screenshot.

This may not be the final choice for how to deal with our data. One drawback here is that we always have to read in all the data from our experiments. With more data accumulated, this may be costly to the point where it is no longer practical. We do have other possibilities, though.

ThingSpeak will let us create applications for data analysis too. We take the first part of this application, where we calculate the winner, and then we save the first three places to a separate channel.

We can read and analyze the data, just as we did in the first version. The only difference now is that we write the current leaders to a separate channel, created specifically for this task.

Write the current leader to our results channels

We only want the last value, the current top positions, and we only save the three best.

dataNow = places(end,1:3);
tNow = TV(end);
R = thingSpeakWrite(228275, dataNow, 'WriteKey', '3405691582', 'Timestamp', tNow);

We then save this code into a ThingSpeak analysis app.

We now have an app that we can run to get updated results. This is a tedious task, though, one which I am not prone to engage in.

There are, however, different ways of letting ThingSpeak do the work for us. We will create a timer application, that will trigger our analysis application regularly, thereby writing results to our results file.
A few clicks, saying that our timer should call our app every 5 minutes, and we’re done.

If we let this application run along with our measurements, we will have a specific result board showing the 3 top positions during the showcase.


We used Simulink to design algorithms for resource-constrained hardware. It let us easily implement our algorithm with the highest fidelity achievable and then modify it, compare results to our original values, and determine which optimizations were viable. The probably biggest advantage was being able to compare algorithms on the PC, which would never have fit on our hardware.

We got it working, but there was still room for improvement, for example, on how to restructure the node mesh if changes in connectivity occur.