This document describes a house automation project that uses a CAN-bus infrastructure with AT90CAN128 nodes. Its principal goals are to complement an existing, traditionally wired installation by
This project has had a very simple precursor in the form of a bit-banger application which had been implemented more than 15 years back with the goal of responding to some urgent needs of remotely extending electric circuits into my garden. The new system maintained the old wiring. but introduced CAN-bus technology and replaced the assembly-programmed Motorola MC68HC11 nodes by AVR AT90CAN128 nodes programmed in C. The first version of this new system became operational in 2008, it has ever since run to perfect satisfaction and never had any failures worth mentioning.
The new system, implemented along the lines explained further down,
presently supports
the following applications:
For a long time, the project had only been internally documented, without time being invested into making public documentation available. I now realize that the approach pursued is somewhat complementary to other ongoing projects; some aspects of the developed solution may be of common interest, for instance
This text focuses on the principal concepts formulated and deployed. It does not provide a detailed description of how these concepts have been put to work - rather, it picks up and illustrates some particular issues - it is meant as a kind of inventory of useful concepts and their successful implementation. Some notions used in the discussion on concurrency and resource handling may sound somewhat academic - the text tries to demonstrate that clearly recognizing and addressing these issues is essential for obtaining an efficient and reliably working system.
Source-code, tics, and the documentation on printed-circuit layout are presently not available on a public server. Although the project is pursued in a spirit of open software, investing time for dealing with copy-right issues and for keeping public documentation up to date is presently not justified. If access to my internal documentation is needed, please contact me at juergen.harms@unige.ch.
The basic concept for implementing the system is that of a distributed system
The system supports four types of nodes:
"Target nodes"
The "Master Node" (only one master node may exist on the system).
"Button Nodes"
provide push-button control over a set of devices (controlled by the target-nodes) and - optionally - operate LEDs that indicate the state of the devices. As an example, Figure 2.1.1-1 shows a button board with 10 LED-buttons for mounting in a double-68 mm wall fitting.
"Display Nodes"
Additional types can be added any time - for instance, the development of a wall-clock with alarm functions has just been completed. For the time being, it is handled as an additional type, but possibly it should become a variant of a display-node.
Distinguishing between types of nodes is an issue of software
engineering: it facilitates the implementation of node software. All
nodes of a given type use the same software, differences between
different kinds of - say - target-nodes are handled under control of
Unix-style "environment variables" that conditionally
include code that takes care of the particular features to be
implemented.
Complemented by simple dialogues that are run when the master-node or a target-node come up, this is all that is needed to keep a multi-node system in a consistent state; this approach even allows to re-boot any single node without loss of configuration data. This example also illustrates the fact that the careful definition of the protocol is a key item in the design of the system.
There is an additional important element that influences overall system design: the system is designed to be independent of interactions with a running PC.
The bus uses category-5 cables that had been installed as spares at the time my local area Ethernet had been installed. The central and most important part of the network uses standard galvanically coupled differential CAN technology (MCP2551-like interface ICs, ISO 11898-2). The branch of the network that goes through the garden is implemented as an independent CAN bus: this "secondary bus" also uses ISO 11898-2 interface, but is connected to the central part by a bus-to-bus coupler node that uses optocouplers for galvanic separation. This concept has made its proof: it survived a direct hit in a thunderstorm that killed my TV and Hifi equipment.
Although the linear topology imposed by CAN suits large parts of the application topology, it has been necessary in some segments to recur to a token-ring like bus-over-star wiring.
Presently, the bus has an overall length of nearly 100 meters and supports a permanently growing number of nodes (more than a dozen at present). The bus is operated at a clock rate of 100 kBps: this is sufficiently fast to provide good real-time response and communication capacity, and is sufficiently slow to be safe with respect to problems of impedance matching and of stub-wiring lengths.
Target-nodes realize the interface to the physical devices. As an alternative to integrate the node and the device into a single equipment, nodes can be conceived as sub-centers - hubs at the origin of traditional wiring to the physical devices. In this case, a node controls several physical devices.
Figure 2.2.2-1 shows some devices that are controlled by the "garden" target-node,
and Figure 2.2.2-2 the interior of the corresponding wiring closet
(it
is placed in
a small garden shed nearby) that accommodates
the node and auxiliary hardware: the Canstation-pcb (see node-hardware)
is hidden behind the bundle of yellow wires (top-right); the
torus at the left-hand side is the transformer for alimenting the
solenoid of a valve for filling up the pond, the device in the
left-bottom corner the filter for the lamp shown in Figure 2.2.2-1 -
removed from the lamp to avoid the iron core to
suffer from rusting.
This example happens to illustrate the - by far - most complex of the nodes presently implemented. The average node consists of a small printed-circuit board (the "Canstation" board, see below) plus a couple of relays. Figure 2.2.2-3 shows another interesting example, a node that drives a set of 8 sprinkler circuits, realized as a pcb. Its main components are (1) a solid-state relay for each circuit, (2) a piggy-backed Canstation pcb - described in the next Section - and (3) a switched 5V DC power supply. The annular transformer that provides 24V AC for the sprinkler solenoids is not mounted on the pcb.
All nodes use AT90CAN128 AVR-processors (the smaller processors of that family are not easy to obtain, and given the low number of nodes the price is not an important issue; moreover, standardization on a single type of processor also has other advantages). Initially, prototype nodes had been successfully tested that use the Microchip MCP2515 controller - preference is given to the fully integrated AVR solution because it allows to make smaller printed-circuit boards and avoids the overhead for doing DMA between the processor and the controller; an MCP2515 controller is nevertheless used to obtain a second CAN interface in the bus-coupling target node that attaches the galvanically isolated secondary bus.
A small (45 by 55 mm) 2-layer printed-circuit board has been developed as a basic building block for all nodes; it is used
Figure 2.3.1-1 shows this Canstation printed-circuit board.
The board uses a combination of SMD and of through-hole devices -
this is to facilitate stock keeping, but also because it is easier to
correct design errors on a board that had been conceived as a
prototype - an argument that is somewhat obsolete. The price I had
paid
for the last batch of 30 non-equipped boards was 228 Euros.
The board has connectors for
In the majority of nodes, only the main parallel I/O connector is used. If all B-register pins are used for output, the corresponding pcb leads can be cut under the "U3" socket (see figure 2.3.1-1) in favor of adding a ULN2803 (8 Darlington amplifiers that can drive loads of up to 500 mA).
At present, a slightly bigger (61 by 69 mm) pcb is being developed as an alternative which contains external RAM (a 32 kByte DS1244 non-volatile IC): although I have been surprised that - with careful design - the small size (4 kByte) of internal RAM never has been a severe restriction, experiments with stocking transient data (for instance for logging temperature charts) and with more sophistication for user programming of event sequences are easier to accomplish with "unlimited" availability of live memory. It is intended to use this extended Canstation for replacing the current master-node.
Most nodes require the Canstation board to be complemented by additional circuitry that controls the physical device. The small number of nodes in general does not justify the development of node-specific printed circuits. Most nodes are therefore realized by manually wiring-up laboratory cards. So far there are two exceptions where the wiring effort, or the complexity of the wiring made me opt for implementing application-specific printed circuit boards:
Eeprom is configured not to be modified when new code is downloaded (the eeprom protection fuse bit is set). The lowest positions of eeprom contain in all nodes:
In the master node, eeprom is also used for storing the current list of events and the event parameters. This data is potentially subject to frequent modifications (intervention at the display-node): to prevent premature deterioration of eeprom, a mechanism is used that "ages" the most recently modified value in ram storage before it is copied to eeprom.
Conceptually, the activity of a node
corresponds to the execution of
several concurrent processes -
such as the reception of messages
over the network, the transmission of messages, the output of text to a
display panel, etc.
In fulfilling their tasks, these processes must
access critical resources -
resources that may only be used by only process at
a time - for instance the output channel to the CAN bus, access to the
display
panel, etc. Processes requiring access to critical resources must be
executed in mutual exclusion.
A small real-time operating system
(described in the next Section) is
used for supporting the implementation of processes. This operating
system handles the
synchronization of processes, but the decision was made to refrain from
handling mutual exclusion with mutex-like
constructs, the normally used approach: this would require to implement
process-queuing,
something that is problematic where live memory is very scarce.
For the critical resource "UART" a different solution is implemented. The UART is only used for debugging, creating a separate process that owns the UART and handles I/O for other processes would be an overkill. The sharing of the UART is therefore achieved by a simple locking mechanism that uses active waiting: the library procedure for access to the UART uses the UDRIE bit of the UCSRB register as a locking bit. When output is launched, this bit is checked and set if it had been free - otherwise the calling process will loop on this check until the UART interrupt handler has cleared the bit at the end of the current output action. Active waiting is banned everywhere else since it deteriorates real-time response.
The display node is a good example to illustrate all this: if (1)
reading messages from the bus and, for instance, updating the display
of the clock, and (2) updating the display of button states in reaction
to user-generated events would be done without careful coordination,
the display would become messed up (note that the output to the
display-panel is interrupt-driven since the panel is a slow device).
The display-node is therefore organized to execute 3 processes:
The "PanelProcess" handles
I/O to
the display panel; when the process wakes up, a parameter specifies
what
action needs to be done - for instance:
The "SignalProcess" handles software-generated events:
Therefore, if the CanRead process wants to write to the display, it must wake up the PanelProcess. It does this by calling "ProcessWait" (see below) with the calling argument specifying the text to be written.
Process definition for other types of nodes is very similar - for instance, all nodes have a CanRead and a SignalProcess.
Several small operating systems for this kind of application do
already exist as open software. However - given the extremely simple
but
application-specific
nature of requirements described in the preceding paragraphs - a
substantially simpler OS has been developed for the support of process
concurrency in
the nodes of this system.
The development of this OS only required a minor effort, two rather small modules have been implemented:
Processes are implemented as ordinary C procedures that execute
do-forever loops. The code of these procedures is activated
whenever
the corresponding process has something to do, the process relinquishes
control once that is finished. Activating and relinquishing is
performed by
the "kernel" - the code implemented by Kernel.c and Processes.c.
Kernel.c contains the procedure "main.c" that is launched after a node has been bootstrapped. During an initial phase, main.c calls initialization procedures - for peripheral libraries (i.e. CanInit, UartInit etc.), and also for initializing the software that implements the node: each node must have
Having the kernel call procedures in the application software is not very elegant, but there is no easy alternative. After this initialization phase, the approach is conceptually clean: the kernel executes an idle loop - de facto this is an "idle process" that simply waits for the interrupt handling to signal actions to be done:
To minimize the requirements for live memory and to make the requirement for live memory predictable
Since programming uses plain C, the constraints implied by the concepts described here (for instance the restriction to access resources as private to a given process) cannot be imposed - they must be observed as "programming discipline"; experience has shown that this is easy to observe and does not present a problem.
As already indicated, the procedures that implement the processes do not directly call any Kernel functions except the synchronization procedures implemented in the module Processor.c . A short enumeration of these procedures provides a good illustration of the co-operation between the OS (the "kernel") and the code of the processes that implement the nodes:
"ProcessResume" is called to re-activate a target-process after another process has relinquished the processing unit. It is only called by the kernel:
"ProcessWait" is called by a process when it decides to suspend running and wait, typically for some action - like sending a message over the network or writing text to a display panel - to be completed (a parameter to the call indicates the condition for the wait):
"ProcessSync" is used the for synchronising processes: it makes the calling process wait until a target process (specified as the argument of the call) becomes idle. The number of processes allowed to wait for synchronization by calling this procedure is limited to a single process - an easy way to prevent wait-deadlocks (an error is signaled if this condition is violated).
"StackSampling" is an auxiliary procedure for software development: when it is called, it stores the current size of the stack of a process. This allows to verify that the stack allocated to a process (determined when the process had been launched by the procedure "ProcessInit") contains enough spare memory. Although a rough estimation of the required size is possible (number and size of local variables, depth of procedure calls), there remains much guessing and it is good policy to verify the effectively used size. I have been surprised that presently I still have more than 1K bytes of spare memory in the display- and the master-nodes (much more in the target and button nodes).
Practically all software available at the time when the project had
been
started did not support interrupts and
therefore needed re-writing (bear in mind that
this dates to many years back, and that
in the meantime new software will have been published).
The CAN library for the AT90CAN processor provides the following procedures:
This library is published at http://www.mikrocontroller.net/articles/CAN_Bibiliothek_f%C3%BCr_AT90CAN_Prozessoren.
At the beginning of the re-write, the logic to be implemented was studied by reading procedures written for the MCP2515 Microchip controller - some parts of my code reflect the logic implemented in http://www.kreatives-chaos.com/artikel/ansteuerung-eines-mcp2515 (Fabian Greif). The target-node that implements the coupling of the garden segment uses an MCP2515 for the additional CAN interface, its software is a re-write of these procedures, but with support for interrupt handling added.
Service procedures are
These procedures have support for the locking mechanism described in the preceding Section. With respect to the corresponding procedures in the avr-libc library, they have the drawback that they cannot be used like standard Unix I/O, the C-"format" construct cannot be used. A couple of utility procedures for printing decimal and hexadecimal values have therefore also been created.
The original code is at http://www.mikrocontroller.net/topic/14792 (P. Dannegger). Here, a total re-write has not been necessary, but a lot of lavish extras have been eliminated and the software has been focused on having access to some few individual sensors.
The principal issue that caused some headache was the 1-wire software. The Dannegger code implements the timing issues by programming active waits; some of these periods are too short for efficiently handling them with an interrupt driven approach. But, as already mentioned, active waits are banned in this system in order to assure good real-time response.
The implementation of DS18B20 support is therefore based on a
small cheat: the
software does active waits, but it runs on a separate processor; this
processor is not directly connected to the bus and has no other
time-critical activities; it uses SPI to transfer
the measured data
to the master-node; for the master node, that corresponds to having a
variable with the current temperatures that is "automagically" updated.
The master-node uses a DCF antenna and -receiver for maintaining its time-of-the-day record. An algorithm has been implemented that analyses the pulse-width modulated output of this receiver and, as a result, maintains this data. The documentation available for the representation of the time broadcast from the DCF transmitter allowed without problems to implement that algorithm (see http://en.wikipedia.org/wiki/DCF77).
At present, this algorithm needs an entire 1-minute period to be received without any error for composing a timer-value. If the reception is bad (for instance during thunderstorm activity in the area), there may be periods of up to several hours without any perfect reception: the algorithm should be improved by adding a feature that combines fragments of correctly received data from disjoint 1-minute periods.The configuration of the system is declared and stored in the form of data structures in C-procedures (i.e. Config.c and Events.c). Two kinds of configuration data are handled in a slightly different way:
Static configuration data is only mapped into the code of the master-node and of the display-node, the template of dynamic configuration data only into the master node. Therefore, only these nodes need to be re-built if corresponding configuration data is changed.
In terms of C-programming this is a very straightforward approach. In practice - since various items of configuration data structures refer to each other - this is somewhat complicated and requires careful handling. I would have liked to generate procedures like Config.c and Events.c with the help of some high-level language tool (probably using html or gtk), but - so far - there have always been more important issues that required immediate attention.
At present, this house bus system is used in two different sites with different configurations - moreover one in the French-speaking part of Switzerland, one in Austria, requiring the GUI to use either French or German. The software architecture selected for implementing the system is perfectly adequate to this situation.
The original bit-banger application was based on assembly-programmed Motorola 68HC11 processors. Deciding on the technology to deploy for the new implementation was not evident. Googling very rapidly made me aware of the attractiveness of AVR processors, but by accident and naively I did some exploratory work using the Conrad C-control product - which turned out a complete failure; on the positive side, this adventure has been a lesson that good library support and a lively exchange in community discussions are essential for choosing a platform.
Before that
background - and after a lot of more exploration, I re-discovered the
processors
of the AVR family and
decided to use CAN
technology (and, implicitly, a bus structure) and to base programming
on the C-language. The decision to go for
the avr-libc approach rather than to use Arduino was - to a large
degree - a question of personal taste; maybe the Conrad adventure also
had made me overly suspicious when I judged Arduino as excessively
"demo-oriented" in comparison to the "professional" aspects of avr-libc.
Software is written in C (except the in-line assembler procedures for context-switching between processes).
Development uses a Linux environment: a gcc AVR cross-compiler, Unix "make" and the avr-libc library - see http://www.nongnu.org/avr-libc. Host communication for downloading and debugging uses an AVR JTAG ICE device and the avrdude utility program - http://savannah.nongnu.org/projects/avrdude.
Avrdude is a command-line tool: to make it easier and more efficient to use, a fronted GUI has been developed - a major asset when debugging implies concurrently running and controlling several nodes. The following Figures provide a short illustration of the properties of this tool (unpublished, but available on demand).
Presently the system does not have a feature for downloading and
bootstrapping new software over the bus: for most nodes an update is
very rarely necessary (as can be seen in Figure 4.2-4) - introducing a
bootstrap loader has been left as a low-priority issue for future
extensions.
The set of nodes that co-operate on the bus constitute a distributed system: although each node executes a perfectly autonomous program, these programs must be built and organized in a coherent fashion.
Compilation and linking is achieved by the Unix "make" tool, with all instructions for "make" arranged into a Makefile. "Make" allows to use the Unix facility of environment variables - variables that are defined outside of "make" but that are accessible to "make" as parameters and for the control of conditional instructions. The Makefiles for nodes use such variables to differentiate between different flavors of a given node (for instance to select between different procedures for doing device I/O), and to determine the CAN message-identifier to which the node will respond; this allows to handle target-nodes and button-nodes as families of nodes with different properties.
These environment variables and the links to configuration-specific procedures can be manually defined by command-line operations. If the Avrgui fronted is used, the GUI takes care of the necessary actions.
This way of arranging the source code for the nodes on the bus has turned out as an extremely efficient way of managing the code of such a system, it can be strongly recommended for this kind of project.
Directory containing: |
Size: | |
(kBytes)
|
(lines)
|
|
Shared procedures |
250
|
4990
|
Configuration-specific procedures |
150
|
750
|
Node-specific procedures (master) |
330
|
4470
|
Node-specific procedures (display) |
440
|
7780
|
Node-specific procedures
(target) |
100
|
2180
|
Node-specific procedures (buttons) |
60
|
1280
|
All printed circuit layouts and most schematics developed for this project are available as documents developed with the Leda family of tools - http://www.gpleda.org; some drawings are powerpoint-style documents produced with LibreOffice.
Each node uses the hardware clock (a 16 MHz crystal) to generate clock-interrupts at intervals of 10 msec. For target- and button-nodes, deviation from the nominal value and drift with aging and temperature are of no concern. This is not so for the master-node and for display nodes, which need to maintain a time-of-the-day variable and must keep this value correct and up-to-date even during extended periods without access to an outside reference (no DCF signal, temporary non-availability of the master-node - see below). For these nodes, a hardware-clock correction algorithm has been implemented. It applies a smooth (no jumps) correction of the clock rate by "occasionally" adding an extra clock hit, or by suppressing one.
This correction is done by defining a correction polynomial that is
specific to each Canstation pcb. The polynomial defines a configuration
of bits of the hardware clock: when the hardware clock arrives at the
corresponding value, a bit is added or suppressed (depending on the
most significant bit that is used in the polynomial itself). These
nodes base all clock-driven activities on the corrected clock rather
than the hardware clock counter.
The master-node receives time-of-the-day information from the Frankfurt DCF transmitter (see the Section on Software for Peripherals). During the intervals between the reception of new and correct DCF information (there may be intervals of more than an hour), the corrected hardware clock is used for maintaining the time-of-the-day record. Every minute, the master-node broadcasts the value of this record over the bus - this serves as a kind of heartbeat mechanism.
The display-node, in turn, uses the time-of-the-day information received over the network for maintaining its clock in synchronization with the master-node (and with the "official" day time).Events are basically independent from each other, and the implementation of the scheduler handles them as such. This turned out as an overly simplified approach:
So far only the irrigation issue has been addressed - easy to
implement by having 8 or 16 irrigation circuits represented in a byte
or a word and having an algorithm that sequentially activates the
circuits piloted by the bits.
Event sequences of the anti-burglar type are presently represented
as individual events with adequately defined timing - not easy to
handle and not flexible.
Presently solutions are explored for creating a more general
implementation - some kind of programs that define sequences of
events which can be scheduled as single "super-events".
These "programs" should be editable at the display-node, and should
allow to be called with parameters and allow to have some random
properties. Hopefully this can be implemented within a couple of
months - a first step is to have the master-node (the scheduler) use
extended ram.
The screenshots in this Section provide a short overview of the
features implemented in the display-node. They are taken from the
French version of the system, which, presently, is the most developed
example of the system.
The first image - Figure 4.1-1 - shows the "idle panel" - the root of the tree of available panels. The top-row contains the two principal action buttons:
The lower three buttons serve to display selected information - the
list of events that are valid for the current day, the list of all
nodes on the network plus the version of the node software, and a
display of "various information".
Figure 4.1-2: Idle panel
Figure 4.2-2 shows the list of devices that is displayed for switching devices on or off manually. For each of the listed devices, a square at the left border indicates that the devices is already "on", a time in the right column indicates the time when the device will be automatically switched off (devices can be of type "auto-off"). The triangle in the top-right corner indicates that the list has predecessor pages and can be scrolled (the touch-panel is programmed to recognize gestures). A device must first be selected by a hit on its row, and then controlled by a hit on the "thumbs up/down" button.
Figure 4.2-3 shows the panel that serves to edit the context of an event:
Figure 4.2-4 shows the result of a button hit in the idle panel for
obtaining the versions of the programs in the nodes; please note that
this is not static (compiled) information, but the result of a query to
each node: the identifier and name of the node, the version of
the program
and the date and time when the software of the node had been built.
These screenshots have been produced with an eDIP320-8 display panel
(320 pixels in a row): the 240 pixels per row of the smaller and less
expensive eDIP240-7 are just too restrictive for a satisfactory display
of all necessary information.
This appendix provides some very succinct information on the
protocol - this will allow to get a general idea on how the protocol
supports the dialogue between the nodes.
The definition of the CAN protocol supports two different types of messages: (1) "Extended frames" - they have a 29-bit identifier field, and (2) "Standard frames" - their identifier field counts 11 bits. Extended frames are exclusively used for dialogues between two specific nodes; all other communication uses standard frames as broadcast messages to - generally - all nodes.
The identifier of an extended frame represents the message header of
the message as a packed record:
00 ssss ssss tttt tttt cccc cccc fff |
<ssssssss> | ... |
frame sequence number |
<tttttttt> | ... |
message type |
<cccccccc> | ... |
node ID of the target CPU and |
<fff> | ... |
flag bits |
Note that the filtering mechanism of CAN allows the target-node to recognize records that contain its node identifier.
Presently, this format is only used for the transmission of multi-frame configuration data between the master-node and the display-node. Initially, it had been planned to support connection-oriented communication, but this has not yet been implemented.Standard frames are used for representing various kinds of
single-frame broadcast messages; again, the identifier represents the
message header as a packed record:
p bb ssss ssss |
<p> |
priority bit (0 = high priority) | |
<bb> |
message category: | 11 ... general
broadcast 01 ... to-master message 10 ... state-change-request 00 ... reserved for future use |
<ssssssss> |
ID of the transmitting node |
The identifier is too short for also incorporating the message-type
code - therefore, the first byte of standard messages is reserved for
the message type. The following list is an enumeration of all currently
defined message types
Categorie / Type |
Description |
|
To-master |
||
Request device state
change |
Request master to modify state |
|
Event list sollicitation |
Ask master for even list |
|
Device-state info sollicitation |
Ask master of state of device |
|
Request device configuration |
Ask master to describe
configuration |
|
General
broadcast |
||
Time & date |
Broadcasting time & date |
|
Status query |
Ask for echo fo node state |
|
Status reply |
Node state response |
|
Action-launch |
Master requests target to change
state |
|
Master initialisation
notification |
Master notifies that it just
restarted |
|
Data frame |
Broadcast data (e.g. temperature) |
|
Transmit device parameter |
Define a device parameter value |
|
Transmit dynamic event flags |
Define dynamic flags for an event |
|
Sollicit CPU information |
Ask CPU to send program info |
|
CPU information reply |
Response with program info |