Troubleshoot the Z-Wave Network

The Z-WAVE EXPERT USER INTERFACE is perfectly suited to troubleshoot networks and find and fix problems. Troubleshooting a Z-Wave network works along the lines of the communication stack.

Problems can occur on the radio layer, the networking layer, and the application layer. To identify and fix problems, it makes sense to work bottom up through the network stack issues.

Most of the troubleshooting functions are accessible on the menu item Analytics. However, this menu item will only be displayed if the firmware on the Z-Wave chip supports some special functions needed for troubleshooting purpose.

Radio Layer

Figure 8.1: Background Noise
Image c2backgroundnoise

Problems on the radio layer come from interference and noise generated by defect or nonconforming electrical gear causing electromagnetic emissions (baby monitor, old cordless phones, wireless speakers, motors, etc.). Other Z-Wave networks with unusual high traffic can also be a root cause of problems. It is also possible that certain other wireless networking services (first and foremost cellular network G4 routers or base stations, also called LTE) may cause interference if they are too closed to the Z-Wave network.

The menu item Analytics > Background Noise offers a view chart displaying the background noise on the two communication channels used by Z-Wave. Channel 1 refers to the 9.6 Kbit/s and 40 kbit/s communication modes, channel 2 points to the 100kbit/s data rate. Figure 8.1 shows this viewgraph. There is an obvious floor of noise with some other “needles.” This noise floor—in Figure 8.1 at about -85 dBm for channel 1 and -90 dBm for channel 2—is the minimum level a Z-Wave transceivers signal must surpass in order to be decoded by the Z-Wave receiver.

The lower the noise level the better the wireless situation. Noise levels below - 95 dBm are very good, levels above -70 dBm are very bad.

Please note that this noise level is measured right on the controllers location or wherever the hardware running Z-Way is positioned. It may make sense to move the measuring device around to see the noise level at different locations. Since the Analytics > Background Noise viewgraph is only updated once per minute, you may want to use the tool Analytics > Noise Gauge, as shown in Figure 8.2. In this case, the display is updated every two seconds.

Figure 8.2: Realtime Measurement of Background-Noise
Image c5noisegauge

Figure 8.3: Powerbank to power the Z-Way controller for mobile use
Image powerbank

If the noise floor is too high, you need to find the source of the noise. The device running Z-Way can be used as mobile device too, thanks to the built-in Wi-Fi. In this case, it needs to be powered with a power bank as shown in Figure 8.3.

Walking around with the Noise Gauge enabled may help to track down the jamming device. The closer the controller hardware gets to the source of the noise, the higher the background noise level will be.

The “needles” above the noise floor show communication from other Z-Wave networks around. Having this is not a real problem unless other networks generate heavy traffic. A rule of thumb is that there should not be more than 30 % of the time allocated by traffic of other Z-Wave networks. If there is more traffic, there will be a need to troubleshoot the other Z-Wave network first. The chart Analytics > Network Statistics, as shown in Figure 8.4, shows a ratio of own traffic versus traffic seen from other networks.

Figure 8.4: Network Statistics Display
Image c5networkstatistics

Network Layer - Devices

Devices can have two faulty states:

Figure 8.5: Status Page Z-Way
Image c5networkstatus

Figure 8.6: Packet Sniffer
Image c5sniffer

Another option to detect faulty devices is the Network > Timing Info View.

Figure 8.7: Paket timing of a fresh Z-Wave network
Image c5timinginfo1

Figure 8.7 shows this view. The timing information lists one entry for every communication between the controller and the device. The number refers to the time (in x * 10 ms) the message took before being confirmed; the color gives a rough indication of what happened:

Figure 8.7 shows the situation in a network just installed. It can be seen that there is only communication with few devices, e.g. no polling of sensors, etc. While this is not a problem, the chart shows that devices 4, 6, and 31 are in direct range and all communication works perfectly well (green, low number). Device 14 seems to be a real problem child. The controller tries all the time to reach this node but always fails. At some point in time, the controller will accept that node 31 is dead and put him into the “failed node list.”

Figure 8.8: Paket timing of an aged Z-Wave network
Image c5timinginfo2

Figure 8.8 shows a network that is a bit more complex, has more communication and is aged. Again node 20 is a defect device that just needs to be replaced. The following interesting patterns can be seen:

Network Layer - Weak or Wrong Routes

It is the best already knowing the troublemaking devices. In this case the status of device can be checked quickly and it is possible to dig deeper into the routing layer. Figure 8.9 shows the routing table of a controller. Technically this is not a routing table but a matrix indicating the wireless neighborhoods of devices. Nevertheless, this a good starting point to investigate deeper. Having many neighbors is a good thing since the routing algorithm has many options in case something goes wrong. On the other hand, just having one other route to communicate to the rest of the network may cause trouble if this route is faulty or moved.

Figure 8.9: Neighbor-Table of a controller
Image c3neighbortable

The next step is to check individual routes. The configuration page of every device offers a link health check that allows testing the links from this very device to its neighbors.

While the neighborhood table shows if two devices are neighbors, the link test checks how good this wireless links is. Unfortunately, not all but an increasing number of devices on the market support this link test. Figure 8.10 shows this dialog within Z-WAVE EXPERT USER INTERFACE . Every link has a color indicator (green = ok, red = bad, grey = unknown) and a time stamp that shows when this test was done the last time.

Figure 8.10: Link test of a node
Image c5linktest

Please note that the link check is a momentary analysis only and does not give any information about the history of the link quality.

Application Layer Settings

In the application layer, there is usually no malfunction of a device but wrong configurations. Z-WAVE EXPERT USER INTERFACE allows changing and monitoring the values.

Polling

Heavy polling of devices causes network traffic leading to delays. A simple look on the sniffer as shown in Figure 8.6 reveal if there is too much polling.


Dead Associations

Figure 8.11: Association Dialog in Z-WAVE EXPERT USER INTERFACE
Image c4association

Association enable direct communication between devices. In case there are more than one device in an association group, they will receive a command one after each other. A very common problem is that associations are set during the built-up of the network and later certain devices are removed or simply fail. If this disappeared node is still in an association group, the device will always try to communicate to this node first before communicating to other nodes. The result is a delay. The device-specific configuration overview as shown in Figure 8.11 displays all association that are set. It is possible to recall the current associations from the device and to remove or set associations.

Wrong Wakeup Settings

Wrong wakeup settings may either result in too much traffic draining the battery, or in too slow response to sensor update requests or configuration changes. The status overview page as shown in Figure 8.5 gives a simple overview of the wakeup settings of the different battery-operated sleeping devices. The device-specific configuration settings allow changing these settings. Besides the wakeup interval, the setting also allows setting/changing the Node ID of the controller holding the mailbox of this device. This setting must reflect the correct situation in the network.

Summary

Table 8.1 summarizes the possible “10 root causes of Z-Wave network problems’’ and suggestions how to fix them.


Table 8.1: Troubleshooting on Z-Wave networks
No. Cause How to find ? How to fix ?
1 Noise by other transmitters Background Noise Chart Find them and turn them off
2 Noise by other Z-Wave networks Background Noise Chart, Network Statistics Talk to the neighbor ;-)
3 Faulty devices Status Page, Failed Node Remove them or replace them.
4 Crazy Devices (always sending) Sniffer Remove them or replace them
5 Weak Link Neighbor-Table, Link Health in Configuration Page Add more routing nodes, move devices
6 Heavy Fading Timing Infos Network Reorganization, more devices
7 Wrong Routing Timing Infos Network Reorganization
8 Wrong Polling Sniffer Change and Save
9 Wrong Wakeup Intervals Status Page Change and Save
10 Dead Nodes in Assoc. Groups Association display in Configuration Page Change and Save