Many organisms use multimodal signals to communicate (Johnstone, 1996; Partan and Marler, 1999; Bradbury and Vehrencamp, 2011; Higham and Hebets, 2013; Uy and Safran, 2013). Although the use of multiple signal modalities may improve communication efficacy through better detection, localization, and discrimination (Guilford and Dawkins, 1991; Narins et al., 2005; Partan, 2017), the costs associated with attracting predators and parasites may also increase (Candolin, 2003; Halfwerk et al., 2014). In the case of acoustic communication, noise and transmission properties of the environment may shape acoustic signalling behaviour and signal design (Morton, 1975; Ryan and Brenowitz, 1985; Gerhardt and Huber, 2002; Slabbekoorn and Smith, 2002; Jones and Teeling, 2006; Grafe et al., 2012). Multimodal communication is favoured in noisy environments because it allows signallers to avoid masking interference by switching to quieter modalities (Partan, 2017).
Acoustic signals are the main mode of communication in anurans (Gerhardt and Huber, 2002; Wells, 2007). However, background noise generated by conspecifics, heterospecifics, or the environment can be a major impediment to effective signal transmission and reception (Narins and Zelick, 1988; Brumm, 2013). Noise generated by social aggregations usually fluctuates over time and thus signallers may show flexible behavioural responses (Zelick and Narins, 1983; Schwartz and Wells, 1984; Grafe, 1996; Grafe, 2003; Schwartz and Bee, 2013). Receivers may achieve release from masking by listening in the gaps or dips of fluctuating noise, a solution to the cocktail party effect encountered by human listeners (Hulse, 2002; Velez and Bee, 2011). Less is known about long-term adaptive processes in response to more constant noisy environments such as those experienced by stream-breeding anurans (Brumm and Slabbekoorn, 2005). Continuous background noise has been identified as a selective agent in stream-breeding anurans having favoured advertisement calls of higher dominant frequencies as a response to masking interference from flowing water (Boeckle et al., 2009; Goutte et al., 2016; Röhr et al., 2016). Another strategy anurans use to reduce auditory masking is to use visual signals (Hödl and Amézquita, 2001). In many anurans, the vocal sac is used for visual communication and is mostly used simultaneously with sound production (Rosenthal et al., 2004; Taylor et al., 2011; Starnberger et al., 2014). Other visual signals, such as foot flagging, however, are independent or sequentially linked to acoustic signals (Grafe and Wanger, 2007; Preininger et al., 2013; Caldart et al., 2014). Advantages of such multimodal communication include both content-driven and efficacy-driven selection pressures, such as providing backup, conveying multiple messages, or reducing signal degradation and attenuation through the environment (Guilford and Dawkins, 1991).
These explanations mostly assume that selection pressures are consistent over time. However, it is becoming apparent that, in addition, multimodal signalling is likely to have evolved and is being maintained by fluctuating social and ecological environments (Greenfield and Rodriguez, 2004; Bro-Jørgensen, 2010). For example, wolf spiders vary their visual and seismic signals according to the substrate they are on (Gordon and Uetz, 2011), humpback whales switch from vocal signals to surface-generated sounds (Dunlop et al., 2010) and, when viewed from the receiver’s perspective, fringe-lipped bats depend on multimodal cues in locating their frog prey when confronted with increased levels of acoustic complexity (Rhebergen et al., 2015).
In this study, we test the fluctuating environment hypothesis for the maintenance of multimodal signalling (Bro-Jørgensen, 2010). In particular, we hypothesize that acoustic signals have not been lost in stream-breeding frogs because stream noise levels vary seasonally with rainfall. Acoustic signalling will be at an advantage under more quiet conditions, whereas visual signals will prevail when the noise of rushing water is high. Such context-dependent dynamic selection regimes are recently gaining wider attention (Bro-Jørgensen, 2010; Wilgers and Hebets, 2011) and enhance our understanding of the flexibility seen in the use of multimodal signals in stream-breeding anurans.
In this study we investigate the flexibility in the use of visual and acoustic signals in Bornean foot-flagging frogs, Staurois parvus. They have solved the problem of continuous broadband low-frequency stream noise by modifying their advertisement calls in addition to using numerous visual signals, foot flagging being the most conspicuous (Grafe et al., 2012). Advertisement calls in S. parvus vary in note number, with males increasing the pitch, sound pressure, and duration of notes with note number, consistent with the notion that they can flexibly adjust to enhance contrast and signal range depending on background noise and receiver proximity (Grafe et al., 2012). Furthermore, males produce conspicuous visual foot flags allowing them to potentially completely forgo vocalizing in noisy conditions. We hypothesized that males would reduce the number of advertisement calls and increase note number at high noise levels while also increasing the number of foot flag displays compared to less noisy conditions.
Although background noise in the environment of S. parvus is nearly continuous over a time period of minutes to hours, it varies strongly depending on rainfall. Especially in smaller streams with small catchment areas, which are typical habitats of S. parvus, background noise levels vary considerably between days and between dry and wet seasons. We recorded stream noise during periods of low and high rainfall for a preliminary assessment of variability in stream noise experienced by stream-breeding anurans.
In the present study, our aims are to (1) examine variation in stream noise at one breeding site and (2) use stream noise playback experiments to test the fluctuating environment hypothesis for the maintenance of multimodal signalling.
Study site and species
We studied a population of S. parvus from 28 June 2015–9 July 2015 and 20–27 May 2016 in the Ulu Temburong National Park, Brunei Darussalam, Borneo. The study site was a section of a small freshwater stream that merges into the Belalong River next to the Kuala Belalong Field Studies Centre (115°09´E, 4°33´N). Stream width varied between 0.4–5 m depending on rainfall. Daily temperatures varied between 24–27°C. Annual precipitation at the site in the years 2005–2016 ranged between 4,440–6,770 mm.
Staurois parvus is a diurnal ranid frog, endemic to Bornean primary forests. It is found in small rocky streams with fast-flowing water (Inger et al., 2017). The snout-urostyl length of the study population of male S. parvus averaged 21.5 ± 0.5 mm (range 20.7–22.7; N = 13; Grafe et al., 2012). The white webbing between toes of their hind legs and their white ventral surface strongly contrasts with their cryptic dark grey, brown dorsal body and the black rocks of the streams they inhabit. Males conspicuously foot flag during agonistic male-male encounters by raising and rotating their legs while exposing their hind feet webbing (Harding, 1982; Grafe et al., 2012). Males also produce advertisement calls that are functionally linked with foot flagging. Calls usually precede foot flags and the acoustic signal serves to alert receivers to the subsequent visual signal (Grafe et al., 2012).
Ambient noise measurements
Ambient noise was measured along a 50 m section of the Sungai Mata Ikan, a small tributary of the Belalong River inhabited by S. parvus. Noise was measured on two days: a quiet day after a longer dry period (31 May 2015) and a noisy day following a period of heavy rains (22 January 2015). Noise measurements and playback experiments were conducted along the same stream section. On each day, single 20 min recordings were made at a distance of 1 m from three small waterfalls using an omni-directional microphone (ME 62/K6, Sennheiser electronic GmbH and Co. KG, Germany) and a solid-state recorder (PMD 661, Marantz, Japan; settings 44.1 kHz, 16 bit) totalling 60 min of recordings. Identical equipment and gain settings were used for all recordings. An arbitrary 10 sec segment of each 20 min recording was analysed for noise spectrum levels (using the new selection spectrogram view option) across the entire frequency range of the recordings in Raven Pro 1.5 (Bioacoustics Research Program 2013, Cornell Lab of Ornithology, USA). Logarithmic averages of the three spectrum measurements for each day were calculated to give mean values for quiet and noisy conditions. Measurements are in relative dB only, but can be compared because of the same recording settings on the two days.
Acoustic playback experiments
To test the fluctuating environment hypothesis, we conducted acoustic playback experiments using high-intensity stream noise with 10 males in the field during early morning (7:00–8:20) and late afternoon (17:00–18:20) when activity peaks (Grafe et al., 2012). Pseudoreplication was avoided by individually marking each male using unique toe-clips after the playbacks (see Grafe et al., 2011 for methodology and justification of toe-clipping).
Males were located in the field and presented with a 2-min pre-playback control followed by a 2-min, high-intensity stream noise playback and a 2-min silent post-playback period. Playbacks were conducted at a time when the stream had a very low water level and thus stream noise levels were low. The playback stimulus was presented from a portable solid-state recorder (Microtrack 24/96, M-Audio, USA) connected to an external battery-amplified speaker (SME-AFS, Saul Mineroff Electronics, Inc., USA; frequency response: 100 Hz–12 kHz) placed between 65–128 cm from the focal male without disturbing it. An arbitrary 2-min section of the 60-min pre-recorded stream noise was chosen before each trial. The speaker could not be placed at a predetermined distance in the rugged, slippery terrain and thus the distance between frog and speaker was measured after the experiment to determine the sound pressure level of each playback. Average playback intensities were 90.3 ± 2.1 dB (re: 20 μPa) (x ± SD) and ranged between 86–94 dB (re: 20 μPa).
Each experiment lasted about 6 min. Sound pressure levels (dB re 20 μPa; flat weighted and fast response setting) were measured at the males’ position using a sound pressure level meter (407703A, Extech Instruments, USA). Each trial was documented through observation and video recording of the visual and acoustic activities of each male using a DSLR camera (600D, Canon, Japan). Analyses were conducted blind with respect to the playback period. The audio recordings were later extracted from the video recordings using Adobe Audition version 1.5 (Adobe Systems, USA). A paired t-test was used to compare pre-playback to playback responses. The statistical program BIAS (version 8.2; epsilon-Verlag GbR, Germany 1989–2006) was used to analyse the data. Average dB values were determined after conversion to a linear scale (Pa).
Variation in stream noise
Ambient stream noise spectra from quiet and noisy days had very similar shapes but differed in spectrum levels (Figure 1). On both days, noise levels were higher at low frequencies and they decreased with increasing frequency. Spectrum levels were consistently higher by approx. 22 dB on the noisy day. A difference of 23 dB was found at 5,580 Hz the average dominant frequency of the advertisement call of S. parvus (Grafe et al., 2012). The maximum difference was 26 dB at 300 Hz. The minimum difference was 20 dB at 6,300 Hz due to insects calling at this frequency during the quiet day.
Stream noise playback experiments
Playback of high-level stream noise significantly influenced the amount of visual and acoustic signalling by male S. parvus. All 10 males foot flagged during the playback (6 during the pre-playback) while only 5 males vocalized during the playback (9 during the pre-playback). Significantly more foot flags were given during the 2-min playback period than during the 2-min pre-playback period (Wilcoxon matched pairs, Z = 2.07, p < 0.05, n = 10; Figure 2). In contrast, the number of advertisement calls given in response to the noise playback significantly dropped compared to the pre-playback control. (Wilcoxon matched pairs, Z = 2.37, p < 0.05, n = 10, Figure 2). During the post-playback control, there was a non-significant reversal in the use of visual and acoustic signals compared to the playback period. During the noise playback, males produced significantly more foot flags than calls (Wilcoxon matched pairs, Z = 2.03, p < 0.05, n = 10). Furthermore, as predicted, calls contained significantly more pulses during the playback than during the pre-playback control (paired t-test, t = 3.12, p < 0.05, n = 5, Figure 3).
Stream noise can vary considerably, especially in upper catchment areas of small streams in which water levels change soon after the onset of rainfall, but this has rarely been measured. In our study, the average differences in amplitude were 22 dB or more and playbacks of high-intensity stream noise induced a switch from predominantly acoustic to predominantly visual signalling in Staurois parvus. Shape of the noise spectra was similar to stream noise recordings from similar-sized streams in Nepal (Dubois and Martens, 1984), Malaysian Borneo (Boeckle et al., 2009), Columbia (Vargas-Salinas and Amézquita, 2013), and Italy (Lugli and Fine, 2003) with more energy at low frequencies, suggesting that similar selection pressures act on the design of communication signals in stream-breeding anurans across continents and even across larger taxonomic groups.
We show that S. parvus can rapidly change the amount of visual and acoustic signals depending on fluctuating stream noise. Presented with playbacks of high-level stream noise during less noisy days, males significantly decreased the amount of calls produced and at the same time significantly increased the number of foot flags. This rapid response strongly supports the fluctuating environment hypothesis for the maintenance of multimodal signalling (Bro-Jørgensen, 2010).
It seems likely that multimodal signalling not only has current utility in background noise, but is an evolved response to continuous background noise in all six species in the genus Staurois, all of which inhabit rocky streams (Grafe and Wanger, 2007; Preininger et al., 2009; Arifin et al., 2011; Grafe et al., 2012). It appears that foot flagging is a derived character that has evolved independently mainly in anuran species that communicate along fast-flowing streams (Hödl and Amézquita, 2001).
Although all anurans that show visual signals also vocalize (Hödl and Amézquita, 2001), some species in the genus Staurois have conspicuously reduced their calling activity in favour of visual signals. For example, S. guttatus produces only short, one- or two-note calls that serve to alert receivers to the subsequent visual foot-flagging signal (Grafe and Wanger, 2007). Some anurans, although retaining the ability to call, choose to go unimodal and give purely visual signals by inflating their vocal sacs without producing a sound. Most notably, males of the East African stream-breeding ranid, Phrynobatrachus kreffti, use vocal sac inflations more frequently than bimodal visual and acoustic signals during male-male encounters (Hirschmann and Hödl, 2006). It is not known if this is a response to high levels of background noise or the need to use a more private modality to avoid attracting predators or parasites.
Apart from using visual signals, male anurans can also avoid broadband low-frequency-dominated masking noise typically found in streams by increasing call dominant frequency (Feng et al., 2006; Arch et al., 2008; Boeckle et al., 2009; Zhang et al., 2015). Most revealing is the case of the Andean poison frog Andinobates bombetes. Males of this species living alongside streams call at higher dominant frequencies than those living away from streams (Vargas-Salinas et al., 2014). This could be the result of habitat filtering with individuals of larger body size, and thus lower frequency calls, avoiding noisy low frequency dominated streams (Carvajal-Castro and Vargas-Salinas, 2016).
The most dramatic spectral shifts have been documented in Odorrana tormota and Huia cavitympanum in which males call in the ultrasonic range (Feng et al., 2006; Arch et al., 2008). However, constraints on body size and the high rate of attenuation and degradation of high frequency sounds will limit widespread use of this solution. All investigated species in the genus Staurois display calls with higher frequencies than expected from their body size (Boeckle et al., 2009). Other frog and bird species are able to increase the pitch of their calls or songs while vocalizing in areas of ambient noise dominated by low frequencies in a flexible manner as a response to local and/or temporal conditions (Dubois and Martens, 1984; Slabbekoorn and Peet, 2003; Brumm, 2006; Bee and Swanson, 2007; Parris et al., 2009).
Finally, as predicted, advertisement calls contained significantly more pulses during the playback than during the pre-playback control. Since the number of pulses/call is significantly correlated to call frequency and sound pressure with calls rising in pitch and amplitude (Grafe et al., 2012), calls given during the stream noise playback increased the signal-to-noise ratio and enhanced signal integrity. Such increases in song duration, and/or increased call or note rate is a response shown by many animals to increases in background noise (Brumm et al., 2004; Parks et al., 2007; Kaiser and Hammers, 2009; Diaz et al., 2011; Francis et al., 2011; Penna and Meier, 2011). Increasing the number of notes or syllables per call increases redundancy and thus increases signal detection in noise (Schwartz and Bee, 2013) and females are known to prefer calls with greater intensity, higher call rate and duration (Gerhardt and Huber, 2002; Bradbury and Vehrencamp, 2011). Switching to more visual signalling and increases in note number in S. parvus must be seen as an additional strategy to facilitate signal transmission under conditions of high background noise.
Stream noise in small streams will not only vary with the amount of rains but also with stream slope leading to distinct frog communities within just a few stream meters (Keller et al., 2009). Such variation, which clearly affects individuals and community assemblage, has to our knowledge not been characterized sufficiently (but see Goutte et al., 2013). The long-term deployment of autonomous acoustic recorders could help delineate temporal and spatial variation in stream noise (Farina, 2014). Furthermore, identifying the species composition of vocalizing animals at streams and the degree of flexibility in responses to temporal and spatial variability in stream noise is an interesting prospective.
In conclusion, we show that stream noise levels, although continuous, are not consistently high. These fluctuating ecological environments can be major drivers of multimodal signalling in stream-breeding frogs. Multimodal signalling will be favoured under fluctuating ecological environments if each modality is favoured under different conditions.