However, our experiments indicate otherwise. When we synthesized sounds using a filter bank with the bandwidths of our canonical model, but with four times as many filters (such that adjacent filters overlapped more than in the original filter bank), identification was not significantly improved [Figure 5D; condition 4 versus 3, t(9) = 1.27, p = 0.24]. Similarly, one might suppose that constraining the
full marginal distribution (as opposed to just matching the four moments in our model) might capture more structure, but we found that this also failed to produce improvements in identification [Figure 5D; condition 5 versus 3, t(9) = 1.84, p = 0.1; Figure S4]. These results see more suggest that cochlear marginal statistics alone, irrespective of how exhaustively they are measured, cannot account for our perception of texture. Because the texture model is independent of the signal length, we could measure statistics from signals much shorter or longer than those being synthesized. In
both cases the results generally sounded as compelling as if the synthetic and original signals were the same length. To verify this empirically, in condition 7 we used excerpts of 15 s signals synthesized from 7 s originals. Identification performance was unaffected [Figure 5D; condition 7 versus 6; t(9) = 0.5, p = 0.63], indicating that these longer signals captured the texture qualities as well as signals more comparable to the original signals in length. We found that each class of statistic was perceptually HKI-272 datasheet these necessary, in that its omission from the model audibly impaired the quality of some synthetic sounds. To demonstrate this empirically, in Experiment
2a we presented listeners with excerpts of original texture recordings followed by two synthetic versions—one synthesized using the full set of model statistics, and the other synthesized with one class omitted—and asked them to judge which synthetic version sounded more like the original. Figure 6A plots the percentage of trials on which the full set of statistics was preferred. In every condition, this percentage was greater than that expected by chance (t tests, p < 0.01 in all cases, Bonferroni corrected), though the preference was stronger for some statistic classes than others [F(4,36) = 15.39, p < 0.0001]. The effect of omitting a statistic class was not noticeable for every texture. A potential explanation is that the statistics of many textures are close to those of noise for some subset of statistics, such that omitting that subset does not cause the statistics of the synthetic result to deviate much from the correct values (because the synthesis is initialized with noise). To test this idea, we computed the difference between each sound’s statistics and those of pink (1/f) noise, for each of the five statistic classes.