搜索 Search
《音频专业临界聆听技巧》/Critical Listening Skills for Audio Professionals
/
出版于/Published in 2005
未出版/unpublished
文献作者/Author:
文献译者/Translator:
文献校对/Proofreader:
文献收集/Collector
文献顾问/Consultant:
关键词/Keywords:
读物/Reader:
MySQL (beta) at CHEARSdotinfo.co.uk Unit 6 Notice that there is considerable increase in loudness because the two sine waves have a time relationship described as being in phase. Because the two tones are in phase, they add constructively and the resulting combination has twice the amplitude of either tone alone. Those knowledgeable in electronics will recognize this as producing a 6-dB increase in signal level. Now, let’s see what happens if the same two 500-Hz sine waves are combined out of phase or in phase opposition, that is, when one waveform goes positive, the other goes negative, and vice versa. That four-second dead spot in the middle was caused by adding the second equal 500-Hz tone out of phase. When the two signals are in phase opposition, one cancels the other out and the resultant output is zero. The frequency of the beat is determined by the difference between the frequencies of the two tones which are beating together. As the difference between the two tones is increased so that the beat frequency increases to about 20 Hz, the ear becomes unable to discern the individual beats. As the beat frequency is increased beyond 20 Hz, a harsh, rattling sound is heard. Note this roughness well! It is the secret ingredient of what we consider to be unpleasant musical effects. As with so many other factors of human hearing, the critical band seems to be involved in how we hear two tones which are sounded together. If the two tones are a critical bandwidth apart, they are heard not as beats or roughness but resolved harmoniously as two separate tones. To avoid the distraction of the beats and the region of roughness, and for the ear to separate the two tones, they must be at least a critical bandwidth apart. All this leads us to the conclusion that when several tones are sounded simultaneously, the result may be considered as either pleasant or unpleasant. Another way of describing these sensations is with the terms consonant and dissonant. In this psychoacoustical context, when we say consonance, we mean tonal or sensory consonance. This is distinguished from the musician’s use of the word, which is dependent on frequency ratios and musical theory. Here, we are referring to human perception. Of course, in an ultimate sense, the two definitions must come together. The audibility of these roughness effects does not depend on musical training. This puts the effect of combining two tones in proper perspective. If their frequencies are separated by a critical bandwidth or more, the effect is consonant. If less than a critical band separates the tones, varying degrees of dissonance are heard. The most dissonant (that is, the least consonant) spacing of two tones is about one-fourth of a critical bandwidth. Musicians define an octave as a musical interval whose two tones are separated by eight scale tones. Tones separated by an octave have an essential similarity recognized by everyone. There is a very good reason for the octave’s consonance, which directs our attention once more to the critical band. An octave represents a frequency ratio of two to one. This means that the harmonics of both are either well separated or coincident up through the audible spectrum when the two are played together. In fact, the sound of the higher note reinforces that of the lower one. The result is consonance- full, rich, and complete. The perfect fifth is only slightly less pleasant than the octave interval. we note that they are either separated more than a critical bandwidth or they are at essentially the same frequency, both factors contributing to consonance. Comparing the frequencies of the fundamentals and harmonics of middle C and B-flat, we fail to find coincident pairs as we did with the perfect fifth interval. For a minor seventh interval, we find numerous harmonics of C and B- flat close enough together to result in some roughness. Evaluating the separation of harmonics, we find many near misses which are not coincident but less than a critical bandwidth apart contributing to the roughness. We see that the perfect fifth is close to perfect-that is, close to the consonance of the octave interval. The minor seventh has some intervals separated less than a critical bandwidth, hence, somewhat dissonant. We conclude that the critical band approach has value in explaining, or even predicting, the degree of consonance an interval exhibits. Dissonance can be considered another dimension of musical creativity to be explored. Our purpose in this analysis is only to relate consonance and dissonance to the critical bands of the human auditory system.
MySQL (beta) at CHEARSdotinfo.co.uk Unit 5 The echo is no longer distinct because of the amazing integrating effect of our auditory system. This is called the Haas Effect by audio engineers and the precedence effect by psychologists. In a modest-sized space like your living room, a classroom, a studio, or a control room, if someone speaks to you from across the room, you have no difficulty sensing the direction of the voice, even if you are blindfolded. That is because the direct sound, which arrives first, gives the directional cue even though followed by an avalanche of reflected sounds. The first sound to arrive tells us from which direction it comes: This is the Law of the First Wavefront. And it all happens in a fraction of a thousandth of a second. while discrete echoes of speech become discernible with a delay of around 40 milliseconds, echoes of single, short-duration impulse sounds will be audible with delays as short as four milliseconds. Sustained, or slowly varying sounds, on the other hand, may require a delay of as much as 80 milliseconds before discrete echoes are noticeable. A famous musician once said, "There is no such thing as good music outdoors." He had in mind the reflections from the walls and other surfaces of the concert hall which become very much a part of the music. The lack of such reflected energy outdoors, in his opinion, degraded the quality of the music.
MySQL (beta) at CHEARSdotinfo.co.uk Lesson 7 Reverberation may be either friend or enemy; it can improve our program material or degrade it. Because we normally are not conscious of reverberation as a separate entity, it is well that we pause to dissect and define it. If reverberation is almost totally eliminated, the same speech is understandable, but it sounds rather "dry" and uninteresting. Too little reverberation is as unpleasant and unnatural as too much. Reverberation is a direct result of the relatively slow speed at which sound is propagated. Some sound energy is lost at each reflection and it might take several seconds for successive bounces to reduce the level of the sound to inaudibility. In other words, in a confined space, it takes a little time for the sound to die away. Reverberation time is, roughly, the time it takes a very loud sound to die away until it cannot be heard anymore. To be more precise and scientific about it, reverberation time is defined as that time it takes a sound suddenly cut off to decay 60 dB. If an orchestra were to play outdoors, its sound would tend to be thin and dry. The same orchestra playing in a hall having a reverberation time of about one second sounds natural and pleasing. The quality of speech is also greatly affected by the amount of reverberation present. The understandability and naturalness of speech is actually better at shorter reverberation times. Anything that interferes with these low-level consonants reduces the intelligibility of the words. Reverberation is one thing but not the only thing that can seriously impair the understandability of speech by covering up these low-level consonants. The slow trailing off of the sound of the first part of each word interferes with our hearing the consonants at the end of each word, and the identification of each word depends upon identifying this consonant. Reverberation is very much a part of our enjoyment of music and our understanding of speech. It affects the quality of both speech and music, but it is very important to have the right amount.
MySQL (beta) at CHEARSdotinfo.co.uk Lesson 9 In recording a voice in a room, there is no escaping the acoustical effect of the surroundings. The sound is contained by the surfaces of the room; and size, shape, and proportions of the room, and the absorbing and reflecting characteristics of the surfaces determine the sound field in the room. With the microphone close to my lips, the direct sound dominates. The sound reflected from the walls, floor, and ceiling is weaker because it travels farther and some sound energy is lost at each reflection. The greater the microphone distance, the more the room effect dominates. I will hold the microphone at a constant distance from my lips so that the direct sound will be unchanged. By walking toward the plywood, the sound reflected from it will increase the closer I get. The entire effect can be simulated for easy study by using a delay device. In the following example, a voice signal is combined with the same signal delayed one-half of a thousandth of a second (or one-half millisecond) with respect to the direct sound. Voice colorations result any time a sound component is combined with itself delayed a bit. The plywood reflector provided such a delay with a single microphone. A hard table top close to the microphone can do the same thing. If the same sound strikes two microphones separated at a distance, and the outputs of the two are combined, wild frequency response variations will result. At frequencies at which the two components are in phase, the signals add, giving a 6-dB peak. At frequencies at which they are out of phase, they cancel, resulting in a 30- or 40-dB dip in response. Down through the audible spectrum, these peaks and dips drastically change our normally uniform response, and this is what changes the character of the sound. This is commonly called a comb filter because the frequency response peaks and dips look like a comb when plotted.
MySQL (beta) at CHEARSdotinfo.co.uk Lesson 8 The program material we are interested in listening to or recording we shall call the signal. Any sound that interferes with enjoyment or comprehension of the signal is called noise. If the signal dominates, excellent. But if noise dominates, the signal is useless. The relative strength of the signal compared to that of the noise is a very important factor in all communication systems. We even have a name for it: the signal-to-noise ratio. If the desired signal dominates, the signal-to-noise ratio is high, and all is well. At a signal-to-noise ratio of 40 dB, it becomes more difficult to hear the noise. Reverberation is a kind of noise, so we can expect white noise to affect the understandability of speech in a similar way. The inevitable noise that often interferes with the desired signal is less familiar to us. Electrical current flowing in every transistor, every piece of wire, generates a hissing sound like we have just heard. Fortunately, it is quite weak, but certain circuit faults can make it a problem. Radio frequency signals, such as those from nearby radio and television broadcasting stations, can easily penetrate audio circuits if there is improper shielding. The noise of heating, ventilating, and air conditioning equipment is often of high enough level to degrade a recording or interfere with listening. In listening to live or reproduced music or speech, signal quality can be affected by environmental noise. The mere presence of people in an audience results in noises of breathing, movement, coughing, rustling paper.
MySQL (beta) at CHEARSdotinfo.co.uk Unit 7 The most dependable cues are those obtained by comparing sounds reaching the two ears. Directional information is contained in a comparison of the relative levels of sound falling on the two ears. Another cue used by the ear to localize sound sources is based on the time of arrival of sound at the two ears. If a sound arrives at one ear later than the other, we say that there is a phase difference between the two signals. In the previous unit, we discussed beats and their relationship to consonance and dissonance of sounds. Those beats are strictly a physical phenomenon, occurring outside our bodies as the two tones pull in and out of phase. There are also so-called binaural beats which are strictly subjective or psychophysical. These give evidence that our ears do indeed perceive phase differences.
MySQL (beta) at CHEARSdotinfo.co.uk Unit 4 A non-linear system alters the input wave-form and delivers a distorted signal at the output. The distorted output contains frequency components not in the input signal. Non-linearity always means distortion, and distortion always adds to the input signal something new and undesirable that wasn’t there before. As with amplifiers and other audio equipment, this is also true of the human auditory system. Another method for detecting the presence of aural harmonics is by playing the fundamental frequency into the left earphone and injecting a probe tone at the frequency of a harmonic in the right ear and listening for a binaural beat. These aural harmonics are produced by non-linearities of the ear and do not exist as external signals. Thus, they are inaccessible to any physical measuring instruments. However, their presence can be verified through these binaural beats. In fact, knowing that strongest beats are produced when the two signals are close to the same amplitude, scientists are able not only to detect the presence of aural harmonics but also to estimate their amplitudes. When two tones are introduced into a non-linear system, a series of so-called combination tones is generated. If the higher tone has a frequency H and the lower tone the frequency L, a difference tone of frequency H minus L and a summation tone of the frequency H plus L are produced. These are called the first order sum and difference tones. The situation becomes much more complex as second order distortion products are considered. For example, these include frequencies of 2H minus 2L, 2H minus L, 2L minus H, 2L plus H, and so on. In fact, all these distortion products are similar to what an electronics engineer measures in an amplifier by what is called the cross-modulation method. In addition to the simpler aural harmonics which we explored with a single tone injected into our auditory system, we have also detected several combination tones resulting from injecting two tones into the system. With music, many more than two tones fall on the ear simultaneously. Just imagine the horde of aural harmonics and combination tones filling up the audible spectrum! The masking of higher frequencies by lower ones makes some of these distortion products inaudible. On the other hand, we must remember that distortion products interact with each other, thus creating even more distortion products. But these will be at progressively lower levels. In summary, we can say that when modest levels of sound fall on our ears, all of these distortion products generated in our heads are at very low levels. For louder sounds, however, the levels of distortion do become appreciable. In other words, at low levels, the ear is quite linear; at high levels, there is a departure from linearity.
MySQL (beta) at CHEARSdotinfo.co.uk Lesson 6 non-linear distortion: Any distortion results in new frequency components being generated within the equipment which do not rightfully belong to the signal. If the input signal to an amplifier is increased, we expect a corresponding increase in output. The operating region over which this is true is rightly called the linear region. Every audio system, however, has its upper limit. Trying to get 100 watts out of a 10- watt amplifier certainly results in penetration of what is called the non-linear region. Our first exercise explores the distortion resulting from what is called signal clipping. The simplest way to describe the amount of distortion is to filter out the fundamental and measure the harmonics remaining. These harmonics are then expressed as a percentage of the fundamental. THD = Total harmonic distortion. Ten percent harmonic distortion is considered to be very heavy distortion. It is well for us to note at this point that modern professional power amplifiers and the better high-fidelity, consumer-type amplifiers are commonly rated as low as a few hundredths of 1 percent total harmonic distortion. A modified form of clipping results from applying too high a signal to a magnetic recorder. This results in what is often called a soft type of clipping as the tape becomes saturated magnetically. Another form of distortion has to do with the slight variations in the speed of the tape in a magnetic recorder or the rotational speed of the turntable as a disk recording is being played. Such speed changes result in unnatural shifts in frequency. This illustrates what is commonly (and understandably) called wow. A similar form of distortion resulting from rapid speed fluctuations is called flutter. It can be caused, among other things, by dirty recording heads in magnetic recorders.
MySQL (beta) at CHEARSdotinfo.co.uk Unit 3 When the tension and length of the string are just right (or in tune, as we would say), bowing the string sets it to vibrating at the standard A, which is defined at 440 vibrations per second, or 440-Hz. This number of vibrations per second is called the fundamental frequency. The 440-Hz vibration (the fundamental frequency) is called the first harmonic, and the 880-Hz vibration is called the second harmonic. The fundamental, or first, harmonic, is usually the strongest, and normally the higher the order of the harmonic, the weaker it is. Each instrument in the orchestra has its own particular harmonic signature. The number and relative intensities of these constituent tones determine the quality, or timbre, of that particular instrument. Research has shown that a prime requisite of the ability to hear out a harmonic in a complex wave is that the separation between adjacent harmonics must be greater than the critical bandwidth. If two adjacent harmonics fall within a common critical band, the ear cannot distinguish one from the other. All this means that the ear is basically like a Fourier analyzer-to use a term familiar to electronics people. We use this analyzing ability of our auditory system all the time without giving it a thought. In addition to hearing out the harmonics of a complex tone, the ear has remarkable powers of discrimination. With the people talking all around us, we can direct our attention to one person, subjectively pushing other conversations into the background. We can direct our attention to one group of instruments in an orchestra or to one singer in a choir. Listening to someone talk in the presence of high background noise, we are able to select out the talk and reject, to a degree, the noise. This is all done subconsciously, but we are constantly using this amazing faculty.
MySQL (beta) at CHEARSdotinfo.co.uk Lesson 5 These harmonics are whole-number multiples of the fundamental frequency. we conclude that the triangular wave certainly has its own distinctive quality. The distinctiveness of its sound is all wrapped up in its harmonic structure. The harmonic content of a signal is the key to its distinctive sound quality. A 1000-Hz square wave has its own distinctive sound. All of its harmonics occur at the same odd multiples of the fundamental as with the triangular wave, but their magnitudes and time relationships are different. There is a richness to the violin tone which the sine wave certainly does not have. The violin tone is rich in overtones. As we deal with musical tones, it is fitting that we switch over to the musician's terminology. Instead of harmonics, the terms overtones or partials should be used. Overtones dominate the violin sound. Its rich tonal quality depends entirely on the overtone pattern. Each instrument of the orchestra has its own overtone pattern, which gives it its characteristic sound. To achieve high quality in the recording and reproduction of sound, it is necessary to preserve all the frequency components of that sound in their original form. Limitation of the frequency band or irregularities in frequency response, among other things, affect sound quality. Some musical instruments have overtones that are not whole-number multiples of the fundamental and thus cannot be called harmonics. For such instruments, the general word overtones must be used. Bells produce a wild mixture of overtones and the fundamental may not even be recognized. The overtones of drums are also not harmonically related to the fundamental, but they are responsible for the unique, rich drum sound. Summarizing, we have learned that preserving the integrity of the fundamental and overtone pattern of our signal preserves the quality of the signal, and this is what high fidelity is all about.
MySQL (beta) at CHEARSdotinfo.co.uk Unit 2: Fletcher repeated this experiment at many frequencies. The bandwidth of the noise just beginning to affect the masking of a particular tone he called the critical band, effective at that frequency. Fletcher's work encouraged other scientists to explore the shape of these so-called critical bands of the human hearing system. Instead of using two tones, which produced beats, some experi-menters used one tone and a band of noise much narrower than the critical band. Simply stated, the closer the probe tone is to the noise band, the easier it is to mask the noise. That’s exactly what Fletcher said: Only sound energy near a tone of a given frequency is effec-tive in masking it.
MySQL (beta) at CHEARSdotinfo.co.uk Unit 1: Playback at such low levels requires boosting lows and highs to restore a semblance of quality. This is the principle of the so-called loudness equalization, which is practiced by high-fidelity enthusiasts. The equalization required to make low-level music sound right comes close to tracing an equal-loudness contour. By a similar process, other contours are traced, each tied down to a specific sound-pressure level at 1000 Hz. These levels at 1000 Hz are arbitrarily called loudness levels in phons. When you go to an otologist or a clinical audiologist to have your hearing tested, your audiogram is really your own personal minimum audible equal-loudness contour!
MySQL (beta) at CHEARSdotinfo.co.uk Lesson 4: In Lesson 3, we heard the effect of cutting off low - and high - frequency portions of the audio spectrum. These we called lo-cut and hi-cut. Sometimes it is desirable to reduce low - and high - frequency contributions less drastically than cutting them off. For this, the phrase roll-off is used. By boosting these important speech frequencies at 5 to 10 dB, understandability of speech can be improved, especially with a background of music or sound effects. This is called presence equalization. Clip-on microphones are very popular today, and most of them are capable of reasonably good quality if used properly. One problem with them is that the high - frequency components of the voice are quite directional, tending to miss the microphone. Boosting system response at the higher frequencies can compensate for this loss. By introducing a 10-dB boost at 5000 Hz, the high - frequency components are restored. Microphones clipped to shirt or necktie are very close to the chest and are prone to pick up chest resonances, which for a man, tend to overemphasize the voice components in the region of 700 to 800 Hz. Now, to compensate for chest resonance, a dip of 6 dB at 750 Hz is introduced. Thus, we see that intentional deviations from the idealized flat response may actually yield better recorded sound in the practical sense. Music may be made more pleasing to the ear, and speech may be made more understandable.
MySQL (beta) at CHEARSdotinfo.co.uk Lesson 3: We see that an orchestra generates very significant amounts of energy in the low- frequency range and that the quality of the music suffers markedly if the full low-frequency range is restricted. Narrowing the band even further, from 300 Hz on the low-frequency end to 3000 Hz on the high-frequency end, a telephone-like quality results. Even though the voice quality has greatly changed, it is interesting to note that the voices are recognizable and the words are quite understandable. A small radio receiver tuned to an AM station might pass something like a band from 300 Hz to 5000 Hz.
MySQL (beta) at CHEARSdotinfo.co.uk Lesson 2: Sound Level: a physical quantity (measured with instruments). Loudness: a psycho-physical sensation perceived by the human ear / brain mechanism. Decibel: one-tenth of a bel, which is the logarithm of the ratio of any two power-like quantities. Logarithm: (common log to base 10) of a number is the exponent of 10 that yields that number. When these tones are reproduced on a loud-speaker, you will notice that changes in head position change the loudness of the sound due to room effects. For this reason, keep your head in one position during each test. Of course, if you are listening on headphones, room acoustics have no effect. A change of 10 dB is often considered to be a doubling of loudness, or cutting loudness in half. A 10-dB change in level is less noticeable at 100 Hz but very prominent at 1000 Hz. The minimum discernible level change depends both on the frequency and the level of the sound.
MySQL (beta) at CHEARSdotinfo.co.uk Lesson 1: Spectrum: the distribution of the sound energy throughout the audible frequency range. Frequency: The number of cycles per second = the number of hertz. k = kilo = 1000 1 kHz = 1000 Hz 10 kHz = 10,000 Hz 20 kHz = 20,000 Hz The frequency range of audible sound is commonly taken as 20 Hz to 20,000 Hz. To avoid problems commonly associated with the extremes of the audible band, we will keep within a 100-Hz limit at the low end and 10,000-Hz limit on the high end. Octave: If one tone has twice or half the number of vibrations per second as another tone, the two tones are one octave apart. Pure Tone = single frequency Noise bands are useful in acoustical measure- ments in rooms because their constantly shifting nature, strange as it seems, gives steadier readings than pure tones. Pure tones, on the other hand, are commonly used in equip.