the main / the Internet / Compression in practice. Mastering in the opposite direction: Is it possible to increase the dynamic range of compressed records? Compression - reasonable compromise

Compression in practice. Mastering in the opposite direction: Is it possible to increase the dynamic range of compressed records? Compression - reasonable compromise

This group of methods is based on the fact that the transmitted signals are subjected to non-linear amplitude transformations, and in transmitting and receiving parts of nonlinearity is converted. For example, if the transmitter uses a nonlinear function ÖU, in the receiver - u 2. The consistent application of the convergent functions will lead to the fact that, in general, the transformation remains linear.

The idea of \u200b\u200bnonlinear data compression methods is reduced to the fact that the transmitter can give a larger range of change in the transmitted parameter with the same amplitude of the output signals (that is, greater dynamic range). Dynamic range - This is expressed in relative units or decibellah attitude of the greatest admissible signal amplitude to the smallest:

;	(2.17)
.	(2.18)

Natural desire to increase the dynamic range by reducing U MIN is limited by the sensitivity of the equipment and an increase in the effect of interference and its own noise.

Most often, the compression of the dynamic range is carried out using a pair of convergent functions of logarithming and potentiation. The first operation of changing amplitude is called compression(compression), second - expandment (stretching). The choice of these functions is related to their the greatest opportunity Compression.

At the same time, these methods have disadvantages. The first of them is that the logarithm of a small number is negative and in the limit:

that is, the sensitivity is very nonlinear.

To reduce these drawbacks, both functions are modified by offset and approximation. For example, for telephone channels, the approximated function is related (type A,):

and a \u003d 87.6. The gain from compression is 24dB.

Data compression by nonlinear procedures is implemented by analog facilities with large errors. The use of digital tools can significantly improve the accuracy or speed of the transformation. At the same time, the direct use of computing equipment (that is, the direct calculation of logarithms and the exponential) will give a better result due to low speed and accumulating calculation error.

Data compression by compression due to accuracy restrictions is used in non-response cases, for example, to transmit speech on telephone and radio channels.

Effective coding

Effective codes were offered to Sundon, Fano and Hafman. The essence of the codes is that they are uneven, that is, with a different category of discharges, and the length of the code is inversely proportional to the probability of its appearance. Another remarkable feature of effective codes - they do not require separators, that is, special characters separating the neighboring code combinations. This is achieved when observing a simple rule: shorter codes are not the beginning longer. In this case, the solid stream of binary discharges is uniquely decoded, since the decoder reveals the shortest code combinations first. Effective codes for a long time were purely academic, but recently used in the formation of databases, as well as in compressing information in modern modems and in software archivers.

Due to unevenness, the average code length is introduced. Medium length - mathematical expectation of code length:

moreover, L CP tends to h (x) from above (that is, L Wed\u003e H (x)).

The implementation of the condition (2.23) is enhanced by increasing N.

There are two varieties of effective codes: Shannon Fano and Hafman. Consider their receipt on the example. Suppose the probabilities of the characters in the sequence are the meanings shown in Table 2.1.

Table 2.1.

Probabilities of symbols

N.
P I.	0.1	0.2	0.1	0.3	0.05	0.15	0.03	0.02	0.05

Symbols are ranked, that is, they seek in a row on descending probability. After that, according to the Shennon Fano method, the following procedure is periodically repeated: the entire group of events is divided into two subgroups with the same (or approximately the same) total probabilities. The procedure continues until one element remains in the next subgroup, after which this element is eliminated, and with the remaining these actions continue. This happens until the last two subgroups remain one element. Continue consideration of our example, which is reduced in Table 2.2.

Table 2.2.

Chennon Fano method

N.	P I.
4	0.3		I.
	0.2	I.	II.
6	0.15		I.	I.
	0.1			II.
1	0.1			I.	I.
9	0.05	II.			II.
5	0.05		II.		I.
7	0.03			II.	II.	I.
8	0.02					II.

As can be seen from Table 2.2, the first symbol with a probability P 4 \u003d 0.3 participated in two partitioning procedures and both times hit the group with number i. In accordance with this, it is encoded by two-bit code II. The second element in the first stage of the partition belonged to Group I, on the second - group II. Therefore, its code 10. The codes of the rest of the characters in additional comments do not need.

Usually uneven codes are depicted in the form of code trees. The code tree is a graph indicating the allowed code combinations. Pre-specify the directions of the ribs of this graph, as shown in Fig.2.11 (the choice of directions is arbitrary).

The graph is guided as follows: make up a route for a dedicated symbol; The number of discharges for it is equal to the number of edges in the route, and the value of each discharge is equal to the direction of the corresponding rib. The route is made up of the source point (it is labeled in the drawing a). For example, the route to the vertex 5 consists of five ribs, of which everything, in addition to the latter, have direction 0; We get the code 00001.

Calculate for this example entropy and middle length of the word.

H (x) \u003d - (0.3 log 0.3 + 0.2 log 0.2 + 2 0.1 Log 0.1+ 2 0.05 log 0.05+

0.03 Log 0.03 + 0.02 log 0.02) \u003d 2.23 bits

l cp \u003d 0.3 2 + 0.2 2 + 0.15 3 + 0.1 3 + 0.1 4 + 0.05 5 +0.05 4+

0.03 6 + 0.02 6 = 2.9 .

As can be seen, the medium length of the word is close to entropy.

Hafman codes are built on a different algorithm. The encoding procedure consists of two stages. At the first stage, one-time compression of the alphabet is consistently. One-time compression is the replacement of the last two characters (with lower probabilities) one, with a total probability. Compression is carried out until two characters remain. At the same time fill the coding table in which the resulting probabilities are affixed, and also depict routes for which new characters are moving at the next stage.

At the second stage, the coding itself occurs, which begins from the last stage: the first of two characters assign code 1, the second - 0. After that, go to the previous stage. To the symbols that did not participate in the compression at this stage, attribute codes from the subsequent stage, and to the two latest characters twice attribute the symbol code obtained after gluing, and add to the upper symbol code 1, Lower - 0. If the symbol is further in gluing Participates, its code remains unchanged. The procedure continues to the end (that is, until the first stage).

Table 2.3 shows coding along the Hafman algorithm. As can be seen from the table, the coding was carried out in 7 stages. On the left are the probabilities of characters, right - intermediate codes. The arrows show moving newly formed characters. At each stage, the last two characters differ only with the younger discharge, which corresponds to the coding technique. We calculate the average length of the word:

l cf \u003d 0.3 2 + 0.2 2 + 0.15 3 ++ 2 0.1 3 + +0.05 4 + 0.05 5 + 0.03 6 + 0.02 6 \u003d 2.7

It is even closer to the entropy: the code is even more effective. In fig. 2.12 shows the Hafman Code tree.

Table 2.3.

Coding on the Hafman algorithm

N.	P I.	the code	I.	II.	III	IV.	V.	VI	VII
	0.3		0.3 11	0.3 11	0.3 11	0.3 11	0.3 11	0.4 0	0.6 1
	0.2		0.2 01	0.2 01	0.2 01	0.2 01	0.3 10	0.3 11	0.4 0
	0.15		0.15 101	0.15 101	0.15 101	0.2 00	0.2 01	0.3 10
	0.1		0.1 001	0.1 001	0.15 100	0.15 101	0.2 00
	0.1		0.1 000	0.1 000	0.1 001	0.15 100
	0.05		0.05 1000	0.1 1001	0.1 000
	0.05		0.05 10011	0.05 1000
	0.03		0.05 10010
	0.02

Both codes satisfy the requirement of decoding uniqueness: as can be seen from the tables, shorter combinations are not the beginning of longer codes.

With increasing number of symbols, the effectiveness of codes increase, so in some cases encoded larger blocks (for example, if we are talking about texts, you can encode some of the most common syllables, words, and even phrases).

The effect of the implementation of such codes is determined in comparison with the uniform code:

(2.24)

where n is the number of uniform code discharges, which is replaced with effective.

Modifications of Khafman codes

The classic Hafman algorithm refers to two-passable, i.e. Requires the initial set of statistics on symbols and messages, and then the procedures described above. It is inconvenient in practice, because it increases the processing time of messages and the accumulation of the dictionary. Single-pass methods in which accumulation and coding procedures are combined. Such methods are also called adaptive compression along Hafman [46].

The essence of adaptive compression across Hafman is reduced to the construction of the initial code tree and its consistent modification after the receipt of each next symbol. As before, the trees here are binary, i.e. From each vertex of the graph - wood, a maximum of two arcs occurs. It is customary to call the original peak by the parent, and the two associated next vertices - children. We introduce the concept of weight of the vertex - this is the number of characters (words) corresponding to this vertex obtained when the initial sequence is applied. Obviously, the sum of the scales of children is equal to the weight of the parent.

After entering the next symbol of the input sequence, the code tree is revised: the weights of the vertices are recalculated and, if necessary, the vertices are rearranged. The rule of rearrangement of the vertices as follows: the weights of the lower vertices are the smallest, and the vertices that are left on the column have the smallest weights.

At the same time, the vertices are numbered. The numbering begins with the lower (hanging, i.e. who do not have children) vertices from left to right, then transferred to upper level etc. to the numbering of the last, source vertex. At the same time, the following result is achieved: the less weight of the vertex, the less its number.

The permutation is carried out mainly for hanging vertices. When permutation, the formulated rule is considered: the tops with high weight have a larger number.

After passing the sequence (it is also called control or test), the code combinations are assigned to all hanging vertices. The rule assignment rule is similar to the above: the number of code discharges is equal to the number of vertices through which the route runs from the source to this hanging vertex, and the value of a specific discharge corresponds to the direction from the parent to the "child" (say, the transition to the left from the parent corresponds to the value 1, right - 0 ).

The obtained code combinations are entered into the memory of the compression device along with their analogues and form a dictionary. The use of the algorithm is as follows. The compressible sequence of characters is divided into fragments in accordance with the existing dictionary, after which each of the fragments is replaced by its code from the dictionary. Fragments not detected in the dictionary form new hanging vertices, gain weight and are also entered into the dictionary. This is formed by an adaptive algorithm for a dictionary replenishment.

To increase the efficiency of the method, it is desirable to increase the size of the dictionary; In this case, the compression coefficient is rising. Virtually the size of the dictionary is 4 - 16 KB of memory.

We illustrate the algorithm given by an example. In fig. 2.13 shows the source diagram (it is also called with a hafman tree). Each vertex of wood is shown by a rectangle, in which two digits are inscribed through the fraction: the first means the number of the vertices, the second is its weight. How can you make sure that the versic weights and their numbers are satisfied.

Suppose now that the symbol corresponding to the vertex 1, in the test sequence met the secondary. The weight of the vertices changed, as shown in Fig. 2.14, as a result, the number of numbering the vertex is violated. At the next stage, we change the layout of hanging vertices, for which we change the vertices 1 and 4 and renumbers all the vertices of the tree. The resulting graph is shown in Fig. 2.15. Next, the procedure continues similarly.

It should be remembered that each hanging peak in the Hafman tree corresponds to a specific symbol or their group. The parent is different from children by the fact that a group of characters, it is appropriate to him, for one symbol in short, than his children, and these children differ in the last symbol. For example, the parents correspond to the "car" symbols; Then children may have a "Kara" and "carp" sequences.

The above algorithm is not academic and is actively used in programs - archivers, including when compressing graphic data (they will be discussed below).

Lempel - Ziva algorithms

These are the most commonly used compression algorithms. They are used in most programs - archivers (for example, Pkzip. Arj, LHA). The essence of algorithms is that some set of characters is replaced when archiving it in a specially generated dictionary. For example, often found in the affairs of the phrase "on your letter outgoing number ..." can occupy in the dictionary position 121; Then, instead of transferring or storing the mentioned phrase (30 bytes), you can store the phrase number (1.5 bytes in binary - decimal form or 1 byte - in binary).

Algorithms are named after the authors who first offered them in 1977. Of these, the first - LZ77. For archiving, the so-called sliding window consisting of two parts is created. The first part, greater format, serves to form a dictionary and has a size of the order of several kilobytes. In the second, smaller part (usually up to 100 bytes) are accepted by the current characters of the text being viewed. The algorithm is trying to find in the dictionary Set of characters coinciding with the viewed window. If it is possible, a code consisting of three parts is generated: a displacement in the dictionary regarding its initial substring, the length of this substring next to this substrate character. For example, a dedicated substrate consists of "application" symbols (only 6 characters), the following symbol is "e". Then, if the substring has an address (place in the dictionary) 45, then the record in the dictionary has the form "45, 6. E". After that, the contents of the window shifts to the position, and the search continues. Thus, a dictionary is formed.

The advantage of the algorithm is an easily formalized algorithm for compiling a dictionary. In addition, it is possible to unzip and without the initial dictionary (it is desirable to have a test sequence) - the dictionary is formed in the process of unimber.

The disadvantages of the algorithm appear with an increase in the size of the dictionary - the time to search is increasing. In addition, if a string of characters are missing in the current window, each symbol is written to three-element code, i.e. It turns out not compression, but stretching.

The best features has the LZSS algorithm proposed in 1978. It has differences in maintaining the sliding window and the output codes of the compressor. In addition to the window, the algorithm forms a binary tree, similar to the Hafman tree to speed up the search for coincidences: each substring leaving the current window is added to the tree as one of the children. Such an algorithm allows you to further increase the size of the current window (it is desirable that its value equal to the degree of two: 128, 256, etc. byte). The sequence codes are also formed differently: 1-bit prefix is \u200b\u200badditionally introduced for distinguishing the non-projected characters from pairs "offset, length".

An even greater compression is obtained using LZW type algorithms. The previously described algorithms have a fixed window size, which leads to the impossibility of entering into the dictionary of phrases is longer than the window size. In the LZW algorithms (and their predecessor lz78) the view window has an unlimited size, and the dictionary accumulates the phrase (and not a totality of characters as before). The dictionary has an unlimited length, and the encoder (decoder) operates in the mode of standby mode. When the phrase that coincides with the dictionary is formed, the coincidence code is issued (i.e. code of this phrase in the dictionary) and the code of the following symbol behind it. If as symbols accumulate a new phrase is formed, it is also entered into the dictionary, as the shortest one. As a result, a recursive procedure is formed, providing quick encoding and decoding.

Additional opportunity Compression provides compressed encoding of repetitive characters. If in the sequence, some characters follow in a row (for example, in the text it may be the "space" characters, in the numerical sequence - flowing zeros, etc.), it makes sense to replace their pair "Symbol; Length" or "Sign, Length ". In the first case, the code indicates the feature that the sequence is encoded (usually 1 bit), then the code of the repeating symbol and the length of the sequence. In the second case (provided for the most common repeated symbols) in the prefix indicates simply a sign of repetitions.

The sound level is the same throughout the composition, there are several pauses.

Narrowing dynamic range

Narrowing the dynamic range, or simply speaking compressionneeded for different purposes that are most common of them:

1) Achieving a single level of volume throughout the composition (or tool batch).

2) Achieve a single volume level of compositions over the album / radio transmission.

2) Increasing the intelligibility, mainly with the compression of a certain party (vocals, bass barrel).

How is the narrowing of the dynamic range?

The compressor analyzes the audio level at the input comparing it with the user specified by the value of Threshold (threshold).

If the signal level is lower than the value Threshold. - The compressor continues to analyze the sound without changing it. If the sound level exceeds the value of THRESHOLD - then the compressor starts its action. Since the role of the compressor consists in narrowing the dynamic range, it is logical to assume that it limits the most large and the smallest amplitude values \u200b\u200b(signal level). At the first stage, there is a limitation of the largest values \u200b\u200bthat decrease with a certain force called Ratio. (Attitude). Let's look at the example:

Green curves display the sound level, the greater the amplitude of their oscillations from the X axis - the greater the signal level.

The yellow line is the threshold (Threshold) of the compressor. Making the threshold value above - the user removes it from the X axis. Doing the threshold threshold below - the user brings it to the Y axis. It is clear that the lower the value of the threshold - the more often the compressor will be triggered and the other way. If the Ratio value is very large, then after reaching the threshold signal level, the entire subsequent signal will be suppressed by the compressor to silence. If the value of Ratio is very small - nothing happens. On the choice of Threshold and Ratio values, it will come later. Now we should ask yourself the next question: what is the point of suppressing the entire subsequent sound? Indeed, in this sense there is no, we need to get rid of the amplitude values \u200b\u200b(peaks), which exceed the value of Threshold (in the graphics are marked in red). It is to solve this problem and there is a parameter Release (Attenuation), which will set the time of compression.

The example shows that the first and second excess of the threshold threshold lasts less than the third excess of the threshold threshold. So, if the Release parameter is adjusted to the first two peaks, then when processing the third may remain untreated part (since the threshold exceeding the threshold lasts longer). If the Release parameter is adjusted to the third peak - then when processing the first and second peak, a unwanted decrease in the signal level is formed.

The same comes the Ratio parameter. If the Ratio parameter is configured to the first two peaks, then the third will not be sufficiently suppressed. If the Ratio parameter is configured to process the third peak - then the processing of the first two peaks will be too high.

These problems can be solved in two ways:

1) Set the attack parameter (ATTACK) is a partial solution.

2) Dynamic compression is a complete solution.

Parameter buttaki (attack)it is intended for a task of time, after which the compressor will start its work after the threshold of Threshold is exceeded. If the parameter is close to zero (equal to zero in the case of parallel compression, see the acc. Article) - then the compressor will start to suppress the signal immediately, and the number of time specified by the Release parameter will work. If the speed of attack is great, then the compressor will start its action after a certain period of time expire (it is necessary to make a definition). In our case, you can configure the parameters of the threshold (threshold), attenuation (Release) and the level of compression (Ratio) to process the first two peaks, and the attack value (attack) is set close to zero. Then the compressor will suppress the first two peaks, and when processing the third will suppress it until the end of the threshold (Threshold) is completed. However, this does not guarantee high-quality sound processing and close to limitting (rough cut of all amplitude values, in this case the compressor is called a limiter).

Let's look at the result of sound processing by compressor:

Peaks disappeared, notice the fact that the processing settings were sufficiently gentle and we supplied only the most speakers of amplitude. In practice, the dynamic range is narrowed much stronger and this trend only progresses. In the minds of many composers - they make music louder, however, in practice, they completely deprive her speakers for those listeners who may have to listen to her at home and not on the radio.

We have left to consider the last parameter of the compression is Gain.(Gain). Strengthening is intended to increase the amplitude of the entire composition and, in fact, equivalent to another tool of sound editors - normaliz. Let's look at the end result:

In our case, the compression was justified and improved the dopy of sound, since the released peak is rather an accident than an intentional result. In addition, it can be seen that the music is rhythmic, therefore it is characterized by a narrow dynamic range. In cases where high amplitudes were made specifically, the compression may become an error.

Dynamic compression

The difference between dynamic compression from not dynamic lies in the fact that with the first signal suppression level (Ratio) depends on the level of the incoming signal. Dynamic compressors are in all modern programs, controlling the Ratio and Threshold parameters using the window (each parameter corresponds to its own axis):

There is no single schedule display standard, somewhere along the Y axis, the level of the incoming signal is displayed, somewhere on the contrary, the signal level after compression. Somewhere the point (0,0) is in the upper right corner, somewhere in the lower left. In any case, when moving the mouse cursor through this field, the values \u200b\u200bof numbers that correspond to the Ratio and Threshold parameters are changed. Those. You specify the compression level for each THRESHOLD value, thanks to which you can easily flexibly configure compression.

Side Chain

Side Chain Compressor analyzes a single channel signal, and when the sound level exceeds the threshold (Threshold) - applies compression to another channel. Side Chain has its advantages of working with tools that are located in one frequency domain (the Bass Bass Bass Bass is actively used), but sometimes the tools located in different frequency areas are used, which leads to an interesting Side-Chein effect.

Part Two - Compression Stages

There are three compression stages:

1) The first stage is the compression of individual sounds (SingleShoots).

The timbre of any tool has the following features: Attack (Attack), Holding (HOLD), decline (Decay), Level Period (Sustain), Attitude (Release).

The compression phase of individual sounds is divided into two parts:

1.1) Compression of individual sounds of rhythmic tools

Often the components of the bit require a separate compression to give them a clarity. Many treated bass barrel separately from other rhythmic tools, both at the stage of compression of individual sounds and at the stage of compression of individual parties. This is due to the fact that it is in a low-frequency area, where only the bass is usually present in addition to it. Under the clarity of bass barrels means the presence of a characteristic click (a very short time of attack and holding bars). If the click is not - then it is necessary to process it with a compressor, setting the threshold equal to zero and the attack time from 10 to 50 ms. The realeese compressor must end to the new bass barrel strike. The last problem can be solved using the formula: 60 000 / BPM, where BPM is the tempo of the composition. So, for example) 60 000/137 \u003d 437.96 (time in milliseconds to a new strong shadow of the 4-dimensional composition).

All the above applies to other rhythmic tools with a short time attack - they must have an accented click, which should not be suppressed by the compressor on some of the stages of compression levels.

1.2) Compression Separate sounds Harmonic instruments

Unlike rhythmic instruments, the batch of harmonic tools is quite rarely made up of individual sounds. However, it does not follow from this that they should not be processed at the level of sound compression. In case you use sample with the recorded party, it is the second level of compression. This level of compression includes only synthesized harmonic instruments. These can be samples, synthesizers using various sound synthesis methods (physical modeling, FM, additive, subtractive, etc.). As you probably have already guessed - we are talking about programming the synthesizer settings. Yes! This is also a compression! Almost all synthesizers have a programmable Envelope parameter (ADSR), which means envelope. With the help of envelope, the attack time (attack) is set, recession (Decay), Holding levels (Sustain), Atoys (Release). And if you tell me what it is not the compression of each individual sound - you are my enemy for life!

2) The second stage is the compression of individual parties.

Under the compression of individual parties, I understand the narrowing of the dynamic range of a number of united individual sounds. This stage includes records of parties, including vocals, which requires processing compression to give it a clarity and intelligibility. When processing the compression of parties, it is necessary to take into account that when the individual sounds are added, unwanted peaks may appear, on which it is necessary to get rid of this stage, since if it is not done now, then the picture can be aggravated at the stage of information on the entire composition. At the stage of compression of individual parties, it is necessary to take into account the compression of the processing stage of individual sounds. If you have achieved the clarity of the bass barrel - then incorrect re-processing in the second stage can be ruined everything. The processing of all batches of the compressor is not required, as well as the processing of all individual sounds is not required. I advise you to deliver an amplitude analyzer just in case to determine the presence of undesirable side effects of combining individual sounds. In addition to compression, at this stage, it is necessary to ensure that the parties be as possible in different frequency bands so that quantization was performed. It is also useful to remember that the sound has such a characteristic as masking (psychoacousti):

1) A quiet sound is masked loud, going to him.

2) Quiet sound at low frequency is masked by a loud sound at high frequency.

So, for example, if you have a batch of synthesizer, then often notes begin to play before the previous notes finish their sound. Sometimes it is necessary (creating harmony, game style, polyphony), but sometimes not at all - you can crop their end (delay - release) in case it is heard in SOLO mode, but not heard in the playback mode of all parties. The same applies to effects, such as reverb - it should not last until the new sound of the sound source. Cutting and removing an unnecessary signal - you make the sound cleaner, and this can also be considered as a compression - because you remove unnecessary waves.

3) The third stage is the compression of the composition.

With the compression of the entire composition, it is necessary to take into account that all parties are associated with many separate sounds. Consequently, when they are associated and subsequent compression, it is necessary to ensure that the final compression does not spoil what we have achieved at the first two stages. You also need to separate the compositions in which is important and narrow range. With compression of compositions with a wide dynamic range - it is enough to put a compressor that will prescribe short-term peaks that were formed as a result of the addition of parties among themselves. With compression of the composition in which the narrow dynamic range is important - everything is much more complicated. Here the compressors are recently called maximizers. Maximizer is a plugin that combines compressor, limitter, graffiti equalizer, enhaiser and other sound conversion tools. At the same time, it must necessarily have sound analysis tools. Moving, final processing with a compressor, is largely needed to combat assumed errors in previous stages. Errors - not so much compression (however, if you do at the last stage, what you could do at the first stage - this is an error), how much in the original choice of good samples and tools that would not interfere with each other (we are talking about frequency bands) . It is for this that achk correction is made. It often happens that with strong compression on the master you need to change the parameters of compression and information on earlier stages, since with a strong narrowing of the dynamic range, quiet sounds, which previously masked, changes the sound of individual components of the composition.

In these parts, I did not affect specific compression parameters. I considered it necessary to write about that when compression it is necessary to pay attention to all sounds and all parties at all stages of creating the composition. Only so in the end you will get a harmonious result not only from the point of view of the theory of music, but also from the point of view of sound engineering.

Next in the table given practical advice on the processing of individual parties. However, in compression, the numbers and presets can only suggest the desired area, in which you need to look for. Ideal compression settings depend on each individual case. The gain (GAIN) and threshold (THRESHOLD) implies the normal sound level (logical use of the entire range).

Part of the tie - compression parameters

Brief reference:

The threshold (Threshold) - determines the sound level of the incoming signal, to achieve which the compressor starts work.

Attack (Attack) - determines the time after which the compressor will start working.

Level (Ratio) - determines the stony of reducing the values \u200b\u200bof the amplitude (with respect to the original amplitude value).

Release (Release) - determines the time after which the compressor will stop working.

Gaining (GAIN) - determines the level of increasing signal, after processing the compressor.

Compression Table:

Tool	Threshold.	Attack	Ratio.	Release	Gain.	Description
Vocals	0 dB.	1-2 ms. 2-5 MS. 10 ms 0.1 ms. 0.1 ms.	less than 4: 1 2,5: 1 4:1 – 12:1 2:1 -8:1	150 ms. 50-100 ms. 150 MSEK 150 ms. 0.5s.		Compression when recording must be minimal, requires mandatory processing at the stage of information to make a definition and intelligibility.
Wind instruments		1 - 5ms	6:1 – 15:1	0.3s.
Barrel		from 10 to 50 ms 10-100 ms.	4: 1 and above 10:1	50-100 ms. 1 ms.		The lower THRSHOLD and the greater Ratio and longer attack, the stronger the click at the beginning of the barrels.
Synthesizers						Depends on the type of wave (ADSR envelopes).
Drum drum:		10-40 MS. 1-5ms	5:1 5:1 – 10:1	50 ms. 0.2S.
High-Hat		20 ms.	10:1	1 ms.
Tepar microphones		2-5 MS.	5:1	1-50 MS.
Drums		5ms.	5:1 – 8:1	10ms.
Bas-guitar		100-200 ms. 4ms to 10ms.	5:1	1 ms. 10ms.
String		0-40 MS.	3:1	500 ms.
Sint bass		4ms - 10ms	4:1	10ms.		Depends on envelopes.

Percussion		0-20 ms.	10:1	50 ms.
Acoustic guitar, piano		10-30 MS. 5 - 10ms	4:1 5:1 -10:1	50-100 ms. 0.5s.
Electro-Nitara		2 - 5ms	8:1	0.5s.

Final compression		0.1 ms. 0.1 ms.	2:1 from 2: 1 to 3: 1	50 ms. 0.1 ms.	0 dB at the output	The attack time depends on the target - whether it is necessary to remove peaks or make the track smoother.
Limiter after final compression		0 ms.	10:1	10-50 MS.	0 dB at the output	If you need a narrow dynamic range and a rude "cut" waves.

The information was taken from various sources that are referred to as long as resources on the Internet. The difference in compression parameters is compressed by the difference in sound preferences and work with different material.

Dynamic compression (Dynamic Range Compression, DRC) is a narrowing (or expansion in the case of the expander) of the dynamic range of the phonogram. Dynamic rangeThis is the difference between the most quiet and loudest sound. Sometimes the most quiet in the phonogram will be the sound of a little loud level of noise, and sometimes a little quieter of the most loud. Hardware devices and programs carrying out dynamic compression are called compressors, highlighting four main groups: compressors, limiter, expanders and gates.

Lamp Analog Compressor DBX 566

Reduced and promoting compression

Lowing compression (Downward Compression) Reduces the sound volume when it starts exceeding a certain threshold value, leaving quieter sounds unchanged. Extreme option of lower compression is limiter. Enhancement compression (Upward Compression), on the contrary, increases the volume of the sound if it is below the threshold, without affecting the loudest sounds. At the same time, both types of compression narrow the dynamic range of the audio signal.

Lowing compression

Enhancement compression

Expander and Gate

If the compressor reduces the dynamic range, the expander increases it. When the signal level becomes above the threshold level, the expander increases it even more, thus increasing the difference between loud and quiet sounds. Such devices are often used when recording drum installation to separate the sounds of some drums from others.

The type of expander, which is not used not to enhance loud, and to dry the quiet sounds that do not exceed the level of the threshold value (for example, background noise) is called Noise Gate.. In such a device, as soon as the sound level becomes less than the threshold, the signal pass is stopped. Typically, the gate is used to suppress noise in pauses. On some models it can be done so that the sound when the threshold level does not stop sharply, but gradually roamed. In this case, the attenuation speed is set by the Decay regulator (recession).

Gate, like other types of compressors, maybe frequency-dependent (i.e., in different ways to process certain frequency bands) and can operate in mode side-Chain. (see below).

The principle of operation of the compressor

The signal falling into the compressor is divided into two copies. One copy is sent to the amplifier, in which the degree of amplification is controlled by an external signal, the second copy - forms this signal. It enters the device called side-chain, where the signal is measured, and the envelope is created based on this data describing the change in its volume.
So the most modern compressors are arranged, this is the so-called FEED-FORWARD type. In older devices (FEEDBACK type), the signal level is measured after the amplifier.

There are various analog control technologies (Variable-Gain Amplification), each with its advantages and disadvantages: lamps, optical using photoresistra and transistum. When working with digital audio (in sound editor or DAW), their own mathematical algorithms can be used or the operation of analog technology can be entered.

The main parameters of compressors

Threshold.

The compressor reduces the audio signal if its amplitude primaries a specific threshold value (THRESHOLD). It is usually indicated in decibels, with a lower threshold (for example, -60 DB) means that the sound will be processed than with a higher threshold (for example, -5 dB).

Ratio.

The degree of level decrease is determined by the Ratio parameter: Ratio 4: 1 means that if the input level is 4 dB exceeds the threshold, the output level will be higher than the threshold by 1 dB.
For example:
Threshold \u003d -10 db
Input signal \u003d -6 DB (on 4 dB above threshold)
Output signal \u003d -9 dB (on 1 dB above threshold)

It is important to keep in mind that suppressing the signal level continues and some time after it falls below the threshold level, and this time is determined by the parameter value release.

Compression with the maximum value of Ratio ∞: 1 is called Limiting. This means that any signal above the threshold level is suppressed before the threshold level (with the exception of a short period after a sharp increase in the input volume). For details, see below "Limiter".

Examples different values Ratio.

Attack and Release

The compressor provides certain control over how quickly it responds to changing the signal dynamics. The Attack parameter defines the time for which the compressor reduces the gain coefficient to the level, which is determined by the Ratio parameter. Release Defines the time for which the compressor, on the contrary, increases the gain coefficient, or returns to normal if the input signal level drops below the threshold value.

ATTACK and Release phases

These parameters indicate the time (usually in milliseconds), which will be required to change the strengthening to a certain amount of decibel, is usually 10 dB. For example, in this case, if ATTACK is set to 1 ms, to reduce the gain by 10 dB, 1 ms will be required, and 20 dB - 2 ms.

In many compressors, the Attack and Release parameters can be configured, but in some they are initially set and not regulated. Sometimes they are designated as "Automatic" or "Program dependent", i.e. vary depending on the input signal.

Knee.

Another compressor parameter: hard / Soft Knee. It determines whether the beginning of the application of compression is sharp (Hard) or gradual (Soft). Soft Knee reduces the slumbering of the transition from the raw signal to the signal subjected to compression, especially at high Ratio values \u200b\u200band sharp volume increases.

Hard Knee and Soft Knee Compression

PEAK and RMS.

The compressor can react to peak (short-term maximum) values \u200b\u200bor on the averaged input level. The use of peak values \u200b\u200bcan lead to sharp fluctuations in the degree of compression, and even to distortion. Therefore, compressors apply averaging function (usually this is RMS) input signal when comparing it with a threshold value. It gives a more comfortable compression, close to the human perception of the volume.

RMS is a parameter reflecting the average volume of the phonogram. From a mathematical point of view RMS (Root Mean Square) is the rms value of the amplitude of a certain number of samples:

Stereo Linking.

Compressor in Stereo Linking mode applies the same gain to both stereo channels. This avoids the displacement of the stereopanorama, which can be the result of the individual processing of the left and right channels. Such a displacement occurs if, for example, any loud element panted not in the center.

Makeup Gain.

Since the compressor reduces the overall signal level, the possibility of fixed gain at the output is usually added, which allows you to get the optimal level.

LOOK-AHEAD.

The Look-AHead function is designed to solve problems peculiar both too large and too small values \u200b\u200battack and release. Too much attacks do not allow you to effectively intercept transients, but too small may not be comfortable for the listener. When using the LOOK-AHEAD function, the main signal is delayed relative to the controller, it allows you to start compression in advance, even before the signal reaches the threshold value.
The only disadvantage of this method is the time delay of the signal, which in some cases undesirable.

Use of dynamic compression

Compression is used everywhere, not only in musical phonograms, but also everywhere, where you need to increase the overall volume, without increasing the peak levels where the inexpensive sound-reproducing equipment is used or a limited transmission channel (alert system, amateur radio, etc.) .

Compression is applied when playing background music (in stores, restaurants, etc.), where any noticeable volume changes are undesirable.

But the most important scope of applying dynamic compression is musical production and broadcasting. Compression is used to give the sound of "density" and "drive" for a better combination of tools with each other, and especially when processing vocals.

Vocal parties in rock and pop music are usually subjected to compression to highlight them on the background of the accompaniment and add clarity. A special type of compressor, configured only on certain frequencies - deesser, is used to suppress hissing background.

In the instrumental parties, the compression is also used for the effects that are not directly related to the volume, for example, the rapidly fading drum sounds can become more prolonged.

In electronic dance music (EDM), Side-chaning is often used (see below) - for example, the bass line can be controlled by a barrel or something similar to prevent the conflict of bass and drums and create a dynamic pulsation.

Compression is widely used in broadcast transmission (radio, television, Internet broadcasting) to increase the perceived volume while reducing the dynamic range of source audio (usually CD). Most countries have legal restrictions on the instant maximum volume, which can be broadcast. Typically, these limitations are implemented by constant hardware compressors in the ethereal chain. In addition, an increase in the perceived volume improves the "quality" of the sound from the point of view of most listeners.

Side-chaning

Another frequently found compressor switch is "Side Chain". In this mode, the compression of the audio does not occur depending on its own level, but depending on the signal level entering the connector, which is so usually called - Side Chain.

This can be found several applications. For example, vocalist Shepelvit and all the letters "C" stand out out of the overall picture. You skip his voice through the compressor, and the Side Chain connector serves the same sound, but missed through the equalizer. On the equalizer you remove all the frequencies, except for those used by vocalist when pronouncing the letter "C". Usually about 5 kHz, but can be from 3 kHz to 8 kHz. If then put a compressor into Side Chain mode, then the compression of the voice will occur in those moments when the letter "C" is pronounced. Thus, it turned out a device known as "Deesser" (DE-ESSER). This method of work is called "frequency dependent" (Frequency Dependent).

Another use of this feature is called "Ducker". For example, on a radio station, music goes through the compressor, and the words of DJ - through a side chain. When DJ starts chatting, the volume of music is automatically reduced. This effect can be successfully used in records, for example, reduce the volume of keyboard batches during singing.

Brick Wall Limiting

The compressor and the limiter are approximately the same, it can be said that the limiter is a high Ratio compressor (from 10: 1) and, usually, low attack time.

There is a BRICK WALL LIMITING concept - a very high ratio limiting (from 20: 1 and above) and a very fast attack. Ideally, it does not allow the signal to exceed the threshold level. The result will be unpleasant for rumor, but this will prevent damage to sound reproducing technology or excess channel bandwidth. Many manufacturers integrate limiter devices for this purpose.

Clipper VS. Limiter, Soft and Hard Clipping

We think about the question - why should we raise the volume? In order to hear the quiet sounds that are not heard in our conditions (for example, if you can not listen loudly if there are extraneous noises in the room, etc.). Is it possible to strengthen the quiet sounds, and do not touch the loud? It turns out. This technique is called the compression of the dynamic range (compression, Dynamic Range Compression, DRC). To do this, you need to change the current volume of constantly - quiet sounds to strengthen, loud - no. The easiest law of volume change is linear, i.e. Volume varies according to the law OUTPUT_LOUDNESS \u003d K * INPUT_LOUDNESS, where k is the compression ratio of the dynamic range:

Figure 18. Compression of the dynamic range.

When k \u003d 1, no changes are made (the output volume is equal to the input). At K.< 1 громкость будет увеличиваться, а динамический диапазон - сужаться. Посмотрим на график (k=1/2) - тихий звук, имевший громкость -50дБ станет громче на 25дБ, что значительно громче, но при этом громкость диалогов (-27дБ) повысится всего лишь на 13.5дБ, а громкость самых громких звуков (0дБ) вообще не изменится. При k > 1 - the volume will decrease, and the dynamic range is to increase.

Let's look at the volume graphs (k \u003d 1/2: compression of DD twice):

Figure 19. Volume graphics.

As can be seen in the original, both very quiet sounds were present, for 30 dB below the level of dialogues, and very loud - by 30 DB above the level of dialogues. So The dynamic range was 60dB. After compression, loud sounds are only 15DB above, and quiet - 15DB below the level of dialogues (the dynamic range is now 30 DB). Thus, the loud sounds have become much quieter, and quiet is significantly louder. At the same time, the overflow does not happen!

Now let's turn to histograms:

Figure 20. Example of compression.

As it can be clearly seen - when gaining up to + 30 dB, the shape of the histogram is well saved, which means that the loud sounds remain well pronounced (do not go to maximum and are not trimmed, as it happens with simple strengthening). At the same time, quiet sounds are highlighted. Histogram it shows poorly, but the difference is very noticeable for rumor. The lack of the method is the same volume of volume. However, the mechanism of their occurrence differs from the jumps of the volume of the circumcision arising during circumcision, and their character is different - they manifest themselves mainly with a very strong strengthening of quiet sounds (and not when circumcised loud, as with normal gain). The excessive level of compression leads to a flattening of the sound pattern - all sounds tend to the same volume and inexpressiveness.

Strong strengthening of quiet sounds can lead to the fact that the noises of the recording will be heard. Therefore, the filter is applied, a little modified algorithm so that noise levels climbed less:

Figure 21. Increase volume, without increasing noise.

Those. At the volume level -50DB, the transfer function is running, and noise will be increasing less (yellow line). In the absence of such inflection, noise will be significantly louder (gray line). Such a simple modification significantly reduces the number of noise even with very strong levels of compression (in the figure - compression 1: 5). The "DRC" level in the filter sets the level of amplification for quiet sounds (at -50db), so on. The 1/5 compression level shown in the figure corresponds to the + 40 DB level in the filter settings.

, Media players

Plates, especially old, which were recorded and manufactured before 1982, with a much lower probability of mixing, during which the record would have been louder. They reproduce natural music with a natural dynamic range that is stored on the record and is lost in most standard digital formats or high-resolution formats.

Of course, there are exceptions - listen not to the long-lasting album Stephen Wilson from Ma Recordings or Reference Recordings, and you will hear how good the digital sound can be. But this is a rarity, most modern sound recordings are loud and compressed.

Recently, music compression is subject to serious criticism, but I am ready to argue that almost all your favorite records are compressed. Some of them are less, some more, but still compressed. The compression of the dynamic range is a kind of scapegoat, which is blamed in a bad musical sound, but strongly compressed music is not a new trend: Listen to the albums of the 60s. The same can be said about the classic work of LED Zeppelin or younger albums Wilco and Radiohead. The compression of the dynamic range reduces the natural ratio between the loud and quiet sound on the record, so the whisper can be as loud as a cry. It is quite problematic to find pop music of the last 50 years, which has not been subject to compression.

I recently talked cute with the founder and editor of Tape Op Larry Crane magazine (Larry Crane) about good, bad and "evil" aspects of compression. Larry Crane worked with such groups and performers as Stefan Marcus, Cat Power, Sleater-Kinney, Jenny Lewis, M. Ward, The Go-Betweens, Jason Little, Eliot Smith, Quasi and Richmond Fontaine. He also controls the sound recording studio Jackpot! In Portland, Oregon, who was a refuge for The Breeders, The Decepts, Eddie Vederra, Pavelment, R.E.m., She & Him and more for many other others.

As an example, surprisingly unnaturally sounding, but still excellent songs, I cite the album Spoon "The Want My Soul", released in 2014. Caren laughs and says that he listens to him in the car, because there he sounds perfectly. What leads us to another answer to the question why the music is compressed: because compression and additional "clarity" allow you to better hear it in noisy places.

Larry Craine at work. Photo of Jason Quigley (Jason Quigley)

When people say that they like the sound of audio recordings, I believe that they like music, as if the sound and music were inseparable terms. But for myself, I differ these concepts. From the point of view of music audana, the sound can be rude and raw, but it will not matter for most listeners.

Many hurry to accuse master engineers in compression abuse, but compression is applied directly during sound recording, during mixing and only then during mastering. If you personally did not attend each of these stages, you can't say how tools and vocal party sounded at the very beginning of the process.

Craine was in a blow: "If the musician wants to deliberately make the sound insane and distorted as a record guided by voices, then there is nothing wrong with that - the desire always outweighs the sound quality." The voice of the performer is almost always compressed, the same thing happens with bass, drums, guitars and synthesizers. With the help of compression, the volume of the vocal is saved at the desired level throughout the song or slightly distinguished against the background of other sounds.

Properly made compression can make the sound of drums more alive or intentionally strange. To music sound perfectly, you need to be able to use the necessary tools for this. That is why to understand how to use compression and not overdo it, years leave. If the mix-engineer squeezed too much a guitar party, then the master engineer will no longer be able to fully restore the missing frequencies.

If the musicians wanted you to listen to music that did not pass the stages of mixing and mastering, we would produce it on the shelves of stores straight from the studio. Crane says that people who create, edit, mix music and conduct their mastering, there are not to be confused by the musicians - they help performers from the very beginning, that is, more than a hundred years.

These people are part of the process of creation, as a result of which amazing works of art are obtained. Caren adds: "You do not need the version of the Dark Side of The Moon, which has not passed through mixing and mastering." Pink Floyd released a song in that kind, in what they wanted to hear it.