Menu
Is free
check in
home  /  BY / Synthesis and speech recognition. Modern solutions

Synthesis and speech recognition. Modern solutions

Compression This is one of the most excreted by the myths of themes of the vehicle. They say Beethoven even scared her neighbor children: (

Okay, in fact, apply compression is not more difficult than using the distortion, the main thing is to understand the principle of its work and have good control. What are we now together and make sure.

What is sound compression

The first thing that is worth understanding before the preparation - compression is working with dynamic sound range. And, in turn, - nothing more than the difference between the loudest and quiet level of the signal:

So here compression is a compression of a dynamic range. Yes, simply Dynamic range compression, well, or in other words lowering the level of loud parts of the signal and an increase in the volume of quiet. No more.

Can you be quite reasonable to surprise with what such a khaip is connected with? Why everyone talks about recipes for the correct setting of compressors, but nobody shares them? Why, despite the huge number of cool plugins, in many studios still used expensive rare models of compressors? Why some producers apply compressors on extreme settings, while others do not use at all? And who of them is in the end right?

Tasks that solves compression

Answers to such questions lie in the plane of understanding the role of compression in working with sound. And it allows:

  1. Emphasize sound, make it more pronounced;
  2. "Seat" in Mix Separate Parties of Toolsby adding power and "weight";
  3. Do tool groups or all mix more solid, such a single monolith;
  4. Solve conflicts between tools using SideChain;
  5. Correfe the flaws of vocalist or musicians, lining their dynamics;
  6. With a definite setting act as an artistic effect.

As you can see, it is no less significant creative process than, say, inventing melodies or disturbing interesting timbres. In this case, any of the above tasks can be solved using 4 main parameters.

The main parameters of the compressor

Despite the huge number of software and hardware models of compressors, all the "magic" of compression occurs when proper setting Basic parameters: Threshold, Ratio, Attack and Release. Consider them in more detail:

Threshold or trigger threshold, DB

This parameter allows you to set the value from which the compressor will work (that is, compress the audio signal). So, if we install in threshold -12db, the compressor will only work in those places of dynamic range that exceed this value. If all of our sound is quische -12db, the compressor will simply miss it through itself, without affecting it.

Ratio or compression coefficient

The Ratio parameter determines how much the signal exceeds threshold. A little mathematics for the fullness of the picture: let's say, we set up a compressor with Threshold -12db, Ratio 2: 1 and filed a drum lept on it, in which the volume of the barrel is -4db. What is the result of the compressor work in this case?

In our case, the barrel level exceeds threshold on 8DB. This difference in accordance with Ratio will be compressed to 4DB (8DB / 2). In sum with an unprocessed part of the signal, this will lead to the fact that after processing the compressor, the volume of the baroch is -8db (Threshold -12db + compressed signal 4db).

Attack, MS.

This time, later, the compressor will respond to exceeding the trigger threshold. That is, if the attack time is above 0ms - compressor starts compression exceeding the threshold signal is not instantaneously, but the specified time.

Release or Recovery, MS

The opposite of the attack - the value of this parameter allows you to specify what time since the signal level is refunded below Threshold the compressor will stop compression.

Before we move further, I strongly recommend to take a good familiar sample, hang on its channel any compressor and 5-10 minutes experiment with the above parameters for reliable fixing material

Everything the remaining parameters are optional. They may differ in different models of compressors, partly therefore producers and use various models For any specific goals (for example, one compressor for vocal, the other on the drum group, the third is on the master channel). I will not dwell in detail on these parameters, but only the ladies general information To understand what it is at all:

  • Knee or Fravel (Hard / Soft Knee). This parameter defines how quickly the compression ratio (Ratio) will be applied: rigidly by curve or smoothly. I note that in the Soft Knee mode, the compressor does not work straightforwardly, but it starts smoothly (as far as it can be appropriate when we talk about milliseconds) already before the value of Threshold. For processing groups of channels and a common mix, it is easily used by the Soft Knee (as it works unnoticed), and for emphasising attacks and other features separate tools - Hard Knee;
  • Response Mode: PEAK / RMS. Peak mode is justified when you need to hardly limit amplitude bursts, as well as on signals with a complex form, the dynamics and readability of which you need to fully convey. RMS mode very carefully affects the sound, allowing you to compact it by saving the attack;
  • Promotiveness (LookAad). This is the time for which the compressor will know what he will have. A kind of preliminary analysis of incoming signals;
  • Makeup or Gain.. The parameter that allows you to compensate for the loudness of the volume as a result of the operation of the compression.

First I. the most important advice, relieving all further questions about compression: if you a) understood the principle of compression, b) firmly you know how one or another and c) has time to try a few in practice different modelsno advice you no longer need.

I am absolutely serious. If you carefully read this entry, experimented with the standard compressor of your DAW and one or two plugins, but did not understand in what cases you need to install large attack values, which ratio coefficient is used and in which of the mode to process the source signal - then you will Further searching on the Internet ready-made recipes, applying them thoughtlessly where it fell.

Recipes of accurate compressor tuning This is approximately as recipes for accurate adjustment of the reverb or chorus - deprived of any meaning and has nothing to do with the work. Therefore, I persistently repeat the only correct recipe: arming this article, good monitor headphones, a plug-in for visual control of the waveform and spend the evening in the company with a couple of compressors.

Act!

, Media players

Plates, especially old, which were recorded and manufactured before 1982, with a much lower probability of mixing, during which the record would have been louder. They reproduce natural music with a natural dynamic range, which persists on the record and is lost in most standard digital formats or high resolution formats.

Of course, there are exceptions - listen not to the long-lasting album Stephen Wilson from Ma Recordings or Reference Recordings, and you will hear how good the digital sound can be. But this is a rarity, most modern sound recordings are loud and compressed.

Recently, music compression is subject to serious criticism, but I am ready to argue that almost all your favorite records are compressed. Some of them are less, some more, but still compressed. The compression of the dynamic range is a kind of scapegoat, which is blamed in a bad musical sound, but strongly compressed music is not a new trend: Listen to the albums of the 60s. The same can be said about the classic work of LED Zeppelin or younger albums Wilco and Radiohead. The compression of the dynamic range reduces the natural ratio between the loud and quiet sound on the record, so the whisper can be as loud as a cry. It is quite problematic to find pop music of the last 50 years, which has not been subject to compression.

I recently talked cute with the founder and editor of Tape Op Larry Crane magazine (Larry Crane) about good, bad and "evil" aspects of compression. Larry Crane worked with such groups and performers as Stefan Marcus, Cat Power, Sleater-Kinney, Jenny Lewis, M. Ward, The Go-Betweens, Jason Little, Eliot Smith, Quasi and Richmond Fontaine. He also controls the sound recording studio Jackpot! In Portland, Oregon, who was a refuge for The Breeders, The Decepts, Eddie Vederra, Pavelment, R.E.m., She & Him and more for many other others.

As an example, surprisingly unnaturally sounding, but still excellent songs, I cite the album Spoon "The Want My Soul", released in 2014. Caren laughs and says that he listens to him in the car, because there he sounds perfectly. What leads us to another answer to the question why the music is compressed: because compression and additional "clarity" allow you to better hear it in noisy places.

Larry Craine at work. Photo of Jason Quigley (Jason Quigley)

When people say that they like the sound of audio recordings, I believe that they like music, as if the sound and music were inseparable terms. But for myself, I differ these concepts. From the point of view of music audana, the sound can be rude and raw, but it will not matter for most listeners.

Many hurry to accuse master engineers in compression abuse, but compression is applied directly during sound recording, during mixing and only then during mastering. If you personally did not attend each of these stages, you can't say how tools and vocal party sounded at the very beginning of the process.

Craine was in a blow: "If the musician wants to deliberately make the sound insane and distorted as a record guided by voices, then there is nothing wrong with that - the desire always outweighs the sound quality." The voice of the performer is almost always compressed, the same thing happens with bass, drums, guitars and synthesizers. With the help of compression, the volume of the vocal is saved at the desired level throughout the song or slightly distinguished against the background of other sounds.

Properly made compression can make the sound of drums more alive or intentionally strange. To music sound perfectly, you need to be able to use the necessary tools for this. That is why to understand how to use compression and not overdo it, years leave. If the mix-engineer squeezed too much a guitar party, then the master engineer will no longer be able to fully restore the missing frequencies.

If the musicians wanted you to listen to music that did not pass the stages of mixing and mastering, we would produce it on the shelves of stores straight from the studio. Crane says that people who create, edit, mix music and conduct their mastering, there are not to be confused by the musicians - they help performers from the very beginning, that is, more than a hundred years.

These people are part of the process of creation, as a result of which amazing works of art are obtained. Caren adds: "You do not need the version of the Dark Side of The Moon, which has not passed through mixing and mastering." Pink Floyd released a song in that kind, in what they wanted to hear it.

We think about the question - why should we raise the volume? In order to hear the quiet sounds that are not heard in our conditions (for example, if you can not listen loudly if there are extraneous noises in the room, etc.). Is it possible to strengthen the quiet sounds, and do not touch the loud? It turns out. This technique is called the compression of the dynamic range (compression, Dynamic Range Compression, DRC). To do this, you need to change the current volume of constantly - quiet sounds to strengthen, loud - no. The easiest law of volume change is linear, i.e. Volume varies according to the law OUTPUT_LOUDNESS \u003d K * INPUT_LOUDNESS, where k is the compression ratio of the dynamic range:

Figure 18. Compression of the dynamic range.

When k \u003d 1, no changes are made (the output volume is equal to the input). At K.< 1 громкость будет увеличиваться, а динамический диапазон - сужаться. Посмотрим на график (k=1/2) - тихий звук, имевший громкость -50дБ станет громче на 25дБ, что значительно громче, но при этом громкость диалогов (-27дБ) повысится всего лишь на 13.5дБ, а громкость самых громких звуков (0дБ) вообще не изменится. При k > 1 - the volume will decrease, and the dynamic range is to increase.

Let's look at the volume graphs (k \u003d 1/2: compression of DD twice):

Figure 19. Volume graphics.

As can be seen in the original, both very quiet sounds were present, for 30 dB below the level of dialogues, and very loud - by 30 DB above the level of dialogues. So The dynamic range was 60dB. After compression, loud sounds are only 15DB above, and quiet - 15DB below the level of dialogues (the dynamic range is now 30 DB). Thus, the loud sounds have become much quieter, and quiet is significantly louder. At the same time, the overflow does not happen!

Now let's turn to histograms:

Figure 20. Example of compression.

As it can be clearly seen - when gaining up to + 30 dB, the shape of the histogram is well saved, which means that the loud sounds remain well pronounced (do not go to maximum and are not trimmed, as it happens with simple strengthening). At the same time, quiet sounds are highlighted. Histogram it shows poorly, but the difference is very noticeable for rumor. The lack of the method is the same volume of volume. However, the mechanism of their occurrence differs from the jumps of the volume of the circumcision arising during circumcision, and their character is different - they manifest themselves mainly with a very strong strengthening of quiet sounds (and not when circumcised loud, as with normal gain). The excessive level of compression leads to a flattening of the sound pattern - all sounds tend to the same volume and inexpressiveness.

Strong strengthening of quiet sounds can lead to the fact that the noises of the recording will be heard. Therefore, the filter is applied, a little modified algorithm so that noise levels climbed less:

Figure 21. Increase volume, without increasing noise.

Those. At the volume level -50DB, the transfer function is running, and noise will be increasing less (yellow line). In the absence of such inflection, noise will be significantly louder (gray line). Such a simple modification significantly reduces the number of noise even with very strong levels of compression (in the figure - compression 1: 5). The "DRC" level in the filter sets the level of amplification for quiet sounds (at -50db), so on. The 1/5 compression level shown in the figure corresponds to the + 40 DB level in the filter settings.

The second part of the cycle is devoted to the functions of optimizing the dynamic range of images. In it, we will tell why such solutions are needed, consider various options for their implementation, as well as their advantages and disadvantages.

Through an immaterial

Ideally, the camera must fix the image of the world around the world as it perceives him. However, due to the fact that the mechanisms of "vision" of the camera and human eye differ significantly, there are a number of restrictions that do not allow this condition.

One of the problems faced by previously the users of film cameras and are now facing digital holders, lies in the inability to adequately capture the scenes with a large difference drop without using special fixtures and / or special shooting techniques. The peculiarities of the human visual apparatus allow you to equally well to perceive the details of the high-contrast scenes in both brightly illuminated and in dark areas. Unfortunately, the camera sensor is not always able to capture the image as we see it.

The greater the brightness drop on the photographed scene, the higher the probability of the loss of parts in lights and / or shades. As a result, instead of a blue sky with lush clouds, only a whitic spot is obtained in the picture, and the objects located in the shade turn into vague dark silhouettes or merge with the surrounding atmosphere.

In the classic photo to assess the feature of the camera (or carrier in the case of film cameras) to transmit a certain range of brightness uses the concept photographic latitude(For details, see in the insertion). Theoretically, the photographic latitude of digital cameras is determined by the discharge of an analog-digital converter (ADC). For example, when applying an 8-bit ADC, taking into account the quantization error, the theoretically achievable value of the photographic latitude will be 7 EV, for 12-bit - 11 EV, etc. However, in real devices, the dynamic range of images turns out w.theoretical maximum due to the influence of various noise and other factors.

The large difference in brightness levels is a serious
The problem when taking pictures. In this case, the capabilities of the camera
It turned out to be not enough for adequate transfer
bright areas of the scene, and as a result instead of a blue section
sky (marked by the stroke) turned out to be a white "patch"

The maximum brightness value that is capable of fixing the photosensitive sensor is determined by the level of saturation of its cells. The minimum value depends on several factors, including the magnitude of the thermal noise of the matrix, the noise of charge transfer and the ADC error.

It should also be noted that the photographic latitude of the same digital camera may vary depending on the sensitivity value set in the settings. The maximum dynamic range is achieved when the so-called basic sensitivity is set (corresponding to the minimum numerical value from possible). As the value of this parameter increases, the dynamic range is reduced due to the increasing level of noise.

Photographic latitude modern models Digital cameras equipped with large sensors and 14- or 16-bit ADCs is from 9 to 11 EV, which is significantly higher compared to similar characteristics of colored negative films of the 35 mm format (on average from 4 to 5 EV). Thus, even relatively inexpensive digital cameras have photographic latitude, sufficient to adequately transfer most typical fans of amateur shooting.

However, there is a problem of another kind. It is connected with the limitations imposed by the existing recording standards digital images. Using the JPEG format with a bit of 8 bits on the color channel (which has now become the actual standard for recording digital images in the computer industry and digital technology), even theoretically cannot be saved a snapshot with a photographic latitude of more than 8 EV.

Suppose that the Camera ADC allows you to get an image of a bit of 12 or 14 bits, containing distinguishable parts both in the lights and in the shadows. However, if the photographic latitude of this image is 8 EV, then in the process of conversion to a standard 8-bit format without any additional actions (that is, simply by discarding the "unnecessary" discharges) part of the recorded information loss in the free sensitive sensor.

Dynamic range and photographic latitude

If we say simplistic, the dynamic range is defined as the ratio of the maximum image brightness value to its minimal value. In the classic photograph, the term photographic latitude is traditionally used, which, in fact, denotes the same.

The width of the dynamic range can be expressed in the form of a relationship (for example, 1000: 1, 2500: 1, etc.), however, the logarithmic scale is most often used for this. In this case, the value of the decimal logarithm of the maximum brightness ratio to its minimum value is calculated, and after the number is the capital letter D (from the English density? - Density), less often? - OD abbreviation (from English Optical density? - Optical density). For example, if the ratio of the maximum brightness value to the minimum value of any device is 1000: 1, then the dynamic range will be equal to 3.0 D:

To measure the photographic latitude, the so-called exhibition units denoted by the EV abbreviation (from English Exposure Values \u200b\u200bare traditionally used (from English. Exposure Values; professionals are often referred to by their "footsteps" or "steps"). It is precisely in these units that the magnitude of the exposure correction in the camera settings is usually set. An increase in the photographic latitude of 1 EV is equivalent to doubling the difference between the maximum and minimum brightness levels. Thus, the EV scale is also logarithmic, but to calculate the numerical values \u200b\u200bin this case, logarithm is applied with a base 2. For example, if any device allows you to fix images, the ratio of the maximum brightness value to the minimum value of which reaches 256: 1, then its Photographic latitude will be 8 EV:

Compression - reasonable compromise

The most efficient way to keep in full image information recorded by photosensitive camera sensor is recording pictures in rAW format. However, such a function is far from all cameras, and not every photographer is ready to engage in painstaking work on the selection of individual settings for each shot taken.

To reduce the likelihood of loss of parts of high-contrast pictures converted inside the chamber in a 8-bit JPEG, in the devices of many manufacturers (not only compact, but also mirrored), special functions were introduced, allowing without user intervention to compress the dynamic range of the stored images. By reducing the total contrast and loss of a minor part of the source image information, such solutions allow you to save in 8-bit JPEG format parts in lights and shadows, fixed with a light sensitive sensor of the device, even if the dynamic range of the source image was wider than 8 EV.

One of the pioneers in the development of this direction was the company HP. In the HP Photosmart 945 digital camera released in 2003, HP Adaptive Lightling technology was implemented for the first time, allowing you to automatically compensate for the lack of illumination in the dark areas of the pictures and thus maintain parts in the shadows without the risk of overexposure (which is very relevant when shooting high-contrast scenes). The HP Adaptive Lightling algorithm is based on the principles set forth by the English scientist Edwin Land (Edwin Land) in the theory of the visual perception of Retinex.

HP Adaptive Lighting Features Menu

How does the Adaptive Lighting feature work? After receiving a 12-bit image, a snapshot of it is extracted with an auxiliary monochrome image, which actually represents a light map. When processing a snapshot, this card is used as a mask that allows you to adjust the degree of exposure to a rather complex digital filter on the image. Thus, in areas corresponding to the most dark points of the card, the impact on the image of the future snapshot is minimally, and vice versa. This approach allows you to show parts in the shadows due to selective lightening of these areas and, accordingly, reducing the overall contrast of the resultant image.

It should be noted that when the Adaptive Lighting function is turned on, the picture taken is processed in the manner described above before the finished image is recorded in the file. All described operations are performed automatically, and the user can only select one of the two modes of operation Adaptive Lighting in the camera menu (low or high level of exposure) or disable this feature.

Generally speaking, many specific functions of modern digital cameras (including those considered in the previous article recognition of persons) are a kind of side or conversion products of research and development work, which were initially carried out for military customers. As for the functions of optimizing a dynamic range of images, one of the most famous providers of such solutions is Apical. The algorithms created by its staff, in particular, underlie the work of the SAT function (Shadow Adjustment Technology - shadow correction technology) implemented in a number of models of OLYMPUS digital cameras. In short, the SAT function can be described as follows: based on the source image of the image, a mask is created corresponding to the most dark areas, and then for these areas, the exhibition value is automatically corrected.

A Sony has also acquired a license to use APICAL development. In many models of compact Cyber-Shot series and in the Alfa series mirror cameras, the so-called dynamic range optimization feature is implemented (Dynamic Range Optimizer, DRO).

Photographs made by HP PHOTOSMART R927 camera with disconnected (at the top)
and activated function adaptive lighting

Picture Correction When DRO activation is performed during the primary image processing process (that is, before recording a ready-made JPEG format file). In the basic version, DRO has a two-step setting (you can select the standard or advanced mode of its operation in the menu). When you select a standard mode based image analysis, the exposure value is corrected, and then a tone curve for aligning a general balance is applied to the image. In the advanced mode, a more complex algorithm is used, which allows the correction of both in the shadows and in the lights.

Sony developers are constantly working on the improvement of the DRO operation algorithm. For example, in the A700 mirror camera, when an advanced DRO is activated, it is possible to select one of five correction options. In addition, the possibility of saving the three options of one snapshot (a kind of bracketing) with various options for DRO is implemented.

In many models of Nikon digital cameras, there is a D-Lighting function, which is also based on Apical algorithms. True, in contrast to the solutions described above, the D-Lighting is implemented as a filter for processing previously stored images by means of a tone curve, the form of which allows you to make shadows with lighter, while maintaining the remaining sections of the image. But since in this case, ready-made 8-bit images are subjected to processing (and not the original frame image, having a higher bit and, accordingly, a wider dynamic range), the possibilities of D-Lighting are very limited. To obtain the same result, the user may be by processing the snapshot in a graphic editor.

When comparing enlarged fragments, it is clearly noticeable that the dark areas of the original image (left)
When you turn on the Adaptive Lighting function became lighter

There are a number of solutions based on other principles. So, in many cameras of the Lumix family of Panasonic (in particular, DMC-FX35, DMC-TZ4, DMC-TZ5, DMC-FS20, DMC-FZ18, etc.) is implemented a light recognition feature (Intelligent Exposure), which is an integral part of the system Intelligent automatic shooting control IA. The operation of the Intelligent Exposure function is based on the automatic frame image analysis and the correction of dark sections of the picture to avoid loss of parts in the shadows, as well as (if necessary) compression of the dynamic range of high-contrast scenes.

In some cases, the operation of the optimization function of the dynamic range provides for not only certain operations for processing the source image of the image, but also the correction of shooting settings. For example, in the new models of Fujifilm digital cameras (in particular, in FinePix S100FS), a dynamic range extension function is implemented (WIDE DYNAMIC RANGE, WDR), which allows the developers, to increase the photographic latitude of one or two steps (in the terminology of settings - 200 and 400%).

When activating the WDR function, the camera takes pictures with an exposure -1 or -2 EV (depending on the selected setting). Thus, the frame image is obtained incorrect - this is necessary in order to maintain maximum information about the details in the lights. Then the resulting image is processed using a tone curve, which allows to align the overall balance and adjust the black level. After that, the image is converted into an 8-bit format and recorded as a JPEG file.

Dynamic range compression allows you to save more details
In the lights and shadows, however, the inevitable consequence of such an impact
is a reduction in total contrast. On the lower imaging
It is much better developed by the texture of the clouds, however
Due to the lower contrast, this version of the picture
Looks less natural

A similar feature called Dynamic Range Enlargement is implemented in a number of compact and mirror cameras of Pentax (Optio S12, K200D, etc.). According to the manufacturer, the application of the Dynamic Range Enlargement function allows you to increase the photographic latitude of images on 1 EV without loss of parts in lights and shadows.

The function acting in this way called Highlight Tone Priority (HTP) is implemented in a number of Canon mirror models (EOS 40D, EOS 450D, etc.). According to the information provided in the User Guide, the activation of HTP allows you to improve the work of parts in the lights (or rather, in the level range from 0 to 18% gray).

Conclusion

Let's summarize. The built-in compression function of the dynamic range allows with minimal damage to convert the source image with a large dynamic range in 8-bit jPEG file. In the absence of frames of saving frames in the RAW format, the compression mode of the dynamic range gives the photographer the opportunity to more fully use the potential of its camera when shooting high-contrast scenes.

Of course, it is necessary to remember that the compression of the dynamic range is not a miraculous means, but rather a compromise. For the preservation of parts in light and / or shadows, it is necessary to pay the noise level in the dark sections of the picture, a decrease in its contrast and some coating of smooth tonal transitions.

Like any automatic functionThe algorithm for compressing the dynamic range is not a fully universal solution that allows you to improve absolutely any picture. And therefore, to activate it makes sense only in cases where it is really necessary. For example, in order to remove the silhouette with a well-worked background, the compression function of the dynamic range must be turned off - otherwise the spectacular plot will be hopelessly spoiled.

Completing the consideration of this topic, it should be noted that the use of dynamic range compression functions does not allow "pull" on the resulting image part that were not fixed by the camera sensor. To obtain a satisfactory result, when shooting high-contrast scenes, you need to use additional devices (for example, gradient filters for photographing landscapes) or special techniques (such as shooting several frames with bracketing on exposure and further combining them into one image using Tone Mapping technology).

The next article will be devoted to the serial shooting function.

To be continued

The sound level is the same throughout the composition, there are several pauses.

Narrowing dynamic range

Narrowing the dynamic range, or simply speaking compressionneeded for different purposes that are most common of them:

1) Achieving a single level of volume throughout the composition (or tool batch).

2) Achieve a single volume level of compositions over the album / radio transmission.

2) Increasing the intelligibility, mainly with the compression of a certain party (vocals, bass barrel).

How is the narrowing of the dynamic range?

The compressor analyzes the audio level at the input comparing it with the user specified by the value of Threshold (threshold).

If the signal level is lower than the value Threshold. - The compressor continues to analyze the sound without changing it. If the sound level exceeds the value of THRESHOLD - then the compressor starts its action. Since the role of the compressor consists in narrowing the dynamic range, it is logical to assume that it limits the most large and the smallest amplitude values \u200b\u200b(signal level). At the first stage, there is a limitation of the largest values \u200b\u200bthat decrease with a certain force called Ratio. (Attitude). Let's look at the example:

Green curves display the sound level, the greater the amplitude of their oscillations from the X axis - the greater the signal level.

The yellow line is the threshold (Threshold) of the compressor. Making the threshold value above - the user removes it from the X axis. Doing the threshold threshold below - the user brings it to the Y axis. It is clear that the lower the value of the threshold - the more often the compressor will be triggered and the other way. If the Ratio value is very large, then after reaching the threshold signal level, the entire subsequent signal will be suppressed by the compressor to silence. If the value of Ratio is very small - nothing happens. On the choice of Threshold and Ratio values, it will come later. Now we should ask yourself the next question: what is the point of suppressing the entire subsequent sound? Indeed, in this sense there is no, we need to get rid of the amplitude values \u200b\u200b(peaks), which exceed the value of Threshold (in the graphics are marked in red). It is to solve this problem and there is a parameter Release (Attenuation), which will set the time of compression.

The example shows that the first and second excess of the threshold threshold lasts less than the third excess of the threshold threshold. So, if the Release parameter is adjusted to the first two peaks, then when processing the third may remain untreated part (since the threshold exceeding the threshold lasts longer). If the Release parameter is adjusted to the third peak - then when processing the first and second peak, a unwanted decrease in the signal level is formed.

The same comes the Ratio parameter. If the Ratio parameter is configured to the first two peaks, then the third will not be sufficiently suppressed. If the Ratio parameter is configured to process the third peak - then the processing of the first two peaks will be too high.

These problems can be solved in two ways:

1) Set the attack parameter (ATTACK) is a partial solution.

2) Dynamic compression is a complete solution.

Parameter buttaki (attack)it is intended for a task of time, after which the compressor will start its work after the threshold of Threshold is exceeded. If the parameter is close to zero (equal to zero in the case of parallel compression, see the acc. Article) - then the compressor will start to suppress the signal immediately, and the number of time specified by the Release parameter will work. If the speed of attack is great, then the compressor will start its action after a certain period of time expire (it is necessary to make a definition). In our case, you can configure the parameters of the threshold (threshold), attenuation (Release) and the level of compression (Ratio) to process the first two peaks, and the attack value (attack) is set close to zero. Then the compressor will suppress the first two peaks, and when processing the third will suppress it until the end of the threshold (Threshold) is completed. However, this does not guarantee high-quality sound processing and close to limitting (rough cut of all amplitude values, in this case the compressor is called a limiter).

Let's look at the result of sound processing by compressor:

Peaks disappeared, notice the fact that the processing settings were sufficiently gentle and we supplied only the most speakers of amplitude. In practice, the dynamic range is narrowed much stronger and this trend only progresses. In the minds of many composers - they make music louder, however, in practice, they completely deprive her speakers for those listeners who may have to listen to her at home and not on the radio.

We have left to consider the last parameter of the compression is Gain.(Gain). Strengthening is intended to increase the amplitude of the entire composition and, in fact, equivalent to another tool of sound editors - normaliz. Let's look at the end result:

In our case, the compression was justified and improved the dopy of sound, since the released peak is rather an accident than an intentional result. In addition, it can be seen that the music is rhythmic, therefore it is characterized by a narrow dynamic range. In cases where high amplitudes were made specifically, the compression may become an error.

Dynamic compression

The difference between dynamic compression from not dynamic lies in the fact that with the first signal suppression level (Ratio) depends on the level of the incoming signal. Dynamic compressors are in all modern programs, controlling the Ratio and Threshold parameters using the window (each parameter corresponds to its own axis):

There is no single schedule display standard, somewhere along the Y axis, the level of the incoming signal is displayed, somewhere on the contrary, the signal level after compression. Somewhere the point (0,0) is in the upper right corner, somewhere in the lower left. In any case, when moving the mouse cursor through this field, the values \u200b\u200bof numbers that correspond to the Ratio and Threshold parameters are changed. Those. You specify the compression level for each THRESHOLD value, thanks to which you can easily flexibly configure compression.

Side Chain

Side Chain Compressor analyzes a single channel signal, and when the sound level exceeds the threshold (Threshold) - applies compression to another channel. Side Chain has its advantages of working with tools that are located in one frequency domain (the Bass Bass Bass Bass is actively used), but sometimes the tools located in different frequency areas are used, which leads to an interesting Side-Chein effect.

Part Two - Compression Stages

There are three compression stages:

1) The first stage is the compression of individual sounds (SingleShoots).

The timbre of any tool has the following features: Attack (Attack), Holding (HOLD), decline (Decay), Level Period (Sustain), Attitude (Release).

The compression phase of individual sounds is divided into two parts:

1.1) Compression of individual sounds of rhythmic tools

Often the components of the bit require a separate compression to give them a clarity. Many treated bass barrel separately from other rhythmic tools, both at the stage of compression of individual sounds and at the stage of compression of individual parties. This is due to the fact that it is in a low-frequency area, where only the bass is usually present in addition to it. Under the clarity of bass barrels means the presence of a characteristic click (a very short time of attack and holding bars). If the click is not - then it is necessary to process it with a compressor, setting the threshold equal to zero and the attack time from 10 to 50 ms. The realeese compressor must end to the new bass barrel strike. The last problem can be solved using the formula: 60 000 / BPM, where BPM is the tempo of the composition. So, for example) 60 000/137 \u003d 437.96 (time in milliseconds to a new strong shadow of the 4-dimensional composition).

All the above applies to other rhythmic tools with a short time attack - they must have an accented click, which should not be suppressed by the compressor on some of the stages of compression levels.

1.2) Compression Separate sounds Harmonic instruments

Unlike rhythmic instruments, the batch of harmonic tools is quite rarely made up of individual sounds. However, it does not follow from this that they should not be processed at the level of sound compression. In case you use sample with the recorded party, it is the second level of compression. This level of compression includes only synthesized harmonic instruments. These can be samples, synthesizers using various sound synthesis methods (physical modeling, FM, additive, subtractive, etc.). As you probably have already guessed - we are talking about programming the synthesizer settings. Yes! This is also a compression! Almost all synthesizers have a programmable Envelope parameter (ADSR), which means envelope. With the help of envelope, the attack time (attack) is set, recession (Decay), Holding levels (Sustain), Atoys (Release). And if you tell me what it is not the compression of each individual sound - you are my enemy for life!

2) The second stage is the compression of individual parties.

Under the compression of individual parties, I understand the narrowing of the dynamic range of a number of united individual sounds. This stage includes records of parties, including vocals, which requires processing compression to give it a clarity and intelligibility. When processing the compression of parties, it is necessary to take into account that when the individual sounds are added, unwanted peaks may appear, on which it is necessary to get rid of this stage, since if it is not done now, then the picture can be aggravated at the stage of information on the entire composition. At the stage of compression of individual parties, it is necessary to take into account the compression of the processing stage of individual sounds. If you have achieved the clarity of the bass barrel - then incorrect re-processing in the second stage can be ruined everything. The processing of all batches of the compressor is not required, as well as the processing of all individual sounds is not required. I advise you to deliver an amplitude analyzer just in case to determine the presence of undesirable side effects of combining individual sounds. In addition to compression, at this stage, it is necessary to ensure that the parties be as possible in different frequency bands so that quantization was performed. It is also useful to remember that the sound has such a characteristic as masking (psychoacousti):

1) A quiet sound is masked loud, going to him.

2) Quiet sound at low frequency is masked by a loud sound at high frequency.

So, for example, if you have a batch of synthesizer, then often notes begin to play before the previous notes finish their sound. Sometimes it is necessary (creating harmony, game style, polyphony), but sometimes not at all - you can crop their end (delay - release) in case it is heard in SOLO mode, but not heard in the playback mode of all parties. The same applies to effects, such as reverb - it should not last until the new sound of the sound source. Cutting and removing an unnecessary signal - you make the sound cleaner, and this can also be considered as a compression - because you remove unnecessary waves.

3) The third stage is the compression of the composition.

With the compression of the entire composition, it is necessary to take into account that all parties are associated with many separate sounds. Consequently, when they are associated and subsequent compression, it is necessary to ensure that the final compression does not spoil what we have achieved at the first two stages. You also need to separate the compositions in which is important and narrow range. With compression of compositions with a wide dynamic range - it is enough to put a compressor that will prescribe short-term peaks that were formed as a result of the addition of parties among themselves. With compression of the composition in which the narrow dynamic range is important - everything is much more complicated. Here the compressors are recently called maximizers. Maximizer is a plugin that combines compressor, limitter, graffiti equalizer, enhaiser and other sound conversion tools. At the same time, it must necessarily have sound analysis tools. Moving, final processing with a compressor, is largely needed to combat assumed errors in previous stages. Errors - not so much compression (however, if you do at the last stage, what you could do at the first stage - this is an error), how much in the original choice of good samples and tools that would not interfere with each other (we are talking about frequency bands) . It is for this that achk correction is made. It often happens that with strong compression on the master you need to change the parameters of compression and information on earlier stages, since with a strong narrowing of the dynamic range, quiet sounds, which previously masked, changes the sound of individual components of the composition.

In these parts, I did not affect specific compression parameters. I considered it necessary to write about that when compression it is necessary to pay attention to all sounds and all parties at all stages of creating the composition. Only so in the end you will get a harmonious result not only from the point of view of the theory of music, but also from the point of view of sound engineering.

Next, there are practical tips on the processing of individual parties. However, in compression, the numbers and presets can only suggest the desired area, in which it is necessary to search. Ideal compression settings depend on each individual case. The gain (GAIN) and threshold (THRESHOLD) implies the normal sound level (logical use of the entire range).

Part of the tie - compression parameters

Brief reference:

The threshold (Threshold) - determines the sound level of the incoming signal, to achieve which the compressor starts work.

Attack (Attack) - determines the time after which the compressor will start working.

Level (Ratio) - determines the stony of reducing the values \u200b\u200bof the amplitude (with respect to the original amplitude value).

Release (Release) - determines the time after which the compressor will stop working.

Gaining (GAIN) - determines the level of increasing signal, after processing the compressor.

Compression Table:

Tool Threshold. Attack Ratio. Release Gain. Description
Vocals 0 dB. 1-2 ms.

2-5 ms.

10 ms

0.1 ms.

0.1 ms.

less than 4: 1

2,5: 1

4:1 – 12:1

2:1 -8:1

150 ms.

50-100 ms.

150 MSEK

150 ms.

0.5s.

Compression when recording must be minimal, requires mandatory processing at the stage of information to make a definition and intelligibility.
Wind instruments 1 - 5ms 6:1 – 15:1 0.3s.
Barrel from 10 to 50 ms

10-100 ms.

4: 1 and above

10:1

50-100 ms.

1 ms.

The lower THRSHOLD and the greater Ratio and longer attack, the stronger the click at the beginning of the barrels.
Synthesizers Depends on the type of wave (ADSR envelopes).
Drum drum: 10-40 MS.

1-5ms

5:1

5:1 – 10:1

50 ms.

0.2S.

High-Hat 20 ms. 10:1 1 ms.
Tepar microphones 2-5 ms. 5:1 1-50 MS.
Drums 5ms. 5:1 – 8:1 10ms.
Bas-guitar 100-200 ms.

4ms to 10ms.

5:1 1 ms.

10ms.

String 0-40 MS. 3:1 500 ms.
Sint bass 4ms - 10ms 4:1 10ms. Depends on envelopes.
Percussion 0-20 ms. 10:1 50 ms.
Acoustic guitar, piano 10-30 MS.

5 - 10ms

4:1

5:1 -10:1

50-100 ms.

0.5s.

Electro-Nitara 2 - 5ms 8:1 0.5s.
Final compression 0.1 ms.

0.1 ms.

2:1

from 2: 1 to 3: 1

50 ms.

0.1 ms.

0 dB at the output The attack time depends on the target - whether it is necessary to remove peaks or make the track smoother.
Limiter after final compression 0 ms. 10:1 10-50 MS. 0 dB at the output If you need a narrow dynamic range and a rude "cut" waves.

The information was taken from various sources that are referred to as long as resources on the Internet. The difference in compression parameters is compressed by the difference in sound preferences and work with different material.