Digital Audio/es

From Audacity Development Manual
Revision as of 16:51, 21 June 2010 by Huss Spanish Students (talk | contribs) (Indices muestra)
Jump to: navigation, search
Digital Audio
This is a work-in progress translation to Spanish of Digital Audio.

Digital Sampling

All sounds we hear with our ears are pressure waves in air. Starting with Thomas Edison's demonstration of the first phonograph in 1877, it has been possible to capture these pressure waves onto a physical medium and then reproduce these later by regenerating the same pressure waves. Audio pressure waves, or waveforms, look something like this:

Todos los sonidos que escuchamos son ondas de presión en el aire.

Ana analog waveform

Analog recording media such as a phonograph records and cassette tapes represent the shape of the waveform directly, using the depth of the groove for a record or the amount of magnetization for a tape. Analog recording can reproduce an impressive array of sounds, but it also suffers from problems of noise. Notably, each time an analog recording is copied, more noise is introduced, decreasing the fidelity. This noise can be minimized but not completely eliminated.

Digital recording works differently: it samples the waveform at evenly-spaced timepoints, representing each sample as a precise number. Digital recordings, whether stored on a compact disc (CD), digital audio tape (DAT), or on a personal computer, do not degrade over time and can be copied perfectly without introducing any additional noise. The following image illustrates a sampled audio waveform:

A digital waveform

Digital audio can be edited and mixed without introducing any additional noise. In addition, many digital effects can be applied to digitized audio recordings to simulate reverberation, enhance certain frequencies, or change the pitch, for example. Audacity is a software program for editing, mixing, and applying effects to digital audio recordings.

Audacity's ability to play or record audio directly from your computer depends on your specific computer hardware. Most desktop computers come with a sound card with 1/8" jacks for you to plug in a microphone or other source for recording, and speakers or headphones for listening. Many laptop computers have speakers and a microphone built-in. The sound card that comes with most computers is not particularly high quality; if you are interested in high-quality recording, see Recording Quality for more details. For information on how to set up Audacity for playback and recording, see Audacity Setup and Configuration.

Digital Audio Quality

The quality of a digital audio recording depends heavily on two factors: the sample rate and the sample format or bit depth. Increasing the sample rate or the number of bits in each sample increases the quality of the recording, but also increases the amount of space used by audio files on a computer or disk.

Sample rates

Indices muestra

Sample rates are measured in hertz (Hz), or cycles per second. This value simply represents the number of samples captured per second in order to represent the waveform; the more samples per second, the higher the resolution, and thus the more precise the measurement is of the waveform. The human ear is sensitive to sound patterns with frequencies between approximately 20 Hz and 20,000 Hz. Sounds outside that range are essentially inaudible, although Rupert Neve has subjectively proven the existence of pysychoacoustic fidelity that can be heard above this supposed limit of 20,000kHz.


¿¿"Los índices muestra"?? son medidos en hertz (Hz), o ciclos por segundo. Este valor simplemente representa el número de muestras capturadas por segundo para representar la forma de la onda;entre más muestras por segundo, la resolición aumenta, y así la medida de la onda es más precisa. el oido humano es sencible a patrones de sonido con frecuencias de aproximadamente 20 Hz a 20,000 Hz. sonidos fuera de ese rango son esencialmente inaudibles, aunque se ha provado subjetivamente la existencia de fidelidad ¿¿psicoacústica?? que puede ser escuchada arriba del supuesto límite de 20,000kHz.

Capturing a sound at a particular frequency requires a sampling rate of at least twice that frequency (known as the Nyquist frequency). Therefore a sample rate of 40,000 Hz is the absolute minimum necessary to reproduce sounds within the range of human hearing, though higher rates (called over sampling) may increase quality even further by avoiding any aliasing artifacts around the Nyquist frequency. The sample rate used by audio CDs is 44,100 Hz. Human speech is intelligible even if frequencies above 4,000 Hz are eliminated; in fact telephones only transmit frequencies between 200 Hz and 4,000 Hz. Therefore a common sample rate for audio recordings is 8,000 Hz, which is sometimes called speech quality. Note that very steep filtering (called an anti-aliasing filter) is required above the Nyquist frequency in order to prohibit signal above this cutoff point from being folded back into the audible range by the digital converter, and creating the distorting artifacts of aliasing noise.

Capturar un sonido de una frecuencia particular requiere un índice prueba de al menos dos veces esa frecuencia (conocida como frecuencia Nyquist). por consiguiente un indice de muestra de 40,000 Hz es necesario el minimo absoluto para reproducir sonidos dentro del rango de la audicion humana, aunque los indices mas altos (llamado over sampling) pueden incrementar la calidad aun mas alla para evitar cualquier ¿¿aliasing?? de artefactos alrededor de la frecuencia ¿¿Nyquist??. El indice muestra usado por CDs de audio es 44,100 Hz. El habla humana es comprensible aun si las frecuencias arriba de 4,000 Hz son eliminadas; de hecho los teléfonos solamente transmiten frecuencias entre 200 Hz y 4,000 Hz.Aunque el indice de una muestra comun para audio grabaciones es de 8,000 Hz, que es algunas veces llamados "calidad de habla". obsérvese que todo ¿¿steep?? filtering (llamado un filtro aliasing)

The most common sample rates, measured in kilohertz (KHz, or 1,000 Hz), are 8 KHz, 16 KHz, 22.05 KHz, 22.25 KHz, 44.1 KHz, 48 KHz, 96 KHz, and 192 KHz. Audacity supports any of these sample rates, however most computer sound cards are limited to 48 KHz or sometimes 96 KHz. Again, the most common sample rate by far is 44.1 KHz (44100 Hz).

los indices muestra mas comunes, medidos en kilohertz (KHz, ó 1,000 Hz), son 8 KHz, 16 KHz, 22.05 KHz, 22.25 KHz, 44.1 KHz,48 KHz, 96 KHz y 192 KHz. Audacity admite cualquiera de estos indices muestra, sin embargo muchas tarjetas de sonido de computadora son limitadas a 48 KHz o algunas veces 96 KHz. El indice muestra mas comun es 44.1 KHz (44100 Hz). In the image below, the left half has a low sample rate, and the right half has a high sample rate (ie. high resolution): En la imagen de abajo, la mitad izquierda tiene un indice muestra bajo, y el de la derecha tiene un indice de muestra alto (¿¿ie.?? alta resolucion): Waveform with low sample rate and high sample rate

Sample formats

The other measure of audio quality is the sample format (or bit depth), which is usually measured by the number of computer bits used to represent each sample. The more bits that are used, the more precise the representation of each sample. Increasing the number of bits also increases the maximum dynamic range of the audio recording, in other words the difference in volume between the loudest and softest possible sounds that can be represented.

Formatos de Muestra

La otra medida de calidad de audio es formato de muestra (o "poco intenso"), la cual es normalmente una medida usada en las computadoras llamada "bits" usada para representar cada medida. Los bits más usados, la representación más precisa de cada medida. Incrementando el número de bits también se incrementa el rango dinámico de la grabación, en otras palabras, la diferencia entre el sonido más fuerte y el más bajo puede ser representada.


Dynamic range is measured in decibels (dB). The human ear can perceive sounds with a dynamic range of at least 90 dB. However, whenever possible it is a good idea to record digital audio with a dynamic range of far more than 90 dB, in part so that sounds that are too soft can be amplified for maximum fidelity. Note that although signals recorded at generally low levels can be raised (ie normalised) to advantage the available dynamic range, the recording of low level signals will not use all of the available bit depth and this loss of resolution cannot be re-captured simply by normalising the overall level of the digital waveform.

El rango dinámico es medido en decibeles dB. El oido humano puede percibir sonidos con un rango dinámico de al menos 90 dB. Sin embargo, cuando sea posible es mejor grabar sonidos digitales con un rango dinámico mayor a 90 dB, de ésta manera, los sonidos que son muy suaves, pueden ser amplificados con mayor fidelidad. Observa que aunque las señales grabadas en los niveles generalmente bajos pueden ser amplificados (eg. normalizado)para favorecer la disposición del rango dinámico, la grabación con señales de bajo nivel no son usadas en todas las ¿bit depth? disponibles y ésta pérdida de resolución no puede ser recuperada al normalizarse los niveles generales de las ondas digitales.

Common sample formats, and their respective dynamic range include:

  • 8-bit integer: 45 dB
  • 16-bit integer: 90 dB
  • 24-bit integer: 135 dB
  • 32-bit floating point: near-infinite dB

Los formatos de muestra más comunes y sus respectivo rango dinámico incluidos:

  • 8-bit  :45 dB
  • 16-bit  :90 dB
  • 24-bit  :135 dB
  • 32-bit  : dB


Other sample formats such as ADPCM approximate 16-bit audio with compressed 4-bit samples. Audacity can import many of these formats, but they are rarely used because of much better newer compression methods.



Audio CDs and most computer audio file formats use 16-bit integers. Audacity uses 32-bit floating-point samples internally and, if required, converts the sample bit depth when the final mix is Exported. Audacity's default sample format during recording can be configured in the Quality Preferences or set individually for each track in the Track Drop-Down Menu. During playback, the audio in any tracks that have a different sample format from the project will be resampled on the fly using the Real-time Conversion settings in the Quality Preferences. The High-quality Conversion settings are used when processing, mixing or Exporting.

In the image below, the left half has a sample format with few bits, and the right half has a sample format with more bits. If you think of the sample rate as the spacing between vertical gridlines, the sample format is the spacing between horizontal gridlines.

Waveform sample formats.png

Size of audio files

Audio files are very large, much larger than most files you probably work with (unless you work with video files). To determine the size of an uncompressed audio file, multiply the sample rate (e.g., 44100 Hz) by the sample format bit rate (e.g. 16-bit) by the number of channels (2 for stereo) by the number of seconds. A completely full 74-minute stereo audio CD takes up over 6 billion bits. Divide this by 8 to get the number of bytes; an audio CD is a little less than 800 megabytes (MB). See compressed audio, below.

El tamaño de los archivos de video

los archivos de sonido son muy grandes, aun más largos que muchos de los achivosq ue ulitices (a menos que trabajes con archivos de video). Para detreminar el tañano de un archivo de video sin comprimir, multiplica ...

Clipping

One limitation of digital audio is that it cannot deal with sound pressure waves that exceed the maximum levels it is designed to deal with. When a signal is recorded that exceeds the maximum level, samples outside the range are clipped to the maximum value, like this:

Waveform showing clipping

A sound recorded with clipping will sound distorted and harsh. While there are some techniques that can eliminate a small amount of noise due to clipping, it is always preferable to avoid clipping while recording. Change the volume on your input source (microphone, cassette player, record player) and set Audacity's input volume control (in the Mixer Toolbar) such that the waveform is as large as possible (for maximum fidelity) without clipping.

Compressed Audio

Because digital audio files are so large, reduced sample rates were typically used whenever possible. In 1991, the MP3 (MPEG I, layer 3) standard changed everything. MP3 is a lossy compression technique that can dramatically reduce the file size of a digital audio file with surprisingly little effect on the quality. One second of CD-quality audio takes up 1.4 megabits, while a common bitrate for MP3 files is 128 kilobits, which is a compression factor of more than 10x! MP3 works by cleverly "throwing away" details about the audio waveform that humans are not very sensitive to, based on a psychoacoustic model of how our ears and brains process sounds. All MP3 files are not created alike; different psychoacoustic models will lead to different amounts of perceived distortion in the audio file.

With good speakers, most people can hear the difference between a 128k MP3 and an uncompressed audio file from a CD. 192k and 256k MP3 files are more popular among audiophiles who prefer higher quality.

There are many other lossy compressed audio file formats. Audacity fully supports the Ogg Vorbis format, which is similar to MP3 but is a completely open, patent-free standard. Over time the quality of Ogg Vorbis files has come to surpass the quality of MP3, and its format is more extensible so more improvements are possible. Ogg Vorbis is a great choice for your own audio, however the reality is that many more devices such as iPods and other portable audio players support MP3 but not Ogg Vorbis yet.

Other well-known compression methods include ATRAC, used by Sony Minidisc recorders, Windows Media Audio (WMA), and AAC.