 Spectrograms are visual representations of spoken words. They are used in the field of speech processing to display the spectrum of frequencies in a sound as they vary with time. The vertical axis of a spectrogram represents the frequency. The horizontal axis represents time in milliseconds. A third dimension indicates the amplitude or the loudness of a particular frequency. Amplitude is displayed in shades of gray from light, less energy, to dark, more energy. There are five different formants for voiced phonemes, but most sounds can be classified using the first two formants. F1 can vary from 300 Hertz to a thousand Hertz. The lower the F1 value is, the closer the tongue is to the roof of the mouth. The vowel E, as in the word beat, has one of the lowest F1 values. In contrast, the vowel R, as in bra, has the highest F1 value. F2 can vary from 850 Hertz to 2500 Hertz and its value is proportional to the frontness or backness of the highest part of the tongue. Different vocal tract shapes will produce different formant patterns. Their unique patterns enable us to distinguish vowels from one another and help us to identify adjacent consonants. For example, plosives are visualized as a great burst of energy across all frequencies after relative silence. Darkness occurring across a white frequency section indicates turbulent air flow, thus implying a fricative or affricate.