Voice Training Glossary: Pitch, Formants, Resonance

Updated on: Mon Jun 01 2026

Voice training comes with a lot of unfamiliar language. You might hear people talk about pitch, resonance, formants, vocal weight, F0, F1, F2, or F3, and it can be hard to know which terms describe the same thing and which ones point to different parts of your voice.

This glossary is a practical starting point. It is written for anyone using voice training to build more awareness and control, including singing, acting, public speaking, sounding more masculine or feminine, or general vocal improvement.

Voice training is easier when you can name what you are hearing. These terms can also help you understand feedback from a voice coach, a speech-language pathologist, an app like Strivocal, or trusted people listening to your recordings.


Quick Definitions

  • Pitch / F0: How high or low a voice sounds, usually measured in hertz.
  • Resonance: How the vocal tract shapes sound after it is produced.
  • Formants: Measurable frequency regions related to resonance, vowel shape, and vocal tract configuration.
  • Vocal weight: How heavy, light, thick, or thin a voice sounds.
  • Intonation: How pitch moves across a phrase or sentence.
  • Articulation: How clearly and physically you shape speech sounds.

On this page


Audio Demo

Reading about all these terms like pitch and formants is useful, but hearing them side by side makes the ideas much easier to understand.

Listen to the difference: pitch and formant

Move the sliders to change the pitch and formant of the audio clip, and see how those changes affect the sound.

Pitch

--

--

Formants

F1
--
--
F2
--
--
F3
--
--

Audio clip from Mozilla Common Voice. Praat was used to generate the pitch and formant variants, which may not be perfectly accurate so you may notice some artifacts in the audio.


What is pitch in voice training?

Pitch is how high or low a voice sounds. In voice analysis, pitch is usually measured in hertz (Hz). A higher number means the vocal folds are vibrating more times per second. A lower number means they are vibrating fewer times per second. In acoustic terms, this measured pitch is usually called fundamental frequency, or F0 [1].

Pitch matters in voice training, but it is not the whole voice. Two people can speak at the same pitch and still sound very different because resonance, formants, volume, articulation, intonation, and speech style all contribute to how a voice is perceived. Men and women often have different average pitch ranges, but there is a lot of overlap (see the histogram below).

Histogram showing pitch values grouped by gender
Women’s pitch is usually higher than men’s, but there is a lot of overlap. Pitch is just one part of the voice, and it interacts with other features to create the overall sound. Data is from Common Voice.

Pitch audio examples

These short clips show why pitch should be interpreted with the rest of the voice. The first speaker is an actress known for a deep voice, while the second speaker is Chef John, whose voice is relatively high pitched for a male speaker.

Shohreh Aghdashloo, actress with a famously deep voice

Line graph of Shohreh Aghdashloo's pitch over time
Pitch over time
Line graph of Shohreh Aghdashloo's formants over time
Formants over time

Chef John, speaker with a high-pitched voice

Line graph of Chef John's pitch over time
Pitch over time
Line graph of Chef John's formants over time
Formants over time

What is resonance in voice training?

Resonance describes how sound is shaped by the spaces in your vocal tract, including the throat, mouth, tongue position, jaw position, and lips. In voice training, people often use resonance as a practical listening term for the quality, brightness, darkness, size, or placement of the sound.

If pitch is related to how quickly the vocal folds vibrate, resonance is related to how the sound is filtered after it is created. The source-filter model describes this in two broad pieces: the vocal folds provide a sound source, and the vocal tract filters that sound through its resonances [2]. This is why changing your tongue, lips, larynx position, or mouth shape can change the character of your voice even when your pitch stays the same [1, 2].

In voice training, resonance is often one of the biggest reasons two voices with similar pitch can be perceived differently.

Resonance and formants are closely related, and in some contexts people use the words almost interchangeably. A helpful distinction is that resonance is the broader physical and perceptual idea: the vocal tract reinforces some frequency regions and reduces others. Formants are the measurable acoustic frequency regions that show up in the speech signal because of those resonances [2, 3, 4]. Some technical sources define formants as vocal-tract resonances, while others reserve formant for the spectral peaks measured in the sound. It is enough to remember that resonance is what you are changing or hearing, and formants are one way to measure part of that change.


What are formants?

Formants are resonant frequencies of the vocal tract, or the resonance-related peaks that can be measured in a speech signal. They are one way to measure part of what voice teachers and coaches often describe as resonance [2, 3, 4].

When you speak, your vocal tract emphasizes certain frequency areas. These emphasized areas are called formants. They help shape vowels, voice quality, and perceived vocal size [1, 2, 4].

The first few formants are especially important in speech analysis:

TermMeaning
F1The first formant
F2The second formant
F3The third formant

If you want a deeper explanation, read What are vocal formants?, which explains how formants are measured and how they relate to perceived gender. The charts below show how F1, F2, and F3 values tend to be higher in female speakers than male speakers, but there is a lot of overlap.

Histograms showing formant values grouped by gender
F1, F2, and F3 formants tend to be higher in female speakers than male speakers, but there is a lot of overlap. Data is from Common Voice.

F1

F1 is the first formant. It is strongly affected by vowel shape, jaw height, tongue height, and the overall shape of the vocal tract. For example, open vowels often have a higher F1 than close vowels. In practical voice training, F1 can help show how your mouth and vocal tract shape are changing while you speak.

F2

F2 is the second formant. It is especially affected by tongue position and lip shape. Front vowels tend to have a higher F2, while back or rounded vowels often have a lower F2. F2 can be useful when looking at how different vowels and articulation habits affect the sound of your voice.

F3

F3 is the third formant. It is often more subtle than F1 and F2, but it can still contribute to voice quality and how vowels are perceived. It can be influenced by tongue shape, lip rounding, and other articulatory details. For most beginners, F1 and F2 are easier places to start. F3 becomes more useful when you want a more detailed acoustic picture.


Vocal Weight

Vocal weight describes how heavy, light, thick, or thin a voice sounds. It is a practical voice-training term rather than a single standard acoustic measurement, and it overlaps with broader ideas like voice quality, phonation, and vocal fold behavior [5].

It is not the same thing as volume. A voice can be quiet and still sound heavy, or loud and still sound light. Vocal weight is often related to how the vocal folds are vibrating and how much intensity or thickness is present in the sound.

People often notice vocal weight when comparing a soft, light voice to a fuller, denser voice at a similar pitch.


Volume

Volume describes how loud or soft your voice is. Volume is usually measured as sound intensity or sound pressure level, but in practice it is about how much loudness you are producing and how loud you seem to a listener.

Volume is important because voice training should transfer into real life. A voice that works in quiet solo practice may feel different in a conversation, on a phone call, in a meeting, or in a noisy room.


Intonation

Intonation is the movement of pitch across a phrase or sentence. For example, your pitch might rise at the end of a question, fall at the end of a statement, or move more expressively when you are excited. Two people can have the same average pitch but use very different intonation patterns.

Intonation affects how natural, expressive, confident, or conversational a voice sounds.


Articulation

Articulation is how clearly and physically you shape speech sounds [1]. Your tongue, lips, jaw, teeth, and soft palate all contribute to articulation. Small changes in articulation can affect vowels, consonants, rhythm, and the overall character of your speech.

In voice training, articulation can be useful because it gives you a concrete place to practice. Words, phrases, and sentences help you repeat the same sounds while adjusting pitch, resonance, volume, or intonation.


Perceived Gender

Perceived gender describes how a listener interprets a voice in terms of gender presentation. This is not the same thing as a person’s identity. It is also not determined by a single acoustic feature. Pitch, formants, resonance, vocal weight, articulation, intonation, volume, language, context, and listener expectations can all influence perception [6].

Because perceived gender is complex, it is best to use app feedback as one source of information rather than a final judgment. Real people, voice coaches, and your own comfort all matter too.


Spectrogram

A spectrogram is a visual display of sound over time [1, 7]. It usually shows time from left to right, frequency from bottom to top, and intensity through color or brightness. Spectrograms can help show pitch movement, formants, noise, and other acoustic patterns [1, 7].

You do not need to become an audio engineer to benefit from voice analysis, but spectrograms can make invisible voice patterns easier to see.


How Strivocal Uses These Terms

Strivocal gives real-time feedback on pitch, formants, volume, and perceived gender. It also lets you save clips, review your feedback history, practice with cards, and share recordings with people who can support your progress.

The goal is not to reduce your voice to numbers. The goal is to give you clearer feedback while you practice, so you can connect what you feel, what you hear, and what the acoustic data shows.

Apps are most useful when they supplement the rest of your voice training. A professional voice coach or speech-language pathologist can provide guidance that an app cannot, especially around vocal health, technique, and individualized support.


Want to try out Strivocal?

Strivocal is available on the web and iOS. You can try Strivocal’s voice training app for free.


References

  1. Ladefoged, P., & Johnson, K. (2015). A course in phonetics (7th ed.). Cengage.
  2. Fant, G. (1960). Acoustic theory of speech production. Mouton.
  3. Titze et al. Toward a consensus on symbolic notation of harmonics, resonances, and formants in vocalization
  4. Peterson, G. E., & Barney, H. L. Control methods used in a study of the vowels
  5. Titze, I. R. Nonlinear source-filter coupling in phonation: Theory
  6. Perceptual weighting of acoustic cues for accommodating gender-related talker differences
  7. Wikipedia: Spectrogram