Font system (NFSS / encoding)

When you switch fonts with \textbf or \sffamily, what happens underneath? This page covers the machinery behind the font commands. LaTeX specifies a single font by five independent attributes (NFSS), and typesets by way of an encoding that maps your input characters to actual glyphs. Leaving the command list to its own page, we go deeper here: the low-level selectors committed with \selectfont, the differences between encodings such as OT1 / T1 / TU, and why you write \usepackage[T1]{fontenc} under pdfLaTeX.

NFSS — a font is five attributes

Font selection in LaTeX2e rests on a scheme called NFSS (New Font Selection Scheme). At its core is the idea that any text font is completely determined by five attributes. The fntguide (the LaTeX2e font-selection guide) names them: encoding, family, series, shape, and size.

  • Encoding — the order characters appear in the font (the map from input character to glyph slot). For text, OT1 (TeX text) and T1 (TeX extended text, the so-called Cork) are the common ones. Detailed in the next section.
  • Family — a collection of fonts sharing a letterform. The codes are cmr for Computer Modern Roman, cmss for its Sans, cmtt for its Typewriter; for commercial fonts, ptm (Times), phv (Helvetica), pcr (Courier).
  • Series — one axis combining weight and width: m (medium = the default), b (bold), bx (bold extended), c (condensed), sb (semi-bold), and so on.
  • Shape — the form of the letters: n (upright/roman), it (italic), sl (slanted/oblique), sc (caps and small caps). Less common is ui (an italic artificially turned upright).
  • Size — the design size, a dimension like 10pt; with no unit, pt is assumed.

The key subtlety is that series is one axis that fuses weight and width. In the standard scheme, weights run ulub (ultra light to ultra bold) and widths run ucux (ultra condensed to ultra expanded). The two are concatenated into a single value, except that m is dropped — unless both weight and width are medium, in which case a single m remains. So “bold, medium width” is b, “bold extended” is bx, and “medium weight, condensed” is c.

These five, written as **encoding/family/series/shape/size**, form a font’s formal name inside LaTeX. For instance OT1/cmr/m/n/10 means “Computer Modern Roman in the OT1 encoding, medium upright, 10 point.” The string \OT1/cmr/m/n/10 you see in an overfull-box warning is exactly these five attributes. The high-level commands you normally use, such as \textbf, are nothing more than operations that rewrite one of these five.

The low-level selectors and \selectfont

Beneath the high-level commands sit low-level selectors that set one attribute at a time: \fontencoding{T1}, \fontfamily{ptm}, \fontseries{b}, \fontshape{it}, \fontsize{12}{14}. But there is a crucial rule: **setting them does nothing until you call \selectfont**. The fntguide warns: “There must be a \selectfont command immediately after any settings of the font parameters … before any following text.”

latex
% 正しい:設定 → \selectfont → テキスト / Correct: set, then \selectfont, then text
\fontfamily{ptm}\fontseries{b}\selectfont Some text.

% 誤り:設定とテキストの間に文字を挟んではいけない / Wrong: no text between a setting and \selectfont
\fontfamily{ptm} Some \fontseries{b}\selectfont text.

A shortcut that sets all five at once is **\usefont{encoding}{family}{series}{shape}** — equivalent to the matching \font... commands followed by \selectfont, with the size carried over from the current value. It is handy for summoning one specific named font. Changing \fontencoding likewise requires a following \selectfont to take effect.

You rarely write these attributes directly: \textbf calls \fontseries for you, \sffamily calls \fontfamily, and so on. The command list and the choice between command and declaration forms live on the “Font style commands” page. You reach for the low level only to name a specific font, or to wire a new font into a class or package.

The default macros — \rmdefault and friends

Which family \textrm or \rmfamily selects is not hard-wired — it is held in a macro. There are three: \rmdefault (roman), \sfdefault (sans), \ttdefault (typewriter), with defaults cmr, cmss, cmtt in the article class. So a designer who wants the body set in Times, Helvetica, and Courier swaps these macros rather than rewriting any command.

latex
% 文書全体の3ファミリを差し替える / Re-point the three families for the whole document
\renewcommand{\rmdefault}{ptm}  % roman   → Times
\renewcommand{\sfdefault}{phv}  % sans    → Helvetica
\renewcommand{\ttdefault}{pcr}  % mono    → Courier

The body font itself is set by **\encodingdefault / \familydefault / \seriesdefault / \shapedefault**, defaulting to OT1, \rmdefault, m, n. Because \familydefault points at \rmdefault, changing \rmdefault changes the body font. The bold weight is held by \bfdefault (default bx); with fonts that lack a bx bold — many PostScript fonts — you may need to lower it to b.

Font encodings — from input to glyph

A font encoding is the order of glyphs within a font — the agreement on which slot (glyph) each input character maps to. It is the first of the five attributes, yet the only one with no author command like \textbf. As the fntguide notes, switching encodings is something a **package such as fontenc provides**.

For text the essentials are these. **OT1 is Knuth’s original 7-bit “TeX text” encoding** (the default). With room for only 128 characters, accented letters such as é or ñ are built by composing a base letter with an accent (\accent). That composition is the catch: accented words do not hyphenate, and they copy poorly out of the PDF. **T1 (“TeX extended text,” from the 1990 TUG conference at Cork, hence the Cork encoding) is an 8-bit, 256-glyph encoding that carries each accented letter as a single glyph. That lets Western — and some Eastern — European languages hyphenate correctly** and copy and paste cleanly.

TU (TeX Unicode) lets you use system-installed OpenType fonts directly by Unicode code point, and is the default for XeLaTeX and LuaLaTeX**. Because fontspec sets it up automatically, you rarely write TU by hand on a Unicode engine. Beyond these there are special-purpose encodings: **T2A (Cyrillic), LGR (Greek — currently the main encoding in use for the language), and TS1 (the Text Companion** encoding, a set of in-text symbol glyphs such as the copyright and currency signs, handled by textcomp). For math, the encodings OML (math italic), OMS (math symbols), and OMX (large math symbols) are assigned.

EncodingWhat it coversWidth / notes
OT1Knuth’s original “TeX text” (the default)7-bit; accents are composed → no hyphenation
T1TeX extended text (Cork); Western languages8-bit, 256 glyphs; accents single → hyphenation works
TUTeX Unicode; OpenType fonts directlyDefault for Xe/LuaLaTeX; set up by fontspec
T2ACyrillic8-bit (also T2B / T2C)
LGRGreekThe 256-glyph encoding now standard for Greek
TS1Text Companion (text symbols)Copyright, currency, etc.; handled by textcomp

Input encoding vs font encoding

Two easily-confused things, separated. The input encoding governs **how the bytes of your .tex source are read as characters** (UTF-8 and the like) — historically the job of the inputenc package. The font encoding governs which glyph of the output font each of those characters flows into, and is the job of fontenc. Think of it as the way in (inputenc) versus the way out (fontenc).

On modern pdfLaTeX the source is read as UTF-8 by default, so you almost never need to load inputenc explicitly. The font encoding, however, is still worth setting. fontenc is a package for pdfLaTeX; on XeLaTeX or LuaLaTeX you use fontspec instead (and since TU is the default there, the issue largely does not arise).

Under pdfLaTeX, load T1

If you typeset with pdfLaTeX, the idiom is to put **\usepackage[T1]{fontenc}** in the preamble. It switches the document’s font encoding to T1 and clears the OT1 defaults’ drawbacks — accented words that will not hyphenate, and text that garbles when copied from the PDF. The fontenc documentation says as much: it gives support for widespread Western languages such as French, German, Italian, and Polish, and “if you have words with accented letters then LaTeX will hyphenate them and your output can be copied and pasted.”

When you list several encodings as options to fontenc, the last one listed becomes the default (\encodingdefault is set to it). For a document that mixes in Greek, for instance, write \usepackage[LGR,T1]{fontenc} so the body default stays T1 while you switch to LGR where needed.

document.tex
\documentclass{article}
\usepackage[T1]{fontenc}   % フォントエンコーディングを T1 に / font encoding -> T1
% pdfLaTeX では入力は UTF-8 が既定(inputenc は通常不要)
% Input is UTF-8 by default on pdfLaTeX (inputenc usually unneeded)
\begin{document}
% T1 ならアクセント付きの語も正しくハイフネーションされる
% With T1, accented words hyphenate correctly
Na\"ive r\"esum\"e, \"uberfl\"ussig, Stra\ss{}e.
\end{document}

The low-level selectors in practice

Finally, a single document confirms how the low-level selectors and \usefont are used. \usefont gives four of the five attributes at once (encoding, family, series, shape) and commits on the spot; \fontfamily plus \selectfont sets attributes individually and commits. Wrap either in a group { } to confine the effect to that span.

document.tex
\documentclass{article}
\usepackage[T1]{fontenc}
\begin{document}
% \usefont で 4 属性を一括指定 / \usefont sets four attributes at once
{\usefont{T1}{ptm}{b}{it} Bold italic Times}

% 低水準セレクタを個別に設定し \selectfont で確定
% Set selectors individually, then commit with \selectfont
{\fontfamily{phv}\fontseries{b}\selectfont Bold Helvetica}

% サイズだけ変える(baselineskip は現在値を流用)
% Change only the size (reuse the current baselineskip)
{\fontsize{14}{\f@baselineskip}\selectfont Fourteen point}
\end{document}

Here {\usefont{T1}{ptm}{b}{it} ...} selects “Times in the T1 encoding, bold italic”; leaving the group restores the previous font. Passing the internal macro \f@baselineskip as the second argument to \fontsize is the idiom for changing only the type size while leaving the line spacing alone (you must not rewrite internal macros directly, but reading one out to pass it along is fine).