When do you use an em hyphen

Brief instructions for typesetting

Version dated July 2, 2010

What is all this good for in general - or: means RT Room temperature or Gas constant times temperature?

More and more is being written and there is always more that should be read in itself. Typographic rules have been invented so that we have a chance to do so. Flawlessly set texts can be captured more quickly, the content is easier to remember. The typographical rules are a burden for the novice in writing. If you save yourself the trouble, you pass this burden on - onto the reader. So to the "customer". You don't have to have a clerk's exam to predict the success of these texts.

... and in particular?

There are essentially two occasions when you need to put your results in writing. On the one hand for your bachelor, master, state examination or doctoral thesis - it is about the issue on paper - on the other hand you should write in such a way that your text can be included in publications, so that it can be - in electronic form - from can be processed by an editorial office and then by a printing company. You will see that this double use requires compromises, as you can "muddle up" a suboptimally structured text for the paper output without affecting the result, but with the electronic text exchange the whole patching up is "exposed". The main part of this text deals with rules that ensure double use. The perfect paper edition is only discussed in a small appendix.

By the way, this text is set in HTML 4 and with a cascading style sheet formatted. You can use your browser to print out on paper, or you can load the file into MS Word, which takes style sheets into account when importing [at least the tested version did this in Office XP; only exception: the stylesheet command “text-decoration: overline” was not implemented]. Regarding the design: Examples for the respective rules are set in square brackets and highlighted in color [this is what an example looks like]. If Unicodes are specified for characters, the specification is hexadecimal.


Dashes (-, -, -, -)

German typesetting

We essentially use three types of strokes, the Divis (-), the Half square or Indent (-) and the minus-Sign (-). In computer programs, the half-quarter dash is also often referred to in English in German instructions, namely "n-dash" (it is called that because it is as long as an "n" wide). The em dash (-), which is used as an indent (“m-dash”) in English typography (therefore confusingly called “indent” in the Windows character table; the actual indent is called “hyphen” there) is rarely used in German typesetting. best not to look!).

The Divis (-) is used as a hyphen in hyphenation at the end of a line and as a hyphen in compound words [copper (II) sulfate]. It can be found at the bottom right of every standard keyboard. Note that the hyphen appears much more frequently in German than in English [copper (II) sulfate pentahydrate, but: cupric sulfate pentahydrate]. Note that when there are contractions, dashes appear where there were none before [100 mL flasks].

The Half square (-) is used in place of the word “to” without any spaces before or after it [It is heated for 20–30 hours]. Most often it will be in your texts with the literature citations [J. H. Enemark, R. D. Feltham, Coord. Chem. Rev.1974, 13, 339-406.]. The rule of not using spaces is enforced by magazines like the Applied softened, who are bilingual, but who only want to set the literature list once. Here it looks like English (see below; strangely uses the Applied in the running text also in English texts the bis-dash according to German custom). Also without a space, the semi-fourth dash separates the names of organic chemists in the annoying name reactions, so that you know where one name ends and the next begins [women do not appear so often in name reactions; the Lobry-de-Bruyn – Alberda-van-Ekenstein rearrangement is an early exception]. In general typography, the names of different people are always separated with a semicolon to from compound names one Distinguish person. I don't know of any example of a one-person-two-name reaction in chemistry. So the Diels-Alder reaction would be correct, the Diels-Alder reaction is certainly exactly the same. With a space in front of or behind it, the double dash serves as a dash [crystals - even smaller ones - always remained in solution]. Most text editors have special abbreviations to set the semicolon (MS Word: + - {Numblock}, or -, if the corresponding autocorrect command is not deactivated); it can generally be reached with +0150 or Unicode 2013.

The Minus sign (-, if your browser displays it correctly, it must look like the slash of the + sign; compare: - +; annoyingly, this went wrong with our house fonts of the LMUCompatil family) without a space as a prefix [−5,8 or −5.8] and used with a (small) space as operator [6 - 9 = −3]. In contrast to half-squares and ems, it does not belong to the extended (8-bit ASCII) character set (the one that can be reached with +128 to 255) and is therefore not supported by all text editors without a font change. In the Unicode character table, the minus sign has the identifier 2212 and can be entered directly if your program supports Unicode (MS Word: enter 2212 and then press + c). By the way: It is useful to distinguish the minus sign from the very similar halftone dash because of the line break. Most text editors and browsers hold −5 ("minus-5") together when making a break, but -5 (ndash-5) does not - give it a try!

Dashes in English Type Matter

The use of hyphens (-) parallel German usage. n-Dashes (-) may replace a “to” but with leading and trailing small spaces (+0160, Unicode 00A0) [heat 20 - 30 hours; J.H. Enemark, R.D. Feltham, Coord. Chem. Rev.1974, 13, 339-406.]. Usually, n-dashes are not used as the English equivalents of the German “dashes”. Instead, use an “m-dash” (-). The m-dash is used without leading and trailing spaces [Crystals — even small ones — always remained dissolved].

Periods and commas

German typesetting

In German fonts, the comma is used as a decimal point [area counter today for € 299999.99!]. The point is the usual symbol after a thousand [... for € 299,999.99]. Here too it is Appliedthat has long mixed German and English typesetting so that, for example, tables do not have to be set twice. A decimal point is prescribed for everything there [... for 299999.99 €], whereby of course the point as a symbol after a thousand must be omitted. As a replacement you take a (small) space in between [... for 299 999.99 €]. It's up to you how you want to do it, but do it consistently. If you choose decimal points, you can use commas to separate two numbers in lists [x, y (Weighting) 0.459, 4.223]. If you use decimal points, use a semicolon to separate [x, y (Weight) 0.459; 4,223]. In any case, don't let commas follow each other, so Not:x, y (Weighting) 0.459, 4.223.

Periods and Commas in English Type Matter

The period is the decimal separator in English texts [area detectors are 299999.99 € today!]. The separator after 1000 is the comma [… are 299,999.99 €]. Two numbers may be separated by a comma in tables [x, y (weighting scheme) 0.459, 4.223].


German typesetting

Numerical value and unit are usually separated by a space [50 ° C]. It doesn't look nice when these spaces are justified apart, and no line breaks should be allowed between number and unit. You can prevent both by using a protected space (+0160 or Unicode 00A0, in MS Word: ++ space). You also use this to separate the individual parts of assembled units [A. = 12 480 L mol−1 cm−1]. The point is that one can, for example, distinguish between meter × coulomb (m C) and millicoulomb (mC). An exception is the simple degree symbol, which is placed with no space in between to the numerical value (50 °). The Applied seems to also put concentration information with “” without spaces (0.5 HCl). The handling of successive variables is very inconsistent. They can be found separated by spaces [ΔG = −R. T lnK], but also without a gap [ΔG = −R.TlnK]. (In the second version, the missing italic adjustment is noticeable, which is discussed further below.) In any case, the equal sign in an equation is surrounded by spaces [ΔG = −R. T lnK]. All of the spaces in this section are usually not justified, but you will find numerous exceptions in books here, too. However, you will agree that number and unit are not separated and that in equations, if possible, a line break should only follow the equal sign, but not in between.

The recommendations given in this section are a compromise. It would usually be correct to put a smaller space in between. The Unicode table offers all possible spaces in the range 2000–200F. The small space requested here has the Unicode 2009, but it is not break-protected. With this, units are not separated so far and the initials also look nicer in literature lists: 5%, 4 ° C, J. H. Enemark, R. D. Feltham, Coord. Chem. Rev.1974, 13, 339–406.

Spaces in English Type Matter

In English typesetting, spaces are less widespread. Thus they are missing in: 5%, 4 ° C, R.D. Feltham…. This usage is the reason for your (future) space-free name suffixes B.Sc. or M.Sc.

Lead-in and lead-off

In German typesetting, “goose feet” is used for the introduction and discharge. If a word or a short term from another language is quoted in a German text, you stay with goosefoot. If the insertion is longer, the introduction and removal of the respective language is used. In English these are the characters “and”. You can see that the English quotation and the German removal are the same signs. As a rule, you can leave your word processor to insert the appropriate characters for the language set, if you set the option "typographical quotation marks" and then use the single "characters (+2). For manual input, the following applies:" (+ 0132 or Unicode 201E), “(+ 147 or Unicode 201C),” (+0148 or Unicode 201D).

Apostrophes and the like

You need the apostrophe in German as an ellipsis [Schoenflies’s ideas] and in English for the construction of genitive forms [Schoenflies’s ideas]. You can reach the apostrophe at +0146 or Unicode 2019. MS Word inserts the apostrophe if you press the character above the "#" on the standard keyboard and have not deactivated the corresponding auto-correction function. Note that other hyphens are used for position information in connection names: Unicode 2032 produces a ′ [2,2′-bipyridyl], Unicode 2033 produces ″ [O,O′,O″ -Trimeth-2-oxyethyl-amine]. Unicode 2034 is intended for the triple hyphen, but has been developed for very few fonts [O,O′,O″,O‴ -Tetrameth-2-oxyethylammonium]. Right? But it doesn't matter, because the spacing of the single hyphen (Unicode 2032) is such that you can simply use it several times in a row [O,O′,O′′,O′ ′ ′ -Tetrameth-2-oxyethylammonium].

Midpoint (·) and mal mark (×)

In the formula of an addition compound, use the midpoint (·) to separate the components: [CuSO4· 5H2O]. Use the multiplication mark (×) with spaces in front of and behind it, if in all other cases you would say “mal” when reading a text aloud. With us this usually occurs in two cases: with the crystal size [0.18 × 0.16 × 0.07 mm], and with numbers given with powers of ten [NA. = 6,022 × 1023]. You will also find the last example with a painting point - more often in German than in English. Do not use the letter "x" instead of "×". You can reach the multiplication mark with +0215 in the extended character set or Unicode 00D7; the midpoint is at +0183 or Unicode 00B7. You can also use the center point, ideally with a little space around it, to indicate interactions of a somewhat indefinite nature [O – H ··· O, Cr ··· Cr].

Character formatting

Basic font

In chemistry, in addition to the normal running text, the following symbols are used in basic font, i.e. not italic, semi-bold or otherwise excellent:

  • Element symbols [CuSO4 · 5 H2O], except when used in a compound name to identify a link (Nitrito-κN for example, see section Text markup italic)
  • Orbital symbols [s, p, d, f, σ, π, δ, a1g, b2u].
  • Races symmetry species) in group theory (as well as the derived orbital names) [eG and t2g denote orbitals in OH like EG and T2g transform]
  • Spectroscopic terms [5D.07F.2]
  • Subscripts and superscripts on symbols unless they denote variables [OH, KA., D.4h, S.N2]; but: there are p-orbitals with ml = −1, ml = 0 and ml = 1 (more in section Text markup italic)
  • Designations for reaction mechanisms [pN2]
  • Acronyms for methods and reaction conditions (if these cannot be marked with) [RT, NMR, HETCOR].

Text markup in general

As a rule, with exceptions in the chemical literature, a punctuation mark that follows marked text is also marked [if you print "bold" fat, then you also print the following comma in bold]. Does that make any sense? Yes sir! Uh ... Yup! The Applied but does not do this, for example, if a bold number is an abbreviation for a compound [crystals of 1aas beautiful as they looked, they decomposed quickly. Correct would be: crystals of 1a, so ...]. The same also applies to literature citations [...Z. Anorg. General Chem.1998, 624, 359 - 360], according to the rule, the comma after the year should be in bold and that after the volume in italics [...Z. Anorg. General Chem.1998,624, 359-360]. In three of God's names, follow the habit of the Applied and other magazines so that your texts can be used without modification.

Text markup italic

Italic (engl. italics) are recognized in the chemical literature:

  • Formula symbol for variable, Not but units [c(I.2) = 1 mol L–1, p(H2) = 2000 Pa] and Not Operators [log c = 0, pKA. = −lg KA.]. If a variable is named by a Greek letter (nλ = 2 d sin θ), the presentation in the various journals is inconsistent. The Applied once put it in the basic script for a while, since Greek distinction was enough - that has now subsided. I think it is more consistent to italicize them as there is less confusion [μeff = 5.8 μB.; δ = 91.8 ppm]. The best way to get to the Greek letters (without changing fonts) is in Unicode-compatible programs. If your text editor cannot use Unicode, you have to switch to a font such as symbol or Symbol Prop BT switch. The Greek letters in TEX's computer-modern fonts are particularly beautiful (for example cmmi12, if you set your text in a 12-point font), but they occupy unusual positions in the extended character set. If you want to differentiate between Greek letters in basic and italics, then use basic letters when designating isomers [α-cyclodextrin, β-ketocarboxylic acids, glucono-δ-lactone]
  • Abbreviations for constants [ΔG = −R. T lnK]
  • The letters of the symmetry elements according to Schoenflies, the groups and the classes, Not but, as already described above, the races [The C4- Scaffolding transformed according to C.4that are symmetry elements of this group E,C.2 and C.4; the xy orbital transforms into OH like EG, it has the symmetry eG]
  • The letters in symbols of symmetry and location designations according to Hermann-Mauguin [Pm is a subgroup of P2/m]
  • The letters in stereo descriptors (for R,S,E. and Z stereo descriptors are also bracketed [rac-Arabit, cis-Oxolandiol, (1R.,2R.) -Cyclohexanediol, meso-Tartrate]
  • Element symbols when they indicate a position [N, N-Dimethyl-ethylenediamine, tetrakis (thiocyanato-κS.) -ferrat (III), OAlkylation]
  • Journal and book titles in bibliographies
  • foreign language insertions [a update costs 100 €]. The rule is very soft, as the representation has to take into account the degree of naturalization into German. A word is naturalized at the latest and is no longer marked with italics when it is inflected according to German rules (“Who laid out this text?” Would be an example, but what about “We still have to update the program”?). A case from chemistry: How do we translate it? high-spin complex? It seems to me to have become part of the German technical jargon, so: "High-Spin-Complex" - right?
  • Emphasis [crystals? Crystals!], for the locks and underline are mega-out (that's why this text only occasionally saysNot"Underlined because you should feel something unpleasant there)

A rule with many exceptions in the chemical literature: If a distinguished term is connected with non-honored terms with hyphens, it loses its distinction [R. = 0.076, the R value is 0.076]. Common exceptions to the rule can be found under symmetry [C.4-symmetrical] and in the contraction of stereo descriptors with words from everyday language. Much is inconsistent, e. B. in Gades Coordination chemistry, which was set by Wiley-VCH: "Trans-Effekt" in the text, "Trans-Effect “in the register.

By using italics, you create a problem that common text editors cannot cope with: it often becomes a Italic adjustment needed. For example, if a formula contains a deprotonated glucofuranose residue, you would write: GlcfH−5. As you can see, the italic “falls” f into the H following it. A typesetting program would now have to slightly increase the space between the two letters. But neither MS Word nor OpenOffices Writer can do that. It's not a problem for TEX. Except for the furanoses, it doesn't look so great depending on the font used, for example: N′, P21/c, C.2221 and similar cases. Because of the usability of your text for publication, you should not insert spaces by hand that are not intended, especially when it comes to N ′, P 21/c and C. 2221 then pulled too far apart. It's best to do the following: (1) In HTML: nothing. (2) With MS Word and OpenOffice-Writer: Avoid hard formatting right from the start (for example on K click in the toolbar if you want to highlight it in italics). Use styles instead. In addition to one “italic” character format, create two “italic” -based character formats in which you expand the spacing by 1 pt and 2 pt, and assign these new formats instead of the simple “italic” where necessary is.

Text markup with small caps

In a chemical text, small caps are used. small capitals) used in two places:

  • for the stereo descriptors d and l (Not but with R. and S., the italic marked and in brackets!) [d-glucose, l-histidine]
  • while as an abbreviation for mol L–1[1 m CuIISO4]

The small capitals of oxidation levels have been abolished since September 2005 by the current iupac guideline.

Avoid a typical beginner's mistake: The one before glucose, for example, is entered as a "d" and not as a "D" before it is marked as small caps! Note that the common text editors except TEX do not print “real” small caps, but only reduced capital letters. This has the disadvantage of making them too thin, which doesn't look good (they seem to want to). You should still use this markup, as your text will then be correctly formatted for further electronic processing. In addition to these three usual markups, you can use small caps (but then really only correct ones - see below) to better fit acronyms into the running text, as a sequence of capital letters noticeably disturbs the flow of the text (the first letter is not capitalized here either! ). This example once without highlighting, otherwise the calm of the text flow will not come across [the assignment was made using spectra and other tricks].

Paragraph formatting


Something that always goes wrong! Compare a well-written book or magazine like that Applied with the next best - not written with TEX - dissertation. Your eye flickers effortlessly over the first of the mentioned print products, but with a text set with the standard settings of MS Word you constantly lose the line. The reason: The line spacing (instead of which you can also specify the “leading”, which denotes the space between the letters) in the dissertation is too small; simple Line spacing is printed out (the line spacing, that is the distance between the lower edge of one line and the upper edge of the next, is then exactly the same as the character height - i.e. 12 pt for a 12 pt font - the text is "compressed") . Try the following: If, for example, you have selected 12 pt for A4 Times Roman output, try 15 or 16 pt line spacing [rule of thumb for the body text: approx. 20-30% more than the character height, depending on the case whether the lines are long (more) or short (less)]. Make the connection between leading and line length clear by looking at the text below in browser windows that you slide back and forth between narrow and wide. If you find tight line spacing in a professional document, it will be a multi-column sentence in which the individual lines are correspondingly short. In this way it becomes possible to fill the sheet better and not to use so much paper. When trying it out, keep in mind that headings are usually written in compressed form.

For comparison, now this section with single line height (this is how MS Word does it if you haven't said anything):

Something that always goes wrong! Compare a well-written book or magazine like that Applied with the next best dissertation. Your eye flickers easily over the first of the mentioned print products, but with a text set with the standard settings of MS Word you constantly lose the line. The reason: The line spacing (instead of which you can also specify the “leading”, which denotes the space between the letters) in the dissertation is too small; simple Line spacing is printed out (the line spacing, that is the distance between the bottom of one line and the top of the next, is then exactly the same as the character height - i.e. 12 pt for a 12 pt font) Try the following: For example, if you ...


Most dissertation authors seem to consider the use of justified justification to be mandatory. You would have a lot less trouble if you set your text left-justified with a fluttered right margin. Try out different things. There are people who consider fluttering to be easier to read and lively, and justified to be conservative. To be on the safe side, manuscripts for magazines are always written in flutter without any separation (this text is written like this). This makes it unequivocally clear that all divisions in the text are hyphens and should be set, regardless of how the line break changes.

Is my text typographically perfect if I've done everything like this?

What! The rules compiled here are a compromise, as your text should not “only” be legible on paper, but also because it should be able to be further processed electronically. But the more you do by hand to improve the typeface, the more difficult it will be to edit your text without any problems. This is how the following things look great, but should - if you want to go down in the history of the working group as Typo Pope - after this you before have stored an easily exchangeable text format:

  • Manual undercutting, in which you eliminate unsightly empty spaces when certain characters come together, which the usual text editors leave behind even when the "automatic kerning" function is switched on. In chemistry, for example, this happens when Fe2III... tried that III to the left over the 2 to maneuver, or if one [α]D.25 want to make it more beautiful without using a formula editor.
  • Not just ½, ¼ and a few more Fractions from the usual set of characters represent correctly, but all. For us, this case occurs most often with the symmetry keys, for example with: i 3/2 − x, 3/2 + y, 5/3 − z. To do this, you usually have to install the so-called expert fonts under Windows in order for this to work. For some fonts in which many Unicodes are worked out (for example Lucida Sans Unicode or Palatino Linotype), this also works with the normal character set - take a look at the character table. As a test an example with Palatino Linotype, where you should see a properly set “4/9” if you have the font on your computer and your browser is playing. [Δt = ⁴⁄₉ ΔO]. If your browser can show that, then surely also the symmetry key [i³⁄₂ − x, ³⁄₂ + y, ⁵⁄₃ − z].
  • Ligatures, something very elegant, which cannot be represented in HTML (in which this text is written). It is essentially about the representation of the letter sequences fi, fl, ft, ff, ffi, ffl and fft. For example, look at a fi in Times font. At the top right, the f and the i point converge ugly, as well as the horizontal line of the f and the upper end of the i. The corresponding ligature is now a own symbol, in which the upper right spout of the f is used as the i-point: fi (looks really cool, doesn't it?). The ligatures are also included in the expert fonts. The fact that you can even see the fi ligature in this text is because it and the fl ligature (fl) are contained in the Unicode character set, namely on FB01 (fi) and FB02 (fl). Now the downer: there's nothing to search / replace all! A ligature is only allowed if the letters in question do not belong to different parts of a compound word [two of the three hydrogenfiso are found in nature - the underlined fi is set normally!]. By the way, TEX sets ligatures by default; remember to resolve a compound word. If you should obtain the Expert Font that matches your basic font: You will also find it in this one Minuscules, these are digits with descenders that don't seem as intrusive as normal digits. These cannot be represented in HTML either.
  • Real small caps use, i.e. switch to a small caps font.

However, font changes are sometimes a source of annoyance when texts are exchanged. If they are not necessary, at least avoid them in the body of the text. It would therefore be safer if you took symbols of all kinds, including Greek letters, from the Unicode repository. This text is an example. If you load it into MS Word together with the appropriate stylesheet, you only need to change the font for the headings. Of course, this recommendation limits you (in moderation), since you can only use "large" fonts such as Times or Arial, for which many Unicode characters have been worked out.

What does a magazine like that Applied of all of these things? A quick check in a current issue did not reveal any more complicated breaks in a hurry. Real small caps appear to be used, but not for acronyms that are represented with normal capital letters; the undercut works well, ligatures are ignored, minuscules are only used in the footer. So do it halfway.


Here are a few more peculiarities that play a role in our texts:

  • Since the letter "l" is very similar to the number "1", you should - like the - and - magazines - - contrary to the recommendation - write L and mL for liters and milliliters and not l and ml [1.1 L], Not 1.1 l.
  • For the same reason you should write C (1) or C-1 instead of C1, otherwise you can confuse it with Cl. Crystallographic tradition is the use of brackets (try the command once and see what happens), in sugar chemistry, C-1 would be the traditional one. But here the train for C (1) and C-1 seems to have left. Most publishers don't care how you do it, so the simple form has prevailed in recent years. The advantages are obvious: brackets take up space and if things are already tight in the molecule, the difficulties of adding atomic symbols to a picture increase considerably. Hyphens lead to unsightly constructs in the text, for example with the torsion angles: O2-C2-C3-O3 is easier to overlook than O-2-C-2-C-3-O-3 (little exercise: why not use: O-2-C-2-C-3-O-3?). So if you want to use C1, C2, etc., even if this leads to sentences with heavy meaning ("The C4 position is incorrectly filled.").
  • Just as cumbersome as attempts to beautify [α]D.25, which you probably enter into a formula editor in an exasperated way and thus greatly reduce the usability, it is the case with the rotational inversion axes, which are more important to us. P1 can be represented well in HTML, but most text editors offer only unsatisfactory solutions (the magazines have given up here and accept things like P1bar). If you are using MS Word, it is best to do the following: Create a new character format and go to the menu item font to the submenu Character spacing. Select there as the tracking distance narrow, by 10 pt. Also select under the submenu item position the instruction Raise, by 1.5 pt. Now enter the overline as Unicode 00AF and write the number to be overridden after it. Then assign the new character format to the overline. Times Roman 12 pt should look good now. With other fonts you have to adapt the two point specifications.
  • use dimension algebra. This is a clear way of getting to unambiguous table entries, and it works like this: You want to tabulate that the lattice constant a Is 10.456 (7) Å. The body of the table should only contain the amount 10.456 (7). So you have to divide the expression 10.456 (7) Å by Å to get the pure number. So if you just write down a number, you are not tabulating a, rather a/ Å - and that's exactly what you write in your table header or in the line definition. So that there is no confusion, do not use fractional lines in the units themselves [c/ mol L−1], Notc/minor. What should be more unambiguous about this procedure? Make a list of temperature factors. Above it says Ueq [104 pm2]. What now? If you want all entries with 104 take or are you already at 104 taken or what? With dimension algebra everything is clear: under Ueq/104 pm2 values ​​stand for Ueqwho through 104 pm2 have been divided.
  • Vanadium (V) or VV. - but not V (V)! Always combine only one element name with an oxidation level on the baseline or the element symbol with a superscript oxidation level. The following example gives an idea of ​​what all these rules are good for: in the headline of a publication in Inorg. Chem. it says: "... A Structurally Characterized U (VI) -I anion." It was immediately clear to you that this was about uranium (VI), i.e. UVI and iodine goes? If you are ever unsure, there is a sensible higher-level rule: element symbols are taboo for other things! Abbreviate "cyclodextrin" CD or cd, but never Cd (for the Bavarians among you:… never no Cd!).

If you discover errors or discrepancies, please let us know or send a message to [email protected]

pdf version