When it comes to printing to the console as you are doing there is no such character as a 'non breaking space' and it is not normally needed since the console normally has a fixed width font and no interpretation is done that can in any way merge groups of spaces into just one space. A hard space can be produced with the HTML code instead of the space bar: 19 kg yields a non-breaking 19 kg. A literal hard space, such as one of the Unicode non-breaking space characters, should not be used, since some web browsers will not load them properly when editing.
'⍽' redirects here. It is not to be confused with. In and, a non-breaking space (' '), also called no-break space, non-breakable space ( NBSP), hard space, or fixed space, is a that prevents an at its position.
In some formats, including, it also prevents consecutive from collapsing into a single space. In HTML, the common non-breaking space, which is the same width as the ordinary space character, is encoded as. In, it is encoded as U+00A0. Non-breaking space characters also exist. Contents.
Uses and variations Despite having layout and uses similar to those of, it differs in contextual behavior. Non-breaking behavior Text-processing software typically assumes that an automatic line break may be inserted anywhere a space character occurs; a non-breaking space prevents this from happening (provided the software recognizes the character). For example, if the text '100 km' will not quite fit at the end of a line, the software may insert a line break between '100' and 'km'.
An editor who finds this behaviour undesirable may choose to use a non-breaking space between '100' and 'km'. This guarantees that the text '100 km' will not be broken: if it does not fit at the end of a line, it is moved in its entirety to the next line.
Non-collapsing behavior A second common application of non-breaking spaces is in file formats such as, and, whose rendering engines are programmed to treat sequences of (space, newline, tab, etc.) as if they were a single character (but this behavior can be overridden). Such 'collapsing' of whitespace allows the author to neatly arrange the source text using line breaks, indentation and other forms of spacing without affecting the final typeset result. In contrast, non-breaking spaces are not merged with neighboring whitespace characters when displayed, and can therefore be used by an author to simply insert additional visible space in the resulting output without using spans styled with peculiar values of the “white-space” property. Conversely, indiscriminate use (see the recommended use in ), in addition to a normal space, gives extraneous space in the output. Width variation Other non-breaking variants,:. U+202F NARROW NO-BREAK SPACE (HTML NNBSP). It was introduced in Unicode 3.0 for Mongolian, to separate a suffix from the word stem without indicating a word boundary.
It is also required for big in, sometimes inaccurately referred to as ”double punctuation“ (before;,?,!, », › and after «, ‹; today often also before:) and (before —), and in between multi-part abbreviations (e.g. ” z. B.“, ” d. h.“, ” v. l. n. r.“). When used with Mongolian, its width is usually one third of the normal space; in other contexts, its width is about 70% of the normal space but may resemble that of the (U+2009), at least with some fonts. Also starting from release 34 of Unicode Common Locale Data Repository (CLDR) the NNBSP is used in numbers as thousands group separator for French locale.
U+2007 (HTML ). Produces a space equal to the figure (0–9) characters. U+2060 (HTML WJ): encoded in Unicode since version 3.2. The word-joiner does not produce any space, and prohibits a line break at its position. Encodings Format Representation of non-breaking space and U+00A0 NO-BREAK SPACE C2 A0 (1-16) / A0:, A0, 9A 41 – RSP, Required Space:, FF (including ) or , A0 9A, Not available Unicode defines several other non-break space characters. Encoding remarks:., encoded in Unicode 3.2 and above as U+2060, and in HTML as or .
(BOM), U+FEFF, which may be interpreted as a 'zero width no-break space', a deprecated alternative to word joiner. Keyboard entry methods It is rare for national or international standards on to define an input method for the non-breaking space. An exception is the Finnish multilingual keyboard, accepted as the national standard SFS 5966 in 2008.
According to the SFS setting, the non-breaking space can be entered with the key combination +. Typically, authors of keyboard drivers and application programs (e.g., ) have devised their own for the non-breaking space.
For example: System/application Entry method + 0+ 1+ 6+ 0 or + 2+ 5+ 5 (doesn't always work) ⌥ + or using, Space, Space or AltGr+ Space Alt+ Space + X 8 Space Ctrl+ K, Space, Space; or Ctrl+ K, ⇧ + N, ⇧ Shift+ S, (since 3.0) Ctrl+ ⇧ Shift+ Space, (non-Mac), (before 3.0), Ctrl+ Space Mac ⌥ Opt+ ⌘ Cmd+ X Apart from this, applications and environments often have, e.g. Via the input method. (Non-breaking space has code point 255 decimal ( FF hex) in and, and code point 160 decimal ( A0 hex) in.) See also., for information about hard and non-breaking hyphens., a non-spacing break. Notes.
I am having some problems with config files which have the chars in them. How should I specify that character with sed so I can replace it with a space.
Sed -n 's/ / /g' examples of the errors service named restart Stopping named: OK Starting named: Error in named configuration: named.localhost:2: unknown RR type 'SOA ' named.localhost:8: unknown RR type '@' named.localhost:9: unknown RR type '127.0.0.1' named.localhost:10: unknown RR type '::1'. I tried to include a line form the original offending file in this post. It does not seams to be working. Pastebin download seams to be the only tool that keeps all the original binary. You should be able to copy and past the original line and have it work in your terminal. Mine is from gnuwin32 sed comes with a pdf. There is a manual online here do edit.find.hex you find the same contents (though I see in one place, one heading like 'some sample scripts' in my pdf, vs 'examples' in that one, differ very slightly) but most of rest is same word for word, I could edit.find phrases in one and find the same content in the other and the same smaller significant headings too.
So details are the same. It seems to be probably almost identical to what I have.
– Dec 12 '12 at 12:24. The answer to this question depends on which of the non-breaking space characters you are encountering. Below are examples of how to replace each of the non-breaking space characters mentioned in the questions title and additionally the UTF-8 version ( C2 A0) that the OP is actually asking about according to the pastebin output. All examples use printf to generate the output as it is more portable than echo. The space characters are replaced by X's to make the output clearer. Examples html printf 'nbsp; n' sed 's/ /X/g' printf ' n' sed 's/&160;/X/g' printf ' n' sed 's/&aA0;/X/g' octal 240 = decimal 160 = hex A0 printf ' xA0 n' sed 's/ xA0/X/g' Or with tr: printf ' xA0 n' tr ' 240' 'X' U+00A0 printf ' x00 xA0 n' sed 's/ x00 xA0/X/g' UTF-8 printf ' xC2 xA0 n' sed 's/ xC2 xA0/X/g' Result Output in all of the above cases is: X Answer Now to your question, you have data that looks like this: printf '@ IN SOA @ rname.invalid.
(' od -x Output: 0000000 c240 c2a0 c2a0 c2a0 c2a0 c2a0 20a0 4e 5320 414f a0c2 4020 7220 616e 656d 692e 0000040 766e 6c61 6469 202e 0a In order to replace the C2 A0s with ordinary space, use this: printf '@ IN SOA @ rname.invalid. (' sed 's/ xC2 xA0/ /g' od -x Output: 00 2020 2020 2020 4e49 5320 414f 20 2040 6e72 6d61 2e65 6e69 6176 696c 2e 2820 000a 0000044. Thanks for all those who help me get to a working solution. I tried to include a line form the original offending file in this post.
It does not seams to be working. Pastebin download seams to be the only tool that keeps all the original binary. You should be able to copy and past the original line and have it work in your terminal. So here is what happens if I remove the octal 0240 or hex xA0. It adds some other funky characters. $ echo '@ IN SOA @ rname.invalid. (' sed -e 's/ xA0//g' @ IN SOA @ rname.invalid.
( There is some extra data not printed in the actual files. I found the tool quite useful to show me what the actual hex / oct / binary values for the whole line are. $ echo '@ IN SOA @ rname.invalid. (' od -x 0000000 c240 c2a0 c2a0 c2a0 c2a0 c2a0 20a0 4e 5320 414f a0c2 4020 7220 616e 656d 692e 0000040 766e 6c61 6469 202e 0a The other character that kept showing up was xC2 It is not printed when the non breaking space xA0 is there, but shows up if the nbsp is removed. So I had to modify the sed line in the to remove it as well.
This is what worked for me. $ echo '@ IN SOA @ rname.invalid. (' sed -e 's/ xC2 xA0/ /g' @ IN SOA @ rname.invalid. I think non-breaking spaces can come out funny when you pipe them to some commands, od included. So for me, echo a od -tx1 prints 61 ff ff ff ff ff ff ff ff ff ff 0d 0a. So to remove my non-breaking spaces, I have to do echo a b sed 's/ xff/we/g' Your non-breaking spaces come out funny but a different funny code to my funny code. (I can copy/paste non-breaking spaces that echo outputs, so echo doesn't mess them up, but they get messed up when piped).
So what we're able to do, is use sed but on the messed up codes, and we see them with od. – Dec 12 '12 at 14:33.