sed
Programs
sed
sed
is a stream editor.
A stream editor is used to perform basic text
transformations on an input stream
(a file or input from a pipeline).
While in some ways similar to an editor which
permits scripted edits (such as ed
),
sed
works by making only one pass over the
input(s), and is consequently more efficient.
But it is sed
's ability to filter text in a pipeline
which particularly distinguishes it from other types of
editors.
sed
may be invoked with the following command-line options:
-V
'
--version
'
sed
that is being run and a copyright notice,
then exit.
-h
'
--help
'
-n
'
--quiet
'
--silent
'
sed
will print out the pattern space
at the end of each cycle through the script.
These options disable this automatic printing,
and sed
will only produce output when explicitly told to
via the p
command.
-l
N'
--line-length=N
'
-u
'
--unbuffered
'
tail -f
, and you wish to see the transformed
output as soon as possible.)
-e
script'
--expression=script
'
-f
script-file'
--file=script-file
'
If no `-e', `-f', `--expression', or `--file' options are given on the command-line, then the first non-option argument on the command line is taken to be the script to be executed.
If any command-line parameters remain after processing the above, these parameters are interpreted as the names of input files to be processed. A file name of `-' refers to the standard input stream. The standard input will be processed if no file names are specified.
sed
Programs
A sed
program consists of one or more sed
commands,
passed in by one or more of the
`-e', `-f', `--expression', and `--file'
options, or the first non-option argument if zero of these
options are used.
This document will refer to "the" sed
script;
this will be understood to mean the in-order catenation
of all of the scripts and script-files passed in.
Each sed
command consists of an optional address or
address range, followed by a one-character command name
and any additional command-specific code.
sed
Addresses in a sed
script can be in any of the following forms:
sed
counts lines continuously across all input files.)
1~2
;
to pick every third line starting with the second, 2~3
would be used;
to pick every fifth line starting with the tenth, use 10~5
;
and 50~0
is just an obscure way of saying 50
.
/
characters,
each must be escaped by a backslash (\
).
%
may be replaced by any other single character.)
This also matches the regular expression regexp,
but allows one to use a different delimiter than /
.
This is particularly useful if the regexp itself contains
a lot of /
s, since it avoids the tedious escaping of every /
.
If regexp itself includes any delimiter characters,
each must be escaped by a backslash (\
).
I
modifier to regular-expression matching is a GNU
extension which causes the regexp to be matched in
a case-insensitive manner.
If no addresses are given, then all lines are matched; if one address is given, then only lines matching that address are matched.
An address range can be specified by specifying two addresses
separated by a comma (,
).
An address range matches lines starting from where the first
address matches, and continues until the second address matches
(inclusively).
If the second address is a regexp, then checking for the
ending match will start with the line following the
line which matched the first address.
If the second address is a number less than (or equal to)
the line matching the first address,
then only the one line is matched.
GNU sed
also supports some special 2-address forms:
Appending the !
character to the end of an address
specification will negate the sense of the match.
That is, if the !
character follows an address range,
then only lines which do not match the address range
will be selected.
This also works for singleton addresses,
and, perhaps perversely, for the null address.
[[I may add a brief overview of regular expressions at a later date;
for now see any of the various other documentations for regular
expressions, such as the awk
info page.]]
sed
buffers data
sed
maintains two data buffers: the active pattern space,
and the auxiliary hold space.
In "normal" operation, sed
reads in one line from the
input stream and places it in the pattern space.
This pattern space is where text manipulations occur.
The hold space is initially empty, but there are commands
for moving data between the pattern and hold spaces.
If you use sed
at all, you will quite likely want to know
these commands.
#
character begins a comment;
the comment continues until the next newline.
If you are concerned about portability, be aware that
some implementations of sed
(which are not POSIX.2
conformant) may only support a single one-line comment,
and then only when the very first character of the script is a #
.
Warning: if the first two characters of the sed
script
are #n
, then the `-n' (no-autoprint) option is forced.
If you want to put a comment in the first line of your script
and that comment begins with the letter `n'
and you do not want this behavior,
then be sure to either use a capital `N',
or place at least one space before the `n'.
/
characters may be uniformly replaced by
any other single character within any given s
command.)
The /
character (or whatever other character is used in its stead)
can appear in the regexp or replacement
only if it is preceded by a \
character.
Also newlines may appear in the regexp using the two
character sequence \n
.
The s
command attempts to match the pattern
space against the supplied regexp.
If the match is successful, then that portion of the pattern
space which was matched is replaced with replacement.
The replacement can contain \n
(n being
a number from 1 to 9, inclusive) references, which refer to
the portion of the match which is contained between the nth
\(
and its matching \)
.
Also, the replacement can contain unescaped &
characters which will reference the whole matched portion
of the pattern space.
To include a literal \
, &
, or newline in the final
replacement, be sure to precede the desired \
, &
,
or newline in the replacement with a \
.
The s
command can be followed with zero or more of the
following flags:
g
and number modifiers,
and currently there is no widely agreed upon meaning
across sed
implementations.
As of recent implementations of GNU sed
,
the interaction is defined to be:
ignore matches before the numberth,
and then match and replace all matches from
the numberth on.
sed
without processing any more commands or input.
Note that the current pattern space is printed
if auto-print is not disabled.
sed
, such as this one, will
double-print lines when auto-print is not disabled and the p
command is given.
Other implementations will only print the line once.
Both ways conform with the POSIX.2 standard, and so neither
way can be considered to be in error.
Portable sed
scripts should thus avoid relying on either behavior;
either use the `-n' option and explicitly print what you want,
or avoid use of the p
command (and also the p
flag to the
s
command).
sed
exits without processing
any more commands.
{
and }
characters.
This is particularly useful when you want a group of commands
to be triggered by a single address (or address-range) match.
Though perhaps less frequently used than those in the previous
section, some very small yet useful sed
scripts can be built with
these commands.
/
characters may be uniformly replaced by
any other single character within any given y
command.)
Transliterate any characters in the pattern space which match
any of the source-chars with the corresponding character
in dest-chars.
Instances of the /
(or whatever other character is used in its stead),
\
, or newlines can appear in the source-chars or dest-chars
lists, provide that each instance is escaped by a \
.
The source-chars and dest-chars lists must
contain the same number of characters (after de-escaping).
\
,
which will be removed from the output)
to be output at the end of the current cycle,
or when the next input line is read.
GNU extension: if, between the a
and the newline there is
other than a whitespace-\
sequence,
then the text of this line,
starting at the first non-whitespace character after the a
,
is taken as the first line of the text block.
(This enables a simplification in scripting a one-line add.)
This extension also works with the i
and c
commands.
\
,
which will be removed from the output).
\
,
which will be removed from the output)
in place of the last line
(or in place of each line, if no addresses were specified).
A new cycle is started after this command is done,
since the pattern space will have been deleted.
\
character)
are printed in C-style escaped form;
long lines are split,
with a trailing \
character to indicate the split;
the end of each line is marked with a $
.
w
commands
(including instances of w
flag on successful s
commands)
which refer to the same filename are output through
the same FILE
stream.
sed
exits without processing
any more commands.
sed
programmers
In most cases, use of these commands indicates that you are
probably better off programming in something like perl
.
But occasionally one is committed to sticking with sed
,
and these commands can enable one to write quite convoluted
scripts.
b
and t
commands.
In all other respects, a no-op.
s
ubstitution
since the last input line was read or t
branch was taken.
The label may be omitted, in which case the next cycle is started.
[[Not this release, sorry. But check out the scripts in the testsuite directory, and the amazing `dc.sed' script in the top-level directory of this distribution.]]
For those who want to write portable sed
scripts,
be aware that some implementations have been known to
limit line lengths (for the pattern and hold spaces)
to be no more than 4000 bytes.
The POSIX.2 standard specifies that conforming sed
implementations shall support at least 8192 byte line lengths.
GNU sed
has no built-in limit on line length;
as long as sed
can malloc() more (virtual) memory,
it will allow lines as long as you care to feed it
(or construct within it).
sed
In addition to several books that have been written about sed
(either specifically or as chapters in books which discuss
shell programming), one can find out more about sed
(including suggestions of a few books) from the FAQ
for the sed-users mailing list, available from any of:
http://www.cornerstonemag.com/sed/sedfaq.html http://www.dbnet.ece.ntua.gr/~george/sed/sedfaq.html http://seders.icheme.org/tutorials/sedfaq.html http://www.ptug.org/sed/sedfaq.html
Also of interest is http://seders.icheme.org/. This site includes sed tutorials under its http://seders.icheme.org/tutorials/ section, in addition to other sed-related goodies.
There is a "sed-users" mailing list maintained by Eric Pement.
To subscribe, send e-mail to Majordomo@jpusa.chi.il.us
with an arbitrary Subject:
line and the command
subscribe sed-users
in the body of your email message.
Email bug reports to bug-gnu-utils@gnu.org.
Be sure to include the word "sed" somewhere in the Subject:
field.
Also, please include the output of sed --version
in the body
of your report if at all possible.
This is a general index of all issues discussed in this manual, with the
exception of the sed
commands and command-line options.
Jump to: a - b - c - d - e - f - g - h - i - l - m - n - p - q - r - s - t - u - v - w - z
sed
sed
scripts
s///
succeeded
p
command and `-n' flag
g
and number modifier interaction in s
command
I
modifier, GNU extensions, I
modifier
n~m
addresses
g
and number modifiers in the s
command
p
command and `-n' flag
sed
program structure
This is an alphabetical list of all sed
commands and command-line
options.
Jump to: # - - - : - = - a - b - c - d - g - h - i - l - n - p - q - r - s - t - w - x - y - {
This document was generated on 27 May 2001 using the texi2html translator version 1.52.