Home Dokumentacje GNU Coreutils - manual - 8. Operating on fields
22 | 08 | 2019
GNU Coreutils - manual - 8. Operating on fields Drukuj

[ < ] [ > ] [ << ] [ Up ] [ >> ] [Top] [Contents] [Index] [ ? ]

8. Operating on fields


[ < ] [ > ] [ << ] [ Up ] [ >> ] [Top] [Contents] [Index] [ ? ]

8.1 cut: Print selected parts of lines

cut writes to standard output selected parts of each line of each input file, or standard input if no files are given or for a file name of `-'. Synopsis:

cut option… [file]…

In the table which follows, the byte-list, character-list, and field-list are one or more numbers or ranges (two numbers separated by a dash) separated by commas. Bytes, characters, and fields are numbered starting at 1. Incomplete ranges may be given: `-m' means `1-m'; `n-' means `n' through end of line or last field. The list elements can be repeated, can overlap, and can be specified in any order; but the selected input is written in the same order that it is read, and is written exactly once.

The program accepts the following options. Also see Common options.

`-b byte-list'
`--bytes=byte-list'

Select for printing only the bytes in positions listed in byte-list. Tabs and backspaces are treated like any other character; they take up 1 byte. If an output delimiter is specified, (see the description of `--output-delimiter'), then output that string between ranges of selected bytes.

`-c character-list'
`--characters=character-list'

Select for printing only the characters in positions listed in character-list. The same as `-b' for now, but internationalization will change that. Tabs and backspaces are treated like any other character; they take up 1 character. If an output delimiter is specified, (see the description of `--output-delimiter'), then output that string between ranges of selected bytes.

`-f field-list'
`--fields=field-list'

Select for printing only the fields listed in field-list. Fields are separated by a TAB character by default. Also print any line that contains no delimiter character, unless the `--only-delimited' (`-s') option is specified.

Note awk supports more sophisticated field processing, and by default will use (and discard) runs of blank characters to separate fields, and ignore leading and trailing blanks.

awk '{print $2}'    # print the second field
awk '{print $NF-1}' # print the penultimate field
awk '{print $2,$1}' # reorder the first two fields

In the unlikely event that awk is unavailable, one can use the join command, to process blank characters as awk does above.

join -a1 -o 1.2     - /dev/null # print the second field
join -a1 -o 1.2,1.1 - /dev/null # reorder the first two fields
`-d input_delim_byte'
`--delimiter=input_delim_byte'

With `-f', use the first byte of input_delim_byte as the input fields separator (default is TAB).

`-n'

Do not split multi-byte characters (no-op for now).

`-s'
`--only-delimited'

For `-f', do not print lines that do not contain the field separator character. Normally, any line without a field separator is printed verbatim.

`--output-delimiter=output_delim_string'

With `-f', output fields are separated by output_delim_string. The default with `-f' is to use the input delimiter. When using `-b' or `-c' to select ranges of byte or character offsets (as opposed to ranges of fields), output output_delim_string between non-overlapping ranges of selected bytes.

`--complement'

This option is a GNU extension. Select for printing the complement of the bytes, characters or fields selected with the `-b', `-c' or `-f' options. In other words, do not print the bytes, characters or fields specified via those options. This option is useful when you have many fields and want to print all but a few of them.

An exit status of zero indicates success, and a nonzero value indicates failure.


[ < ] [ > ] [ << ] [ Up ] [ >> ] [Top] [Contents] [Index] [ ? ]

8.2 paste: Merge lines of files

paste writes to standard output lines consisting of sequentially corresponding lines of each given file, separated by a TAB character. Standard input is used for a file name of `-' or if no input files are given.

For example:

$ cat num2
1
2
$ cat let3
a
b
c
$ paste num2 let3
1       a
2       b
        c

Synopsis:

paste [option]… [file]…

The program accepts the following options. Also see Common options.

`-s'
`--serial'

Paste the lines of one file at a time rather than one line from each file. Using the above example data:

$ paste -s num2 let3
1       2
a       b       c

`-d delim-list'
`--delimiters=delim-list'

Consecutively use the characters in delim-list instead of TAB to separate merged lines. When delim-list is exhausted, start again at its beginning. Using the above example data:

$ paste -d '%_' num2 let3 num2
1%a_1
2%b_2
%c_

An exit status of zero indicates success, and a nonzero value indicates failure.


[ < ] [ > ] [ << ] [ Up ] [ >> ] [Top] [Contents] [Index] [ ? ]

8.3 join: Join lines on a common field

join writes to standard output a line for each pair of input lines that have identical join fields. Synopsis:

join [option]… file1 file2

Either file1 or file2 (but not both) can be `-', meaning standard input. file1 and file2 should be sorted on the join fields.

Normally, the sort order is that of the collating sequence specified by the LC_COLLATE locale. Unless the `-t' option is given, the sort comparison ignores blanks at the start of the join field, as in sort -b. If the `--ignore-case' option is given, the sort comparison ignores the case of characters in the join field, as in sort -f.

The sort and join commands should use consistent locales and options if the output of sort is fed to join. You can use a command like `sort -k 1b,1' to sort a file on its default join field, but if you select a non-default locale, join field, separator, or comparison options, then you should do so consistently between join and sort. If `join -t ''' is specified then the whole line is considered which matches the default operation of sort.

If the input has no unpairable lines, a GNU extension is available; the sort order can be any order that considers two fields to be equal if and only if the sort comparison described above considers them to be equal. For example:

$ cat file1
a a1
c c1
b b1
$ cat file2
a a2
c c2
b b2
$ join file1 file2
a a1 a2
c c1 c2
b b1 b2

If the `--check-order' option is given, unsorted inputs will cause a fatal error message. If the option `--nocheck-order' is given, unsorted inputs will never cause an error message. If neither of these options is given, wrongly sorted inputs are diagnosed only if an input file is found to contain unpairable lines, and when both input files are non empty. If an input file is diagnosed as being unsorted, the join command will exit with a nonzero status (and the output should not be used).

Forcing join to process wrongly sorted input files containing unpairable lines by specifying `--nocheck-order' is not guaranteed to produce any particular output. The output will probably not correspond with whatever you hoped it would be.

The defaults are:

  • the join field is the first field in each line;
  • fields in the input are separated by one or more blanks, with leading blanks on the line ignored;
  • fields in the output are separated by a space;
  • each output line consists of the join field, the remaining fields from file1, then the remaining fields from file2.

The program accepts the following options. Also see Common options.

`-a file-number'

Print a line for each unpairable line in file file-number (either `1' or `2'), in addition to the normal output.

`--check-order'

Fail with an error message if either input file is wrongly ordered.

`--nocheck-order'

Do not check that both input files are in sorted order. This is the default.

`-e string'

Replace those output fields that are missing in the input with string. I.E. missing fields specified with the `-12jo' options.

`--header'

Treat the first line of each input file as a header line. The header lines will be joined and printed as the first output line. If `-o' is used to specify output format, the header line will be printed according to the specified format. The header lines will not be checked for ordering even if `--check-order' is specified. Also if the header lines from each file do not match, the heading fields from the first file will be used.

`-i'
`--ignore-case'

Ignore differences in case when comparing keys. With this option, the lines of the input files must be ordered in the same way. Use `sort -f' to produce this ordering.

`-1 field'

Join on field field (a positive integer) of file 1.

`-2 field'

Join on field field (a positive integer) of file 2.

`-j field'

Equivalent to `-1 field -2 field'.

`-o field-list'
`-o auto'

If the keyword `auto' is specified, infer the output format from the first line in each file. This is the same as the default output format but also ensures the same number of fields are output for each line. Missing fields are replaced with the `-e' option and extra fields are discarded.

Otherwise, construct each output line according to the format in field-list. Each element in field-list is either the single character `0' or has the form m.n where the file number, m, is `1' or `2' and n is a positive field number.

A field specification of `0' denotes the join field. In most cases, the functionality of the `0' field spec may be reproduced using the explicit m.n that corresponds to the join field. However, when printing unpairable lines (using either of the `-a' or `-v' options), there is no way to specify the join field using m.n in field-list if there are unpairable lines in both files. To give join that functionality, POSIX invented the `0' field specification notation.

The elements in field-list are separated by commas or blanks. Blank separators typically need to be quoted for the shell. For example, the commands `join -o 1.2,2.2' and `join -o '1.2 2.2'' are equivalent.

All output lines--including those printed because of any -a or -v option--are subject to the specified field-list.

`-t char'

Use character char as the input and output field separator. Treat as significant each occurrence of char in the input file. Use `sort -t char', without the `-b' option of `sort', to produce this ordering. If `join -t ''' is specified, the whole line is considered, matching the default operation of sort. If `-t '\0'' is specified then the ASCII NUL character is used to delimit the fields.

`-v file-number'

Print a line for each unpairable line in file file-number (either `1' or `2'), instead of the normal output.

An exit status of zero indicates success, and a nonzero value indicates failure.


[ << ] [ >> ] [Top] [Contents] [Index] [ ? ]

This document was generated by root on May, 18 2011 using texi2html 1.76.

 
Linki sponsorowane

W celu realizacji usług i funkcji na witrynach internetowych ZUI "ELPRO" stosujemy pliki cookies. Korzystanie z witryny bez zmiany ustawień dotyczących plików cookies oznacza, że będą one zapisywane w urządzeniu wyświetlającym stronę internetową. Więcej szczegółów w Polityce plików cookies.

Akceptuję pliki cookies z tej witryny.