Last update: 2015-08-14. The up to date orginal is at Greg's wiki



[EnglishFrontPage] [TitleIndex] [WordIndex

How can I split a file into line ranges, e.g. lines 1-10, 11-20, 21-30?

Some Unix systems provide the split utility for this purpose:

   1 split --lines 10 --numeric-suffixes input.txt output-

For more flexibility you can use sed. The sed command can print e.g. the line number range 1-10:

   1 sed -n -e '1,10p' -e '10q'

This stops sed from printing each line (-n). Instead it only processes the lines in the range 1-10 ("1,10"), and prints them ("p"). The command will quit after reading line 10 ("10q").

We can now use this to print an arbitrary range of a file (specified by line number):

   1 # POSIX shell
   2 file=/etc/passwd
   3 range=10
   4 cur=1
   5 last=$(wc -l < "$file") # count number of lines
   6 chunk=1
   7 while [ $cur -lt $last ]
   8 do
   9     endofchunk=$(($cur + $range - 1))
  10     sed -n -e "$cur,${endofchunk}p" -e "${endofchunk}q" "$file" > chunk.$(printf %04d $chunk)
  11     chunk=$(($chunk + 1))
  12     cur=$(($cur + $range))
  13 done

The previous example uses POSIX arithmetic, which older Bourne shells do not have. In that case the following example should be used instead:

   1 # legacy Bourne shell; assume no printf either
   2 file=/etc/passwd
   3 range=10
   4 cur=1
   5 last=`wc -l < "$file"` # count number of lines
   6 chunk=1
   7 while test $cur -lt $last
   8 do
   9     endofchunk=`expr $cur + $range - 1`
  10     sed -n -e "$cur,${endofchunk}p" -e "${endofchunk}q" "$file" > chunk.$chunk
  11     chunk=`expr $chunk + 1`
  12     cur=`expr $cur + $range`
  13 done

Awk can also be used to produce a more or less equivalent result:

   1 awk -v range=10 '{print > FILENAME "." (int((NR -1)/ range)+1)}' file


CategoryShell


2015-08-01 04:05