How to delete every two lines after 3rd lines in a file contains very large number of lines? The Next CEO of Stack OverflowHow to print lines number 15 and 25 out of each 50 lines?AWK command failing for large fileextract every nth character from a stringawk manipulationSum of alternate values in a column using either sed or nawkCheck if two lines start with the same character, if so the output average, if not, print actual valuehow to use awk to do subtraction with numbers in a large fileHow to take the values from two columns in a txt file and match them to values in anotherHow to find the min of a column in every nth intervals of a file, using sed, sort, tail?Remove the line if a field of the line exists in another fileHow to aggregate the below records using awk command

A small doubt about the dominated convergence theorem

0 rank tensor vs 1D vector

Which one is the true statement?

Why isn't the Mueller report being released completely and unredacted?

Is a distribution that is normal, but highly skewed considered Gaussian?

Calculator final project in Python

Is the D&D universe the same as the Forgotten Realms universe?

How to get from Geneva Airport to Metabief, Doubs, France by public transport?

Can you be charged for obstruction for refusing to answer questions?

Why doesn't UK go for the same deal Japan has with EU to resolve Brexit?

Grabbing quick drinks

How to count occurrences of text in a file?

How many extra stops do monopods offer for tele photographs?

Why do airplanes bank sharply to the right after air-to-air refueling?

Would a completely good Muggle be able to use a wand?

Is it convenient to ask the journal's editor for two additional days to complete a review?

Make solar eclipses exceedingly rare, but still have new moons

Do I need to write [sic] when a number is less than 10 but isn't written out?

Is there a difference between "Fahrstuhl" and "Aufzug"

Poetry, calligrams and TikZ/PStricks challenge

Is it possible to replace duplicates of a character with one character using tr

Does Germany produce more waste than the US?

Find non-case sensitive string in a mixed list of elements?

How I can get glyphs from a fraktur font and use them as identifiers?

How to delete every two lines after 3rd lines in a file contains very large number of lines?

The Next CEO of Stack OverflowHow to print lines number 15 and 25 out of each 50 lines?AWK command failing for large fileextract every nth character from a stringawk manipulationSum of alternate values in a column using either sed or nawkCheck if two lines start with the same character, if so the output average, if not, print actual valuehow to use awk to do subtraction with numbers in a large fileHow to take the values from two columns in a txt file and match them to values in anotherHow to find the min of a column in every nth intervals of a file, using sed, sort, tail?Remove the line if a field of the line exists in another fileHow to aggregate the below records using awk command

Like

If I have :

1st line (keep) 
2nd line (keep) 
3rd line (keep) 
4rth lines (delete) 
5th (del) 
6th (keep) 
7nth (keep) 
8th lines (keep) 
9th (del) 
10th (del) 
11th (keep) 
12th (keep) 
13th (keep) 
14th (del) 
15th (del)

etc....

edited 36 mins ago

Prvt_Yadv

3,00031328

asked yesterday

Jaguar Jom

161

New contributor

1

increment a line counter (zero-indexed) for each line read, print when (line counter modulo 5>=3)

– ChuckCottrill
yesterday

can you please clarify more,

– Jaguar Jom
yesterday

3

Possible duplicate of How to print lines number 15 and 25 out of each 50 lines?

– Sundeep
21 hours ago

1

the duplicate is slightly worded differently, but it is the same looked in a different way.. this question would be print lines 1,2,3 out of each 5 lines for ex: seq 15 | awk 'BEGIN a[1] a[2] a[3] ; NR % 5 in a' and seq 15 | sed -n 'p;n;p;n;p;n;n'

– Sundeep
21 hours ago

also, the sed version above might be faster than the awk one for large files

– Sundeep
21 hours ago

add a comment |

Like

If I have :

1st line (keep) 
2nd line (keep) 
3rd line (keep) 
4rth lines (delete) 
5th (del) 
6th (keep) 
7nth (keep) 
8th lines (keep) 
9th (del) 
10th (del) 
11th (keep) 
12th (keep) 
13th (keep) 
14th (del) 
15th (del)

etc....

edited 36 mins ago

Prvt_Yadv

3,00031328

asked yesterday

Jaguar Jom

161

New contributor

1

increment a line counter (zero-indexed) for each line read, print when (line counter modulo 5>=3)

– ChuckCottrill
yesterday

can you please clarify more,

– Jaguar Jom
yesterday

3

Possible duplicate of How to print lines number 15 and 25 out of each 50 lines?

– Sundeep
21 hours ago

1

the duplicate is slightly worded differently, but it is the same looked in a different way.. this question would be print lines 1,2,3 out of each 5 lines for ex: seq 15 | awk 'BEGIN a[1] a[2] a[3] ; NR % 5 in a' and seq 15 | sed -n 'p;n;p;n;p;n;n'

– Sundeep
21 hours ago

also, the sed version above might be faster than the awk one for large files

– Sundeep
21 hours ago

add a comment |

Like

If I have :

1st line (keep) 
2nd line (keep) 
3rd line (keep) 
4rth lines (delete) 
5th (del) 
6th (keep) 
7nth (keep) 
8th lines (keep) 
9th (del) 
10th (del) 
11th (keep) 
12th (keep) 
13th (keep) 
14th (del) 
15th (del)

etc....

edited 36 mins ago

Prvt_Yadv

3,00031328

asked yesterday

Jaguar Jom

161

New contributor

Like

If I have :

1st line (keep) 
2nd line (keep) 
3rd line (keep) 
4rth lines (delete) 
5th (del) 
6th (keep) 
7nth (keep) 
8th lines (keep) 
9th (del) 
10th (del) 
11th (keep) 
12th (keep) 
13th (keep) 
14th (del) 
15th (del)

etc....

bash shell awk sed

edited 36 mins ago

Prvt_Yadv

3,00031328

asked yesterday

Jaguar Jom

161

New contributor

edited 36 mins ago

Prvt_Yadv

3,00031328

asked yesterday

Jaguar Jom

161

New contributor

edited 36 mins ago

Prvt_Yadv

3,00031328

edited 36 mins ago

Prvt_Yadv

3,00031328

edited 36 mins ago

Prvt_Yadv

3,00031328

asked yesterday

Jaguar Jom

161

New contributor

asked yesterday

Jaguar Jom

161

asked yesterday

Jaguar Jom

161

New contributor

Jaguar Jom is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

1

increment a line counter (zero-indexed) for each line read, print when (line counter modulo 5>=3)

– ChuckCottrill
yesterday

can you please clarify more,

– Jaguar Jom
yesterday

3

Possible duplicate of How to print lines number 15 and 25 out of each 50 lines?

– Sundeep
21 hours ago

1

the duplicate is slightly worded differently, but it is the same looked in a different way.. this question would be print lines 1,2,3 out of each 5 lines for ex: seq 15 | awk 'BEGIN a[1] a[2] a[3] ; NR % 5 in a' and seq 15 | sed -n 'p;n;p;n;p;n;n'

– Sundeep
21 hours ago

also, the sed version above might be faster than the awk one for large files

– Sundeep
21 hours ago

add a comment |

1

increment a line counter (zero-indexed) for each line read, print when (line counter modulo 5>=3)

– ChuckCottrill
yesterday

can you please clarify more,

– Jaguar Jom
yesterday

3

Possible duplicate of How to print lines number 15 and 25 out of each 50 lines?

– Sundeep
21 hours ago

1

the duplicate is slightly worded differently, but it is the same looked in a different way.. this question would be print lines 1,2,3 out of each 5 lines for ex: seq 15 | awk 'BEGIN a[1] a[2] a[3] ; NR % 5 in a' and seq 15 | sed -n 'p;n;p;n;p;n;n'

– Sundeep
21 hours ago

also, the sed version above might be faster than the awk one for large files

– Sundeep
21 hours ago

increment a line counter (zero-indexed) for each line read, print when (line counter modulo 5>=3)

– ChuckCottrill
yesterday

can you please clarify more,

– Jaguar Jom
yesterday

Possible duplicate of How to print lines number 15 and 25 out of each 50 lines?

– Sundeep
21 hours ago

the duplicate is slightly worded differently, but it is the same looked in a different way.. this question would be print lines 1,2,3 out of each 5 lines for ex: seq 15 | awk 'BEGIN a[1] a[2] a[3] ; NR % 5 in a' and seq 15 | sed -n 'p;n;p;n;p;n;n'

– Sundeep
21 hours ago

also, the sed version above might be faster than the awk one for large files

– Sundeep
21 hours ago

add a comment |

6 Answers
6

active

oldest

votes

Try:

awk '(NR-1)%5<3' file

For example:

$ awk '(NR-1)%5<3' file
1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

How it works

The command (NR-1)%5<3 tells awk to print any line for which (NR-1)%5<3 is true. In awk, NR is the line number with the first line counting as 1. For every five lines in the file, that statement will be true for the first three.

edited 20 hours ago

Kusalananda♦

138k17258428

answered yesterday

John1024

48.1k5113128

add a comment |

Basically, you want something like 'Fizz-Buzz' in awk ...

awk ' if (i++%5 < 3) print $0;'

To show this works...

for x in 1 2 3 4 5 6 7 8 9 10 ; do echo $x; done |
awk ' if (i++%5 < 3) print $0;'

When your file is named, 'mybigfile.csv',

awk ' if (i++%5 < 3) print $0;' < mybigfile.csv > mybigfile-123.csv

answered yesterday

ChuckCottrill

722814

You could use NR, or just rely on i defaulting to zero :-) (code golf)

– ChuckCottrill
yesterday

add a comment |

A simple command is:

awk 'if((NR-1) % 5<=2)print $0' file

It will only print first 3 lines in sequence of 5 lines. Because (NR-1)%5 will give output like 0 1 2 3 4, and first 3 lines are less than equal to 2. So it will only print them.

I have file with contents:

The output is:

Or as suggested in comments you can use:

awk '(NR - 1) % 5 <= 2' file

edited 19 hours ago

answered yesterday

Prvt_Yadv

3,00031328

2

Or, with idiomatic use of awk syntax: awk '(NR - 1) % 5 <= 2' file

– Kusalananda♦
20 hours ago

Thanks I didnt know it.

– Prvt_Yadv
19 hours ago

add a comment |

A generic solution for masking out a particular pattern of lines from a file:

#!/bin/sh

# The pattern is given on the command line.
pattern=$1

# The period is simply the length of the pattern.
period=$#pattern

# Use bc to convert the binary pattern to an integer.
mask=$( printf 'ibase=2; %sn' "$pattern" | bc )

awk -v mask="$mask" -v period="$period" '
 BEGIN p = lshift(1, period-1) 
 and(rshift(p, (FNR-1) % period), mask)'

This relies on awk implementing the non-standard functions and() (bitwise AND), rshift() and lshift() (bitwise right and left shift), which both GNU awk and some BSD implementations of awk does, but not mawk.

This takes a pattern, which is a binary number representing both the cyclic period and what lines within each period should be kept or masked out. A 1 means "keep" and a 0 means "delete".

For example: The pattern of line that should be applied in your question is 11100, which means "for each set of five lines, keep the first three and delete the others".

Using 01001000 would delete all but the 2nd and 5th lines in every 8 lines.

The awk program could also be written without the BEGIN block as

and(lshift(1, (period-1) - (FNR-1) % period), mask)

Left-shifting 1 by (period-1) - (FNR-1) % period positions is the same as calculating 2 to that power, but I'm using lshift() since awk does its arithmetics using floating point operations rather than in exact integer arithmetics.

Since the code relies on the binary representation of the pattern, very long patterns may not work well.

Testing:

Removing the lines you want to remove:

$ sh script.sh 11100 <file
1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

Inverting the pattern:

$ sh script.sh 00011 <file
4rth lines (delete)
5th (del)
9th (del)
10th (del)
14th (del)
15th (del)

edited 15 hours ago

answered 19 hours ago

Kusalananda♦

138k17258428

add a comment |

This can be solved using GNU sed:

sed '4~5,5~5d' file

Note that this uses a GNU-specific extension to the sed standard, and thus doesn't work with e.g. BSD sed on macOS. However, GNU sed can be installed on macOS using brew, after which it can be used as gsed. On Linux, GNU sed is the default.

This prints every line that does not fall in the fourth till fifth line of every five lines; for a clearer example: sed '3~10,6~10d' fill select lines 1, 2, 7, 8, 9, 10 of every group of 10 lines by deleting lines 3 till 6.

The top-voted answer suggests using awk '(NR-1)%5<3'. On my machine, on a file containing the numbers 1 till 2 million, this takes about 0.6 seconds, while the sed solution in this answer takes about 0.35 seconds. This is reasonable, since sed is in general a simpler tool, and can thus work faster than the more complicated, but more full-featured, awk.

answered 15 hours ago

tomsmeding

1413

New contributor

2

+1 ... or 4~5N;d;

– steeldriver
13 hours ago

add a comment |

Tried with below command and it worked fine

for((i=1;i<=20;i++)); do j=$(($i+2)); sed -n ''$i','$j'p' filename;i=$(($j+2)); done

output

1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

answered 21 hours ago

Praveen Kumar BS

1,6981311

1

That is nice, but you have know how many lines you have in advance, and you're looping back from the beginning each round. It cannot be used on a stream, and it gets more inefficient the bigger the data gets, so since OP says the number of lines is very large, this is not the best solution.

– Law29
17 hours ago

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

Jaguar Jom is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f509553%2fhow-to-delete-every-two-lines-after-3rd-lines-in-a-file-contains-very-large-numb%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

6 Answers
6

active

oldest

votes

6 Answers
6

active

oldest

votes

Try:

awk '(NR-1)%5<3' file

For example:

$ awk '(NR-1)%5<3' file
1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

How it works

edited 20 hours ago

Kusalananda♦

138k17258428

answered yesterday

John1024

48.1k5113128

add a comment |

Try:

awk '(NR-1)%5<3' file

For example:

$ awk '(NR-1)%5<3' file
1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

How it works

edited 20 hours ago

Kusalananda♦

138k17258428

answered yesterday

John1024

48.1k5113128

add a comment |

Try:

awk '(NR-1)%5<3' file

For example:

$ awk '(NR-1)%5<3' file
1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

How it works

edited 20 hours ago

Kusalananda♦

138k17258428

answered yesterday

John1024

48.1k5113128

Try:

awk '(NR-1)%5<3' file

For example:

$ awk '(NR-1)%5<3' file
1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

How it works

edited 20 hours ago

Kusalananda♦

138k17258428

answered yesterday

John1024

48.1k5113128

edited 20 hours ago

Kusalananda♦

138k17258428

edited 20 hours ago

Kusalananda♦

138k17258428

edited 20 hours ago

Kusalananda♦

138k17258428

answered yesterday

John1024

48.1k5113128

answered yesterday

John1024

48.1k5113128

answered yesterday

John1024

48.1k5113128

add a comment |

Basically, you want something like 'Fizz-Buzz' in awk ...

awk ' if (i++%5 < 3) print $0;'

To show this works...

for x in 1 2 3 4 5 6 7 8 9 10 ; do echo $x; done |
awk ' if (i++%5 < 3) print $0;'

When your file is named, 'mybigfile.csv',

awk ' if (i++%5 < 3) print $0;' < mybigfile.csv > mybigfile-123.csv

answered yesterday

ChuckCottrill

722814

You could use NR, or just rely on i defaulting to zero :-) (code golf)

– ChuckCottrill
yesterday

add a comment |

Basically, you want something like 'Fizz-Buzz' in awk ...

awk ' if (i++%5 < 3) print $0;'

To show this works...

for x in 1 2 3 4 5 6 7 8 9 10 ; do echo $x; done |
awk ' if (i++%5 < 3) print $0;'

When your file is named, 'mybigfile.csv',

awk ' if (i++%5 < 3) print $0;' < mybigfile.csv > mybigfile-123.csv

answered yesterday

ChuckCottrill

722814

You could use NR, or just rely on i defaulting to zero :-) (code golf)

– ChuckCottrill
yesterday

add a comment |

Basically, you want something like 'Fizz-Buzz' in awk ...

awk ' if (i++%5 < 3) print $0;'

To show this works...

for x in 1 2 3 4 5 6 7 8 9 10 ; do echo $x; done |
awk ' if (i++%5 < 3) print $0;'

When your file is named, 'mybigfile.csv',

awk ' if (i++%5 < 3) print $0;' < mybigfile.csv > mybigfile-123.csv

answered yesterday

ChuckCottrill

722814

Basically, you want something like 'Fizz-Buzz' in awk ...

awk ' if (i++%5 < 3) print $0;'

To show this works...

for x in 1 2 3 4 5 6 7 8 9 10 ; do echo $x; done |
awk ' if (i++%5 < 3) print $0;'

When your file is named, 'mybigfile.csv',

awk ' if (i++%5 < 3) print $0;' < mybigfile.csv > mybigfile-123.csv

answered yesterday

ChuckCottrill

722814

answered yesterday

ChuckCottrill

722814

answered yesterday

ChuckCottrill

722814

answered yesterday

ChuckCottrill

722814

You could use NR, or just rely on i defaulting to zero :-) (code golf)

– ChuckCottrill
yesterday

add a comment |

You could use NR, or just rely on i defaulting to zero :-) (code golf)

– ChuckCottrill
yesterday

You could use NR, or just rely on i defaulting to zero :-) (code golf)

– ChuckCottrill
yesterday

add a comment |

A simple command is:

awk 'if((NR-1) % 5<=2)print $0' file

It will only print first 3 lines in sequence of 5 lines. Because (NR-1)%5 will give output like 0 1 2 3 4, and first 3 lines are less than equal to 2. So it will only print them.

I have file with contents:

The output is:

Or as suggested in comments you can use:

awk '(NR - 1) % 5 <= 2' file

edited 19 hours ago

answered yesterday

Prvt_Yadv

3,00031328

2

Or, with idiomatic use of awk syntax: awk '(NR - 1) % 5 <= 2' file

– Kusalananda♦
20 hours ago

Thanks I didnt know it.

– Prvt_Yadv
19 hours ago

add a comment |

A simple command is:

awk 'if((NR-1) % 5<=2)print $0' file

It will only print first 3 lines in sequence of 5 lines. Because (NR-1)%5 will give output like 0 1 2 3 4, and first 3 lines are less than equal to 2. So it will only print them.

I have file with contents:

The output is:

Or as suggested in comments you can use:

awk '(NR - 1) % 5 <= 2' file

edited 19 hours ago

answered yesterday

Prvt_Yadv

3,00031328

2

Or, with idiomatic use of awk syntax: awk '(NR - 1) % 5 <= 2' file

– Kusalananda♦
20 hours ago

Thanks I didnt know it.

– Prvt_Yadv
19 hours ago

add a comment |

A simple command is:

awk 'if((NR-1) % 5<=2)print $0' file

It will only print first 3 lines in sequence of 5 lines. Because (NR-1)%5 will give output like 0 1 2 3 4, and first 3 lines are less than equal to 2. So it will only print them.

I have file with contents:

The output is:

Or as suggested in comments you can use:

awk '(NR - 1) % 5 <= 2' file

edited 19 hours ago

answered yesterday

Prvt_Yadv

3,00031328

A simple command is:

awk 'if((NR-1) % 5<=2)print $0' file

It will only print first 3 lines in sequence of 5 lines. Because (NR-1)%5 will give output like 0 1 2 3 4, and first 3 lines are less than equal to 2. So it will only print them.

I have file with contents:

The output is:

Or as suggested in comments you can use:

awk '(NR - 1) % 5 <= 2' file

edited 19 hours ago

answered yesterday

Prvt_Yadv

3,00031328

edited 19 hours ago

answered yesterday

Prvt_Yadv

3,00031328

answered yesterday

Prvt_Yadv

3,00031328

answered yesterday

Prvt_Yadv

3,00031328

2

Or, with idiomatic use of awk syntax: awk '(NR - 1) % 5 <= 2' file

– Kusalananda♦
20 hours ago

Thanks I didnt know it.

– Prvt_Yadv
19 hours ago

add a comment |

2

Or, with idiomatic use of awk syntax: awk '(NR - 1) % 5 <= 2' file

– Kusalananda♦
20 hours ago

Thanks I didnt know it.

– Prvt_Yadv
19 hours ago

Or, with idiomatic use of awk syntax: awk '(NR - 1) % 5 <= 2' file

– Kusalananda♦
20 hours ago

Thanks I didnt know it.

– Prvt_Yadv
19 hours ago

add a comment |

A generic solution for masking out a particular pattern of lines from a file:

#!/bin/sh

# The pattern is given on the command line.
pattern=$1

# The period is simply the length of the pattern.
period=$#pattern

# Use bc to convert the binary pattern to an integer.
mask=$( printf 'ibase=2; %sn' "$pattern" | bc )

awk -v mask="$mask" -v period="$period" '
 BEGIN p = lshift(1, period-1) 
 and(rshift(p, (FNR-1) % period), mask)'

This takes a pattern, which is a binary number representing both the cyclic period and what lines within each period should be kept or masked out. A 1 means "keep" and a 0 means "delete".

For example: The pattern of line that should be applied in your question is 11100, which means "for each set of five lines, keep the first three and delete the others".

Using 01001000 would delete all but the 2nd and 5th lines in every 8 lines.

The awk program could also be written without the BEGIN block as

and(lshift(1, (period-1) - (FNR-1) % period), mask)

Since the code relies on the binary representation of the pattern, very long patterns may not work well.

Testing:

Removing the lines you want to remove:

$ sh script.sh 11100 <file
1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

Inverting the pattern:

$ sh script.sh 00011 <file
4rth lines (delete)
5th (del)
9th (del)
10th (del)
14th (del)
15th (del)

edited 15 hours ago

answered 19 hours ago

Kusalananda♦

138k17258428

add a comment |

A generic solution for masking out a particular pattern of lines from a file:

#!/bin/sh

# The pattern is given on the command line.
pattern=$1

# The period is simply the length of the pattern.
period=$#pattern

# Use bc to convert the binary pattern to an integer.
mask=$( printf 'ibase=2; %sn' "$pattern" | bc )

awk -v mask="$mask" -v period="$period" '
 BEGIN p = lshift(1, period-1) 
 and(rshift(p, (FNR-1) % period), mask)'

This takes a pattern, which is a binary number representing both the cyclic period and what lines within each period should be kept or masked out. A 1 means "keep" and a 0 means "delete".

For example: The pattern of line that should be applied in your question is 11100, which means "for each set of five lines, keep the first three and delete the others".

Using 01001000 would delete all but the 2nd and 5th lines in every 8 lines.

The awk program could also be written without the BEGIN block as

and(lshift(1, (period-1) - (FNR-1) % period), mask)

Since the code relies on the binary representation of the pattern, very long patterns may not work well.

Testing:

Removing the lines you want to remove:

$ sh script.sh 11100 <file
1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

Inverting the pattern:

$ sh script.sh 00011 <file
4rth lines (delete)
5th (del)
9th (del)
10th (del)
14th (del)
15th (del)

edited 15 hours ago

answered 19 hours ago

Kusalananda♦

138k17258428

add a comment |

A generic solution for masking out a particular pattern of lines from a file:

#!/bin/sh

# The pattern is given on the command line.
pattern=$1

# The period is simply the length of the pattern.
period=$#pattern

# Use bc to convert the binary pattern to an integer.
mask=$( printf 'ibase=2; %sn' "$pattern" | bc )

awk -v mask="$mask" -v period="$period" '
 BEGIN p = lshift(1, period-1) 
 and(rshift(p, (FNR-1) % period), mask)'

This takes a pattern, which is a binary number representing both the cyclic period and what lines within each period should be kept or masked out. A 1 means "keep" and a 0 means "delete".

For example: The pattern of line that should be applied in your question is 11100, which means "for each set of five lines, keep the first three and delete the others".

Using 01001000 would delete all but the 2nd and 5th lines in every 8 lines.

The awk program could also be written without the BEGIN block as

and(lshift(1, (period-1) - (FNR-1) % period), mask)

Since the code relies on the binary representation of the pattern, very long patterns may not work well.

Testing:

Removing the lines you want to remove:

$ sh script.sh 11100 <file
1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

Inverting the pattern:

$ sh script.sh 00011 <file
4rth lines (delete)
5th (del)
9th (del)
10th (del)
14th (del)
15th (del)

edited 15 hours ago

answered 19 hours ago

Kusalananda♦

138k17258428

A generic solution for masking out a particular pattern of lines from a file:

#!/bin/sh

# The pattern is given on the command line.
pattern=$1

# The period is simply the length of the pattern.
period=$#pattern

# Use bc to convert the binary pattern to an integer.
mask=$( printf 'ibase=2; %sn' "$pattern" | bc )

awk -v mask="$mask" -v period="$period" '
 BEGIN p = lshift(1, period-1) 
 and(rshift(p, (FNR-1) % period), mask)'

This takes a pattern, which is a binary number representing both the cyclic period and what lines within each period should be kept or masked out. A 1 means "keep" and a 0 means "delete".

For example: The pattern of line that should be applied in your question is 11100, which means "for each set of five lines, keep the first three and delete the others".

Using 01001000 would delete all but the 2nd and 5th lines in every 8 lines.

The awk program could also be written without the BEGIN block as

and(lshift(1, (period-1) - (FNR-1) % period), mask)

Since the code relies on the binary representation of the pattern, very long patterns may not work well.

Testing:

Removing the lines you want to remove:

$ sh script.sh 11100 <file
1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

Inverting the pattern:

$ sh script.sh 00011 <file
4rth lines (delete)
5th (del)
9th (del)
10th (del)
14th (del)
15th (del)

edited 15 hours ago

answered 19 hours ago

Kusalananda♦

138k17258428

edited 15 hours ago

answered 19 hours ago

Kusalananda♦

138k17258428

answered 19 hours ago

Kusalananda♦

138k17258428

answered 19 hours ago

Kusalananda♦

138k17258428

add a comment |

This can be solved using GNU sed:

sed '4~5,5~5d' file

answered 15 hours ago

tomsmeding

1413

New contributor

2

+1 ... or 4~5N;d;

– steeldriver
13 hours ago

add a comment |

This can be solved using GNU sed:

sed '4~5,5~5d' file

answered 15 hours ago

tomsmeding

1413

New contributor

2

+1 ... or 4~5N;d;

– steeldriver
13 hours ago

add a comment |

This can be solved using GNU sed:

sed '4~5,5~5d' file

answered 15 hours ago

tomsmeding

1413

New contributor

This can be solved using GNU sed:

sed '4~5,5~5d' file

answered 15 hours ago

tomsmeding

1413

New contributor

answered 15 hours ago

tomsmeding

1413

New contributor

answered 15 hours ago

tomsmeding

1413

answered 15 hours ago

tomsmeding

1413

New contributor

tomsmeding is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

2

+1 ... or 4~5N;d;

– steeldriver
13 hours ago

add a comment |

2

+1 ... or 4~5N;d;

– steeldriver
13 hours ago

+1 ... or 4~5N;d;

– steeldriver
13 hours ago

add a comment |

Tried with below command and it worked fine

for((i=1;i<=20;i++)); do j=$(($i+2)); sed -n ''$i','$j'p' filename;i=$(($j+2)); done

output

1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

answered 21 hours ago

Praveen Kumar BS

1,6981311

1

That is nice, but you have know how many lines you have in advance, and you're looping back from the beginning each round. It cannot be used on a stream, and it gets more inefficient the bigger the data gets, so since OP says the number of lines is very large, this is not the best solution.

– Law29
17 hours ago

add a comment |

Tried with below command and it worked fine

for((i=1;i<=20;i++)); do j=$(($i+2)); sed -n ''$i','$j'p' filename;i=$(($j+2)); done

output

1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

answered 21 hours ago

Praveen Kumar BS

1,6981311

1

That is nice, but you have know how many lines you have in advance, and you're looping back from the beginning each round. It cannot be used on a stream, and it gets more inefficient the bigger the data gets, so since OP says the number of lines is very large, this is not the best solution.

– Law29
17 hours ago

add a comment |

Tried with below command and it worked fine

for((i=1;i<=20;i++)); do j=$(($i+2)); sed -n ''$i','$j'p' filename;i=$(($j+2)); done

output

1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

answered 21 hours ago

Praveen Kumar BS

1,6981311

Tried with below command and it worked fine

for((i=1;i<=20;i++)); do j=$(($i+2)); sed -n ''$i','$j'p' filename;i=$(($j+2)); done

output

1st line (keep)
2nd line (keep)
3rd line (keep)
6th (keep)
7nth (keep)
8th lines (keep)
11th (keep)
12th (keep)
13th (keep)

answered 21 hours ago

Praveen Kumar BS

1,6981311

answered 21 hours ago

Praveen Kumar BS

1,6981311

answered 21 hours ago

Praveen Kumar BS

1,6981311

answered 21 hours ago

Praveen Kumar BS

1,6981311

1

That is nice, but you have know how many lines you have in advance, and you're looping back from the beginning each round. It cannot be used on a stream, and it gets more inefficient the bigger the data gets, so since OP says the number of lines is very large, this is not the best solution.

– Law29
17 hours ago

add a comment |

1

That is nice, but you have know how many lines you have in advance, and you're looping back from the beginning each round. It cannot be used on a stream, and it gets more inefficient the bigger the data gets, so since OP says the number of lines is very large, this is not the best solution.

– Law29
17 hours ago

That is nice, but you have know how many lines you have in advance, and you're looping back from the beginning each round. It cannot be used on a stream, and it gets more inefficient the bigger the data gets, so since OP says the number of lines is very large, this is not the best solution.

– Law29
17 hours ago

add a comment |

Jaguar Jom is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Jaguar Jom is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mdtryhht

6 Answers
6

How it works

Your Answer

Post as a guest

6 Answers
6

6 Answers
6

How it works

How it works

How it works

How it works

Post as a guest

Popular posts from this blog

Category:Tremithousa Media in category "Tremithousa"Navigation menuUpload media34° 49′ 02.7″ N, 32° 26′ 37.32″ EOpenStreetMapGoogle EarthProximityramaReasonatorScholiaStatisticsWikiShootMe

6 Answers 6

How it works

Your Answer

Sign up or log in

Post as a guest

Post as a guest

6 Answers 6

6 Answers 6

How it works

How it works

How it works

How it works

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Category:Tremithousa Media in category "Tremithousa"Navigation menuUpload media34° 49′ 02.7″ N, 32° 26′ 37.32″ EOpenStreetMapGoogle EarthProximityramaReasonatorScholiaStatisticsWikiShootMe

6 Answers
6

6 Answers
6

6 Answers
6