Bash How to Read a File Line by Line

A terminal window on a Linux computer system.
Fatmawati Achmad Zaenuri/Shutterstock

It'due south pretty piece of cake to read the contents of a Linux text file line by line in a shell script—as long as yous bargain with some subtle gotchas. Here's how to exercise it the safe mode.

Files, Text, and Idioms

Each programming language has a set of idioms. These are the standard, no-frills means to accomplish a prepare of common tasks. They're the elementary or default way to employ i of the features of the language the programmer is working with. They become part of a programmer'south toolkit of mental blueprints.

Actions similar reading information from files, working with loops, and swapping the values of 2 variables are good examples. The programmer will know at least 1 manner to reach their ends in a generic or vanilla fashion. Perhaps that will suffice for the requirement at hand. Or mayhap they'll embellish the lawmaking to brand it more efficient or applicable to the specific solution they are developing. But having the building-block idiom at their fingertips is a not bad starting point.

Knowing and understanding idioms in one language makes it easier to choice up a new programming language, as well. Knowing how things are constructed in one language and looking for the equivalent—or the closest thing—in another linguistic communication is a good way to appreciate the similarities and differences between programming languages you already know and the i yous're learning.

Reading Lines From a File: The Ane-Liner

In Bash, you lot tin can utilise a while loop on the command line to read each line of text from a file and do something with it. Our text file is called "information.txt." It holds a list of the months of the twelvemonth.

January February March . . October November December

Our simple one-liner is:

while read line; do echo $line; done < information.txt

The while loop reads a line from the file, and the execution flow of the little programme passes to the body of the loop. The repeat command writes the line of text in the last window. The read endeavour fails when there are no more lines to be read, and the loop is washed.

One peachy trick is the ability to redirect a file into a loop. In other programming languages, you'd need to open the file, read from information technology, and close it again when y'all'd finished. With Bash, you can simply employ file redirection and let the beat out handle all of that low-level stuff for you.

Of course, this one-liner isn't terribly useful. Linux already provides the cat command, which does exactly that for us. We've created a long-winded way to supercede a 3-letter command. But it does visibly demonstrate the principles of reading from a file.

That works well enough, upward to a signal. Suppose we accept another text file that contains the names of the months. In this file, the escape sequence for a newline graphic symbol has been appended to each line. We'll phone call information technology "data2.txt."

January\n February\n March\n . . October\n November\due north December\n

Permit'due south utilize our one-liner on our new file.

while read line; do echo $line; done < data2.txt

The backslash escape grapheme " \ " has been discarded. The result is that an "n" has been appended to each line. Bash is interpreting the backslash as the beginning of an escape sequence. Often, we don't want Bash to interpret what it is reading. Information technology tin can be more convenient to read a line in its entirety—backslash escape sequences and all—and choose what to parse out or supervene upon yourself, within your own code.

If nosotros want to do any meaningful processing or parsing on the lines of text, nosotros'll need to employ a script.

Reading Lines From a File With a Script

Here's our script. It's chosen "script1.sh."

                                          #!/bin/bash                                                          Counter=0                                            while                IFS=                ''                read                -r                LinefromFile                ||                [[                -n                "                ${LinefromFile}                "                ]];                practise                                                          ((                Counter                ++                ))                                            echo                "Accessing line                                $Counter                :                                ${LinefromFile}                "                                                          done                <                "                $1                "                                    

We set a variable called Counter to cypher, then we define our while loop.

The kickoff argument on the while line is IFS='' . IFS stands for internal field separator. It holds values that Fustigate uses to identify word boundaries. By default, the read control strips off leading and trailing whitespace. If we want to read the lines from the file exactly as they are, we demand to prepare IFS to be an empty cord.

We could set this one time outside of the loop, only similar we're setting the value of Counter . But with more complex scripts—especially those with many user-defined functions in them—it is possible that IFS could be set to unlike values elsewhere in the script. Ensuring that IFS is set to an empty cord each time the while loop iterates guarantees that nosotros know what its behavior will exist.

We're going to read a line of text into a variable called LinefromFile . We're using the -r (read backslash as a normal character) choice to ignore backslashes. They'll be treated but like any other character and won't receive whatever special treatment.

There are two conditions that will satisfy the while loop and allow the text to be processed past the body of the loop:

  • read -r LinefromFile : When a line of text is successfully read from the file, the read control sends a success signal to the while , and the while loop passes the execution flow to the body of the loop. Note that the read control needs to come across a newline character at the end of the line of text in gild to consider it a successful read. If the file is non a POSIX compliant text file, the last line may not include a newline character. If the read command sees the cease of file marker (EOF) before the line is terminated by a newline, information technology will not treat it every bit a successful read. If that happens, the terminal line of text volition not be passed to the body of the loop and will not exist processed.
  • [ -north "${LinefromFile}" ] : We need to do some extra piece of work to handle not-POSIX compatible files. This comparison checks the text that is read from the file. If it isn't terminated with a newline character, this comparison volition however return success to the while loop. This ensures that any trailing line fragments are processed by the body of the loop.

These two clauses are separated by the OR logical operator " || " and so that ifeither clause returns success, the retrieved text is processed by the body of the loop, whether at that place is a newline character or not.

In the body of our loop, we're incrementing the Counter variable past one and using echo to ship some output to the last window. The line number and the text of each line are displayed.

We can still apply our redirection play a joke on to redirect a file into a loop. In this case, we're redirecting $ane, a variable that holds the name of the get-go command line parameter that passed to the script. Using this trick, we can hands pass in the name of the information file that we want the script to work on.

Copy and paste the script into an editor and relieve it with the filename "script1.sh." Use the chmod command to make information technology executable.

chmod +10 script1.sh

Let'south see what our script makes of the data2.txt text file and the backslashes contained within information technology.

./script1.sh data2.txt

Every character in the line is displayed verbatim. The backslashes are not interpreted as escape characters. They're printed as regular characters.

Passing the Line to a Function

We're however merely echoing the text to the screen. In a existent-world programming scenario, nosotros'd likely be about to practise something more than interesting with the line of text. In most cases, information technology is a good programming practice to handle the further processing of the line in another function.

Here's how nosotros could exercise it. This is "script2.sh."

                                                #!/bin/bash                                                                  Counter=0                                                  function                                      process_line()                  {                                                                  repeat                  "Processing line                                    $Counter                  :                                    $1                  "                                                                  }                                                                  while                  IFS=                  ''                  read                  -r                  LinefromFile                  ||                  [[                  -n                  "                  ${LinefromFile}                  "                  ]];                  practise                                                                  ((                  Counter                  ++                  ))                                                  process_line                  "                  $LinefromFile                  "                                                                  done                  <                  "                  $1                  "                                          

Nosotros define our Counter variable as before, and so we define a function called process_line() . The definition of a role must appear earlier the function is starting time called in the script.

Our function is going to be passed the newly read line of text in each iteration of the while loop. Nosotros can access that value inside the function by using the $one variable. If there were two variables passed to the function, nosotros could access those values using $1 and $2 , and then on for more variables.

The while loop is mainly the same. There is only one change inside the body of the loop. The echo line has been replaced past a telephone call to the process_line() function. Note that you don't need to use the "()" brackets in the name of the function when you are calling it.

The name of the variable holding the line of text, LinefromFile , is wrapped in quotation marks when it is passed to the part. This caters for lines that have spaces in them. Without the quotation marks, the kickoff word is treated as $ane by the part, the second give-and-take is considered to be $two , and so on. Using quotation marks ensures that the entire line of text is handled, altogether, every bit $1. Notation that this is not the same $one that holds the same data file passed to the script.

Because Counter has been alleged in the main body of the script and non inside a function, information technology tin exist referenced inside the process_line() function.

Copy or type the script above into an editor and salvage information technology with the filename "script2.sh." Make it executable with chmod :

chmod +x script2.sh

At present we can run it and laissez passer in a new data file, "data3.txt." This has a list of the months in it, and one line with many words on information technology.

January Feb March . . Oct November \nMore text "at the end of the line" December

Our command is:

./script2.sh data3.txt

The lines are read from the file and passed one past ane to the process_line() function. All the lines are displayed correctly, including the odd one with the backspace, quotation marks, and multiple words in it.

Building Blocks Are Useful

There's a railroad train of idea that says that an idiom must contain something unique to that linguistic communication. That's non a conventionalities that I subscribe to. What's of import is that information technology makes good use of the linguistic communication, is easy to remember, and provides a reliable and robust style to implement some functionality in your code.

carteritis1937.blogspot.com

Source: https://www.howtogeek.com/709838/how-to-process-a-file-line-by-line-in-a-linux-bash-script/

0 Response to "Bash How to Read a File Line by Line"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel