A lot of strings manipulation can be done using the power of regular expressions but in many cases, the built-in string functions are straightforward and take less time to execute.
To begin with, let's see the syntax forms available for Perlsubstr function:
substr EXPR, OFFSET, LENGTH, REPLACEMENT substr EXPR, OFFSET, LENGTH substr EXPR, OFFSET
where:
- EXPR is a string expression from which the substring will be extracted
- OFFSET is an index from where the substring to be extracted starts
- LENGTH is the length of the substring to extract
- REPLACEMENT is a string that will replace the substring
Like in the case with other functions, you can use the parentheses or not, do it as you wish. As you can see above, some arguments are mandatory and others are optional. You must mention at least the string expression (EXPR) and the position (OFFSET) from where the substring to be extracted starts.
Before reviewing the Perl substr function parameters, I want to remind you that in Perl the first character of a string has the index 0, the second 1, and so on. Actually, you can modify this by setting the special variable $[ with whatever you want, but be careful however if you decide to change it. For strings $[ is the index of the first character of the string and by default is set to 0.
Take a moment to look at the following example and see a code sample about how to use the Perl substr function:
my $names = "John Peterson Anne Mike"; my $oneName = substr($names, 5, 8); print "$names\n"; #it prints John Peterson Anne Mike print "$oneName\n"; #it prints Peterson
Please note that $names variable value didn't change after using the Perlsubstr function.
We can use Perl substr function either in various comparisons or like a lvaluesuch as an assignment. In this last case, if the EXPR will be a string variable, the value of the string variable will be modified. See the next block of code for this:
my $names = "Alin Fred John"; substr($names, 5, 4) = "Mary"; print "$names\n"; # it prints Alin Mary John
Check my new How To Tutorial eBooks (PDF format):
- Perl Scalar and String Functions
- Perl Functions for Real Arrays
- Perl Functions for List Data
- Perl Functions for Real Hashes
- Perl Complex Data Structures
- Perl Statements
to see a lot of examples about how to use built-in functions, complex data structures and statements in Perl.
And now let's get back to our parameters.
OFFSET could be:
- positive – the substring starts that far from the beginning of the string
- negative – the substring starts that far from the end of the string
- 0 - that means that the substring starts at the first character of the string
If the substring is used like a lvalue, and the OFFSET is entirely outside the expression string, a fatal error will be issued. See the following snippet code to illustrate the cases discussed above:
my $names = "Alin Fred Peter"; my $oneName = substr($names, -10, 4); print "Name: $oneName\n"; $oneName = substr $names, -100, 4; print "Name: $oneName\n"; substr($names, 20, 5) = "Alice";
This code will produce as result:
Name: Fred Name: substr outside of string at 1.pl line 6.
Well, 1.pl is my script name. You can run the script in a command prompt (on a Windows machine in my case) using the switch –w and you'll get the warning errors too (perl -w 1.pl).
LENGTH could be:
- omitted – the function will return all the characters beginning with theOFFSET position up to the end character of the string
- positive – the function will return from the string maximum LENGTHcharacters beginning with the OFFSET position
- negative – it will return the substring starting with the OFFSET position but without that many characters off the end of the string
- 0 – in this case the returned substring will be empty, no error warning
my $names = "Alex James Abby Shannon Monica"; my $strNames = substr $names, 11; # length omitted print "$strNames\n"; # prints Abby Shannon Monica $strNames = substr $names, 24, 100; # length = 100 print "$strNames\n"; # prints Monica $strNames = substr $names, 24, -2; # length = -2 print "$strNames\n"; # prints Moni
And now some examples using substr as a lvalue:
my $names = "Alex James Abby Shannon Monica"; substr($names, 11, 4) = "Alexandra"; print "$names\n"; # prints Alex James Alexandra Shannon Monica
In the example above, the substring "Abby" (found at offset 11) will be replaced by the substring "Alexandra" although this substring is longer than 4 (the LENGTH supplied to substr function). As you see, $names has now more characters than it had initially.
Next, look at an example where the substring used to be assigned is shorter than the LENGTH supplied to Perl substr function:
my $names = "Alex James Abby Shannon Monica"; substr($names, 11, 4) = "Tom"; print "$names\n"; # prints Alex James Tom Shannon Monica
After assignment, $names became shorter that the initial string.
You can play around with these examples to see how the Perl substr function works in other similar situations.
REPLACEMENT
my $names = "Alex James Abby Shannon Monica"; substr $names, 16, 7, "Alexandra"; # it will replace "Shannon" with "Alexandra" print "$names\n"; # it prints Alex James Abby Alexandra Monica
As you have seen in the example above, $names will change the value after the replacement (like in the case of a lvalue).
Finally, I'll show you a mini script application where you can see how you can use Perl substr function in connection with other string functions.
| How to use substr to get the column fields from a flat file database |
A flat file database consists of a number of records delimited by a separator, which in most cases is the newline ("\n") character. In this case we say that each record is specified on a single line. Each record consists by one or more fields, either of fixed width or delimited by some special character like whitespace or comma.
For instance, let's suppose that each record of the file customers.txt includes the fields Name, Phone and ZipCode and the entire file has only three records, like in the next figure:
| Name | Phone | ZipCode |
| John Abbot | 872-321-1212 | 55416 |
| Clark Eliot | 205-321-1200 | 20037 |
| Johnny Randolph | 345-767-3476 | 33702 |
Fixed-width columns
We'll examine first the case when the fields have fixed width: Name – 20, Phone – 12 and ZipCode – 5. If we'll print the file, we'll get something like this:
John Abbot 872-321-121255416 Clark Eliot 205-321-120020037 Johnny Randolph 345-767-347633702
The following block of code reads the file and prints each record on a single line, the fields being separated by comma:
open FILE, "customers.txt" or die $!; while (<FILE>) { # chomp off the possible ending newline from $_ chomp; my $name = substr($_,0, 20); #Trim the end trailling spaces $name =~ s/ +$//; my $phone = substr($_,20, 12); # delete all '-' characters $phone =~ s/-//g; my $zipCode = substr($_, 20+12, 5); print $name, ",",$phone, ",",$zipCode, "\n"; } close FILE; Running this snippet code will produce the following output:
John Abbot,8723211212,55416 Clark Eliot,2053211200,20037 Johnny Randolph,34576734763,33702
Columns delimitated by separator
The next example will illustrate the case when the fields are delimited by a separator character like comma. In this case the content of our file will be:
John Abbot,872-321-1212,55416 Clark Eliot,205-321-1200,20037 Johnny Randolph,345-767-34763,33702
Because we want to show you how you can use the Perl substr function to access the fields of the record, we'll not use the split function to do this (although it looks easier). See the next sample code to see how you could implement it:
open FILE, "customers.txt" or die $!; while (<FILE>) { # chomp off the possible ending newline chomp; my $pos1 = index($_, ","); my $name = substr($_,0, $pos1); my $pos2 = index $_, ",", $pos1+1; my $phone = substr($_,$pos1+1, $pos2-$pos1-1); # delete all - characters $phone =~ s/-//g; my $zipCode = substr($_, $pos2+1, length($_)-$pos2); print $name,",",$phone,",",$zipCode,"\n"; } close FILE; The output is the same as in the previous example.
Please click here to download the Perl substr script with all the above examples included.
没有评论:
发表评论