[ Index ]

PHP Cross Reference of Unnamed Project

title

Body

[close]

/se3-unattended/var/se3/unattended/install/linuxaux/opt/perl/lib/5.10.0/pod/ -> perlreftut.pod (source)

   1  =head1 NAME
   2  
   3  perlreftut - Mark's very short tutorial about references
   4  
   5  =head1 DESCRIPTION
   6  
   7  One of the most important new features in Perl 5 was the capability to
   8  manage complicated data structures like multidimensional arrays and
   9  nested hashes.  To enable these, Perl 5 introduced a feature called
  10  `references', and using references is the key to managing complicated,
  11  structured data in Perl.  Unfortunately, there's a lot of funny syntax
  12  to learn, and the main manual page can be hard to follow.  The manual
  13  is quite complete, and sometimes people find that a problem, because
  14  it can be hard to tell what is important and what isn't.
  15  
  16  Fortunately, you only need to know 10% of what's in the main page to get
  17  90% of the benefit.  This page will show you that 10%.
  18  
  19  =head1 Who Needs Complicated Data Structures?
  20  
  21  One problem that came up all the time in Perl 4 was how to represent a
  22  hash whose values were lists.  Perl 4 had hashes, of course, but the
  23  values had to be scalars; they couldn't be lists.
  24  
  25  Why would you want a hash of lists?  Let's take a simple example: You
  26  have a file of city and country names, like this:
  27  
  28      Chicago, USA
  29      Frankfurt, Germany
  30      Berlin, Germany
  31      Washington, USA
  32      Helsinki, Finland
  33      New York, USA
  34  
  35  and you want to produce an output like this, with each country mentioned
  36  once, and then an alphabetical list of the cities in that country:
  37  
  38      Finland: Helsinki.
  39      Germany: Berlin, Frankfurt.
  40      USA:  Chicago, New York, Washington.
  41  
  42  The natural way to do this is to have a hash whose keys are country
  43  names.  Associated with each country name key is a list of the cities in
  44  that country.  Each time you read a line of input, split it into a country
  45  and a city, look up the list of cities already known to be in that
  46  country, and append the new city to the list.  When you're done reading
  47  the input, iterate over the hash as usual, sorting each list of cities
  48  before you print it out.
  49  
  50  If hash values can't be lists, you lose.  In Perl 4, hash values can't
  51  be lists; they can only be strings.  You lose.  You'd probably have to
  52  combine all the cities into a single string somehow, and then when
  53  time came to write the output, you'd have to break the string into a
  54  list, sort the list, and turn it back into a string.  This is messy
  55  and error-prone.  And it's frustrating, because Perl already has
  56  perfectly good lists that would solve the problem if only you could
  57  use them.
  58  
  59  =head1 The Solution
  60  
  61  By the time Perl 5 rolled around, we were already stuck with this
  62  design: Hash values must be scalars.  The solution to this is
  63  references.
  64  
  65  A reference is a scalar value that I<refers to> an entire array or an
  66  entire hash (or to just about anything else).  Names are one kind of
  67  reference that you're already familiar with.  Think of the President
  68  of the United States: a messy, inconvenient bag of blood and bones.
  69  But to talk about him, or to represent him in a computer program, all
  70  you need is the easy, convenient scalar string "George Bush".
  71  
  72  References in Perl are like names for arrays and hashes.  They're
  73  Perl's private, internal names, so you can be sure they're
  74  unambiguous.  Unlike "George Bush", a reference only refers to one
  75  thing, and you always know what it refers to.  If you have a reference
  76  to an array, you can recover the entire array from it.  If you have a
  77  reference to a hash, you can recover the entire hash.  But the
  78  reference is still an easy, compact scalar value.
  79  
  80  You can't have a hash whose values are arrays; hash values can only be
  81  scalars.  We're stuck with that.  But a single reference can refer to
  82  an entire array, and references are scalars, so you can have a hash of
  83  references to arrays, and it'll act a lot like a hash of arrays, and
  84  it'll be just as useful as a hash of arrays.
  85  
  86  We'll come back to this city-country problem later, after we've seen
  87  some syntax for managing references.
  88  
  89  
  90  =head1 Syntax
  91  
  92  There are just two ways to make a reference, and just two ways to use
  93  it once you have it.
  94  
  95  =head2 Making References
  96  
  97  =head3 B<Make Rule 1>
  98  
  99  If you put a C<\> in front of a variable, you get a
 100  reference to that variable.
 101  
 102      $aref = \@array;         # $aref now holds a reference to @array
 103      $href = \%hash;          # $href now holds a reference to %hash
 104      $sref = \$scalar;        # $sref now holds a reference to $scalar
 105  
 106  Once the reference is stored in a variable like $aref or $href, you
 107  can copy it or store it just the same as any other scalar value:
 108  
 109      $xy = $aref;             # $xy now holds a reference to @array
 110      $p[3] = $href;           # $p[3] now holds a reference to %hash
 111      $z = $p[3];              # $z now holds a reference to %hash
 112  
 113  
 114  These examples show how to make references to variables with names.
 115  Sometimes you want to make an array or a hash that doesn't have a
 116  name.  This is analogous to the way you like to be able to use the
 117  string C<"\n"> or the number 80 without having to store it in a named
 118  variable first.
 119  
 120  B<Make Rule 2>
 121  
 122  C<[ ITEMS ]> makes a new, anonymous array, and returns a reference to
 123  that array.  C<{ ITEMS }> makes a new, anonymous hash, and returns a
 124  reference to that hash.
 125  
 126      $aref = [ 1, "foo", undef, 13 ];
 127      # $aref now holds a reference to an array
 128  
 129      $href = { APR => 4, AUG => 8 };
 130      # $href now holds a reference to a hash
 131  
 132  
 133  The references you get from rule 2 are the same kind of
 134  references that you get from rule 1:
 135  
 136      # This:
 137      $aref = [ 1, 2, 3 ];
 138  
 139      # Does the same as this:
 140      @array = (1, 2, 3);
 141      $aref = \@array;
 142  
 143  
 144  The first line is an abbreviation for the following two lines, except
 145  that it doesn't create the superfluous array variable C<@array>.
 146  
 147  If you write just C<[]>, you get a new, empty anonymous array.
 148  If you write just C<{}>, you get a new, empty anonymous hash.
 149  
 150  
 151  =head2 Using References
 152  
 153  What can you do with a reference once you have it?  It's a scalar
 154  value, and we've seen that you can store it as a scalar and get it back
 155  again just like any scalar.  There are just two more ways to use it:
 156  
 157  =head3 B<Use Rule 1>
 158  
 159  You can always use an array reference, in curly braces, in place of
 160  the name of an array.  For example, C<@{$aref}> instead of C<@array>.
 161  
 162  Here are some examples of that:
 163  
 164  Arrays:
 165  
 166  
 167      @a        @{$aref}        An array
 168      reverse @a    reverse @{$aref}    Reverse the array
 169      $a[3]        ${$aref}[3]        An element of the array
 170      $a[3] = 17;    ${$aref}[3] = 17    Assigning an element
 171  
 172  
 173  On each line are two expressions that do the same thing.  The
 174  left-hand versions operate on the array C<@a>.  The right-hand
 175  versions operate on the array that is referred to by C<$aref>.  Once
 176  they find the array they're operating on, both versions do the same
 177  things to the arrays.
 178  
 179  Using a hash reference is I<exactly> the same:
 180  
 181      %h        %{$href}          A hash
 182      keys %h        keys %{$href}          Get the keys from the hash
 183      $h{'red'}    ${$href}{'red'}          An element of the hash
 184      $h{'red'} = 17    ${$href}{'red'} = 17  Assigning an element
 185  
 186  Whatever you want to do with a reference, B<Use Rule 1> tells you how
 187  to do it.  You just write the Perl code that you would have written
 188  for doing the same thing to a regular array or hash, and then replace
 189  the array or hash name with C<{$reference}>.  "How do I loop over an
 190  array when all I have is a reference?"  Well, to loop over an array, you
 191  would write
 192  
 193          for my $element (@array) {
 194             ...
 195          }
 196  
 197  so replace the array name, C<@array>, with the reference:
 198  
 199          for my $element (@{$aref}) {
 200             ...
 201          }
 202  
 203  "How do I print out the contents of a hash when all I have is a
 204  reference?"  First write the code for printing out a hash:
 205  
 206          for my $key (keys %hash) {
 207            print "$key => $hash{$key}\n";
 208          }
 209  
 210  And then replace the hash name with the reference:
 211  
 212          for my $key (keys %{$href}) {
 213            print "$key => ${$href}{$key}\n";
 214          }
 215  
 216  =head3 B<Use Rule 2>
 217  
 218  B<Use Rule 1> is all you really need, because it tells you how to do
 219  absolutely everything you ever need to do with references.  But the
 220  most common thing to do with an array or a hash is to extract a single
 221  element, and the B<Use Rule 1> notation is cumbersome.  So there is an
 222  abbreviation.
 223  
 224  C<${$aref}[3]> is too hard to read, so you can write C<< $aref->[3] >>
 225  instead.
 226  
 227  C<${$href}{red}> is too hard to read, so you can write
 228  C<< $href->{red} >> instead.
 229  
 230  If C<$aref> holds a reference to an array, then C<< $aref->[3] >> is
 231  the fourth element of the array.  Don't confuse this with C<$aref[3]>,
 232  which is the fourth element of a totally different array, one
 233  deceptively named C<@aref>.  C<$aref> and C<@aref> are unrelated the
 234  same way that C<$item> and C<@item> are.
 235  
 236  Similarly, C<< $href->{'red'} >> is part of the hash referred to by
 237  the scalar variable C<$href>, perhaps even one with no name.
 238  C<$href{'red'}> is part of the deceptively named C<%href> hash.  It's
 239  easy to forget to leave out the C<< -> >>, and if you do, you'll get
 240  bizarre results when your program gets array and hash elements out of
 241  totally unexpected hashes and arrays that weren't the ones you wanted
 242  to use.
 243  
 244  
 245  =head2 An Example
 246  
 247  Let's see a quick example of how all this is useful.
 248  
 249  First, remember that C<[1, 2, 3]> makes an anonymous array containing
 250  C<(1, 2, 3)>, and gives you a reference to that array.
 251  
 252  Now think about
 253  
 254      @a = ( [1, 2, 3],
 255                 [4, 5, 6],
 256             [7, 8, 9]
 257               );
 258  
 259  @a is an array with three elements, and each one is a reference to
 260  another array.
 261  
 262  C<$a[1]> is one of these references.  It refers to an array, the array
 263  containing C<(4, 5, 6)>, and because it is a reference to an array,
 264  B<Use Rule 2> says that we can write C<< $a[1]->[2] >> to get the
 265  third element from that array.  C<< $a[1]->[2] >> is the 6.
 266  Similarly, C<< $a[0]->[1] >> is the 2.  What we have here is like a
 267  two-dimensional array; you can write C<< $a[ROW]->[COLUMN] >> to get
 268  or set the element in any row and any column of the array.
 269  
 270  The notation still looks a little cumbersome, so there's one more
 271  abbreviation:
 272  
 273  =head2 Arrow Rule
 274  
 275  In between two B<subscripts>, the arrow is optional.
 276  
 277  Instead of C<< $a[1]->[2] >>, we can write C<$a[1][2]>; it means the
 278  same thing.  Instead of C<< $a[0]->[1] = 23 >>, we can write
 279  C<$a[0][1] = 23>; it means the same thing.
 280  
 281  Now it really looks like two-dimensional arrays!
 282  
 283  You can see why the arrows are important.  Without them, we would have
 284  had to write C<${$a[1]}[2]> instead of C<$a[1][2]>.  For
 285  three-dimensional arrays, they let us write C<$x[2][3][5]> instead of
 286  the unreadable C<${${$x[2]}[3]}[5]>.
 287  
 288  =head1 Solution
 289  
 290  Here's the answer to the problem I posed earlier, of reformatting a
 291  file of city and country names.
 292  
 293      1   my %table;
 294  
 295      2   while (<>) {
 296      3    chomp;
 297      4     my ($city, $country) = split /, /;
 298      5     $table{$country} = [] unless exists $table{$country};
 299      6     push @{$table{$country}}, $city;
 300      7   }
 301  
 302      8   foreach $country (sort keys %table) {
 303      9     print "$country: ";
 304     10     my @cities = @{$table{$country}};
 305     11     print join ', ', sort @cities;
 306     12     print ".\n";
 307     13    }
 308  
 309  
 310  The program has two pieces: Lines 2--7 read the input and build a data
 311  structure, and lines 8-13 analyze the data and print out the report.
 312  We're going to have a hash, C<%table>, whose keys are country names,
 313  and whose values are references to arrays of city names.  The data
 314  structure will look like this:
 315  
 316  
 317             %table
 318          +-------+---+
 319          |       |   |   +-----------+--------+
 320          |Germany| *---->| Frankfurt | Berlin |
 321          |       |   |   +-----------+--------+
 322          +-------+---+
 323          |       |   |   +----------+
 324          |Finland| *---->| Helsinki |
 325          |       |   |   +----------+
 326          +-------+---+
 327          |       |   |   +---------+------------+----------+
 328          |  USA  | *---->| Chicago | Washington | New York |
 329          |       |   |   +---------+------------+----------+
 330          +-------+---+
 331  
 332  We'll look at output first.  Supposing we already have this structure,
 333  how do we print it out?
 334  
 335      8   foreach $country (sort keys %table) {
 336      9     print "$country: ";
 337     10     my @cities = @{$table{$country}};
 338     11     print join ', ', sort @cities;
 339     12     print ".\n";
 340     13    }
 341  
 342  C<%table> is an
 343  ordinary hash, and we get a list of keys from it, sort the keys, and
 344  loop over the keys as usual.  The only use of references is in line 10.
 345  C<$table{$country}> looks up the key C<$country> in the hash
 346  and gets the value, which is a reference to an array of cities in that country.
 347  B<Use Rule 1> says that
 348  we can recover the array by saying
 349  C<@{$table{$country}}>.  Line 10 is just like
 350  
 351      @cities = @array;
 352  
 353  except that the name C<array> has been replaced by the reference
 354  C<{$table{$country}}>.  The C<@> tells Perl to get the entire array.
 355  Having gotten the list of cities, we sort it, join it, and print it
 356  out as usual.
 357  
 358  Lines 2-7 are responsible for building the structure in the first
 359  place.  Here they are again:
 360  
 361      2   while (<>) {
 362      3    chomp;
 363      4     my ($city, $country) = split /, /;
 364      5     $table{$country} = [] unless exists $table{$country};
 365      6     push @{$table{$country}}, $city;
 366      7   }
 367  
 368  Lines 2-4 acquire a city and country name.  Line 5 looks to see if the
 369  country is already present as a key in the hash.  If it's not, the
 370  program uses the C<[]> notation (B<Make Rule 2>) to manufacture a new,
 371  empty anonymous array of cities, and installs a reference to it into
 372  the hash under the appropriate key.
 373  
 374  Line 6 installs the city name into the appropriate array.
 375  C<$table{$country}> now holds a reference to the array of cities seen
 376  in that country so far.  Line 6 is exactly like
 377  
 378      push @array, $city;
 379  
 380  except that the name C<array> has been replaced by the reference
 381  C<{$table{$country}}>.  The C<push> adds a city name to the end of the
 382  referred-to array.
 383  
 384  There's one fine point I skipped.  Line 5 is unnecessary, and we can
 385  get rid of it.
 386  
 387      2   while (<>) {
 388      3    chomp;
 389      4     my ($city, $country) = split /, /;
 390      5   ####  $table{$country} = [] unless exists $table{$country};
 391      6     push @{$table{$country}}, $city;
 392      7   }
 393  
 394  If there's already an entry in C<%table> for the current C<$country>,
 395  then nothing is different.  Line 6 will locate the value in
 396  C<$table{$country}>, which is a reference to an array, and push
 397  C<$city> into the array.  But
 398  what does it do when
 399  C<$country> holds a key, say C<Greece>, that is not yet in C<%table>?
 400  
 401  This is Perl, so it does the exact right thing.  It sees that you want
 402  to push C<Athens> onto an array that doesn't exist, so it helpfully
 403  makes a new, empty, anonymous array for you, installs it into
 404  C<%table>, and then pushes C<Athens> onto it.  This is called
 405  `autovivification'--bringing things to life automatically.  Perl saw
 406  that they key wasn't in the hash, so it created a new hash entry
 407  automatically. Perl saw that you wanted to use the hash value as an
 408  array, so it created a new empty array and installed a reference to it
 409  in the hash automatically.  And as usual, Perl made the array one
 410  element longer to hold the new city name.
 411  
 412  =head1 The Rest
 413  
 414  I promised to give you 90% of the benefit with 10% of the details, and
 415  that means I left out 90% of the details.  Now that you have an
 416  overview of the important parts, it should be easier to read the
 417  L<perlref> manual page, which discusses 100% of the details.
 418  
 419  Some of the highlights of L<perlref>:
 420  
 421  =over 4
 422  
 423  =item *
 424  
 425  You can make references to anything, including scalars, functions, and
 426  other references.
 427  
 428  =item *
 429  
 430  In B<Use Rule 1>, you can omit the curly brackets whenever the thing
 431  inside them is an atomic scalar variable like C<$aref>.  For example,
 432  C<@$aref> is the same as C<@{$aref}>, and C<$$aref[1]> is the same as
 433  C<${$aref}[1]>.  If you're just starting out, you may want to adopt
 434  the habit of always including the curly brackets.
 435  
 436  =item *
 437  
 438  This doesn't copy the underlying array:
 439  
 440          $aref2 = $aref1;
 441  
 442  You get two references to the same array.  If you modify
 443  C<< $aref1->[23] >> and then look at
 444  C<< $aref2->[23] >> you'll see the change.
 445  
 446  To copy the array, use
 447  
 448          $aref2 = [@{$aref1}];
 449  
 450  This uses C<[...]> notation to create a new anonymous array, and
 451  C<$aref2> is assigned a reference to the new array.  The new array is
 452  initialized with the contents of the array referred to by C<$aref1>.
 453  
 454  Similarly, to copy an anonymous hash, you can use
 455  
 456          $href2 = {%{$href1}};
 457  
 458  =item *
 459  
 460  To see if a variable contains a reference, use the C<ref> function.  It
 461  returns true if its argument is a reference.  Actually it's a little
 462  better than that: It returns C<HASH> for hash references and C<ARRAY>
 463  for array references.
 464  
 465  =item *
 466  
 467  If you try to use a reference like a string, you get strings like
 468  
 469      ARRAY(0x80f5dec)   or    HASH(0x826afc0)
 470  
 471  If you ever see a string that looks like this, you'll know you
 472  printed out a reference by mistake.
 473  
 474  A side effect of this representation is that you can use C<eq> to see
 475  if two references refer to the same thing.  (But you should usually use
 476  C<==> instead because it's much faster.)
 477  
 478  =item *
 479  
 480  You can use a string as if it were a reference.  If you use the string
 481  C<"foo"> as an array reference, it's taken to be a reference to the
 482  array C<@foo>.  This is called a I<soft reference> or I<symbolic
 483  reference>.  The declaration C<use strict 'refs'> disables this
 484  feature, which can cause all sorts of trouble if you use it by accident.
 485  
 486  =back
 487  
 488  You might prefer to go on to L<perllol> instead of L<perlref>; it
 489  discusses lists of lists and multidimensional arrays in detail.  After
 490  that, you should move on to L<perldsc>; it's a Data Structure Cookbook
 491  that shows recipes for using and printing out arrays of hashes, hashes
 492  of arrays, and other kinds of data.
 493  
 494  =head1 Summary
 495  
 496  Everyone needs compound data structures, and in Perl the way you get
 497  them is with references.  There are four important rules for managing
 498  references: Two for making references and two for using them.  Once
 499  you know these rules you can do most of the important things you need
 500  to do with references.
 501  
 502  =head1 Credits
 503  
 504  Author: Mark Jason Dominus, Plover Systems (C<mjd-perl-ref+@plover.com>)
 505  
 506  This article originally appeared in I<The Perl Journal>
 507  ( http://www.tpj.com/ ) volume 3, #2.  Reprinted with permission.
 508  
 509  The original title was I<Understand References Today>.
 510  
 511  =head2 Distribution Conditions
 512  
 513  Copyright 1998 The Perl Journal.
 514  
 515  This documentation is free; you can redistribute it and/or modify it
 516  under the same terms as Perl itself.
 517  
 518  Irrespective of its distribution, all code examples in these files are
 519  hereby placed into the public domain.  You are permitted and
 520  encouraged to use this code in your own programs for fun or for profit
 521  as you see fit.  A simple comment in the code giving credit would be
 522  courteous but is not required.
 523  
 524  
 525  
 526  
 527  =cut


Generated: Tue Mar 17 22:47:18 2015 Cross-referenced by PHPXref 0.7.1