root/docs/bobchin/diveintopython/functional_programming/filtering_lists.html @ 26131

Revision 26131, 17.7 kB (checked in by drry, 8 years ago)
  • added subversion properties.
  • Property svn:mime-type set to text/html
2<!DOCTYPE html
3  PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "">
5   <head>
6      <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
8      <title>16.3.&nbsp;Filtering lists revisited</title>
9      <link rel="stylesheet" href="../diveintopython.css" type="text/css">
10      <link rev="made" href="">
11      <meta name="generator" content="DocBook XSL Stylesheets V1.52.2">
12      <meta name="keywords" content="Python, Dive Into Python, tutorial, object-oriented, programming, documentation, book, free">
13      <meta name="description" content="Python from novice to pro">
14      <link rel="home" href="../toc/index.html" title="Dive Into Python">
15      <link rel="up" href="index.html" title="Chapter&nbsp;16.&nbsp;Functional Programming">
16      <link rel="previous" href="finding_the_path.html" title="16.2.&nbsp;Finding the path">
17      <link rel="next" href="mapping_lists.html" title="16.4.&nbsp;Mapping lists revisited">
18   </head>
19   <body>
20      <table id="Header" width="100%" border="0" cellpadding="0" cellspacing="0" summary="">
21         <tr>
22            <td id="breadcrumb" colspan="5" align="left" valign="top">You are here: <a href="../index.html">Home</a>&nbsp;&gt;&nbsp;<a href="../toc/index.html">Dive Into Python</a>&nbsp;&gt;&nbsp;<a href="index.html">Functional Programming</a>&nbsp;&gt;&nbsp;<span class="thispage">Filtering lists revisited</span></td>
23            <td id="navigation" align="right" valign="top">&nbsp;&nbsp;&nbsp;<a href="finding_the_path.html" title="Prev: &#8220;Finding the path&#8221;">&lt;&lt;</a>&nbsp;&nbsp;&nbsp;<a href="mapping_lists.html" title="Next: &#8220;Mapping lists revisited&#8221;">&gt;&gt;</a></td>
24         </tr>
25         <tr>
26            <td colspan="3" id="logocontainer">
27               <h1 id="logo"><a href="../index.html" accesskey="1">Dive Into Python</a></h1>
28               <p id="tagline">Python from novice to pro</p>
29            </td>
30            <td colspan="3" align="right">
31               <form id="search" method="GET" action="">
32                  <p><label for="q" accesskey="4">Find:&nbsp;</label><input type="text" id="q" name="q" size="20" maxlength="255" value=" "> <input type="submit" value="Search"><input type="hidden" name="cof" value="LW:752;L:;LH:42;AH:left;GL:0;AWFID:3ced2bb1f7f1b212;"><input type="hidden" name="domains" value=""><input type="hidden" name="sitesearch" value=""></p>
33               </form>
34            </td>
35         </tr>
36      </table>
37      <div id="sponsoredlinks">
38<script type="text/javascript"><!--
39google_ad_client = "pub-7176621954566026";
40google_ad_width = 120;
41google_ad_height = 600;
42google_ad_format = "120x600_as";
43google_color_border = "800000";
44google_color_bg = "FFFFFF";
45google_color_link = "800000";
46google_color_url = "008000";
47google_color_text = "000000";
48google_ad_channel ="1280308639";
50<script type="text/javascript"
51  src="">
55      <div class="section" lang="en">
56         <div class="titlepage">
57            <div>
58               <div>
59                  <h2 class="title"><a name="regression.filter"></a>16.3.&nbsp;Filtering lists revisited
60                  </h2>
61               </div>
62            </div>
63            <div></div>
64         </div>
65         <div class="abstract">
66            <p>You're already familiar with <a href="../power_of_introspection/filtering_lists.html" title="4.5.&nbsp;Filtering Lists">using list comprehensions to filter lists</a>.  There is another way to accomplish this same thing, which some people feel is more expressive.
67            </p>
68         </div>
69         <p><span class="application">Python</span> has a built-in <tt class="function">filter</tt> function which takes two arguments, a function and a list, and returns a list.<sup>[<a name="d0e35697" href="#ftn.d0e35697">7</a>]</sup>  The function passed as the first argument to <tt class="function">filter</tt> must itself take one argument, and the list that <tt class="function">filter</tt> returns will contain all the elements from the list passed to <tt class="function">filter</tt> for which the function passed to <tt class="function">filter</tt> returns true.
70         </p>
71         <p>Got all that?  It's not as difficult as it sounds.</p>
72         <div class="example"><a name="d0e35724"></a><h3 class="title">Example&nbsp;16.7.&nbsp;Introducing <tt class="function">filter</tt></h3><pre class="screen">
73<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput"><span class='pykeyword'>def</span><span class='pyclass'> odd</span>(n):</span>                 <a name="regression.filter.1.1"></a><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12">
74<tt class="prompt">... </tt><span class="userinput"><span class='pykeyword'>return</span> n % 2</span>
75<tt class="prompt">...     </tt>
76<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">li = [1, 2, 3, 5, 9, 10, 256, -3]</span>
77<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">filter(odd, li)</span>             <a name="regression.filter.1.2"></a><img src="../images/callouts/2.png" alt="2" border="0" width="12" height="12">
78<span class="computeroutput">[1, 3, 5, 9, -3]</span>
79<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">[e <span class='pykeyword'>for</span> e <span class='pykeyword'>in</span> li <span class='pykeyword'>if</span> odd(e)]</span>   <a name="regression.filter.1.3"></a><img src="../images/callouts/3.png" alt="3" border="0" width="12" height="12">
80<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">filteredList = []</span>
81<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput"><span class='pykeyword'>for</span> n <span class='pykeyword'>in</span> li:</span>                <a name="regression.filter.1.4"></a><img src="../images/callouts/4.png" alt="4" border="0" width="12" height="12">
82<tt class="prompt">... </tt><span class="userinput"><span class='pykeyword'>if</span> odd(n):</span>
83<tt class="prompt">... </tt><span class="userinput">    filteredList.append(n)</span>
84<tt class="prompt">...     </tt>
85<tt class="prompt">&gt;&gt;&gt; </tt><span class="userinput">filteredList</span>
86<span class="computeroutput">[1, 3, 5, 9, -3]</span></pre><div class="calloutlist">
87               <table border="0" summary="Callout list">
88                  <tr>
89                     <td width="12" valign="top" align="left"><a href="#regression.filter.1.1"><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12"></a>
90                     </td>
91                     <td valign="top" align="left"><tt class="function">odd</tt> uses the built-in mod function &#8220;<span class="quote"><tt class="literal">%</tt></span>&#8221; to return <tt class="constant">True</tt> if <tt class="varname">n</tt> is odd and <tt class="constant">False</tt> if <tt class="varname">n</tt> is even.
92                     </td>
93                  </tr>
94                  <tr>
95                     <td width="12" valign="top" align="left"><a href="#regression.filter.1.2"><img src="../images/callouts/2.png" alt="2" border="0" width="12" height="12"></a>
96                     </td>
97                     <td valign="top" align="left"><tt class="function">filter</tt> takes two arguments, a function (<tt class="function">odd</tt>) and a list (<tt class="varname">li</tt>).  It loops through the list and calls <tt class="function">odd</tt> with each element.  If <tt class="function">odd</tt> returns a true value (remember, any non-zero value is true in <span class="application">Python</span>), then the element is included in the returned list, otherwise it is filtered out.  The result is a list of only the odd
98                        numbers from the original list, in the same order as they appeared in the original.
99                     </td>
100                  </tr>
101                  <tr>
102                     <td width="12" valign="top" align="left"><a href="#regression.filter.1.3"><img src="../images/callouts/3.png" alt="3" border="0" width="12" height="12"></a>
103                     </td>
104                     <td valign="top" align="left">You could accomplish the same thing using list comprehensions, as you saw in <a href="../power_of_introspection/filtering_lists.html" title="4.5.&nbsp;Filtering Lists">Section&nbsp;4.5, &#8220;Filtering Lists&#8221;</a>.
105                     </td>
106                  </tr>
107                  <tr>
108                     <td width="12" valign="top" align="left"><a href="#regression.filter.1.4"><img src="../images/callouts/4.png" alt="4" border="0" width="12" height="12"></a>
109                     </td>
110                     <td valign="top" align="left">You could also accomplish the same thing with a <tt class="literal">for</tt> loop.  Depending on your programming background, this may seem more &#8220;<span class="quote">straightforward</span>&#8221;, but functions like <tt class="function">filter</tt> are much more expressive.  Not only is it easier to write, it's easier to read, too.  Reading the <tt class="literal">for</tt> loop is like standing too close to a painting; you see all the details, but it may take a few seconds to be able to step
111                        back and see the bigger picture: &#8220;<span class="quote">Oh, you're just filtering the list!</span>&#8221;
112                     </td>
113                  </tr>
114               </table>
115            </div>
116         </div>
117         <div class="example"><a name="d0e35864"></a><h3 class="title">Example&nbsp;16.8.&nbsp;<tt class="function">filter</tt> in <tt class="filename"></tt></h3><pre class="programlisting">
118    files = os.listdir(path)                                <a name="regression.filter.2.1"></a><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12">
119    test = re.compile(<span class='pystring'>"test\.py$"</span>, re.IGNORECASE)           <a name="regression.filter.2.2"></a><img src="../images/callouts/2.png" alt="2" border="0" width="12" height="12">
120    files = filter(, files)                      <a name="regression.filter.2.3"></a><img src="../images/callouts/3.png" alt="3" border="0" width="12" height="12"></pre><div class="calloutlist">
121               <table border="0" summary="Callout list">
122                  <tr>
123                     <td width="12" valign="top" align="left"><a href="#regression.filter.2.1"><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12"></a>
124                     </td>
125                     <td valign="top" align="left">As you saw in <a href="finding_the_path.html" title="16.2.&nbsp;Finding the path">Section&nbsp;16.2, &#8220;Finding the path&#8221;</a>, <tt class="varname">path</tt> may contain the full or partial pathname of the directory of the currently running script, or it may contain an empty string
126                        if the script is being run from the current directory.  Either way, <tt class="varname">files</tt> will end up with the names of the files in the same directory as this script you're running.
127                     </td>
128                  </tr>
129                  <tr>
130                     <td width="12" valign="top" align="left"><a href="#regression.filter.2.2"><img src="../images/callouts/2.png" alt="2" border="0" width="12" height="12"></a>
131                     </td>
132                     <td valign="top" align="left">This is a compiled regular expression.  As you saw in <a href="../refactoring/refactoring.html" title="15.3.&nbsp;Refactoring">Section&nbsp;15.3, &#8220;Refactoring&#8221;</a>, if you're going to use the same regular expression over and over, you should compile it for faster performance.  The compiled
133                        object has a <tt class="function">search</tt> method which takes a single argument, the string to search.  If the regular expression matches the string, the <tt class="function">search</tt> method returns a <tt class="classname">Match</tt> object containing information about the regular expression match; otherwise it returns <tt class="literal">None</tt>, the <span class="application">Python</span> null value.
134                     </td>
135                  </tr>
136                  <tr>
137                     <td width="12" valign="top" align="left"><a href="#regression.filter.2.3"><img src="../images/callouts/3.png" alt="3" border="0" width="12" height="12"></a>
138                     </td>
139                     <td valign="top" align="left">For each element in the <tt class="varname">files</tt> list, you're going to call the <tt class="function">search</tt> method of the compiled regular expression object, <tt class="varname">test</tt>.  If the regular expression matches, the method will return a <tt class="classname">Match</tt> object, which <span class="application">Python</span> considers to be true, so the element will be included in the list returned by <tt class="function">filter</tt>.  If the regular expression does not match, the <tt class="function">search</tt> method will return <tt class="literal">None</tt>, which <span class="application">Python</span> considers to be false, so the element will not be included.
140                     </td>
141                  </tr>
142               </table>
143            </div>
144         </div>
145         <p><b>Historical note.&nbsp;</b>Versions of <span class="application">Python</span> prior to 2.0 did not have <a href="../native_data_types/mapping_lists.html" title="3.6.&nbsp;Mapping Lists">list comprehensions</a>, so you couldn't <a href="../power_of_introspection/filtering_lists.html" title="4.5.&nbsp;Filtering Lists">filter using list comprehensions</a>; the <tt class="function">filter</tt> function was the only game in town.  Even with the introduction of list comprehensions in 2.0, some people still prefer the
146            old-style <tt class="function">filter</tt> (and its companion function, <tt class="function">map</tt>, which you'll see later in this chapter).  Both techniques work at the moment, so which one you use is a matter of style.
147             There is discussion that <tt class="function">map</tt> and <tt class="function">filter</tt> might be deprecated in a future version of <span class="application">Python</span>, but no decision has been made.
148         </p>
149         <div class="example"><a name="d0e35972"></a><h3 class="title">Example&nbsp;16.9.&nbsp;Filtering using list comprehensions instead</h3><pre class="programlisting">
150    files = os.listdir(path)                               
151    test = re.compile(<span class='pystring'>"test\.py$"</span>, re.IGNORECASE)         
152    files = [f <span class='pykeyword'>for</span> f <span class='pykeyword'>in</span> files <span class='pykeyword'>if</span>] <a name="regression.filter.3.1"></a><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12"></pre><div class="calloutlist">
153               <table border="0" summary="Callout list">
154                  <tr>
155                     <td width="12" valign="top" align="left"><a href="#regression.filter.3.1"><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12"></a>
156                     </td>
157                     <td valign="top" align="left">This will accomplish exactly the same result as using the <tt class="function">filter</tt> function.  Which way is more expressive?  That's up to you.
158                     </td>
159                  </tr>
160               </table>
161            </div>
162         </div>
163         <div class="footnotes">
164            <h3 class="footnotetitle">Footnotes</h3>
165            <div class="footnote">
166               <p><sup>[<a name="ftn.d0e35697" href="#d0e35697">7</a>] </sup>Technically, the second argument to <tt class="function">filter</tt> can be any sequence, including lists, tuples, and custom classes that act like lists by defining the <tt class="function">__getitem__</tt> special method.  If possible, <tt class="function">filter</tt> will return the same datatype as you give it, so filtering a list returns a list, but filtering a tuple returns a tuple.
167               </p>
168            </div>
169         </div>
170      </div>
171      <table class="Footer" width="100%" border="0" cellpadding="0" cellspacing="0" summary="">
172         <tr>
173            <td width="35%" align="left"><br><a class="NavigationArrow" href="finding_the_path.html">&lt;&lt;&nbsp;Finding the path</a></td>
174            <td width="30%" align="center"><br>&nbsp;<span class="divider">|</span>&nbsp;<a href="index.html#regression.divein" title="16.1.&nbsp;Diving in">1</a> <span class="divider">|</span> <a href="finding_the_path.html" title="16.2.&nbsp;Finding the path">2</a> <span class="divider">|</span> <span class="thispage">3</span> <span class="divider">|</span> <a href="mapping_lists.html" title="16.4.&nbsp;Mapping lists revisited">4</a> <span class="divider">|</span> <a href="data_centric.html" title="16.5.&nbsp;Data-centric programming">5</a> <span class="divider">|</span> <a href="dynamic_import.html" title="16.6.&nbsp;Dynamically importing modules">6</a> <span class="divider">|</span> <a href="all_together.html" title="16.7.&nbsp;Putting it all together">7</a> <span class="divider">|</span> <a href="summary.html" title="16.8.&nbsp;Summary">8</a>&nbsp;<span class="divider">|</span>&nbsp;
175            </td>
176            <td width="35%" align="right"><br><a class="NavigationArrow" href="mapping_lists.html">Mapping lists revisited&nbsp;&gt;&gt;</a></td>
177         </tr>
178         <tr>
179            <td colspan="3"><br></td>
180         </tr>
181      </table>
182      <div class="Footer">
183         <p class="copyright">Copyright &copy; 2000, 2001, 2002, 2003, 2004 <a href="">Mark Pilgrim</a></p>
184      </div>
185   </body>
Note: See TracBrowser for help on using the browser.