| 1 |
|
|---|
| 2 | <!DOCTYPE html
|
|---|
| 3 | PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|---|
| 4 | <html>
|
|---|
| 5 | <head>
|
|---|
| 6 | <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|---|
| 7 |
|
|---|
| 8 | <title>16.3. Filtering lists revisited</title>
|
|---|
| 9 | <link rel="stylesheet" href="../diveintopython.css" type="text/css">
|
|---|
| 10 | <link rev="made" href="mailto:f8dy@diveintopython.org">
|
|---|
| 11 | <meta name="generator" content="DocBook XSL Stylesheets V1.52.2">
|
|---|
| 12 | <meta name="keywords" content="Python, Dive Into Python, tutorial, object-oriented, programming, documentation, book, free">
|
|---|
| 13 | <meta name="description" content="Python from novice to pro">
|
|---|
| 14 | <link rel="home" href="../toc/index.html" title="Dive Into Python">
|
|---|
| 15 | <link rel="up" href="index.html" title="Chapter 16. Functional Programming">
|
|---|
| 16 | <link rel="previous" href="finding_the_path.html" title="16.2. Finding the path">
|
|---|
| 17 | <link rel="next" href="mapping_lists.html" title="16.4. Mapping lists revisited">
|
|---|
| 18 | </head>
|
|---|
| 19 | <body>
|
|---|
| 20 | <table id="Header" width="100%" border="0" cellpadding="0" cellspacing="0" summary="">
|
|---|
| 21 | <tr>
|
|---|
| 22 | <td id="breadcrumb" colspan="5" align="left" valign="top">You are here: <a href="../index.html">Home</a> > <a href="../toc/index.html">Dive Into Python</a> > <a href="index.html">Functional Programming</a> > <span class="thispage">Filtering lists revisited</span></td>
|
|---|
| 23 | <td id="navigation" align="right" valign="top"> <a href="finding_the_path.html" title="Prev: “Finding the path”"><<</a> <a href="mapping_lists.html" title="Next: “Mapping lists revisited”">>></a></td>
|
|---|
| 24 | </tr>
|
|---|
| 25 | <tr>
|
|---|
| 26 | <td colspan="3" id="logocontainer">
|
|---|
| 27 | <h1 id="logo"><a href="../index.html" accesskey="1">Dive Into Python</a></h1>
|
|---|
| 28 | <p id="tagline">Python from novice to pro</p>
|
|---|
| 29 | </td>
|
|---|
| 30 | <td colspan="3" align="right">
|
|---|
| 31 | <form id="search" method="GET" action="http://www.google.com/custom">
|
|---|
| 32 | <p><label for="q" accesskey="4">Find: </label><input type="text" id="q" name="q" size="20" maxlength="255" value=" "> <input type="submit" value="Search"><input type="hidden" name="cof" value="LW:752;L:http://diveintopython.org/images/diveintopython.png;LH:42;AH:left;GL:0;AWFID:3ced2bb1f7f1b212;"><input type="hidden" name="domains" value="diveintopython.org"><input type="hidden" name="sitesearch" value="diveintopython.org"></p>
|
|---|
| 33 | </form>
|
|---|
| 34 | </td>
|
|---|
| 35 | </tr>
|
|---|
| 36 | </table>
|
|---|
| 37 | <div id="sponsoredlinks">
|
|---|
| 38 | <script type="text/javascript"><!--
|
|---|
| 39 | google_ad_client = "pub-7176621954566026";
|
|---|
| 40 | google_ad_width = 120;
|
|---|
| 41 | google_ad_height = 600;
|
|---|
| 42 | google_ad_format = "120x600_as";
|
|---|
| 43 | google_color_border = "800000";
|
|---|
| 44 | google_color_bg = "FFFFFF";
|
|---|
| 45 | google_color_link = "800000";
|
|---|
| 46 | google_color_url = "008000";
|
|---|
| 47 | google_color_text = "000000";
|
|---|
| 48 | google_ad_channel ="1280308639";
|
|---|
| 49 | //--></script>
|
|---|
| 50 | <script type="text/javascript"
|
|---|
| 51 | src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
|
|---|
| 52 | </script>
|
|---|
| 53 | </div>
|
|---|
| 54 |
|
|---|
| 55 | <div class="section" lang="en">
|
|---|
| 56 | <div class="titlepage">
|
|---|
| 57 | <div>
|
|---|
| 58 | <div>
|
|---|
| 59 | <h2 class="title"><a name="regression.filter"></a>16.3. Filtering lists revisited
|
|---|
| 60 | </h2>
|
|---|
| 61 | </div>
|
|---|
| 62 | </div>
|
|---|
| 63 | <div></div>
|
|---|
| 64 | </div>
|
|---|
| 65 | <div class="abstract">
|
|---|
| 66 | <p>You're already familiar with <a href="../power_of_introspection/filtering_lists.html" title="4.5. Filtering Lists">using list comprehensions to filter lists</a>. There is another way to accomplish this same thing, which some people feel is more expressive.
|
|---|
| 67 | </p>
|
|---|
| 68 | </div>
|
|---|
| 69 | <p><span class="application">Python</span> has a built-in <tt class="function">filter</tt> function which takes two arguments, a function and a list, and returns a list.<sup>[<a name="d0e35697" href="#ftn.d0e35697">7</a>]</sup> The function passed as the first argument to <tt class="function">filter</tt> must itself take one argument, and the list that <tt class="function">filter</tt> returns will contain all the elements from the list passed to <tt class="function">filter</tt> for which the function passed to <tt class="function">filter</tt> returns true.
|
|---|
| 70 | </p>
|
|---|
| 71 | <p>Got all that? It's not as difficult as it sounds.</p>
|
|---|
| 72 | <div class="example"><a name="d0e35724"></a><h3 class="title">Example 16.7. Introducing <tt class="function">filter</tt></h3><pre class="screen">
|
|---|
| 73 | <tt class="prompt">>>> </tt><span class="userinput"><span class='pykeyword'>def</span><span class='pyclass'> odd</span>(n):</span> <a name="regression.filter.1.1"></a><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12">
|
|---|
| 74 | <tt class="prompt">... </tt><span class="userinput"><span class='pykeyword'>return</span> n % 2</span>
|
|---|
| 75 | <tt class="prompt">... </tt>
|
|---|
| 76 | <tt class="prompt">>>> </tt><span class="userinput">li = [1, 2, 3, 5, 9, 10, 256, -3]</span>
|
|---|
| 77 | <tt class="prompt">>>> </tt><span class="userinput">filter(odd, li)</span> <a name="regression.filter.1.2"></a><img src="../images/callouts/2.png" alt="2" border="0" width="12" height="12">
|
|---|
| 78 | <span class="computeroutput">[1, 3, 5, 9, -3]</span>
|
|---|
| 79 | <tt class="prompt">>>> </tt><span class="userinput">[e <span class='pykeyword'>for</span> e <span class='pykeyword'>in</span> li <span class='pykeyword'>if</span> odd(e)]</span> <a name="regression.filter.1.3"></a><img src="../images/callouts/3.png" alt="3" border="0" width="12" height="12">
|
|---|
| 80 | <tt class="prompt">>>> </tt><span class="userinput">filteredList = []</span>
|
|---|
| 81 | <tt class="prompt">>>> </tt><span class="userinput"><span class='pykeyword'>for</span> n <span class='pykeyword'>in</span> li:</span> <a name="regression.filter.1.4"></a><img src="../images/callouts/4.png" alt="4" border="0" width="12" height="12">
|
|---|
| 82 | <tt class="prompt">... </tt><span class="userinput"><span class='pykeyword'>if</span> odd(n):</span>
|
|---|
| 83 | <tt class="prompt">... </tt><span class="userinput"> filteredList.append(n)</span>
|
|---|
| 84 | <tt class="prompt">... </tt>
|
|---|
| 85 | <tt class="prompt">>>> </tt><span class="userinput">filteredList</span>
|
|---|
| 86 | <span class="computeroutput">[1, 3, 5, 9, -3]</span></pre><div class="calloutlist">
|
|---|
| 87 | <table border="0" summary="Callout list">
|
|---|
| 88 | <tr>
|
|---|
| 89 | <td width="12" valign="top" align="left"><a href="#regression.filter.1.1"><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12"></a>
|
|---|
| 90 | </td>
|
|---|
| 91 | <td valign="top" align="left"><tt class="function">odd</tt> uses the built-in mod function “<span class="quote"><tt class="literal">%</tt></span>” to return <tt class="constant">True</tt> if <tt class="varname">n</tt> is odd and <tt class="constant">False</tt> if <tt class="varname">n</tt> is even.
|
|---|
| 92 | </td>
|
|---|
| 93 | </tr>
|
|---|
| 94 | <tr>
|
|---|
| 95 | <td width="12" valign="top" align="left"><a href="#regression.filter.1.2"><img src="../images/callouts/2.png" alt="2" border="0" width="12" height="12"></a>
|
|---|
| 96 | </td>
|
|---|
| 97 | <td valign="top" align="left"><tt class="function">filter</tt> takes two arguments, a function (<tt class="function">odd</tt>) and a list (<tt class="varname">li</tt>). It loops through the list and calls <tt class="function">odd</tt> with each element. If <tt class="function">odd</tt> returns a true value (remember, any non-zero value is true in <span class="application">Python</span>), then the element is included in the returned list, otherwise it is filtered out. The result is a list of only the odd
|
|---|
| 98 | numbers from the original list, in the same order as they appeared in the original.
|
|---|
| 99 | </td>
|
|---|
| 100 | </tr>
|
|---|
| 101 | <tr>
|
|---|
| 102 | <td width="12" valign="top" align="left"><a href="#regression.filter.1.3"><img src="../images/callouts/3.png" alt="3" border="0" width="12" height="12"></a>
|
|---|
| 103 | </td>
|
|---|
| 104 | <td valign="top" align="left">You could accomplish the same thing using list comprehensions, as you saw in <a href="../power_of_introspection/filtering_lists.html" title="4.5. Filtering Lists">Section 4.5, “Filtering Lists”</a>.
|
|---|
| 105 | </td>
|
|---|
| 106 | </tr>
|
|---|
| 107 | <tr>
|
|---|
| 108 | <td width="12" valign="top" align="left"><a href="#regression.filter.1.4"><img src="../images/callouts/4.png" alt="4" border="0" width="12" height="12"></a>
|
|---|
| 109 | </td>
|
|---|
| 110 | <td valign="top" align="left">You could also accomplish the same thing with a <tt class="literal">for</tt> loop. Depending on your programming background, this may seem more “<span class="quote">straightforward</span>”, but functions like <tt class="function">filter</tt> are much more expressive. Not only is it easier to write, it's easier to read, too. Reading the <tt class="literal">for</tt> loop is like standing too close to a painting; you see all the details, but it may take a few seconds to be able to step
|
|---|
| 111 | back and see the bigger picture: “<span class="quote">Oh, you're just filtering the list!</span>”
|
|---|
| 112 | </td>
|
|---|
| 113 | </tr>
|
|---|
| 114 | </table>
|
|---|
| 115 | </div>
|
|---|
| 116 | </div>
|
|---|
| 117 | <div class="example"><a name="d0e35864"></a><h3 class="title">Example 16.8. <tt class="function">filter</tt> in <tt class="filename">regression.py</tt></h3><pre class="programlisting">
|
|---|
| 118 | files = os.listdir(path) <a name="regression.filter.2.1"></a><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12">
|
|---|
| 119 | test = re.compile(<span class='pystring'>"test\.py$"</span>, re.IGNORECASE) <a name="regression.filter.2.2"></a><img src="../images/callouts/2.png" alt="2" border="0" width="12" height="12">
|
|---|
| 120 | files = filter(test.search, files) <a name="regression.filter.2.3"></a><img src="../images/callouts/3.png" alt="3" border="0" width="12" height="12"></pre><div class="calloutlist">
|
|---|
| 121 | <table border="0" summary="Callout list">
|
|---|
| 122 | <tr>
|
|---|
| 123 | <td width="12" valign="top" align="left"><a href="#regression.filter.2.1"><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12"></a>
|
|---|
| 124 | </td>
|
|---|
| 125 | <td valign="top" align="left">As you saw in <a href="finding_the_path.html" title="16.2. Finding the path">Section 16.2, “Finding the path”</a>, <tt class="varname">path</tt> may contain the full or partial pathname of the directory of the currently running script, or it may contain an empty string
|
|---|
| 126 | if the script is being run from the current directory. Either way, <tt class="varname">files</tt> will end up with the names of the files in the same directory as this script you're running.
|
|---|
| 127 | </td>
|
|---|
| 128 | </tr>
|
|---|
| 129 | <tr>
|
|---|
| 130 | <td width="12" valign="top" align="left"><a href="#regression.filter.2.2"><img src="../images/callouts/2.png" alt="2" border="0" width="12" height="12"></a>
|
|---|
| 131 | </td>
|
|---|
| 132 | <td valign="top" align="left">This is a compiled regular expression. As you saw in <a href="../refactoring/refactoring.html" title="15.3. Refactoring">Section 15.3, “Refactoring”</a>, if you're going to use the same regular expression over and over, you should compile it for faster performance. The compiled
|
|---|
| 133 | object has a <tt class="function">search</tt> method which takes a single argument, the string to search. If the regular expression matches the string, the <tt class="function">search</tt> method returns a <tt class="classname">Match</tt> object containing information about the regular expression match; otherwise it returns <tt class="literal">None</tt>, the <span class="application">Python</span> null value.
|
|---|
| 134 | </td>
|
|---|
| 135 | </tr>
|
|---|
| 136 | <tr>
|
|---|
| 137 | <td width="12" valign="top" align="left"><a href="#regression.filter.2.3"><img src="../images/callouts/3.png" alt="3" border="0" width="12" height="12"></a>
|
|---|
| 138 | </td>
|
|---|
| 139 | <td valign="top" align="left">For each element in the <tt class="varname">files</tt> list, you're going to call the <tt class="function">search</tt> method of the compiled regular expression object, <tt class="varname">test</tt>. If the regular expression matches, the method will return a <tt class="classname">Match</tt> object, which <span class="application">Python</span> considers to be true, so the element will be included in the list returned by <tt class="function">filter</tt>. If the regular expression does not match, the <tt class="function">search</tt> method will return <tt class="literal">None</tt>, which <span class="application">Python</span> considers to be false, so the element will not be included.
|
|---|
| 140 | </td>
|
|---|
| 141 | </tr>
|
|---|
| 142 | </table>
|
|---|
| 143 | </div>
|
|---|
| 144 | </div>
|
|---|
| 145 | <p><b>Historical note. </b>Versions of <span class="application">Python</span> prior to 2.0 did not have <a href="../native_data_types/mapping_lists.html" title="3.6. Mapping Lists">list comprehensions</a>, so you couldn't <a href="../power_of_introspection/filtering_lists.html" title="4.5. Filtering Lists">filter using list comprehensions</a>; the <tt class="function">filter</tt> function was the only game in town. Even with the introduction of list comprehensions in 2.0, some people still prefer the
|
|---|
| 146 | old-style <tt class="function">filter</tt> (and its companion function, <tt class="function">map</tt>, which you'll see later in this chapter). Both techniques work at the moment, so which one you use is a matter of style.
|
|---|
| 147 | There is discussion that <tt class="function">map</tt> and <tt class="function">filter</tt> might be deprecated in a future version of <span class="application">Python</span>, but no decision has been made.
|
|---|
| 148 | </p>
|
|---|
| 149 | <div class="example"><a name="d0e35972"></a><h3 class="title">Example 16.9. Filtering using list comprehensions instead</h3><pre class="programlisting">
|
|---|
| 150 | files = os.listdir(path)
|
|---|
| 151 | test = re.compile(<span class='pystring'>"test\.py$"</span>, re.IGNORECASE)
|
|---|
| 152 | files = [f <span class='pykeyword'>for</span> f <span class='pykeyword'>in</span> files <span class='pykeyword'>if</span> test.search(f)] <a name="regression.filter.3.1"></a><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12"></pre><div class="calloutlist">
|
|---|
| 153 | <table border="0" summary="Callout list">
|
|---|
| 154 | <tr>
|
|---|
| 155 | <td width="12" valign="top" align="left"><a href="#regression.filter.3.1"><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12"></a>
|
|---|
| 156 | </td>
|
|---|
| 157 | <td valign="top" align="left">This will accomplish exactly the same result as using the <tt class="function">filter</tt> function. Which way is more expressive? That's up to you.
|
|---|
| 158 | </td>
|
|---|
| 159 | </tr>
|
|---|
| 160 | </table>
|
|---|
| 161 | </div>
|
|---|
| 162 | </div>
|
|---|
| 163 | <div class="footnotes">
|
|---|
| 164 | <h3 class="footnotetitle">Footnotes</h3>
|
|---|
| 165 | <div class="footnote">
|
|---|
| 166 | <p><sup>[<a name="ftn.d0e35697" href="#d0e35697">7</a>] </sup>Technically, the second argument to <tt class="function">filter</tt> can be any sequence, including lists, tuples, and custom classes that act like lists by defining the <tt class="function">__getitem__</tt> special method. If possible, <tt class="function">filter</tt> will return the same datatype as you give it, so filtering a list returns a list, but filtering a tuple returns a tuple.
|
|---|
| 167 | </p>
|
|---|
| 168 | </div>
|
|---|
| 169 | </div>
|
|---|
| 170 | </div>
|
|---|
| 171 | <table class="Footer" width="100%" border="0" cellpadding="0" cellspacing="0" summary="">
|
|---|
| 172 | <tr>
|
|---|
| 173 | <td width="35%" align="left"><br><a class="NavigationArrow" href="finding_the_path.html"><< Finding the path</a></td>
|
|---|
| 174 | <td width="30%" align="center"><br> <span class="divider">|</span> <a href="index.html#regression.divein" title="16.1. Diving in">1</a> <span class="divider">|</span> <a href="finding_the_path.html" title="16.2. Finding the path">2</a> <span class="divider">|</span> <span class="thispage">3</span> <span class="divider">|</span> <a href="mapping_lists.html" title="16.4. Mapping lists revisited">4</a> <span class="divider">|</span> <a href="data_centric.html" title="16.5. Data-centric programming">5</a> <span class="divider">|</span> <a href="dynamic_import.html" title="16.6. Dynamically importing modules">6</a> <span class="divider">|</span> <a href="all_together.html" title="16.7. Putting it all together">7</a> <span class="divider">|</span> <a href="summary.html" title="16.8. Summary">8</a> <span class="divider">|</span>
|
|---|
| 175 | </td>
|
|---|
| 176 | <td width="35%" align="right"><br><a class="NavigationArrow" href="mapping_lists.html">Mapping lists revisited >></a></td>
|
|---|
| 177 | </tr>
|
|---|
| 178 | <tr>
|
|---|
| 179 | <td colspan="3"><br></td>
|
|---|
| 180 | </tr>
|
|---|
| 181 | </table>
|
|---|
| 182 | <div class="Footer">
|
|---|
| 183 | <p class="copyright">Copyright © 2000, 2001, 2002, 2003, 2004 <a href="mailto:mark@diveintopython.org">Mark Pilgrim</a></p>
|
|---|
| 184 | </div>
|
|---|
| 185 | </body>
|
|---|
| 186 | </html> |
|---|