<HTML>
   <HEAD>
      <TITLE> Why to use half-open ranges </TITLE>
   </HEAD>

   <BODY>

      <H1> Why to use half-open ranges </H1>

      <pre>
$Header: /repository/documentation/ranges.htm,v 1.2 2006/04/29 00:24:44 joerg Exp $
      </pre>


      <H2> Introduction </H2>

      <P> A range is a list of things beginning from a start, being in a sequence up
         to an end. For example the numbers from 13 to 42, the days from 19.12.1919
         to 12.12.1942, objects in an array from the 13th up to the 42nd, objects in
         a list ... </P>

      <P> A range can be specified by giving its begin and end point. So for the
         numbers from 13 to 42, we can write [13, 42 ).
         For a range definition like [13, 42) there is the question, do the 13 and
         the 42 belong to the range or not. We use [] to include the begin and end
         and we use () to exclude them. So [13, 42) means the 13 is included and the
         42 is excluded. In programming science these kind of ranges are called
         "half-open". (For the mathematicians: in the topology of real numbers [13,
         42] is indeed closed whereas (13, 42) is open. But for a discrete topology
         like the set of natural numbers N each subset is both closed and open. So as
         a subset of N [13, 42) is open and closed at the same time. Because in
         programming science all sets are finite and thus discrete the word
         "half-open" makes no sense. Thus the programming science notion of half-open
         is inspired by but not identical to the topological one.) </P>

      <P> In daily life we normally use closed ranges. So when you say "i'm in
         vacation from the 18th to the 20th of August" you're 3 day of. And i know:
         your grandmother always used closed ranges and has lived for 87 years. </P>

      <P> One of the greatest results of programming science in the 1990th is C++'s
         STL, which is a library for containers, iterators and algorithms. The STL
         uses half-open ranges ( "Generic programming and the STL" by Matthew H.
         Austern, chapter 2.1.2) And one the best programmers of the known parts of
         our galaxy also votes for this asymmetric approach ( "C traps and pitfalls"
         Andrew Koenig, chapter 3.6). </P>y

      <P> So why? Matthew Austern: "What it really comes down to is convenience. The
         notation for half-open intervals may look unfamiliar at first, but it turns
         out to be easier, both for formulation definitions and for the more mundane
         purposes of writing and using algorithms, than the alternatives." </P>

      <H2> 16 good reasons to use half-open ranges </H2>

      <P> 0. handle differnt 'end of month dates' automatically. to find if something
         is in february say < 1.3. not <28.2. which is not always the complete truth. </P>

      <P> 1. Reduce the number of "off-by-one" errors. </P>

      <P> 2. The number of elements in the range [n, m) is just m-n (and not m-n+1). </P>

      <P> 3. The empty range is [n, n) (and not [n, n-1], which can be a problem if n
         is an iterator already pointing the first element of a list, or if n == 0). </P>

      <P> 4. For floats you can write [13, 42) (instead of [13, 41.999999999999] ). </P>

      <P> 5. If the precision of the floats change you can keep [13, 42) (and not
         change it to 41.999999999999999999999999] ). </P>

      <P> 6. The same arguments 4 and 5 for a range of strings ["a", "f") (and not
         ["a", "ezzzzzzzzzzzzzzzzz"] ). </P>

      <P> 7. And for strings: is 'z' really the largest char? In the ANSI ASCII coding
         "e~" > "ezzzzzzzzzzzzzzzzz". And if you say ["a", "e~~~~~~~~~~~~~~~~~~~~~"]
         the behavior depends on the character set and locales. </P>

      <P> 8. And for variable length strings (like the C++ string type)? It's not
         possible to write "e<infinite number of '~'s>". </P>

      <P> 9. And for dates we have reason 5. Additionally begin and end is equal for
         the hours if you don't need them: [2000.12.31.000000, 2001.01.13.000000)
         (instead of [2000.12.31.000000, 2001.01.12.235959] ). </P>

      <P> 10. Its easy to check for the completeness of a number of ranges: ["a".
         "f"), ["f". "k"), ["k". "o"), ["o". "q"), ["q". "u"), ["u". "w"), ["w". "z")
         (instead of ["a". "e"], ["f". "j"], ["k". "n"], ["o". "p"], ["q". "t"],
         ["u". "v"], ["w". "z"] ). </P>

      <P> 11. For dates there is a maximum (infinity), which is for example
         9999.12.31. In some range algorithms (for example "split range") there is
         created a new range just after another. So you have [b1, e1) and you want to
         create [e1, e2). In such algorithms, [e1, e2) is discarded, if e1 >= e2. The
         case that e1 = 9999.12.31 works perfect in these algorithms. (In case of
         []-ranges 9999.12.31 needs a special treatment, because 9999.12.31 + 1 is
         not possible.) </P>

      <P> 12. The +1 and -1 are almost never used, when handling ranges. This is an
         advantage if they are expensive (as it is for dates). </P>

      <P> 13. If you write a find in a range, the fact that there was nothing found
         can easily indicated by returning the end as the found position: if( find(
         [begin, end) ) == end ) nothing found. </P>

      <P> 14. If there is an insert_at function, appending at the end is just an
         insert_at( [begin, end), end) like appending at the beginning is a
         insert_at( [begin, end), begin). </P>

      <P> 15. In languages, which start the array subscripts with 0 (like C, C++,
         JAVA, NCL) the upper bound is equal to the size. </P>

      <H2> it's not a disadvantage that ... </H2>

      <P> 0. unexperienced programmers get confused. if bad programmers are confused,
         they will work slower and hence will produce less bad code. </P>

      <P> 1. you have to translate ranges for end-users. in any case the range [A,
         KZZZZZZZZZZZZZZZZZZZZZZZ] and [A, L) have to be translated to "A, K". </P>
     
      <HR>
      
      <P> <A HREF="index.htm">the silicon brain home page</A> </P>
      <P> <A HREF="mailto:info@siliconbrain.com">contact 
         SiliconBrain</A></P>
      
      <ADDRESS>
         <A HREF="mailto:info@siliconbrain.com">info@siliconbrain.com</A> 
      </ADDRESS>
      
      <P> this web page was designed by joerg kunze.</P>

      <P> copyright; 2000, 2006 joerg kunze </P>

      <P> this web page is free software; you can redistribute it and/or 
         modify it under the terms of the GNU General Public License
         as published by the Free Software Foundation; either version 2
         of the license or (at your opinion) any later version.</P>

      <P> this web page is distributed in the hope that it is useful, but
         <STRONG>without any warranty</STRONG>; without even the implied warranty of
         <STRONG>merchantability</STRONG> or <STRONG>fitness for a particular
         purpose</STRONG>. see the GNU General Public License for details.</P>

      <P> you should have received a <A HREF="gpl.html">copy of the GNU 
         General Public License</A>
         along with these web page; if not, write to the
         <A HREF="http://www.fsf.org/">Free Software Foundation</A>, Inc.,
         59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.</P>

      <hr>
      <pre>
$Log: ranges.htm,v $
Revision 1.2  2006/04/29 00:24:44  joerg
add Header and Log cvs keywords

      </pre>


   </BODY>
</HTML>