\input texinfo
@comment %**start of header
@setfilename siliconBrain.info
@settitle siliconBrain
@comment %**end of header
@c Copyright (C) 2002, 2003, 2004 Joerg Kunze
@c Permission is granted to copy, distribute and/or modify this document
@c under the terms of the GNU Free Documentation License, Version 1.1 or
@c any later version published by the Free Software Foundation; with no
@c Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.
@c Basic Installation
@set siliconBrainRelease $siliconBrainRelease: 0.2.3 $
@set siliconBrainRcsIdentifier $Id: siliconBrain.main.texinfo,v 1.55 2004/12/14 23:31:27 joerg Exp $
@set siliconBrainSaveStamp $siliconBrainSaveStamp: 2004/12/14 22:37:42, Joerg Kunze$'
@include temporary/releaseInformation.texinfo
@titlepage
@title siliconBrain
@subtitle A generic open source make environment
@subtitle Version: @value{release}, @value{releaseDate}
@author Joerg Kunze
@page
@vskip 0pt plus 1filll
Copyright @copyright{} 2002, 2003, 2004
Joerg Kunze
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1 or
any later version published by the Free Software Foundation; with the
Invariant Section being ``History'', with no Front-Cover Texts, and with
no Back-Cover Texts. A copy of the license is included in the section
entitled ``GNU Free Documentation License''.
Version Information:@*
@value{siliconBrainRelease}@*
@value{siliconBrainRcsIdentifier}@*
@value{siliconBrainSaveStamp}
@end titlepage
@summarycontents
@contents
@c ---------------------------------------------------------------------------------------*
@c Top: *
@c ---------------------------------------------------------------------------------------*
@node top
@top siliconBrain
@include longDescription.snippet.texinfo
Version Information:@*
@value{siliconBrainRelease}@*
@value{siliconBrainRcsIdentifier}@*
@value{siliconBrainSaveStamp}
@menu
* Introduction:: A first glance
* Usage:: Usage
* Principles of Operation:: Principles of Operation
* Philosophy:: All my Ideas for a standard project
* Commands:: Commands
* Installation:: Installation
* GNU GENERAL PUBLIC LICENSE:: The GNU General Public License
* GNU Free Documentation License:: The GNU Free Documentation License
* Concept Index:: A menu covering many topics
@end menu
@c ****************************************************************************************
@c *
@c Introduction: *
@c *
@c ****************************************************************************************
@node Introduction
@chapter Introduction
@include introduction.snippet.texinfo
@c ****************************************************************************************
@c *
@c Usage: *
@c *
@c ****************************************************************************************
@node Usage
@chapter Usage
@cindex usage
This chapter describes how to use the siliconBrain project environment
to create and handle packages. This is the user manual.
@menu
* HowTo:: A step by step introduction
* Prerequisites:: What you need before you can start.
* Files:: Description of generated, and self written files
* DocumentationUsage:: How to organize your documents
* Configuration:: How to specifiy the configuration of a package
@end menu
@c ---------------------------------------------------------------------------------------*
@c HowTo: *
@c ---------------------------------------------------------------------------------------*
@node HowTo
@section A step by step introduction
@cindex how to
Here is described how to proceed, to create and compile a project
based on the siliconBrain make environment.
@subsection Setup your package root directory
@subsubsection Create a package root directory
@cindex setup a project
@cindex create first directory
First thing you have to do is to create a directory, which will
contain the complete sources and results of generated and compiled
files for your package. The directory name automatically is your
package name. So to start a new siliconBrain project ``myProject'' do
a:
@example
cd @var{directoryWhereYourProjectsdirShouldBeCreated}
mkdir myProject
cd myProject
@end example
@subsubsection setup environment to use siliconBrain
@cindex environment
To use siliconBrain it must be active in your environment. To that you
have to execute:
@example
export siliconBrainPath=@var{dirWhereSiliconBrainIsInstalled}
. $siliconBrainPath/setEnvironment
@end example
@var{dirWhereSiliconBrainIsInstalled} is where you installed
siliconBrain. These two lines are also outputted after you have called
@samp{siliconBrainInstall} while being inside the siliconBrain distribution.
You can put this lines into your @file{.bash_profile} or in
@file{/etc/profile}, if it should be available for all users on your
system. You can also create a special @command{xterm} script, in which
the environment is set. In that way you can selectively activate
siliconBrain without polluting your standard environment.
@subsubsection create @file{makefile}
@cindex makefile
The only thing your makefile must contain is:
@example
include $(siliconBrainPath)/makefile
@end example
You can combine this step and the one before, if you just like to call
siliconBrain features from within your @file{makefile}. Then you can
leave your environment as it is and instead put two lines in your
@file{makefile}:
@example
export siliconBrainPath=@var{dirWhereSiliconBrainIsInstalled}
include $(siliconBrainPath)/makefile
@end example
@subsubsection create the frame of an empty project
@cindex directory tree (create)
Now that the root directory of your project is existing, you run
@command{make} the first time. This will setup a directory tree and a
couple of files, which should exist.
@example
cd myProject
make
@end example
In your projects root directory you will have the following files:
@example
AUTHORS
COPYING
COPYING.DOC
ChangeLog
INSTALL
INTRODUCTION
NEWS
README
RELEASE
TODO
distribution
documentation
makefile
siliconBrainInstall
temporary
@end example
You will never edit these files except @file{ChangeLog}. The meaning
of most of these files is described in
@file{README}. @file{distribution} contains all things which are
needed to use your package. @file{siliconBrainInstall} is a program
(actually it is a bash script), which installs your package
elsewhere.
In @file{documentation} you will now have the following files:
@example
install.snippet.texinfo
news.snippet.texinfo
shortDescription.snippet.texinfo
todo.snippet.texinfo
introduction.snippet.texinfo
readme.snippet.texinfo
myProject.main.texinfo
authors.snippet.texinfo
longDescription.snippet.texinfo
shortCopyingDoc.snippet.texinfo
standardFiles.snippet.texinfo
@end example
Here @file{myProject.main.texinfo} is a frame where you put your main
documentation in. For all the @emph{snippets}, see @ref{DocumentationUsage}.
@subsection The first documentationary steps
Some of the documents, which are created as frames for your writing, are
used in a lot of places to generate files, docs, html, @enddots{} So you are
not allowed to delete them. And some should have a reasonable
contents.
@table @file
@item shortDescription.snippet.texinfo
Put a one liner into here. This line will go into a lots of places.
@item introduction.snippet.texinfo
Here you put several pages of text to describe for a beginner, what
your package is about.
@item myProject.main.texinfo
This will be your packages manual.
@item authors.snippet.texinfo
List the people who worked for this project.
@item longDescription.snippet.texinfo
Put a five liner here, to say what your package is.
@end table
@subsection A first program
In the projects root dir write your hello world program in a file
named @file{helloWorld.main.c}. There after or before write a program
called @dfn{testMain}, which calls your helloWorld. Normally it will
be implemented in @file{testMain.main.c} or @file{testMain.bash}.
@command{testMain} is automatically called by the @file{makefile} if
it exists. Now you can compile @emph{and} run your program by just
entering @command{make}.
@subsection Create a web version of your sources
make web
@c ---------------------------------------------------------------------------------------*
@c Prerequisites: *
@c ---------------------------------------------------------------------------------------*
@node Prerequisites
@section Prerequisites
@subsection tools
To run siliconBrain you need:
@table @asis
@itemx GNU make
@itemx Texinfo
@itemx emacs
@itemx bash
@itemx gawk
@itemx GNU m4
@itemx perl
@itemx glib-2.0
@itemx siliconBrainLib
@end table
@subsection skills
To use siliconBrain you should be familiar with:
@table @asis
@item Texinfo
The complete documentation is based on Texinfo. All @acronym{html} files or
printed output or @acronym{ascii} text files are generated out of Texinfo
files.
@end table
@subsection What we not use
@subsubsection autoconf, automake and libtool
We do not use autoconf, automake and libtool. The reason is, that
siliconBrain is @emph{not} used to create portable software. The
approach of siliconBrain is, to use all the beautiful and advanced
features of Linux and the GNU tools. When someone wants to use a
siliconBrain package, she should switch to a GNU/Linux
system. siliconBrain does not want to make compromises, just to be
portable to other platforms.
So we do not intend to write portable @command{sh} scripts, or
portable @file{makefile}'s. Instead we use the full features of bash
and GNU make. The same is true for all other programs we use, including
the operating system itself.
It also is of great fun to use the GNU extensions. This would be reason enough.
@c ---------------------------------------------------------------------------------------*
@c Files: *
@c ---------------------------------------------------------------------------------------*
@node Files
@section Files
siliconBrain projects can have three kinds of files:
@subsection self written files
These are the files you edit and which
should go into @acronym{cvs}. These are the sources of your project. They are
all collected in the root directory of the project.
@subsection templated files
These are generated by siliconBrain, if and only if
they are not existing. For example for the first initialization of a
project. Or when a new version of siliconBrain includes additional
templated files. Templated files can be edited, changed. They should
be registered in @acronym{cvs}. They are like self written files and so they are
part of the source of your project. But you never should delete them,
because other parts of siliconBrain generated features rely on their
existens.
@subsection target files
Like your executables, @acronym{html} and info docs. They are placed in:
@table @file
@item data
Data needed to run the program. But only static data.
@item documentation
All kinds of documents.
@item programs
contains all programs. This could be bash or awk
scripts, emacs lisp files, binaries and libraries.
@end table
@subsection temporary files
Like @file{*.o}. These files are only needed during the @command{make}
process and are not installed with @command{install}.
@subsection capital files
Capital files are @file{COPYING}, @file{COPYING.DOC}, @file{AUTHORS},
@file{INSTALL}, @file{INTRODUCTION}, @file{NEWS}, @file{README},
@file{TODO}.
These files are generated out of templated texinfo snippets. You can
edit these snippets, but never change the capital files by
them self. Because the snippets are templated, you have a good idea of
its structure.
Capital files are so important for understanding of a distribution of
your project, so that they are not put in between the target files. But
they @emph{are} targets. They are so important to find, so they are
capitalized in their names, so they are sorted at the beginning of a
normal @command{ls}.
These files are those you find in other non siliconBrain open source
packages as well. These are those files, which are required or created by
@command{automake} (although we don't use @command{automake}).
These files should be registered in @acronym{cvs}, so that, when someone checks
out a siliconBrain project, she can immediately read these files,
without building the project, or without detecting the underlying
texinfo snippets.
So these files are like sources, as they are in @acronym{cvs} and in the
projects root directory. They are like targets, as they are generated
by @command{make} and are never edited.
To say it again: all changes you have made in a capital file, will
vanish during the next @command{make}. Edit the underlying snippets
instead: @file{readme.snippet.texinfo} for @file{README}.
@file{COPYING} and @file{COPYING.DOC} are special in that they are
not generated out of
snippets, which lie in your projects directory. Instead they are
copied from files in siliconBrain. The idea behind is, that you should
not edit these files, as they reflect the current versions of GNU's
@acronym{gpl} and @acronym{fdl}.
@subsection the RELEASE file
This file just contains the release number. For example
@code{0.1.42}. This file should be registered in @acronym{cvs}.
This file is used by @command{make public}, which increases the last
number and automatically creates a @acronym{cvs} tag for this. If you like to
increase the first or second number, you have to edit this file.
@c ---------------------------------------------------------------------------------------*
@c DocumentationUsage: *
@c ---------------------------------------------------------------------------------------*
@node DocumentationUsage
@section How to organize your documents
Here is described how to create and organize any kind of
documents. This includes man pages, info files, html-docs printed
output and output of programs.
@subsection Snippets
@dfn{Snippets} are Texinfo files which are intended to be included
into other Texinfo documents. They are building blocks of text.
So they do not include standard Texinfo
header information. An example of a snippet is
@file{authors.snippet.texinfo} which can look like this:
@example
The following persons have contributed to this package:
@@itemize
@@item
@@email@{joerg@@@@siliconbrain.com, Joerg Kunze@}
@@end itemize
@@c $siliconBrainRelease: 0.2.3 $
@@c $Id: siliconBrain.main.texinfo,v 1.55 2004/12/14 23:31:27 joerg Exp $
@@c $siliconBrainSaveStamp: 2004/12/14 22:37:42, Joerg Kunze$
@end example
The revision control information like @code{$siliconBrainRelease} is
just held in comments. They are not inside @samp{@@set} commands so
they don't overwrite settings done in the main document, which
includes snippets.
Snippet files are named with the two level extension
@file{.snippet.texinfo}. This is that @command{make} knows what to do with
it. The extension @file{.texinfo} indicates that
they are Texinfo files and that emacs should highlight them
accordingly, and that they are processed with for example
@command{makefinfo}. The part @file{.snippets} indicates
@command{make} that they are snippets.
One common way to use snippets as the building blocks of your
documents is just to include them in your main Texinfo files:
@example
@@include authors.snippet.texinfo
@end example
In that way you avoid redundancies in your Texinfo documents. But what
distinguishes snippets form other Texinfo included files is, that out
of them, siliconBrain's @command{makefile} automatically generates a
text and a html version into @file{temporary/data}. So in our example
we will have the two files:
@example
distribution/data/authors.snippet.html
distribution/data/authors.snippet.text
@end example
Using these files in your application program, by reading and
displaying them, the application programs output can be based on the
same Texinfo files. This further reduces the redundancy of document
information.
So for example we have the line
@example
cat "distribution/data/longDescription.snippet.html"
@end example
inside the @samp{webify.bash} to reuse the same documentation part in
the creation of your projects web page.
If you will use the snippet not while making a package, but while you
run it should write:
@example
cat "$siliconBrainPath/data/longDescription.snippet.html"
@end example
if you reuse document parts from siliconBrain or
@example
cat "$yourPackagePath/data/longDescription.snippet.html"
@end example
if its a part of your package.
There is the special case of standard snippets, which are used to
create the capital files like @file{AUTHORS}. The capital files are
created by copying the generated text version of the corresponding
snippet into the projects root directory.
@table @file
@item AUTHORS
Created from @file{authors.snippet.texinfo}. Here you should describe
the authors, which have contributed to your package.
@item INSTALL
Created from @file{install.snippet.texinfo}. Here you find the
description of how to install your package. Normally you can leave
this file as it has been created by siliconBrain' @file{makefile}.
@item INTRODUCTION
Created from @file{introduction.snippet.texinfo}. Here you should put
a short introduction into your package.
@item NEWS
Created from @file{news.snippet.texinfo}. Describe, what is new to
your project.
@item README
Created from @file{readme.snippet.texinfo}. This file should be read
first by someone, who is new to your package.
@item TODO
Created from @file{todo.snippet.texinfo}. Open issues in your package.
@end table
When you create a new project these snippets are automatically copied
into your projects root directory from templates. You never should
edit the capital files by your self (except @file{RELEASE}, which is
not generated out of Texinfo snippets). Always edit the corresponding
Texinfo snippet.
There are other snippets, which are copied into your project from
templates:
@table @file
@item longDescription.snippet.texinfo
This file contains a long (one paragraph) description of your
package. This is included in the readme and other docs.
@item shortCopyingDoc.snippet.texinfo
This is a short version of the GNU Free Documentation License. This is
included in a lot of docs, because all docs of a siliconBrain project
should be published under that license.
@item shortDescription.snippet.texinfo
A one liner to describe your package. It is included in
longDescription and many others. This can be used in the startup
message of your programs as well.
@item standardFiles.snippet.texinfo.template
A list of the standard files in your distribution: the capital
files. This file is included in readme and normally should not be
changed.
@end table
All these files are also included in your main Texinfo document, which
is used for printed output and as manual on your web site.
Using @dfn{snippets} thus guarantees, that the capital files (like
@file{README}), your web pages, the printed manual and your programs
output, contain the same information. And this with no redundancy and
no double work.
@c ---------------------------------------------------------------------------------------*
@c Configuration:: *
@c ---------------------------------------------------------------------------------------*
@node Configuration
@section How to specify the configuration of a package
The configuration parameters of a package are like command options,
which are valid for all commands of a package. There are three sources
of command line options: Options, which are defined by siliconBrain
like @code{--help}, which are always and for all commands
available. Options defined for the package, which are valid for all
commands belonging to a package. And finally, command specific
options.
The first group of options, the siliconBrain defined options, don't
have to be specified. And they cannot.
The second group, the package general options and the third the
command specific options are specified in exactly the same way. If the
package name is @code{packageName}, that the options of the command
@code{packageName} are the package general options. This works
independently of whether this command really exists. But in case that
there is a command with the name of the package, then this command is
treated as @emph{the main command} of the package and all its options
are the package general options.
The specification of options works via so called executable
specifications. Currently they have to be written in @acronym{c}. For
the command @code{myCommand} there is the program
@code{myCommand.specification.c}. This is a program containing a
@code{main} function. If it is called it outputs the option
specification as @acronym{xml}. Because the configuration parameters
are just the options of an eventually virtual command with the name of
the package, @code{packageName.specification.c} does contain the
configuration parameter specification. This specification is
indistinguishable from normal options. These sources are automatically
detected by siliconBrain's @code{makefile} and compiled into the
correct directory, which @acronym{btw} is @file{temporary/}. This
program should accept the option @code{--xml} and at least n case this
option is given, should output @acronym{xml} format.
The output is like so:
@example
<command
name="siliconBrain"
release="$siliconBrainRelease: 0.2.3 $"
rcsIdentifier="$Id: siliconBrain.main.texinfo,v 1.55 2004/12/14 23:31:27 joerg Exp $"
saveStamp="$siliconBrainSaveStamp: 2004/08/06 18:39:38, Joerg
Kunze$"
title="siliconBrain main configuration.">
<shortDescription> general datahandling application framework </shortDescription>
<longDescription> Based on some specification a general frame of standard interfaces ins generated </longDescription>
<option name="publishTargetHost" type="value">
<shortDescription> IP address or domain name where package should be put to </shortDescription>
<longDescription> Host to which the complete package is webified. This includes FTP directories for the tar.gz of this package and a directory for the HTML documentation and the syntaxhighlighted sources. </longDescription>
</option>
<option name="publishTargetArchive" type="value" oneCharacterName="a">
<shortDescription> directory on the publishTargetHost, where the tar.gz should be put to. </shortDescription>
<longDescription> this directory should already exist. It is reachable from where the FTP command puts one when logging in. </longDescription>
</option>
<option name="publishTargetWeb" type="value">
<shortDescription> directory on the publishTargetHost, where the HTMLilized sourcetree should be put to. </shortDescription>
<longDescription> this directory should already exist. It is reachable from where the FTP command puts one when logging in. </longDescription>
</option>
</command>
@end example
Where @code{type} can be @samp{value} or @samp{flag}.
The easiest way to implement this is to use siliconBrainLib's Printer:
@example
#define siliconBrainPrintShortNames
#include "siliconBrainLib"
static const char *siliconBrainRelease = "$siliconBrainRelease: 0.2.3 $";
static const char *siliconBrainRcsIdentifier = "$Id: siliconBrain.main.texinfo,v 1.55 2004/12/14 23:31:27 joerg Exp $";
static const char *siliconBrainSaveStamp = "$siliconBrainSaveStamp: 2004/12/14 22:37:42, Joerg Kunze$";
int main( int argc, const char *argv[] ) @{
SiliconBrainPrinter siliconBrainPrinter;
siliconBrainPrinterInit( &siliconBrainPrinter, argc, argv );
tag( "command" );
attribute( "name" , "configuration" );
attribute( "release", siliconBrainRelease );
attribute( "rcsIdentifier", siliconBrainRcsIdentifier );
attribute( "saveStamp", siliconBrainSaveStamp );
attribute( "title", "siliconBrain main configuration." );
tag( "shortDescription" );
text( "general datahandling application framework" );
end();
tag( "longDescription" );
text( "Based on some specification a general frame of standard interfaces ins generated" );
end();
tag( "option" );
attribute( "name" , "publishTargetHost" );
attribute( "type" , "value" );
tag( "shortDescription" );
text( "IP address or domain name where package should be put to" );
end();
tag( "longDescription" );
text(
"Host to which the complete package is webified. This includes FTP directories for the tar.gz of this package "
"and a directory for the HTML documentation and the syntaxhighlighted sources."
);
end();
end();
tag( "option" );
attribute( "name" , "publishTargetArchive" );
attribute( "type" , "value" );
tag( "shortDescription" );
text( "directory on the publishTargetHost, where the tar.gz should be put to." );
end();
tag( "longDescription" );
text(
"this directory should already exist. It is reachable from where the FTP command puts one "
"when logging in."
);
end();
end();
tag( "option" );
attribute( "name" , "publishTargetWeb" );
attribute( "type" , "value" );
tag( "shortDescription" );
text( "directory on the publishTargetHost, where the HTMLilized sourcetree should be put to." );
end();
tag( "longDescription" );
text(
"this directory should already exist. It is reachable from where the FTP command puts one "
"when logging in."
);
end();
end();
end();
return 0;
@}
@end example
In this way the option @code{--xml} is handled, the output is pretty
printed indented and colorized if stdout is a tty. With other output
options like @code{--bash} other output syntax is possible.
If we take as an example the option @code{publishTargetHost} it now
can be given in the command line:
@example
myCommand --publishTargetHost=siliconbrain.com
@end example
can be given as an environment variable:
@example
export packageName_publishTargetHost=siliconbrain.com
@end example
or in an @acronym{xml} outputting executable configuration:
@example
<packageName>
<!-- publishTargetHost (value) : IP address or domain name where package should be put to -->
<publishTargetHost> siliconbrain.com </publishTargetHost>
<!-- publishTargetArchive (value) : directory on the publishTargetHost, where the tar.gz should be put to. -->
<publishTargetArchive> ftp/anon/pub </publishTargetArchive>
<!-- publishTargetWeb (value) : directory on the publishTargetHost, where the HTMLilized sourcetree should be put to. -->
<publishTargetWeb> www/htdocs </publishTargetWeb>
<!-- help (flag ) : Print a short help message, listing the options. -->
<!-- verbose (flag ) : Let this command talk to you a lot. -->
<!-- version (flag ) : Display version information. -->
<!-- output (value) : File to which output is written. -->
<!-- complete (flag ) : Indicate completeness of specified options. No further lookup in configuration chain. -->
</siliconBrain>
@end example
These executable configurations are searched in the directories
specified in the environment variabel
@code{$siliconBrainConfigurationPath}, which has the default
@code{.:~:/etc:}, where the last entry of length 0 searches for a
configuration via @code{$PATH}. The executable configurations have the
name @code{packageName.configuration}. They can be written in any
language. Again they can use siliconBrainLib's printer:
@example
#define siliconBrainPrintShortNames
#include "siliconBrainLib"
// static const char *siliconBrainRelease = "\$siliconBrain" "Release: 0.0.7 \$";
// static const char *siliconBrainRcsIdentifier = "\$I" "d: myProject.configuration.main.c,v 1.3 2003/06/07 21:59:58 joerg Exp \$";
// static const char *siliconBrainSaveStamp = "\$siliconBrain" "SaveStamp: 2003/06/17 23:47:40, Joerg Kunze\$";
int main( int argc, const char *argv[] ) @{
SiliconBrainPrinter siliconBrainPrinter;
siliconBrainPrinterInit( &siliconBrainPrinter, argc, argv );
tag( "myProject" );
keyBool( "standardFlag", true );
keyBool( "verbose" , true );
key( "oops" , "well, ehm ..." );
end();
return 0;
@}
@end example
or they uses other means, like the trivial configuration
implementation:
@example
#!/bin/bash
cat <<EOF
<siliconBrain>
<!-- publishTargetHost (value) : IP address or domain name where package should be put to -->
<publishTargetHost> siliconbrain.com </publishTargetHost>
<!-- publishTargetArchive (value) : directory on the publishTargetHost, where the tar.gz should be put to. -->
<publishTargetArchive> ftp/anon/pub </publishTargetArchive>
<!-- publishTargetWeb (value) : directory on the publishTargetHost, where the HTMLilized sourcetree should be put to. -->
<publishTargetWeb> www/htdocs </publishTargetWeb>
<!-- complete (flag ) : Indicate completeness of specified options. No further lookup in configuration chain. -->
<complete/>
</siliconBrain>
EOF
@end example
In each stage, does the @code{--complete} option block any further
lookup. If given in the command line, no environment variables would
be inspected, if given in the environment, no lookup in configurations
would be done.
The program @code{packageName.configurationReader} handles all options
like a normal command and outputs the result in @acronym{xml},
@code{bash} or @code{perl} syntax.
The result can be pumped into the environment with:
@example
eval "$(packageName.configurationReader -- --bashEnvironment)
@end example
With:
@example
eval "$(packageName.configurationReader $* -- --bash)
@end example
The configuration can be used inside a @code{bash} script.
@c ****************************************************************************************
@c *
@c Principles of Operation: *
@c *
@c ****************************************************************************************
@node Principles of Operation
@chapter Principles of Operation
In this chapter I describe the exact details of the work of
siliconBrain. You read this chapter, if you want to know everything.
@menu
* Directory:: Directory tree
@end menu
@c ---------------------------------------------------------------------------------------*
@c Directory: *
@c ---------------------------------------------------------------------------------------*
@node Directory
@section Directory
The main idea for the directory tree is, that all generated output is
placed somewhere below @file{temporary} or @file{distribution}. @dfn{generated} here means to
be the output of a compile, a @command{cp} from another file, the
result of an @command{sed} or @command{awk} script, the result of
@command{makeinfo} or @command{texi2dvi} or whatever. All files, which
can be reproduced automatically, or which are produced as a side
effect of some actions should go into @file{temporary} or
@file{distribution}.
@file{distribution} is used for results, which will be used later,
when your package will be used. @file{temporary} is for interim results,
which are use during the build process only.
This is not true for some exception:
@table @file
@item capital files
These are the files @file{AUTHORS}, @file{COPYING},
@file{COPYING.DOC}, @file{INSTALL}, @file{INTRODUCTION}, @file{NEWS},
@file{README} and @file{TODO}. These files are generated out of
Texinfo files. But they are important to understand a package after it
is downloaded, and so are placed in the projects root dir as well as
they are registered in CVS.
@item CVS
siliconBrain projects are handled by @acronym{cvs}. And so there has to be a
@file{CVS} directory.
@item RELEASE
This file is sometimes updated automatically. But the major release
numbers are edited manually.
@item TAGS
This is the emacs tag file. It should be placed in the current dir.
@item *~
Emacs backup files are placed near the source.
@item *.log
Some log files go the the current dir. We should avoid that, but it
was not always possible.
@end table
As a consequence @command{make clean} does the following:
@example
export siliconBrainTemporary := distribution\|temporary\|.*\.log$$\|TAGS\|.*~$$
@dots{}
-rm --force --recursive $$(ls | grep $$siliconBrainTemporary)
@end example
@samp{$siliconBrainTemporary} is also used by other scripts, which
want to exclude temporary files, like @command{webify} or @command{publish}.
@c ****************************************************************************************
@c *
@c Philosophy: *
@c *
@c ****************************************************************************************
@node Philosophy
@chapter Philosophy
@cindex Philosophy
In my long life as a programmer, I have got a huge number of ideas, of how a good software
project for a normal business application should be structured. With siliconBrain I
implement all these ideas in automated tools.
@menu
* basicPrinciples:: The main basic ideas and principles
* testPhilosophy:: Test is the single most important topic
* processPhilosophy:: Process, most important dynamic unit
* commandPhilosophy:: Functionality is available by commands
* Documents:: How a siliconBrain project is documented
* Tools:: Which tools are required and used
* Source Tree:: Structure of the source tree
* Installation Principles:: How to install a siliconBrain project
* Nameing:: Definition of naming conventions
* Generators:: Differents to the 1980s case tools and generators
* HTML:: Thoughts about web applications
* Performance:: Performance is one of the most important things
* Java:: @acronym{java}
* ConfigurationPhilosophy:: Configuration
@end menu
@c ---------------------------------------------------------------------------------------*
@c basicPrinciples: *
@c ---------------------------------------------------------------------------------------*
@node basicPrinciples
@section The main basic ideas and principles
There are a couple of main ideas behind all the other topics in this
chapter. The others can be considered as derived from that basic
principles. Here is a list:
@itemize
@item
@dfn{Runnable from CD-ROM}: it should be possible to execute a piece
of software on a computer without polluting it by a complex installation, which
floods several directories with files.
@item
@dfn{paradigm of total automation}:
@item
@dfn{redundancy is Satan}:
@item
@dfn{quality is velocity of change}:
@item
@dfn{the ``invent a new OS'' anti pattern}: this is against JAVA
because @acronym{j2ee} @emph{is} an operating system.
@item
@dfn{compile time decisions always better than run time decisions}:
this is against JAVA, because @acronym{java} is interpreted and uses
things like reflection @acronym{api}. This is a strong advantage of generators.
@end itemize
@c ---------------------------------------------------------------------------------------*
@c testPhilosophy: *
@c ---------------------------------------------------------------------------------------*
@node testPhilosophy
@section Test is the single most important topic
@cindex test
The single most important thing in software development is
@emph{testing}.
@cindex automatic testing
@cindex refactoring
@cindex reusage
@cindex test driver
And the most important thing about testing is @emph{automatic
testing}. Automatic testing is the basis of almost all other aims and
goals of software development. You cannot refactor a piece of code
without being able to rely on extensive tests. You cannot change a
highly reused module, if you cannot be sure that all users or clients
of that module will continue to work. @dfn{Test driver} of something is a program,
the task of which is to test this something automatically.
Automatic testing should take place on all levels of software: the
single C-function (better: <programming language>-function) should have
a test driver, each program, each library, each module, each
class. This includes internal or private classes as well. On the other
end, the creation of @acronym{html} pages or @command{man} entries
should be tested automatically. Every perl and bash script, every
@acronym{sql} and @acronym{xml} piece should have its own test driver.
Automatic testing should include the release, publish, build make
version and most notably the install phase of a package.
@cindex commands, testing
This is one reason for the generation of @samp{main} wrappers around
commands: then it is easy to write test @command{bash}-scripts for
those commands.
The test drivers are in a form, that is callable without user
interface in a kind of batch mode. This it is possible to start the
tests from within @command{crontab}.
And all these test drivers should be chained and bound together in one
huge test suite, so that it is a single command to be entered on the
command line to test everything.
@cindex @file{makeTest.bash}
To further force the everlasting test, siliconBrain's
@command{makefile} has an automatic call to @command{makeTest}, if it
exists in @file{distribution/programs}. Normally this is created from
a @file{makeTest.perl} or @file{makeTest.bash} but can be a program in
any language. After having build all targets, siliconBrain
automatically calls @command{makeTest}. Hence with every
@command{make} there is a test.
In my personal environment I have a keystroke, which invokes
@command{make} from within my editor. So often I include some calls to
programs or scripts at the beginning of my @file{makeTest.bash}, so I
can make a change with my editor, and with a single key, it is
compiled and tested.
@cindex test unit
@emph{To do} the automatic testing is more important, than to use one
of these sexy test units or frames, which are available. So, if a test
frame does help you to put your tests into @command{make} and
@command{crontab} then use it. If it is not runnable in batch, don't
use it. And if it is not available for some or all parts of your
project, write your own test drivers.
@c ---------------------------------------------------------------------------------------*
@c processPhilosophy: *
@c ---------------------------------------------------------------------------------------*
@node processPhilosophy
@section Process: most important dynamic unit
A @dfn{process} is a part of an application (a program), which 1. is running
possibly in parallel to other parts in a preemptive way, is a thread, 2. communicates with
other parts, processes via character streams (pipes), 3. is memory
isolated from the other processes (memory protection, region, address
space) and 4. cleans up all its used resources when exiting.
Definitely a @acronym{unix} process is a process as defined right
now. On a @emph{real} UNIX (like Linux on an i386) the preemptiveness
and the memory protection is supported by the CPU.
A good choice for defining, which parts will become the processes of
an application, is the commands as described in the next section.
There is said, that is is one strategic property, that a command is
callable as a C-function in addition to being a process (which in C
means, that there is a starting function @code{int main()}. This is
important to be able to use commands in critical places without
restarting a process too often.
But it is equal important to have the commands available as processes.
The char stream interface (pipe) guarantees a simple interface. An
interface, which can be communicated from within different programming
environment. For example a structured and complex list structure in a
@acronym{lisp} program can never be read or written by a
@command{bash} script. So even, when the C-function version of
commands communicate with @acronym{c} @code{struct}'s, these
@code{struct}'s are just mapped simple char stream data areas. The
structs are used to accelerate parsing, because it uses a fixed width
format.
(The format is
<delimiter><fixedWidth><delimiter><fixedWidth><delimiter><fixedWidth><delimiter>@dots{}
so that the delimiters are always on the same columns and missing data
is padded by blank. The delimiter is inserted for readability by humans
and for other programs, which can easily delimiter but not fixed width
(like Microsoft's @acronym{excel}).
The really extremely important thing about processes is:
they cannot disturb each other and the don't leak. And so if one of
the commands or processes (the part of your application) is written
not too well, it does not harm the application as a whole. If
subcommand leaks memory, after the ending of this command, all memory
is back again (of course, if the operating system does not leak).
The prototype of this kind of architecture is the @command{bash}:
@command{bash} is a frame for a system administration application, or
a program development application (depending of the subcommands
used). Suppose @command{bash} itself does not leak and not crash. Then
a leaking @command{gcc}, @command{find} or @command{vi} does not
disturb the overall application. After @command{find} exits,
everything evil is gone.
Using the @code{atexit()} possibility, other resources (like database
connectors) can equally secure be made water-proved.
So @command{bash}'s @samp{ls -l | grep regExp} is the template, the
basic architecture, the core of each application.
In a graphical environment the command line @command{bash} would be
replaced by a graphical frame, and instead of typing the command
names, the user would click in menus, but this would invoke real
processes, which are equally flexible plugged into the graphical frame
like commands can be integrated into @command{bash}.
Of course there should not be the creation of millions of processes in a
short time. But for example one process for each OK-button and each
menu item is certainly OK.
This extreme importance of processes makes it worth to support them
syntactically. Again the @command{bash} is a perfect example of how
simple the syntax of command and pipe could be.
The siliconBrain frame should provide some C-macros or functions to
support bash, gawk or perl style forking.
@c ---------------------------------------------------------------------------------------*
@c commandPhilosophy: *
@c ---------------------------------------------------------------------------------------*
@node commandPhilosophy
@section Functionality is available by commands
@cindex command
All kind of functionality is available as @dfn{commands}. As a model
you can think of them as the commands you can enter on the
shell. @dfn{commands} implement the basic or atomic functions of a
software package. Business processes are a kind of scripts, which
combines several commands to an application.
Commands are the building blocks of your application. Commands will be
available in several versions: as a real shell command line
command. As a C-function, as a window. All of these versions are based
on the same C-function. They represent just different user interfaces.
@cindex pipe
Because of the shell user interface it is possible to write
bash-scripts, which combines the commands of your application. This
makes it easy to write batches or test scripts.
Commands can be combined in traditional C-code and in a pipe
fashion. They also can be combined with GNU/Linux commands inside a
C-program. Although this is less efficient, because the GNU/Linux
command will start its own process, it allows you to combine your
commands with everything already found on GNU/Linux.
@subsection Command interface
Commands have three data structures to handle:
@itemize @bullet
@item
@dfn{Options} control the behavior of the command. In their command
line Vernon options are the command line options like @samp{--help}.
@item
@dfn{Input} is the input data. Conceptually it is comparable to stdin.
@item
@dfn{Output} is the output data, which is comparable to stdout.
@end itemize
Command can have both input and output, then called @dfn{filter}. They
can have no input then called @dfn{source}. Or they have no output,
called @dfn{sink}. Commands having neither input nor output are called
@dfn{function}.
An application can use @dfn{sequencing} of commands. That is one
command is executed and then another is executed independently of the
former one.
Another way to combine commands is to @dfn{pipe} them: in this case
the output of the first command is automatically handled by the second
command as input.
Command con work in @dfn{single record} or in @dfn{list} mode. In
@dfn{single record mode} only one instance of the input and/or output
data is handled. In @dfn{list mode} many input/output records are
handled.
@subsection Data commands
@dfn{Data commands} are commands that read data from outside the
application or write data to the outer world. For each data structure
there are commands to retrieve data corresponding to such a structure
and those writing data for this structure. The simplest way to think
of the world outside the application is to think of files. Writing
data records to a file writes this data out off the application. For
each data structure there is a data command to read this kind of data
from and to a file.
For the same data structure the data can be written in various
formats: Fixed length records, comma separated data, XML and
others. But Berkeley DB, mySQL and other databases are just other
formats of the same data structure handled by the same data
command. Which format is chosen is controlled by @dfn{options}.
So the @dfn{data structure} is the logical description of one data
record. The @dfn{data format} is the way one record is read or
written. Sometimes the data format is also called @dfn{data base}.
For example, if their is a data structure named @samp{address}. Then
there are the data commands @samp{readAddress} and
@samp{writeAddress}. When using the command line interface you can
load a comma separated file into a Berkley DB by:
@example
readAddress --commaSeparated | writeAddress --berkleyDb
@end example
Here piping of the shell is used and is possible, as for each command
there @emph{is} a command line version. Used inside a C program
@dfn{piping} is conceptually the same, but is used with a different
syntax. And it will not create additional processes.
@subsection Screens and Windows
A special place to write data to and read it from is a humans
brain. To read/write data in/from brains, data commands have special
data formats called @dfn{brain format}. Brain formats are GTK, curses
and HTML. Thus you can also say: a brain format is a window or a
screen. There is a special command line brain format, which prompts a
user for data input for reading. It writes data to stdout using syntax
highlighting like @samp{ls --color}.
So:
@example
readAddress --mySql | writeAddress --brainGtk
@end example
will display an address in a @acronym{gtk} graphical window in single
record mode, and display a list of addresses in a graphical window in
list mode.
Hence windows are just a special form of a data base or data format,
to which data is written or read from.
All data formats and all modes are automatically generated out of a
data record specification.
@c ---------------------------------------------------------------------------------------*
@c Documents: *
@c ---------------------------------------------------------------------------------------*
@node Documents
@section Documents
@subsection Texinfo
@cindex texinfo
@cindex documents
@cindex documentation
@cindex @TeX{}
All self written docs are created by writing texinfo files. texinfo is
the documentation tool of the GNU project
@uref{http://www.gnu.org}. texinfo files can be processed by @TeX{} or
by @command{makefinfo}. See @uref{http://www.texinfo.org} for more
details.
Out of texinfo doc's can be created HTML files or @file{info} files.
For all
commands and applications and other texinfo files are generated. The
reasons for using texinfo are well enough explained in texinfo's info:
@quotation
@TeX{} works with virtually all printers; Info works with virtually all
computer terminals; the HTML output works with virtually all web
browsers. Thus Texinfo can be used by almost any computer user.
@end quotation
@subsection HTML
@cindex HTML
@cindex open source
Out of the self written and generated texinfo doc's siliconBrain
generates HTML doc's. Because the idea is, to support open source
projects, the source tree is also HTMLified. The complete HTML is
generated in a local WEB, which is beyond the @file{distribution/documentation} subdir. This then
is automatically installed into a WEB target.
@subsection Source Browsing
@cindex source tree
@cindex HTMLify
@cindex syntax high lighting
@cindex font lock
The complete source tree (without the @file{temporary} and @file{distribution} subdir) is HTMLified. That
is, for each directory siliconBrain generates a HTML file with a link
for each file. This dir is a syntax highlited version of the @command{ls}
output. Here each file is a link to an HTMLified version of that file.
@c ---------------------------------------------------------------------------------------*
@c Tools: *
@c ---------------------------------------------------------------------------------------*
@node Tools
@section Tools
@subsection What tools
@cindex GNU
For almost all purposes I use tools from the GNU project provided by
the Free Software Foundation. The reason: they are fast, reliant, full
of features, cheap and FREE. Linux has the same properties. For
formating of printed output we use @TeX{} or postScript.
@cindex bash
@cindex tools
@cindex GNUmake
@cindex make
@cindex emacs
@cindex gcc
@cindex texinfo
@cindex GNU UN*X commands
@cindex GNUsed
@cindex sed
@cindex CVS
@cindex GNU/Linux
@cindex gostscript
@cindex @TeX{}
@cindex postScript
@table @asis
@item bash
For scripts, and programs heavily depending on UN*X commands.
@item GNUmake
For the build process.
@item emacs
For editing, maintenance of @file{ChangeLog} and for lisp
programs, which are designed to change text.
@item gcc C
For the core programs.
@item texinfo
For documentation. All html-file are generated out of
it. info files as well.
@item GNU UN*X commands
For many purposes we use the GNU versions of
UN*X commands. Whenever possible we use the long version of options.
@item GNUsed
For simple text replacements.
@item CVS
For version control.
@item GNU/Linux
As the operating system.
@item gostscript
To view postScript files.
@item @TeX{}
to produce printable output.
@item postScript
to produce printable output.
@end table
@subsection Why GNU
@cindex portability
siliconBrain projects are not portable in a way that they run on every
computer. They even are not intended to run on every UN*X machine. I
have chosen GNU for almost all tools, not only because it is open
software, but also because the quality is much improved compared to
many other vendors tools.
@cindex GNU extensions
So whenever it is of advantage I use GNU extension and facilities,
which are not standard. I use gcc non-ANSI C language extension. I use
GNUmake with all facilities it has. And I use bash extension, which
are not present in Bourne or korn shell.
@cindex GNU/Linux
@cindex Linux
@cindex GNU/Hurd
@cindex Hurd
In the same way I develop the project on GNU/Linux operating system. I
will never sacrifice a feature found on a GNU/Linux system, just to be
able to run on another OS. Eventually I will try to
compile siliconBrain on a GNU/Hurd system.
On the other hand: because the GNU tools are widely portable, it is
likely that siliconBrain project are portable to systems, where GNU
software is running.
@subsection autoconf and automake
@cindex automake
@cindex autoconf
@command{autoconf} is designed to be able to develope portable
software. It does so by generating the makefiles and by replacing non
existing features. In a @file{makefile} used in a autoconf driven
project, you are not allowed to use all the nice features GNUmake
provides to you. But why should I? It is for everyone possible to run
a GNU/Linux system and use GNU software. It is cheap to get, if you
like, and you have the right to do that.
@cindex generator
But @command{autoconf} and @command{automake} have an interesting
feature: they generate a lot before the make process is started. In
siliconBrain projects generators are used before the compile. That is
to avoid all redundancies.
@c ---------------------------------------------------------------------------------------*
@c Source Tree: *
@c ---------------------------------------------------------------------------------------*
@node Source Tree
@section Source Tree
@cindex source tree
There is a very standardized form of a siliconBrain projects source
tree. The structural most important fact is, that the real sources are
kept clean from any generating or compilation results. These are all
put into a subdir @file{temporary} for results just needed for make
and @file{distribution} for the final results.
@cindex temporary
Currently all sources are in the projects home directory. Normally
this is @file{.}, when you are editing and making. Everything, which
is generated, compiled or what ever is put into the directory
@file{temporary} or @file{distribution }or its subdirectories. Unfortunately this is not true
for th @file{TAGS} file and not for emacs backup files.
Directory @file{temporary} contains all intermediate files (like
@file{.o}-files). @file{distribution} and some subdirectories contains all files, which are
needed when using a package, like documents and programs.
The following subdirectories of @file{distribution} are standard:
@cindex data
@cindex documentation
@table @file
@item data
Containing files, which act as data not programs. This
data files should not be changed by the running package. Databases or
similar files should be allocated somewhere in @file{/var/@dots{}} or else
where, but not inside the package directory tree.
@item documentation
All kind of documentation: HTML files, info
files man pages and manuals as dvi-file. Here we have the following
subsubdirectories:
@table @file
@item man
the man pages.
@item web
all generated HTML files. Mainly @file{index.html} the
starting page of a project.
@item web/info
HTML-files generated out of texinfo documents.
@item web/sourceTree
all sources of a project htmlilized and sytaxhighlited.
@end table
@item programs
All executable programs. This include binaries as
well as bash or awk scripts. The later don't have any special
extension. Programs, which cannot be started from within a command
shell (like EMACS lisp programs) are also included here. There is
conceptually no difference between a program running in a shell or in
EMACS. You can look at EMACS as just being another kind of shell.
@end table
@c ---------------------------------------------------------------------------------------*
@c Installation Principles: *
@c ---------------------------------------------------------------------------------------*
@node Installation Principles
@section Installation Principles
@cindex installation
Normally siliconBrain projects are not intermixed into all the other
software on you computer. So dir's like @file{/usr/local/bin} is
untouched. Instead we create a @file{/usr/<package>} or
@file{/usr/local/<package>} dir, where we put all things, which are
relevant for that package. The package name includes the version, so
that it is easy to run several versions of the same tool or
application on the same computer. The siliconBrain generation tool, for
example, will be installed into @file{/usr/siliconBrain_0_0_0}.
@cindex version, current
@cindex current version
You can define a link @file{/usr/siliconBrain} pointing to the current
version, so that, if you like, you can create references to a
siliconBrain application, without considering the current version.
In @file{@var{packagePath}/configuration} you will find a @file{setEnvironment}, which
you can include somewhere. This file should have been included
(executed with a prefixing @code{. }), in an environment, which uses
this siliconBrain application.
As a developer of a siliconBrain application, you will use
@code{siliconBrain}. So you will have something like
@code{. /usr/siliconBrain/configuration/setEnvironment} in your
@file{~/.bash_profile}. In that @file{setEnvironment} there is
@file{@var{packagePath}/programs} appended to the path, so that as a developer you can
use your own packages executables.
@cindex version, different
To be able to use in parallel two siliconBrain applications, which are
based on different versions of @code{siliconBrain}, all used lib's,
scripts or whatever will be installed into the applications dir.
The generated WEB is to be installed via @acronym{ftp}. The idea behind is an
open source application developer, who works on a computer with
changing IP address and having an Internet provider, which gives her
some web space. Or a company developer who has to install the WEB on a
special intranet server.
@c ---------------------------------------------------------------------------------------*
@c Nameing: *
@c ---------------------------------------------------------------------------------------*
@node Nameing
@section Nameing
@subsection Prefix
@cindex naming
The siliconBrain tool and siliconBrain applications are installed
outside the normal @file{bin} dir's. Also their @file{lib}s are
excluded from the others. So names do not have a prefix. That is, if
you execute @command{someCommand} while being in a siliconBrain
development environment, siliconBrain's version of
@command{someCommand} will be executed. Outside the siliconBrain
context the commands are not visible. In that way we do not pollute
the command namespace of a computer.
@cindex prefixes
Other reasons not to use prefixes have to be long to be unique. It is
very unintuitive to have prefixes before every command. It is a burden
to type them, and even more if there are long prefixes.
One intention of the organization of siliconBrain projects is, to be
able to run several versions in parallel on one computer. Here
prefixes to command names would not help. And so we have had to look
for another solution. This solution then also gives the possibility to
work without prefixes.
Sometimes there are exceptions. When there are commands or variables,
which have a relatively common name, such as ``packageName'' or ``install'', which in
turn probably will interfere with operating system commands, or
commands from other products, the are prefixed by
``siliconBrain''. This is especially true for operating system
environment variables, which are @code{export}'ed.
So the installation procedure of siliconBrain is @command{siliconBrainInstall}.
@subsection Abbreviations
@cindex abbreviations
Abbreviations are easy to type but not so easy to remember. And even
more heavy to read and understand. In a world of command line
completion and similar facilities of editors, to be short is no longer
an advantage. So we use long names, describing the objects the name.
Very common abbreviations like @acronym{cvs}, @acronym{html} or @acronym{ftp} are exceptions, of
course. They in the meantime have the status of words rather then
abbreviations. They are treated as words.
This is true for directory names as well. So instead of @file{tmp} we
name it @file{temporary}.
@subsection Say what you mean
@cindex names
A directory containing configuration should not be named
@file{etc}. It is named instead @file{configuration}. A directory
containing executable programs is not named @file{binary}, as would be
the not abbreviated @file{bin}, but is called @file{programs}. A bash
script for example is not a binary file. It is pure readable
text. Nevertheless it is a program.
Normally objects are not named according to their type. For example an
account number is not called @code{number} but @code{account}.
An implication for options for commands is, that we always use
long-format options like @samp{rm --force}. All commands we define for
siliconBrain or a siliconBrain project have at least also a long
format. In all scripts or wherever we call a command we use the
long-format options. Maybe for some options there is a short
alternative like @samp{rm -f}. But this form is just for humans, who
like to type a command intercativly into a command line. When an option
has its equivalent in a configuration file, we will use long
describing word as well.
@subsection Extensions
@cindex extensions
File extensions like @file{c} in @file{myProgram.c} are used to
specify how to handle the file during the @command{make} process. The
rightmost extension is also used to indicate the editor style to
use. If their are different type of handling for the same language
used, second level extensions are used: There are two different kind
of C-sources: those which contain a @code{main} and will be linked to
a command, and those which are designed to be part of a library. The
first get the extension @file{.main.c} the second @file{.lib.c}.
@cindex executables, extension of
Executable programs residing in @file{programs} never have an
extension. It should be completely transparent (invisible) for the
user of a program, in which language it is written, whether it is C,
awk or bash.
@cindex bash programs, extension of
For bash-programs that means: in the source tree they do have the
extension @file{.bash}. Doing so, @command{make} knows what to do with
it. Compiling them means to copy them to the
directory @file{programs} without any extension.
@subsection Prefix
@c ---------------------------------------------------------------------------------------*
@c Generators: *
@c ---------------------------------------------------------------------------------------*
@node Generators
@section Differents to the 1980s case tools and generators
In the 1980s there where a couple of @emph{generators} or @emph{case
tools} in use. Those tools have some good ideas but also some
bad. Some of the good ideas are used by siliconBrain. Some of the bad
are tried to be avoided:
@itemize @bullet
@item
@dfn{language}s are not newly invented or created. siliconBrain is
completely based on existing languages, mainly C (and lisp, bash, awk,
@dots{}).
@item
@dfn{compiler}s are not written. siliconBrain is completely based on
existing compilers and interpreters.
@item
@dfn{binary format} of sources is forbidden. All sources are stored as
ascii text. Thus sources are able to be handled by CVS, grep and
emacs.
@item
@dfn{specification screen}s are not used. Instead all sources of a
siliconBrain project can be created and changed by emacs, vi or
whatever editor the programmers like.
@item
@dfn{portability} is not goal. There are no complex layers
encapsulating system specific properties.
@item
@dfn{strange tools} are not used. Everything is based on very common
tools like gcc, emacs, GNU make, GNU awk, @dots{} So it should be easy to
incorporate with other tolls, lib and software.
@item
@dfn{runtime environment}s are avoided. All components of a
siliconBrain project are available as fast command lines (additionally
to be available as screens or windows). So there is no virtual machine
or runtime environment, which has to be started either explicitly or
implicitly before a program can run. So it is cheap to call a
siliconBrain based program within a shell.
@end itemize
@c ---------------------------------------------------------------------------------------*
@c HTML: *
@c ---------------------------------------------------------------------------------------*
@node HTML
@section Thoughts about web applications
A lot of modern applications are written for use with a web
browser. SiliconBrain later will generate an @acronym{http} version of
the defined applications. For web applications in general there are
several things to consider.
@subsection Generation of @acronym{html}
Sometimes the layout and contents of a web page is dynamic. That is
during compile time the @acronym{html} pages are not defined. Rather
they have to be generated later out of the data. In some applications
I have seen, this leads to a general approach, that all pages are
generated while the application is running. Even those pages, which
are static. This is to simplify the overall architecture of the
application.
But according to the general principle of @dfn{compile time decisions
outperform runtime decisions} this idea is deprecated. Instead we
distinguish between three basic classes of @acronym{html} output:
@table @dfn
@item static
This are @acronym{html} pages, which are completely defined at compile
time and never change. Those pages are installed as ready to display
@acronym{html} sources.
@item semi static
This are pages, which normally do not change, but have to be rebuild
time and again. They are based on data, which is not defined at
compile time, but varies only rarely. For those pages there is a
special program, which rebuilds all the @acronym{html} pages out of
semi static data. This program can run for example every day or week.
@item dynamic
This are those pages which really depend on the input the user of an
application has typed into the application. Only those pages are
generated while the user is waiting for a response.
@end table
@subsection Reusable pieces for the generation of @acronym{HTML}
Whether @acronym{html} pages are @dfn{static}, @dfn{semi static} or
@dfn{dynamic}: all will have the same kind of styles or techniques
with which they are created. There will be common @code{javaScript}
pieces or the same @code{/title>} and other pieces identical in all
pages of one application.
For these pieces there should be only @emph{one} place of
definition. There should be not one bit coded redundant in several
pages. And so there will be an @acronym{html} generator, which
generates all three kind of pages. The generators C-functions will be
active in the @command{make} of an application to generate the static
pages, they will be active in the program, which runs every week or
day to generate the semi static pages and the same functions again will
be active in the running application to generate the dynamic pages.
@c ---------------------------------------------------------------------------------------*
@c Performance: *
@c ---------------------------------------------------------------------------------------*
@node Performance
@section Performance is one of the most important things
Many times today I hear sentences like: ``today performance does not
matter any longer''. For example when I discuss about
@acronym{java}. Normally in the same discussion someone explains me
complex techniques use to establish @dfn{load balancing}. That is
because performance does matter.
It is annoying, that faster and faster computers do not really give
benefits to the end users, because the incredible high performance of
the new hardware is almost completely consumed by worse and worse
programs.
What I really like for example is, opening and @acronym{excel} version
3 application on a modern computer. It opens in the second the double
click is pressed.
Any application should be as fast as possible. There are two reasons
for this:
@itemize @bullet
@item
The user should wait as less as possible on the computer. Instead the
computer should wait for the user. The user is a human, the computer a
machine.
@item
It should be possible to reuse any application as a component in other
applications. That implies, that an application originally designed to
be used via a user interface, could become a component of a batch
application that calls the former in a loop many thousand
times. Suddenly it would be a mess, if the original application were
slow.
@end itemize
@subsection Load Balancing
Complex techniques to implement something like load balancing are most
of the time the solving of a problem inside the application from
outside it. The better way would be to improve the application.
Also it is a bad habit, when application programmers are needed to
implement complex performance techniques, just because the chosen
programming frame is lame. There are some simple means to guarantee
fast programs:
@itemize @bullet
@item
@emph{C}: Use the language C. It generates fast code.
@item
@emph{GNU/Linux}: use GNU programs and the GNU/Linux operating system.
@item
@emph{Compile time decisions}: Any decision, which can be made at
compile time, should not be postponed to the run time.
@end itemize
@subsection Compile Time Decisions
To get an application as fast as possible, it is important to do any
decision, which can be made before run time, at compile time. That is
any decision, which can logically taken. So don't argue, that a
certain decision is logically able to be taken at compile time, but it
is easier to implement it during run time.
This has some consequences:
@itemize @bullet
@item
Compilers outperforms interpreters. The task of interpreting the code
can by done during the build time of an application. So to use
@code{bash} scripts is generally a bad idea. This is @emph{not} a
reason not to use scripting languages. But in general the scripting
language should be able to be compiled into a real binary executable.
An exception is scripts, which will be written by the end user, to
extend your application.
@item
Machine code outperforms byte code. Many modern languages have
the concept of a @dfn{byte code} or something else, which is
interpreted at run time. This is halfway from interpreter to
compiler. But why not compiling it further on to machine code? What is
often answered: because it should be portable. But as gcc shows, a
compiler can be as portable.
An exception to this is the situation, like in web browsers, when you
provide applications at click time to machines you don't know. Then a
byte code is the solution. That means: @acronym{java} is good for web
browser applets, but not for server applications.
@item
Real machine outperformce virtual machine. This is more or less the
same reasoning as for the machine code. The byte code interpreter can
also be called a @dfn{virtual @acronym{cpu}}.
@item
Static outperforms dynamic @acronym{html}. So, if you want to reuse
components designed for use to generate dynamic @acronym{html} pages,
in the creation of static pages (which generally is a good idea), then
you should incorporate these components in the building process of
your application. The the @acronym{html} building components are
active during compile time, to create the static @acronym{html} and
during run time to create the actual dynamic pages.
@end itemize
@subsection Creating new Processes
A special note I want to give on @dfn{process creation}. On GNU/Linux
machines the creation of new processes is relatively fast. And in
@code{bash} programs there are a lot of processes created. I call them
@emph{programs}, because they are programs. Often they are called
@emph{scripts}, because these programs are interpreted, and because
there is an easy interface to the underlying operating system.
In @code{bash} programs processes are created for each command, which
is not an internal command. That is for each line, which is a command,
for each backquote and for each pipe. These mechanisms make it easy to
write @code{bash}. It allows to reuse existing executables to write
new functionality. But there is often no special reason to create
processes. It is just because @code{bash} is designed so.
If you take @code{perl} as another scripting language, there is a lot
of work done, not to create processes time and again. Many
@acronym{unix} commands and all C-functions are available as internal
functions. This is the better way.
Siliconbrain creates a process creating executable for each
siliconBrain command. This is to make it possible to test those
commands from within for example @code{bash}-scripts. And it gives a
user the possibility to use those commands by just typing them into
her command line. Also gives the flexibility to write shell script
programs using siliconBrain commands, but this is @emph{not}
recommended. Instead use C and the siliconBrain environment to create
new applications, which are in itself commands.
@c ---------------------------------------------------------------------------------------*
@c Java: *
@c ---------------------------------------------------------------------------------------*
@node Java
@section @acronym{java}
In these days it is rather popular to do things in
@acronym{java}. siliconBrain completely is implemented in C, bash,
perl, makefile, @enddots{} I think @acronym{java} has a lot of
improvements, but also a lot of back draws. And it is not only the
language itself, which I criticize, but also things like
@acronym{jsp}, which is in my mind the false way to go.
In the siliconBrain project, I intend to do a lot of things contrary
to what is done today. So for example instead of writing
@acronym{html} pages, which eventually call programs to get data, I
write programs, which handle data and eventually generate
@acronym{html}.
The third reason not to use @acronym{java} in siliconBrain is, that it does not fit to
@command{grep}, @command{sed}, pipes, files, @command{make}, the
features of @acronym{unix}. For example one of my ideas is to use
@acronym{unix} processes to implement user sessions, which then
automatically inherit the @acronym{unix} access right possibilities as
well as the mutual isolation of sessions: one session cannot read nor
even modify data of another session.
My @acronym{java} critique is as follows:
I dislike:
@itemize @bullet
@item
It has no processes as resource leak free, non disturbing user owned threads.
@item
It does not resist seamlessly in the hosting @acronym{os}. There are
no easy ways to access operating system functions, like pipes, memory,
processes, files, @enddots{}
@item
The byte compilation instead of a real machine code compilation is not
reasonable for server applications.
@item
The reflection @acronym{api} increases the number of interpreted and
type less applications.
@item
The @acronym{vm} @emph{is} a new operating system. But with not all
features implemented. For example ``pipes'' and more seriously
``processes'' are not existing.
@item
It is not possible to write command line versions of programs.
@item
It does not implement some of the most powerful concepts of older
languages, like the intrinsic support of regular expressions of
@acronym{awk}, or the extremely fast automatic storage of @acronym{c},
or the powerful support of ``pipes'' and ``redirection'' of
@command{bash}.
@item
Performance seems not to be so good, because people always want to
tell me things about load balancing, connection pooling, object
reusage, @enddots{}
@end itemize
I like:
@itemize @bullet
@item
It is highly portable.
@item
@acronym{java} brings a rich, standardized @acronym{api}. This is
itself extremely portable.
@item
The language is strictly and elegant designed.
@end itemize
@c ---------------------------------------------------------------------------------------*
@c Configuration: *
@c ---------------------------------------------------------------------------------------*
@node ConfigurationPhilosophy
@section Configuration
There is no real difference between a command line option and a
configuration parameter. In a package, which has one or many commands,
each command has several options. Some of the options are command
specific others are inherited from the package. A third group of
options is inherited from the siliconBrain frame (like the
@code{--help} option). The options, which are inherited from the
package are the configuration parameters.
The normal command specific options are specified in a command
specification program (executable specification), which outputs the
option description in @acronym{xml} or other formats. The @emph{package
configuration parameters} are simple the options of a command, with the
same name as the package. Whether the command really exists or not does
not matter. These options are generated into each command
implementation as inherited options. So each configuration parameter
can be overwritten by using an option.
So the first source of configuration are the command line
parameters. In the option handling of each command, the second
location of looking for options is the environment. For each option
and thus for each configuration parameter too, there is a
corresponding environment variable @samp{packageName_optionName}.
Third place are @env{executable configurations}: files with the name
@samp{packageName.configuration} residing in certain directories
(current, home, $PATH, etc). These programs (implemented in any
language) output the option contents in XML format. This output is
read by the standard option handling generated by siliconBrain. The
list of directories can be configured by setting the environment
variable @samp{siliconBrainConfigurationPath} to a colon separated
list of directories, with the entry @samp{""} being the
@samp{$PATH}. The default is @samp{.:~:/etc:$PATH} for
(current, home, $PATH, etc).
At the beginning I thought that
the very last location for a configuration look up is the packages
path. But I decided against this because it should be possible to just
delete a package directory and then reinstall it. All configuration
information should be kept. So there should be nothing changeable
inside the package directory. always think of the package directory
residing on a read only CD-ROM. The second reason is, that the normal
case presumably is that with a new version of a package the old
configuration should be taken. If this is @emph{not} the wish,
@samp{siliconBrainConfigurationPath} can be used to handle different
configurations.
An option will have the value, which has been found first. So command
line arguments overwrite environment variables, overwrite the
configuration in the current directory overwrites @enddots{}. Flag
options are normally specified without a value (@samp{--myFlag}). To be
able to set an option to @emph{false} even if it is configured to
@emph{true} in a subsequent configuration, it is possible to set a
flag option explicitly to false: @samp{--myFlag=false} or
@samp{--no-myFlag}.
And for
reasons of symmetry @samp{--myFlag=true} is also allowed.
The @samp{--complete} option avoids reading all stages every
time. Whenever in the places to look the @samp{--complete} option is
set, the lookup in the subsequent places is suppressed. So for
example, if yo specify @samp{--complete} on the command line no
further option or configuration lookup is performed.
In each package there is a generated program
@samp{packageName.configurationReader}, which does the standard option
handling (reading command line arguments, reading environment
variables, reading executable configuration output and each time
stopping, if @samp{--complete} is found). This program then does
nothing than outputting the aggregated information in various format:
xml, bash, environment, perl and others.
This can be included in other languages like (for bash)
@example
eval $(packageName.configurationReader $* -- --bash)
@end example
The configuration can be pumped into environment by:
@example
eval $(packageName.configurationReader -- --bashEnvironemnt)
@end example
where @samp{export} statements are generated. The configurationReader
always set the @samp{--complete} option. So after executing the last
example, all commands of that package, will stop after reading the
environment. The above eval statement can for example be place in a @code{.bashrc}.
Before starting I thought, that the output of the @emph{executable
configurations} should be of the form:
@example
# comment
key = value
anotherKey = another value
@end example
rather than @acronym{xml}. The reason for that was, that it should be
easy for other programs in arbitrary languages to read and process the
configurations. Other languages include @code{bash}, @code{emacs lisp} and
@code{perl}. Using @acronym{xml} would require to have an
@acronym{xml} parser in any language. But for the reason to have a
consistent and common option, environment @emph{and} configuration
handling I have invented the @emph{configurationReader}. This handles
options, environment and configurations and then outputs the result in
a host language friendly way. This means that the configurations can
output @acronym{xml} without having the need to process this
@acronym{xml} in all different languages. On the other side to produce
@acronym{xml} is far more easy than to parse it. The following example
of a @code{siliconBrain.configuration} file
is the so called @emph{trivial configuration implementation}:
@example
#!/bin/bash
cat <<EOF
<siliconBrain>
<publishTargetHost> siliconbrain.com </publishTargetHost>
<publishTargetArchive> ftp/anon/pub </publishTargetArchive>
<publishTargetWeb> www/htdocs </publishTargetWeb>
</siliconBrain>
EOF
@end example
With this concept we fulfill the following requirentments of a
configuration environment:
@itemize @bullet
@item
@emph{Configurations are held in files, which in turn are recognized
by their file extension}: The @emph{executable} configurations have
the extension @code{.configuration}.
@item
@emph{Configuration contents is pure @acronym{ascii} and thus is easy
to create in any language}: the configuration content is
@acronym{xml}, which is easy to build in any language. We do not use
@acronym{xml} attributes which makes the creation even easier.
@item
@emph{Configurations do reside in @acronym{cvs} and can be created
with an arbitrary editor (as opposed to a special program)}: For each
configuration, which is a program, there is a program source, which can
as normally stored in @acronym{cvs}. Further on configurations can be
written in any language (C, bash, perl, @dots{}) so an administrator
can choose the language, which fits best in her environment.
@item
@emph{Configurations do cascade in site specific, user specific,
local}: The configurationReader and the standard option handling for
C-programs, do read first the command line, then the environment
variables, then a configuration found in the current working
directory, then one in the users @samp{$HOME} directory then in the
@samp{$PATH}, then in @file{/etc}.
@item
@emph{The same package can be run with different configurations on the
same computer with the same userid}: By using different configurations
in different current working directories, this is possible. If the
@file{./myPackage.configuration} does not set the @code{--complete}
flag, then a higher level configuration can set the common parameters.
@item
@emph{After a version change, the old configuration files should be
supported still}: the option handling @acronym{xml} parser just
ignores options, which are unknown. This is a little support for a
package author to fulfill this request.
@item
@emph{Sometimes a configuration configures the build phase of a
package. In that case the configuration should be part of the
@acronym{cvs} tree}: Because configurations are programs, there source
can reside in the normal @acronym{cvs} tree. If they are C-programs,
the make process must just make sure, that they are compiled into the
package root directory.
@item
@emph{It should be possible and easy to use the configurations from
all languages}: The configurationReader can output the found and
aggregated parameters in @acronym{xml}, @code{bash}, @code{perl} or
other languages.
@item
@emph{It should be possible to gain performances by suppressing a
complex configuration hierarchy parsing}: The @samp{--complete} option
can suppress any further lookup in higher levels. So for example
setting the environment variable @code{myPackage_complete} will
suppress the search, execution and parsing of executable
configurations. With the configurationReaders @samp{--bashEnvironment}
format, it is easy to pump a configuration into ones environment.
@item
@emph{It should be possible to read all or part of a configuration
from a database}: because a configuration is a program, information
can be read from anywhere, to be converted to @acronym{xml}.
@item
@emph{The configuration language should provide means to avoid
redundancies}: Because there is no extra configuration language
(observing the ``never invent a new language'' rule), but instead any
language can be used (the option handling does not interpret this
language source, but rather the output which should be
@acronym{xml}). So all means of programming are available. You can
write your configurations in @acronym{c++} and use @acronym{oo} techniques.
@item
@emph{Configuration syntax should be easy for non programmers to be
build}: the trivial implementation (see example above) is a
@samp{bash} program, which just @samp{cat} out the @acronym{xml}.
@item
@emph{Configurations should be processable with @samp{sed} and
@samp{grep}}: To be honest I am not sure. The output of a
configuration is @acronym{xml}. Is this processable with @code{grep}?
@item
@emph{Configurations should be writable in any language}: This is true
as long as the language allows writing @acronym{xml} to @code{stdout}.
@item
@emph{Configurations should be allowed to be implemented in mixed
languages}: (For example a company admin writes a complex @code{perl}
configuration file and a completes with a short trivial @code{bash}
based configuration snippet.) This can be accomplished by chaining the
configurations in the sense that one calls a second, modifies its
contends then output the final @acronym{xml}. Or by a simple placing
the complex @code{perl} into @file{/etc/packageName.configuration} and
the trivial @code{bash} into @file{./packageName.configuration}. This
is the intended solution.
@end itemize
There is an environment variable
@samp{siliconBrainConfigurationPath} with a colon separated list of
directories, where to search for configurations.
The option specification, which describes the configuration parameters
of a package is the same (not just equal) to the option specification
of a command with the name of the package. So if the package
packageName has a command @code{packageName.command} then the options
of that command are the same as the configuration parameters of the
package. This is intended for packages with a definite main command,
which for example starts @emph{the} server.
The @emph{configurationReader} uses siliconBrainLib's xmlPrinter to
output its information. This means, that it can do the output in
various formats (for example @acronym{xml}). And it does it colorized
in case of stdout being a tty. Also the @emph{configurationReader}
uses the @emph{short descriptions} of the options to create comments
in the generated output, so each option is described a little. Only
those options (and with it configuration parameters (member that the
configuration parameters are available as inherited command line
options)) which have been specified somewhere are outputted. The rest
is present just via its commented short description:
@example
~/sandbox/siliconBrain> siliconBrain.configurationReader -- --xml
<!-- siliconBrain main configuration.
general datahandling application framework -->
<siliconBrain>
<!-- publishTargetHost (value) : IP address or domain name where package should be put to -->
<publishTargetHost> siliconbrain.com </publishTargetHost>
<!-- publishTargetArchive (value) : directory on the publishTargetHost, where the tar.gz should be put to. -->
<publishTargetArchive> ftp/anon/pub </publishTargetArchive>
<!-- publishTargetWeb (value) : directory on the publishTargetHost, where the HTMLilized sourcetree should be put to. -->
<publishTargetWeb> www/htdocs </publishTargetWeb>
<!-- help (flag ) : Print a short help message, listing the options. -->
<!-- verbose (flag ) : Let this command talk to you a lot. -->
<!-- version (flag ) : Display version information. -->
<!-- output (value) : File to which output is written. -->
<!-- complete (flag ) : Indicate completeness of specified options. No further lookup in configuration chain. -->
<complete/>
</siliconBrain>
~/sandbox/siliconBrain>
@end example
It is possible to chain configurations in a way: a user can write a
configuration in her home directory, which calls a standard
configuration and then uses @samp{sed} or @samp{awk} to manipulate
some entries.
Be aware the the suffix @file{.configuration} does not indicate the
format of the file, so that you can chose a certain editor or editing
mode. Rather this suffix indicates, that it is an executable file,
which, when called, outputs @acronym{xml}. In case that the
configuration is implemented in @acronym{c}, the configuration file
itself is not readable at all. In this case you need the source.
It is recommended to use @emph{siliconBrainLib}'s
@code{SiliconBrainPrinter} @acronym{aka} @code{xmlPrinter} to write
the sources of your configuration, if they tend to be a little
complex. This is a @acronym{c} library of functions (also used by the
@emph{configurationReader}) to output @acronym{xml}. If you do, you
are programming in @acronym{c}, you have all the possibilities of
@acronym{c} and the resulting program outputs @acronym{xml} in color in
a standardized inteted manner, and in other syntaxes (like @code{perl}
or @code{bash}).
I have decided for an @emph{executable configuration} to be able to
use a real full blown programming language for writing the
configuration. So in creating configurations you have all means of
reusability, avoiding copied code, factor out common information,
using loops, defining variables, commenting. On the other side you can
read databases or other sources of information to support kind of
dynamic configuration.
If you use @acronym{xml} as a language for sources, or to put it into
other words, if you have @acronym{xml} files in @acronym{cvs} normally
you corrupt a lot of programming rules. Mostly the ``never copy
code'' rule. But @acronym{xml} is a perfect protocol language, a
language used for programs to communicate with each other. By deciding
for executable configurations, I have the advantages of full
programming languages, with the advantages of @acronym{xml} as protocol.
@c ****************************************************************************************
@c *
@c Commands: *
@c *
@c ****************************************************************************************
@node Commands
@chapter Commands
@include commands.texinfo
@c ****************************************************************************************
@c *
@c Installation: *
@c *
@c ****************************************************************************************
@node Installation
@chapter Installation
@section How to install
@include install.snippet.texinfo
@section Overview
@include readme.snippet.texinfo
@section NEWS
@include news.snippet.texinfo
@section TODO
@include todo.snippet.texinfo
@section AUTHORS
@include authors.snippet.texinfo
@include gnuGpl.include.texinfo
@include gnuFdl.include.texinfo
@node Concept Index
@unnumbered Concept Index
@printindex cp
@bye
@c $Log: siliconBrain.main.texinfo,v $
@c Revision 1.55 2004/12/14 23:31:27 joerg
@c published for new release 0.2.3
@c
@c Revision 1.54 2004/12/14 23:17:05 joerg
@c published for new release 0.2.2
@c
@c Revision 1.53 2004/12/14 22:42:23 joerg
@c allFiles: all sources have a Log CVS keyword at the end now.
@c