This chapter first provides a mathematical formalization of an image. Then the development of ICL's image class is presented and discussed in a step-by-step manner.\\ Within ICL, an image is represented as a set of equal sized image channels \seefig{image-sketch}. \iclfigure{image-sketch}{ICL image scheme}{height=170px} %\section{A Functional View on Images\label{sec:the-image-function}} More generally, an image can be represented as the following function: \begin{equation} I(\mathbb{R} \times{} \mathbb{R} \times{} \mathbb{N^+} )\rightarrow\mathbb{R} \end{equation}. This image function expects three arguments. The first two arguments reference a location in the 2D-image plane (the spacial domain) and the last argument references a single image channel. Of course, this formulation is more general than the image scheme presented in \reffig{image-sketch}. Image processing must be handles in a numerically efficient way. Therefore we now provide a less complex functional description: \begin{equation} I(\mathbb{N^+} \times{} \mathbb{N^+} \times{} \mathbb{N^+} )\rightarrow\mathbb{R} \end{equation} This function can be understood in two different ways: \begin{enumerate} \item It expects three arguments and returns a single pixel value from the referenced channel. \item It expects only two arguments and returns a vector in $\mathbb{R}^C$\footnote{$C$ is the channel count of the image here}. \end{enumerate} \todo{Correct with Jonathan from here!} We decided to implement the first one, because of a fundamental design principle: We decided to use planar data layout, which cannot be combined with the 2nd point view on the image function. E.g. there has to be some pixel access function that returns the pixel value at some image location $(x,y)$. This would have to return some vector structure (e.g. a \inlinecode{std::vector}) which is sure possible, but not very efficient in C++. Of course, we provide an operator (\inlinecode{operator()(int x, int y)}) which allows to address a pixels with all it's channels, but this is just a convenience feature for code that has no performance constraints.\\ In addition the spacial domain of images is always limited to the image size, which again restricts the image function. Together with a positive but finite channel count of an image we approach this function: \begin{equation} I(\{0,..,W_I-1\} \times{} \{0,..,H_I-1\} \times{} \{0,..,C_I-1\} ) \rightarrow \mathbb{R} \end{equation} Where the images dimension is $W_I \times{}H_I$ and it has $C_I$ channels. In particular we can see here, that image coordinates origin is pixel $(0,0)$ and not $(1,1)$ like e.g. in Matlab\footnote{see www.mathworks.de}. \section{Value-Domain\label{sec:image-value-domain}} Further we have to take a look at the value-domanin of images. Of course, we cannot use $\mathbb{R}$ :-). The native data type that comes closest to the real numbers is obviously a maximum precision floating point number (C++: \inlinecode{long double}, with normally 96 Bits). Unfortunately floating point processing is still slower than integer processing on common CPUs. This is the reason why most frameworks and algorithms work on byte-valued image data. On common systems, a byte (mostly 8Bit, in C++ realized by the \inlinecode{unsigned char} data type), is the smallest directly addressable data unit. Of course you can access even single bits, but in most cases, algorithm performance is worse here due to the need of bit-shifting and masking on the byte-aligned data.\\ In short, bytes have a very limited range $\{0,..,255\}$ but processing performance is optimal. Furthermore most input- and output-devices (cameras, frame-grabbers and even most common file formats) use 24bit color resolution, which can be implemented very naturally using 3 byte-valued image channels (e.g. RGB). Unfortunately there are many situations, where the 8bit-range of bytes is too limited. For example if one wants to create a feature map\footnote{Here a map is nothing else than an ordinary image -- consider a city map that contains some locally arranged information}. Often, the range of image features cannot be mapped to $\{0,..,255\}$ especially if feature values need to have a good resolution between $0$ and $1$ as well as for very high values. To make a long story short, sometimes there is also need for float-valued pixels. Hence we decided to provide a flexible image structure called \inlinecode{Img}\iclclassref{Core}{Img} that uses C++-templating for supporting most different data types. \section{Image Features\label{sec:image-features}} Once having clarified the basics for the image representation, we can have a look on some extra image features. Although size and channel count have already been mentioned above, here is a list of native features\footnote{This must not be misunderstood here: This are not features like intensity or image moments but features of the image structure} of an ICL image: \begin{enumerate} \item \textbf{Size} \\The 2D-spacial extend of the image. Common sizes are e.g. $640 \times{} 480$ which is calles \emph{VGA}. Sizes are represented by a special \inlinecode{Size}-structure \iclclassref{Core}{Size}. Uninitialized image have size of $0 \times{} 0$. Image size can be defined in the constructor, or it can be adapted/obtained using '\inlinecode{void setSize(const Size&)}' and '\inlinecode{const Size &getSize() const}' respectively. Please note, that changing an images size using \inlinecode{setSize} involves a complete loss of all image data. For scaling also the images contents, a special function named \inlinecode{scale} is provided. \item \textbf{Channels}\\ Channel count of an image is represented by a single \inlinecode{int} value. We do not use an \inlinecode{unsigned int} as we sometimes use a negative channel index to indicate that e.g. a function should work on all image channels. In contrast to most other image features, the channel count depends on another feature: the image \emph{format} (See below for more details). An uninitialized image has $0$ channels; channel count can be set/obtained using '\inlinecode{void setChannels(int num)}'/'\inlinecode{int getChannels() const}'. If channel count is increased by a \inlinecode{setChannels}-call, old channels are not touched. If otherwise, channel count is decreased, only the last channels are removed -- other channels remain untouched (image data is not changed, data is not reallocated). \item \textbf{Format}\\ This additional feature defines an images \emph{color-format}. Currently 6+1 formats are defined by the \inlinecode{format}-enum \iclnamespaceref{Core}. 6+1 means, that there are 6 \emph{real} color-formats\footnote{yes! gray is also a color format here} (Gray,RGB,YUV,HLS,LAB,Chroma) which each implies a related channel count (see above), and a dedicated format named \inlinecode{formatMatrix} that may have any number of channels. Setting an images format using \inlinecode{void setFormat(format fmt)} also adapts the channel count if necessary. Changing an images channel count to a value that does not match the channel count that is associated with it's current format automatically sets the image format to \inlinecode{formatMatrix}.\\ Furthermore it's important to mention, that all color-formats use a value range of $[0,255]$ regardless of the current pixel data type. I.e. also \inlinecode{float}-images expect e.g. RGB-values in range $[0,255]$ and not as one might expect in range $[0,1]$. Of course, you can setup a \inlinecode{float}-images pixels to any desired values, but color conversion functions \iclnamespaceref{CC} and image visualization work with this range. \item \textbf{Region-of-Interest}\\ In many image processing applications there're steps within the processing loop, where further processing can be restricted to a small image region. Most of the time, this so called \emph{Region-of-Interest} -- or short \textbf{ROI} is represented by a rectangular data structure. This helps to specify some interest regions without the need of extracting the images ROI by copying it's ROI-pixels deeply into another (smaller) image.\\ In the ICL, each image is equipped with a ROI-Rectangle of type \inlinecode{Rect} \iclclassref{Core}{Rect}. Nearly all ICL-functions and operators are able to process an images ROI only (e.g. Filters, that are represented by the \inlinecode{UnaryOp}-Interace \iclclassref{Filter}{UnaryOp} can be set up to apply their specific operation only on the ROI-Pixels of an image). One exception is e.g. the \inlinecode{deepCopy} function of an image, which creates a new image with identical size (and ROI-size). If one wants to copy only an images ROI \inlinecode{deepCopyROI} must be used instead. \item \textbf{Time-Stamp}\\ Another useful image parameter is a time stamp, which is commonly given by the time the images content was grabbed or recorded. ICL uses a \inlinecode{Time} \iclclassref{Utils}{Time} structure here. Most camera grabbers set up grabbed images time stamps to the moment it was grabbed from the camera device. \item \textbf{Image-Pixel Data Type (depth)} The image depth parameter is used to estimate an images pixel data type \textbf{at runtime}. As we will see in the following sections, in most cases, image pixel data type (from now on \emph{image depth}) is already fixed at compilation time due to the use of C++-templating. This feature becomes clearer, when the image implementation was discussed in detail (see Section \ref{sec:image}). The image depth can be obtained using \inlinecode{depth getDepth() const} and depth values are defined by an enum of the same name \iclheaderref{Core}{Types}. Furthermore, it's important, that there's no \inlinecode{setDepth} function, which again becomes clear in the next sections. An images depth can be adapted with a group of extra functions \iclnamespaceref{Core}. E.g. the function \inlinecode{ensureDepth} \iclheaderref{Core}{Core} can be used to set an images depth to a certain value. \end{enumerate}