Page MenuHomec4science

precomp-contrib-pmap-techreport.tex
No OneTemporary

File Metadata

Created
Tue, Jul 16, 05:46

precomp-contrib-pmap-techreport.tex

\documentclass[a4paper,11pt]{report}
\usepackage{color,graphicx,pslatex,rcs,gensymb,bigints,listings,rotating,
listings
}
\usepackage[colorlinks,urlcolor=red,bookmarks=false,breaklinks=false]{
hyperref
}
\usepackage[skip=0.5em]{subcaption}
\usepackage[onlyrm,veryoldstyle]{kpfonts}
%\usepackage[firstpage,stamp]{draftwatermark}
\usepackage[T1]{fontenc}
% Reduce margins
\addtolength{\oddsidemargin}{-1.5cm}
\addtolength{\evensidemargin}{-1.5cm}
\addtolength{\textwidth}{3cm}
\addtolength{\topmargin}{-2cm}
\addtolength{\textheight}{3cm}
\setlength{\marginparwidth}{2cm}
\usepackage{todonotes}
\graphicspath{{figs/}}
\definecolor{lstbg}{gray}{0.9}
\definecolor{lstfg}{rgb}{0,0,0.5}
\lstset{language=C, frame=single, backgroundcolor=\color{lstbg},
% xleftmargin=5pt, xrightmargin=5pt,
% aboveskip=20pt, belowskip=20pt,
numbers=none, captionpos=b, basicstyle=\footnotesize\color{lstfg},
breaklines=true, breakatwhitespace=true, showstringspaces=false,
escapeinside={(*@}{@*)}
}
\renewcommand{\familydefault}{\sfdefault}
\newcommand{\radiance}{\textsc{radiance}}
\newcommand{\radClassic}{\textsc{radiance classic$^\textsf{\tiny TM}$}}
\newcommand{\var}[1]{$\mathit{#1}$}
\newcommand{\lit}[1]{\textit{#1}}
\newcommand{\opt}[1]{\textbf{#1}}
\newcommand{\cmd}[1]{\textit{\textbf{#1}}\/}
\newcommand{\realLife}{\textsc{RealLife$^\textsf{\tiny TM}$}}
\newcommand{\mkpmap}{\cmd{mkpmap}}
\newcommand{\rcontrib}{\cmd{rcontrib}}
\newcommand{\rtrace}{\cmd{rtrace}}
\newcommand{\rcClassic}{\cmd{rcontrib classic$^\textsf{\tiny TM}$}}
\RCS $Revision: 1.13 $
\RCS $Date: 2023/06/06 15:51:20 $
\fontfamily{jpk}\selectfont
\title{The \radiance{} Precomputed Contribution Photon Map\\
--- Technical Report ---\\[5mm]
{\fontfamily{jkpvos}\selectfont\Large
\textrm{
\textit{%
Being an Account of the final remarkable Enterprises=
and desperate Actions=, \&c, undertaken by the shipwreck'd Crew
of the founder'd CC~EASE,
in which is= describ'd a most fantastick Design,
viz. furnishing
a compleat computed Day-Light, by the Season,
to the RADIANCE Publick
}
}
}
}
\fontfamily{jpk}\selectfont
\author{
Roland Schregle (roland.schregle@\{hslu.ch, gmail.com\})\\
CC Building Envelopes and Civil Engineering\\
Lucerne University of Applied Sciences and Arts
}
\date{
Revision \RCSRevision\\
\today
}
\begin{document}
\maketitle
\begin{abstract}
The \rcontrib{} tool included with the \radiance{} lighting simulation
and rendering software suite is useful for calculating daylight
coefficients for climate-based modelling, notably to assess annual
daylight availability under representative sky conditions.
This requires ``binning'' contributions from grouped sky directions
(``patches'') and sun positions. This can be a slow process as
\rcontrib{} does not cache irradiance, resulting in redundant rays
being traced for neighbouring sensor positions.
\radiance{} includes a contribution photon mapping module to bin
contributions by tracing rays from the light sources (forward
raytracing), in contrast to \radiance's standard backward raytracing
approach from the sensors (referred to here as \rcClassic). Photon
mapping is particularly efficient to simulate redirection through
complex shading devices using data-driven BSDFs, such as prismatic
films.
This technical report documents the further development of the
\radiance{} contribution photon map to support precomputed
contributions to reduce redundancy in \rcontrib.
To this end, the contribution photon map utilises a wavelet
compression and efficient coefficient encoding to compactly
represent the large volume of data the contributions incur.
The encoded coefficients are then paged on-demand from disk
and cached to reduce the in-core footprint of the contributions
in \rcontrib.
This document focuses on the implementation details of
the photon mapping software in support of this new functionality.
\end{abstract}
\tableofcontents
% ---------------------------------------------------------------------------
%\clearpage
\chapter{Introduction}
\section{Motivation}
Daylight coefficients and contributions are used in \radiance{} workflows
to simulate and analyse annual daylight availability subject to
seasonal variations. This includes accumulating contributions or
daylight coefficients (the latter corresponding to normalised
contributions) from solar sources and sky ``patches'', the latter
corresponding to discrete subdivisions of the incident hemisphere
on the receiving surface. These coefficients or contributions are
then propagated in the scene to assess predictions of indirect
daylight availablity on work planes, for example.
In a static scene, these daylight coefficients can be subsequently
scaled a posteriori with
discretised sky luminance vectors obtained either from empirical sky
models or climate data measured at the proposed site.
This process can be repeated as desired for a time-series of luminance
distributions without the need to recalculate the coefficients, since
these remain static along with the geometry. The resulting vector
calculations can be easily parallelised on modern computing platforms
for very large time-series, affording a high temporal resolution.
Calculating the coefficients at sufficiently high resolution in the
first place, can, however, be computationally expensive.
\radiance{} uses the \rcontrib{} tool to accumulate contributions
or coefficients from light sources as well as arbitrary objects. It
accounts for the light transport paths contributed by a set of
user-specified objects (identified by their modifiers), optionally
discretising their incident directions into ``bins'', and outputs these
either to one or more files, or to the console. In the context of contributions,
the incident direction is that of the first ray emitted or scattered
from an object with a modifier whose contributions are sought.
The \radiance{} photon map was extended to support annual daylight
simulations using contribution photons when it became part of the
official \radiance{} software distribution
\cite{schregle-techreport-2015}. Its primary purpose was to efficiently
compute daylight coefficients for daylight redirecting components with
predominantly specular reflection or transmission.
Initial results using contribution
photon mapping indicated -- unsurprisingly -- that the number of
required photons to adequately predict lighting levels
scales linearly with the number of
timestamps / light sources \cite{schregle-cisbat-2015}.
This in turn motivated the development of an \textit{out-of-core} photon
map that maintains its photons entirely on disk, and
dynamically loads subsets of photons on demand
\cite{schregle-oocpmap-jbps-2016, schregle-techreport-2016}.
By loading photons on demand, the resident memory footprint is reduced to
those photons which actually contribute to the sensor points under
consideration. By doing so, complex annual daylight simulations with
large photon maps can be efficiently performed on commodity office PCs.
Yay.
This technical report supplements the user documentation found in the
\radiance{} photon map manual \cite{schregle-pmapManual-2022}
and \mkpmap{} and \rcontrib{} manpages. It is primarily intended
for researchers interested in extending the code, and serves as
primary documentation of the source code.
% ---------------------------------------------------------------------------
\section{Overview}
An overview of the precomputed contribution photon map toolchain is shown
in figure \ref{fig:overview}. It follows the general \radiance{} photon
mapping workflow consisting of a \mkpmap{} preprocess to generate the
photon map, and a modified \radiance{} tool which evaluates
irradiance from the photons for a given set of virtual
sensor positions (often arranged in a grid).
\begin{figure}[p]
\centering
\includegraphics[width=\linewidth]{contribpmap-overview3-crop}
\parbox{\linewidth}{%
\caption{%
\label{fig:overview}
Overview of contribution photon mapping workflow.
Green paths denote inputs (parameters and sensor positions),
while red paths denote output. Light source contributions for
modifier mod are binned using an
$\left\lfloor\sqrt{nbins}\right\rfloor^2$
Shirley-Chiu disk-to-square mapping, and wavelet compressed
for a fraction \var{precomp} of photons by \mkpmap.
These contributions are saved along with the corresponding
photons in separate files for each modifier, grouped in a
subdirectory under the parent photon map \var{pmapfile}. The
photons and their precomputed contributions are subsequently
paged on demand, uncompressed, and cached by \rcontrib,
which passes the contributions to the standard contribution
calculation used by \rcClassic.
}
}
\end{figure}
The photon map represents
a precomputed radiance distribution which can be evaluated multiple
times under the assumption that the geometry and lighting remain static.
This aspect of photon mapping amortises the expense of precomputation,
thus the payoff scales with the number of sensor positions. Contribution
photon mapping is then significantly faster than ``classic'' \rcontrib{}
in the presence of complex geometry and specular materials, giving rise
to multiple scattering events. This effect is further amplified by the
fact that the irradiance cache is disabled in \rcontrib.
In precomputed contribution photon mapping mode, \mkpmap{} is designed
to behave similarly to \rcontrib, and accepts a subset of its parameters to
``bin'' (discretise) contributions based on their incident directions.
The contributors are identified by their modifiers with the \opt{-m}
option, but unlike \rcClassic, they are restricted to light
sources only. The binned direction is the incident direction of a
photon's first interaction with a scene object after emission from a
contributing light source. This corresponds to the \textit{photon primaries}
in the earlier contribution photon mapping implementation, although these
are no longer stored as they are no longer evaluated in \rcontrib.
The workflow in figure \ref{fig:overview} can be divided into the
following stages, each comprising distinct components as indicated
in the gray boxes:
\begin{enumerate}
\item Photon map generation with \mkpmap:
This precomputes the contribution photons and compresses them
for use by \rcontrib. Under the hood, \mkpmap{} performs
the following steps:
\begin{enumerate}
\item Distribution of \var{nphotons} photons as specified with
the \opt{-apC} option.
\item Binning of photon incident directions into \var{nbins} bins
(as specified with the \opt{-bn} option) using a Shirley-Chiu
disk-to-square mapping \cite{shirleyChiu-1997}. Technically,
this already occurs on-the-fly during photon distribution.
\item Precomputation of binned contributions for a fraction
\var{precomp} of the distributed photons, as specified with
the \opt{-apP} option (this option already served the same
purpose for the existing precomputed global photon map).
This involves performing a density estimate by locating the
\var{bwidth} nearest photons in the vicinity of a precomputed
photon, and accumulating their pre-binned contributions
in the mapped Shirley-Chiu square.
\item A wavelet transform over the binned contributions,
resulting in a set of approximation and detail coefficients.
\item Compression of wavelet detail coefficients by
thresholding, i.e. keeping only the (\var{comp})\% most
significant coefficients, as specified with the \opt{-apC}
option.
\item Compact encoding of thresholded wavelet coefficients
using a customised 32-bit \textit{mRGBE} format similar to
the RGBE format used for \radiance{} HDR files
\cite{ward-RGBE-1994}, with an additional coefficient index
to reconstruct the original bin ordering.
\item Saving of the precomputed photons and their corresponding
compressed wavelet coefficients; the previously distributed
photons are no longer needed and discarded.
\end{enumerate}
\item The out-of-core precomputed contribution photon map on disk:
This consists of the following files:
\begin{enumerate}
\item A \emph{parent} photon map file \var{pmapfile} as
specified with the \opt{-apC} option.
This file does not actually contain any photons, and merely
serves as a header for \cmd{getinfo}.
\item An option file \var{pmapfile.opt} containing the binning
options used to precompute to contribution photons; this is
passed to \rcontrib{} via its \opt{@} option to ensure consistent
binning when evaluating the contributions.
\item A subdirectory \var{pmapfile.rc} containing the actual
precomputed contribution photon maps.
\item Per-modifier \emph{child} photon maps \var{mod_i.pm}
foreach modifier \var{mod_i} specified with the \opt{-m}
option. These use the existing out-of-core photon map format
\cite{schregle-techreport-2016} and include a companion
\var{mod_i.pm.leaf} file containing the actual photon payload.
\item Per-modifier compressed wavelet coefficients file
\var{mod_i.wvt}, organised as aggregate records of fixed size
per photon. The ordering is identical to that of the out-of-core
leaf file to facilitate on-demand-paging.
\end{enumerate}
\item Evaluation of precomputed contributions with \rcontrib:
\begin{enumerate}
\item Lookup for single closest precomputed photon at each
given sensor position, paging photons
from the out-of-core leaf file as required. This returns
a photon index $p$ corresponding to the photon's record number
in the leaf file.
\item Paging of the photon's associated mRGBE-encoded,
compressed wavelet coefficients, which are located in the
$p$-th record in the wavelet coefficient file.
\item mRGBE decoding of the wavelet coefficients to floating
point, plus coefficient index.
\item Expansion of compressed wavelet coefficients by populating
the coefficient matrix with the decoded coefficients in the
ordering indicated by their decoded indices.
\item Inverse wavelet transform to recover the original binned
contributions in the Shirley-Chiu square.
\item Caching of reconstructed contributions to hide latency
incurred by paging and reconstruction. This cache is
derived from that used by the existing out-of-core photon map,
and contains \var{ncached} pages (as specified with the
\opt{-aC} option), each containing a set of
fully decoded binned contributions for a single precomputed
photon.
\item Passing of reconstructed contributions to \rcontrib's
internal contribution lookup table, at which point control
is handed over to \textit{Don Gregorio} to produce the usual
binned contributions per sensor position according to the
specified output format.
\end{enumerate}
\end{enumerate}
Each stage and its constituents is detailed in section
\ref{sec:implementation}.
% ---------------------------------------------------------------------------
%\clearpage
\chapter{Proof of Concept}
\section{Rationale and Prototyping}
The design choices to implement the workflow in figure \ref{fig:overview}
were borne out by the author's collective experience with wavelets.
This dates back to a first exposure during a guest lecture on Wavelet
Radiosity held by Prof. H.P. Seidel hosted by the University of Bonn in
the mid-90s \cite{stamminger-waveletRadiosity-1995}. Further exposure
followed as the 90s saw a flurry of activity in applying wavelets to
computer graphics, notably the development of the lifting scheme
\cite{sweldens-lifting-1996}, and the adaptation of wavelets to the
spherical topology \cite{schroeder-sphWavelets-1995}.
The latter publication inspired the author's first own application of
wavelets in developing a 4D BSDF compression scheme as part of an
``adjunct project'' (read: digression) to the development of the
original photon map extension at Fraunhofer ISE.
This unfinished work served as a proof of concept, but was shelved and
never formally published. However, it was retrospectively
presented at the \radiance{} workshop almost a decade later as part
of the short-lived%
\footnote{Extremely short-lived infact, since this was the only
instalment!
}
lecture series, \lit{Secret Weapons of RADIANCE:
Stuff That Never Took Off} \cite{schregle-bsdfComp-2011}.
It was this initial, hands-on experience
with wavelets that made the author aware of their inherent power as
an analysis and compression tool.
Some two decades later, the author's involvement in the SNSF-funded
project \lit{Lightfields for glare assessment}
rekindled his interest in wavelets.
This was chiefly due to their use by his very talented
colleague, Stephen Wasilewski, to adaptively sample spatio-temporal
luminance distributions to represent a lightfield.
This sampling scheme was based on the
\cmd{pywavelets} Python module and led to the development of the
\lit{Raytraverse} tool \cite{wasilewski-raytraverse-2021}.
When a proposition was therefore made within the framework of the project
to extend the existing contribution photon map with precomputed
contributions, an application of wavelets was immediately apparent.
The availablity of \cmd{pywavelets} as a convenient prototyping tool
quickly prompted the development of a concept and initial experimentation
to prove the feasibility of the proposition.
A detailed account of the initial concept that led to the development
of the precomputed contribution photon map can be found -- along with
choice expletives documenting frustrating dead ends for a dose of realism
-- in a separate proposal document
\cite{schregle-preCompContribPmapProposal-2021}.
It should be acknowledged that during the development of precomputed
contributions, a number of existing -- and some \emph{very} old -- bugs
were still discovered in various places around the photon mapping code.
This work therefore not only extended the photon map's functionality,
but also made the code more robust in general.
% ---------------------------------------------------------------------------
\section{Initial Compression Tests with \cmd{pywavelets}}
\label{sec:waveletPoC}
An initial back-of-the-envelope calculation made it clear that the
storage required for precomputed contributions would be prohibitive
even for current office workstations and laptops; 10M precomputed
photons (a realistic number for a complex simulated environment) with
32-bit RGBE-encoded contributions in 2305~bins (corresponding to a
Reinhart~MF:4 mapping \cite{bourgeoisReinhart-2008}) would occupy
92.2~Gb on disk.
Note this calculation omits the storage occupied by the photons
themselves, which would occupy a relatively insignificant 240 Mb.
This simple example makes it clear that an effective compression scheme
is needed.
A quick statistical analysis of contributions from test scenes
immediately obviated the notion of using a na\"ive run-length encoding of
contiguous empty (zero) contributions. Instead, a wavelet compression
scheme was favoured based on prior experience.
To assess viable candidate compression techniques prior to hard-core
C coding, preliminary tests were conducted with a simple Python prototype
(\cmd{wavelet\_test.py}) using Filip Wasilewski's excellent
\cmd{pywavelets} (\cmd{pywt}) package \cite{Lee2019}.
The \rcontrib{} tool assumes a linear ordering of bins from
what is essentially a 3D domain (hemisphere of incident directions)
which, due to dimensional redundancy, can be reduced to a 2D
domain ($\theta, \phi$), and in turn serialised to a 1D domain,
as is the case with the popular Reinhart mapping.
This is inherently a compromise, since the original topology,
and therefore any multidimensional correlation in the contributions,
is then lost.
In an initial proof-of-concept, max-normalised scalar contributions from
a uniform sky in a bilaterally lit test scene with skylight were
binned using a Reinhart MF:4 mapping (2305 bins) and passed to a 1D
wavelet transform. The resulting wavelet coefficients were then
thresholded against a user-specified value to achieve what amounts to
a lossy compression, effectively setting the thresholded coefficients
to zero. Finally, the thresholded coefficients were reconstructed by
applying the inverse wavelet transform, and their deviations assessed
relative to the original.
The initial results revealed that the type of wavelet used (and
\cmd{pywt} offers many!) has a less significant impact on the
reconstruction error compared to the linear ordering of the coefficients.
In their original order, localised peaks in the linearised
contributions were significantly attenuated with high compression
(typically 90--98\% in these tests).
Ordering the contributions by their value to obtain a monotonically
increasing signal significantly reduced these artefacts, as it effectively
imposes correlation on the input. Similar tests were also conducted
with solar sources, which introduced higher dynamic range, more
pronounced peaks, and major artefacts without sorting (see figure
\ref{fig:solarcont-mf4}).
\begin{figure}[htb]
\centering
\includegraphics[width=0.45\linewidth]{%
solarcont-mf4-db4-90%
}
\includegraphics[width=0.45\linewidth]{%
solarcont-mf4-sort-db4-98-detail%
}\\
\parbox{0.9\linewidth}{%
\caption{%
\label{fig:solarcont-mf4}
Daubechies wavelet transform of Reinhart MF:4 mapped solar
contributions (2305 bins) after thresholding 90\% of coefficients.
In their original linear ordering (left), the binned contributions
exhibit isolated discontinuities (peaks) which are poorly
decorrelated by the
wavelet transform, resulting in obvious compression artefacts.
Ordering the contributions by increasing magnitude (right)
imposes corellation and significantly reduces these artefacts.
This reordering
however complicates the reconstruction of the contributions,
and doesn't decorellate the contributions in their original
hemispherical domain.
}
}
\end{figure}
Other 1D bin orderings using 2D space-filling curves such as Morton codes
or Hilbert indices yielded similarly disappointing results with peaky
contributions.
In addition, sorting the contributions complicates their reconstruction,
as this requires storing the original bin order.
Furthermore, this still doesn't decorellate the contributions
in their original 2D hemispherical domain, which results in suboptimal
compression. A 1D wavelet compression was therefore abandoned in
favour of a more elaborate 2D mapping of the hemisphere, and a wavelet
transform in this domain.
Furthermore, it was desirable not to limit the number of bins to a
power of two. This requires extending the contributions at the boundaries,
as described in section \ref{sec:boundaryExt}. Boundary extensions in
the context of wavelet transforms are poorly documented in the literature;
consequently, a detailed analysis of the \cmd{pywt} C source code was
conducted in order to obtain an insight into how the different boundary
extension modes offered by the module are implemented, notably in a
multi-dimensional context. This revealed the necessity of introducing
additional \textit{padding coefficients} at the boundaries, consequently
increasing the number of coefficients beyond the number of bins.
Because it is so comprehensive and highly
optimised, the \cmd{pywt} source code became the primary basis for the
wavelets module that was developed for \radiance{}, which is described in
section \ref{sec:wavelet3.c}.
% ---------------------------------------------------------------------------
\section{Initial Wavelet Coefficient Encoding Tests}
\label{sec:mrgbePoC}
An initial analysis of the dynamic range of normalised, scalar wavelet
coefficients generated by the test script using \cmd{pywt} indicated
that the majority of these spanned ca. 10 orders of magnitude (see
figure~\ref{fig:coeffRange}).
\begin{figure}[htb]
\centering
\includegraphics[width=0.9\linewidth]{skycont-mf4-sort-db4-coeffs}
\parbox{0.9\linewidth}{%
\caption{%
\label{fig:coeffRange}
Logarthmic plot of absolute wavelet coefficient magnitudes after
normalisation. The coefficients (magenta crosses) are clustered
in bands of increasing frequency from left to right, with the
number of coefficients doubling in consecutive bands due to the
wavelet transform's multiresolution analysis. In this example,
the dynamic range of the coefficients is limited to ca. 10 orders
of magnitude. It is also evident that the coefficients at the
bottom between $10^{-19}$ and $10^{-20}$
are negligible and can be omitted. It follows therefore that
it suffices to encode coefficients up to the green line at ca.
$4.6^{-10}$; this corresponds to an effective range of
$[2^{-31}, 1]$, which can be encoded using a 5-bit binary mantissa.
In practice, thresholding the coefficients will further reduce this
dynamic range.
}
}
\end{figure}
By clamping the maximum absolute
coefficient magnitude to 1, it is possible to map the coefficients as
3-tuples to a normalised range and encode these as integer RGB
mantissae%
\footnote{Yeah, that \emph{is} the plural.}
with a common exponent, thus accounting for 3 colour channels, as is done
with \radiance's 32-bit RGBE format \cite{ward-RGBE-1994}.
Furthermore, because the coefficients are thresholded, their minimum
absolute magnitude is likewise bounded. This insight, coupled with the fact
that the coefficients in the range $10^{-19}$ to $10^{-20}$ are clearly
insignificant, indicated the viability of an encoding with limited
precision. This encoding, which would require pre-normalising the
coefficients and storing corresponding independent normalisation factors
per colour
channel, would also need to accommodate a coefficient index to indicate
which coefficients remain after thresholding. This payload data would
be incorporated into the modified RGBE encoding (or mRGBE, as we like
to call it%
\footnote{
Actually, it can stand for \{micro, modified, minimalist\} RGBE;
take your pick.
It just has to sound cool nowadays; style over substance, you know.
iRGBE would've been trendy too, but probably infringes on some
trademark from Cupertino.
}) as a dedicated field of fixed width, which of course limits the
number of bits allocated to the RGB mantissae and the exponent, thus
inherently reducing
the precision compared to \radiance's native RGBE, which allocates
8 bits to each field.
\begin{figure}[!htb]
\centering
\includegraphics[width=0.8\linewidth]{%
skycont-mf4-sort-db4-bincoeffs%
}\\
\includegraphics[width=0.8\linewidth]{%
skycont-mf4-sort-db4-bincoeffs-detail%
}\\
\parbox{0.9\linewidth}{%
\caption{%
\label{fig:mrgbeTest}
Logarithmic plot of wavelet coefficient range in figure
\ref{fig:coeffRange} (top, detail inset at bottom) with
encoded/decoded mRGBE coefficients superimposed as blue circles.
Despite the limited precision
(4 RGB mantissa bits + 1 sign, 5 exponent bits), the mRGBE
encoding correlates well with the original floating point values
within the constrained dynamic range $[2^{-31}, 1]$ covered by
the exponent.
}
}
\end{figure}
A prototype mRGBE encoding/decoding script (\cmd{bincoeff\_test.py},
importing routines from \cmd{wave\-let\_test.py}) was therefore developed in
Python to check the accuracy (read: loss of precision) of the proposed
encoding given the previous practical sample data. Figure
\ref{fig:mrgbeTest} shows initial results of an mRGBE encoding using
5 bits per RGB mantissa (including 1 bit each for the sign) and the
exponent (implicitly negative), leaving 12 bits for the payload data,
which was not used in this test. Despite the limited precision, the
encoded and decoded coefficients exhibit remarkable coherence with the
original floating point values. However, it should be noted that the
RGB colour channels are identical in this test; although highly saturated
luminance is unlikely in realistic scenarios, higher deviations were
expected when the R, G, and B channels differ. This was further
investigated and quantified in the unit test of the C module
developed on the basis of the Python prototype, which is detailed in
section \ref{sec:mrgbe.c}.
% ---------------------------------------------------------------------------
\section{Design Goals}
The conclusions drawn from the initial tests helped set the design goals
for the precomputed contribution photon map. These included:
\begin{itemize}
\item A fixed binning of photon incident directions to a square matrix
using the area-preserving Shirley-Chiu disk-to-square mapping
which enables decorrelating the contributions in 2D in
a subsequent wavelet transform.
\item Compression in 2D via a computationally lightweight 4-tap
(having a support of 4 samples) wavelet transform, the Daubechies
DB2 wavelet being a popular choice.
\item An arbitrary number of bins (not just powers of 2). This requires
boundary extension of the contributions during the wavelet
transform.
\item Reuse or adaptation of existing photon map code where
possible; this included modifying the existing out-of-core photon
cache, the original contribution photon distribution routines,
the precomputation routines for global photons,
photon lookup routines, out-of-core caching routines, and
routines interfacing to \rcontrib.
We doan' need to reinvent da wheel.
\item Compact encoding (mRGBE) of 3-tuple wavelet coefficients and
their indices in a 32-bit envelope in a separate wavelet
coefficient file, with at least 5 bits per RGB mantissa.
\item Thresholding of wavelet coefficients by dropping a fixed number;
this results in a fixed size of compressed coefficients per
photon, which facilitates directly accessing contribution records
in the wavelet coefficient file.
\item Binning settings in the precomputed photon map override those
in \rcontrib{} when the photon map is loaded; these are passed
via an option file generated by \mkpmap.
\item Out-of-core storage of photons and their wavelet coefficients,
with on-demand paging in \rcontrib,
as already done by the standard out-of-core photon map.
Consequently, the in-core kd-tree data structure is no longer
supported with contribution photons.
\item Caching of reconstructed contributions after mRGBE decoding and
inverse wavelet transform in \rcontrib.
\end{itemize}
% ---------------------------------------------------------------------------
%\clearpage
\chapter{Implementation}
\label{sec:implementation}
The following sections present implementation details of each
component of the precomputed contribution photon map workflow
in reference to figure \ref{fig:overview}.
Each section refers to the associated source code modules
and relevant functions. The reader is referred to
the software architecture graphs in section \ref{sec:swarch} in the
appendix
for an overview of the modules, how they are embedded within the
\rcontrib{} framework, and how they interact.
The exposition also frequently refers to data types defined in the source
code, which are summarised in listings \ref{lst:codeDefs1},
\ref{lst:codeDefs2}, and \ref{lst:codeDefs3} for the interested reader.
% ---------------------------------------------------------------------------
\section{Contribution Photon Generation \& Precomputation with
\mkpmap}
Precomputed contribution photon maps are generated with the \mkpmap{}
tool. This entails distributing (and binning) the photons,
precomputing their binned contributions by density estimation,
compressing the resulting contributions via wavelet transform (handling
potential boundary artefacts), and encoding the wavelet coefficients.
\subsection{Contribution Photon Distribution}
Contribution photons have their own photon distribution routine,
\cmd{pmcontrib2:distribPhotonContrib()}.
It differs from the standard photon distribution routine
\cmd{pmap:distribPhotons()} in that each source contributes
(approximately) the same number of photons
(i.e. $nemit \approx nphotons \: /\, nsrc$, where $nsrc$ is the number of
contributing sources).
This measure is intended to
balance the density of photons contributed by each source whose
contributions are sought (as specified with the \opt{-m} option to
\mkpmap).
As in the standard photon
distribution routine, the number of emitted photons to emit is adjusted
to obtain the desired target photon count
by emitting a fraction of the photons in a prepass, and then
extrapolating this
in proportion to the stored photons vs. the total target count.
Since the number of photons emitted from each source no longer correlates
with its emitted flux, as is the case with \cmd{distribPhotons()},
the resulting flux per photon must be adjusted individually for
each source to compensate for bias, at the expense of increased variance.
% ---------------------------------------------------------------------------
\subsection{Binning of Contributions}
\label{sec:binning}
When a contribution photon is emitted, its \textit{contribution source}
is temporarily set in the photon map's \cmd{PhotonMap.lastContribSrc}
field, consisting of a 16-bit integer
source index and bin. The latter is obtained by calling
\cmd{pmapcontrib:ray2bin()} via \cmd{pmapcontrib:contribSourceBin()},
which performs a
Shirley-Chiu disk-to-square mapping (see figure \ref{fig:shirleyChiu})
of the photon's emitted direction
vector.
\begin{figure}[htb]
\centering
\includegraphics[width=0.8\linewidth]{shirley-chiu2-crop}\\
\parbox{0.9\linewidth}{%
\caption{%
\label{fig:shirleyChiu}
Shirley-Chiu disk-to-square mapping used to bin
contribution photon directions. This mapping has the
desirable property of preserving adjacency and fractional area.
The disk is obtained by projecting the
hemisphere of photon directions onto the plane defined by the
surface normal $[rNx,\: rNy,\: rNz]$. The polar angle origin
$\phi = 0$ in the disk is defined by the up vector
$[Ux,\: Uy,\: Uz]$.
}
}
\end{figure}
The disk coordinates are obtained by projecting the photon's direction
onto the disk plane as defined by its normal, $[rN_x,\: rN_y,\: rN_z]$
(see below). The mapped square coordinates $[sc_x,\: sc_y]$
are, in turn, mapped to a linear bin
\begin{equation}
b_p = sc_x \cdot scdim + sc_y,
\label{eq:linearIdx}
\end{equation}
where $scdim = \left\lfloor\sqrt{nbins}\right\rfloor$ is the dimension
of the Shirley-Chiu square,
and is derived from the number of bins specified to \mkpmap{}
with the \opt{-bn} option. If the latter is not an integer square,
it will be rounded to the nearest such number by \mkpmap.
If the Shirley-Chiu mapping fails (e.g if the photon's direction
lies in the plane or is incident from the back, the bin is set to
-1 as invalid. This mapping is fixed for contribution photons and
cannot be modified by the user, as the resulting square topology
is immediately applicable to a 2D wavelet transform in matrix form.
\cmd{ray2bin()} evaluates the following variables to reorient the
disk-to-square mapping:
\begin{itemize}
\item{$RHS$}: 1 for right-handed coordinate system
(default), -1 for left-handed.
\item{$rNx,\: rNy,\: rNz$}: disk plane surface normal
(defines $\theta = 0$, default [0, 0, 1])
\item{$Ux,\: Uy,\: Uz$}: up vector (defines $\phi = 0$,
default [0, 1, 0])
\end{itemize}
The contribution source is set speculatively for each emitted photon,
in the anticipation that its path will lead to stored photons. If the
path spawns no photons (e.g. because the photon was backscattered
from a port and left the scene, or because the materials have zero
reflectance), the contribution source is simply
discarded and overwritten by the next emitted photon.
On the other hand, When a photon is created, the photon's contribution
source field \cmd{Pho\-ton.\-aux.\-con\-trib\-Src} is set from
\cmd{PhotonMap.lastContribSrc}, which remains constant for the photon
path until the next photon is emitted. Note that \cmd{Photon.aux} is
designated an auxiliary data field which is specific to the photon type.
It is defined as a union structure and can also store a unique photon
path ID for regular photons, and a photon's time of flight (expressed
as distance travelled) for transient photons.
The contribution sources stored with each photon are subsequently
accumulated in bins when the contributions are precomputed, as
detailed in the next section.
% ---------------------------------------------------------------------------
\subsection{Precomputation of Contribution Photons}
\label{sec:precomp}
Once the approximate target number of photons \var{nphotons} has been
reached, a fraction \var{precomp} of these (specified with the \opt{-apP}
option to \mkpmap) is drawn uniformly at random as candidates for
precomputation, thus preserving the relative distribution of photons in
the scene. The entire precomputation is wrapped by the routine
\cmd{pmapcontrib:preComputeContrib()}.
The contributions for a candidate precomputed photon
are collected by function
\cmd{pmap\-con\-trib:\-get\-Pho\-ton\-Con\-trib()}. This
routine extracts the candidate photon's emitting light source from its
contribution source field \cmd{Photon.aux.contribSrc.sourceIdx},
from which, in turn, the source modifier is obtained. If the modifier is
among those whose contributions are sought, it is passed to the
standard photon lookup routine \cmd{pmapdata:findPhotons()}, which in
turn calls the lookup routine specific to the out-of-core data structure
(sorry, no OOP-style overloading here), \cmd{pmapooc:OOC\_FindPhotons()}.
This routine performs a nearest neighbour search for the \var{bwidth}
closest photons. It also sets a filter of instance
\cmd{pmapfilt:OOC\_SearchFilterState}, which is passed to a callback
function \cmd{pmapfilt:filterPhoton()} called by the lookup routine
to accept/reject photons on the fly based on the filter's criteria.
This function ordinarily tests only photon normals to reject photons
incident from backfaces. However, with contribution photons, the
filtering callback further adds the constraint that the photons are
emitted from a given light source modifier. This ensures the
lookup collects only contributions for the same modifier as the candidate
precomputed photon.
If the photon search is successful, \cmd{getPhotonContrib()} performs
a modified density estimate of the photons, accumulating their
contributions (obtained from their \cmd{Photon.flux} field) in
the preassigned bins indicated by each photon's contribution source
field \cmd{Photon.aux.contribSrc.sourceBin} (this incidentally also
includes the candidate precomputed photon's own binned contribution).
The binned contributions are then divided by the area of the disk
intercepted by the search volume, with radius corresponding to the
maximum found photon distance.
The precomputation routine allocates a lookup table (\cmd{struct LUTAB},
defined in the stock \radiance{} module \cmd{lookup.c}) of per-modifier
\emph{child} photon maps, and assigns it to the \emph{parent} photon map's
\cmd{PhotonMap.preCompContribTab} field. This enables the parent photon
map to act as a container for the per-modifier child photon maps, with
the actual precomputed photons stored in the latter. The modifier name
then acts as the LUT key, as is also the case with \rcontrib's
contribution table \cmd{rcontrib:modconttab}.
The precomputation routine also allocates a container
\cmd{PhotonMap.preCompContrib} of type \cmd{PreComputedContrib} in
each child photon map (see listing \ref{lst:codeDefs2}, which serves
as scratch space for the subsequent wavelet transform and compression.
This includes an instance of \cmd{wavelet2.h:WaveletMatrix2}, which
is a 2D array of floating point 3-tuples, corresponding to the
RGB wavelet coefficients in each colour channel.
In addition, the wavelet transform requires a second, transposed
matrix for intermediate results, which is also preallocated
and initialised to 0 prior to every transform.
Note that these data structures are allocated and dimensioned separately
per modifier (which is unique to each child photon map), as the binning
parameters may differ for each.
The preallocated data structure container of type \cmd{PreComputedContrib}
is then passed to the main compression/encoding routine
\cmd{pmapcontrib:encodeContribs()} along with the user-specified
compression ratio \var{comp}.
This routine performs the actual wavelet transform and
subsequent coefficient thresholding.
% ---------------------------------------------------------------------------
\subsection{2D Wavelet Transform}
\label{sec:wavelet3.c}
Wavelets are a science unto themselves and a very broad topic that is
mostly opaque for non-ma\-the\-ma\-ti\-ci\-ans and even computer
scientists.
Once one has grasped the fundamental concept of wavelets, however, the
theory is relatively intuitive, if far from simple. The reader is
referred to \cite{Graps:1995} for an excellent introduction with
notable applications.
Much like the Fourier transform, the wavelet transform essentially
represents a given input signal as a series of sums of scaled
(in terms of amplitude) basis functions, that have been dilated and
translated along the signal's propagation axis.
These functions are referred to as the actual wavelets, though in
practice they aren't necessarily always undulating (see that funny
squiggly thing in figure \ref{fig:waveletFunc} for an example).
Unlike Fourier
basis functions, wavelets have a finite support, i.e. a defined
non-zero region of influence. Based on this principle, it is
possible to decompose a given signal into a set of coefficients
(the scaling terms) at different dilations and translations.
The dilations and translations can be considered sliding windows
depicting the signal at varying zoom levels, or resolutions.
Consequently, wavelets are a form of \textit{multiresolution
analysis}.
Signals can be decomposed by a wavelet transform in arbitrary
dimensions, though applications in 1 and 2 dimensions are most common
(with image processing and compression being notable applications of
the latter).
The precomputed contribution photon map decomposes the pre-binned
contributions in their native 2D domain to optimally leverage
corellation along each axis of the Shirley-Chiu square, effectively
treating it as a square matrix.
Once the contributions for each precomputed photon have been accumulated,
they are transferred as input to the preallocated wavelet matrix. In the
general case, this requires a \cmd{memcpy()} of each matrix row from the
linear contribution array, effectively reverting the linearisation
that follows the Shirley-Chiu mapping.%
\footnote{In the special case where input and output sizes of the
transform are the same, i.e. when no padding oocurs, the matrix
rows (which are \textit{Iliffe} vectors, or pointers to arrays) can
simply be set up to map directly into the corresponding segments in
the linear contribution array, obviating the need to push data about.
Obviously this is much more efficient. This can be done in
conjunction with the specialised 2D wavelet transform function
\cmd{wavelet2:waveletXform2()}, which only operates on input
sizes of powers of 2 and does not generate padding coefficients.
However, due to the major restriction this imposes on the input size
(and the number of bins), this function is not used by the precomputed
contribution photon map.
}
The routine \cmd{pmapcontrib:encodeContribs()} calls the full 2D
wavelet transform wrapper routine \cmd{wavelet3:padWaveletXform2()}.
This routine, in turn calls \cmd{padD2Step2()} to perform one pass of the
transform along a fixed (horizontal) axis. The latter function accepts
an output matrix of the same size as the input, and transposes the
resulting coefficients from the iteration ``on the fly'' as they are
written to the output matrix.
With the input and output matrices now swapped, the next iteration of
this function will perform another horizontal pass over the transposed
matrix, which is tantamount to a vertical pass over the original matrix.
With the output matrix now transposed a second time, the original
orientation is obtained, and the matrices swapped a second time.
Consequently, each pair of invocations of a transform step constitutes a
complete horizontal/vertical transform pass at a given resolution (see
figure \ref{fig:waveletStep}).
\begin{figure}[htb]
\centering
\includegraphics[width=0.7\linewidth]{Daubechies4-functions}
\parbox{0.9\linewidth}{%
\caption{\label{fig:waveletFunc}
Daubechies DB2
``4-tap'' wavelet function. The scaling function
(blue) decorrelates the input signal as approximation
coefficients. The wavelet function (red) -- itself orthogonal
to the scaling function -- decorrelates the input as detail
coefficients. Note the asymmetry of this wavelet family. This
wavelet has a support of 4 adjacent samples and balances
computational efficiency and boundary effects for poorer
decorrelation compared to wavelets with larger supports,
but also higher computational expense.
}
}
\end{figure}
\begin{figure}[p]
\centering
\includegraphics[width=0.7\linewidth]{%
wavelet3-test-16x16-steps-reaspect-crop%
}\\
\parbox{\linewidth}{%
\caption{\label{fig:waveletStep}
Sample output of wavelet unit test for 16$\times$16 bins,
showing the first two resolution levels of the 2D wavelet
transform,
each consisting of transform steps over the horizontal and
vertical axes. The output of each transform step
(right of arrows)
becomes the input for the next step (left of arrows).
Starting with the original input samples $y_{i,j}$, each step
generates a set of approximation coefficients $s_{i,j}$ (red
arrows) and detail coefficients $d_{i,j}$ (green arrows). To
alternate the transform axes, the implementation transposes the
output matrix on the fly (note the reversed output coefficient
indices), consequently the transform need only be performed
along one axis, which simplifies indexing; the original matrix
orientation is then restored every two steps. After each
iteration (horizontal/vertical transform pair), the
approximation coefficients $s(\ldots (s_{i,j})\ldots ) =
(s_{i,j})^k$ in the upper left of the output matrix become the
input for the next (halved) resolution level. The sizes of the
progressively smaller output submatrices (indicated by square
brackets) include padding coefficients. Consequently, the
output matrix leaves unoccupied cells (indicated by dots) to
accommodate these additional coefficients, which accumulate in
the upper left of the matrix in subsequent resolution levels.
}
}
\end{figure}
Each transform pass reduces the number of output
coefficients thereby yielding a coarser representation of the
original input signal, corresponding to a lower frequency band. This
is termed
\textit{multiresolution analysis} and is a key characteristic of
the wavelet transform, namely that it decorrelates the input signal
at various resolutions, or frequency bands (in filtering parlance).
The alternating transposition of the matrix effects a decorellation
along the horizontal and vertical axes.
\begin{figure}[htb]
\centering
\includegraphics[width=0.75\linewidth]{%
wavelet3-test-16x16-full-reaspect-crop%
}
\parbox{0.9\linewidth}{%
\caption{\label{fig:waveletCoeffsFull}
Sample output of wavelet unit test for 16$\times$16 bins,
showing the wavelet coefficient matrix after a full transform.
The coloured fields identify the coefficient type (
red = approximation $s$, green = detail $d$) and the successive
transform steps, from right to left. Each iteration generates
coefficients from those of prior iterations at higher resolutions:
approximations of prior details $sd$ (lower left),
details of prior approximations $ds$ (upper right),
details of prior details $dd$ (lower right),
approximations of prior approximation $ss$ (upper left).
Each subsequent iteration then recurses into the upper left
submatrix, using the approximations $ss$ as
increasingly coarse representations of the original contributions
as input. After the final iteration, the red submatrix in the
upper left corner contains the 3$\times$3 coarsest approximations.
}
}
\end{figure}
The wavelet transform projects the input signal $y_{i,j}$ onto a pair of
orthogonal basis functions specific to each wavelet family (see figure
\ref{fig:waveletFunc} for a simple example from the Daubechies family%
\footnote{The naming convention within each wavelet family is not firmly
established. The Daubechies wavelet in figure \ref{fig:waveletFunc}
is either denoted \textit{DB2}, referring to its 2 vanishing moments,
or \textit{D4}, referring to its support of 4 (having 4 ``taps'',
in filter parlance).
The \lit{pywavelets} module adheres to the former convention,
consequently so does the module \cmd{wavelet3.c}.
}),
termed the \textit{scaling function} $\Phi_k(j)$ and the \textit{wavelet
function} $\Psi_k(j)$, where $k$ is the resolution (or dilation) of the
function, and $j$ is its position along the input signal's transform axis.
Unlike Fourier basis function, wavelets have finite support and are
localised in space, i.e. they are only non-zero for a finite subset of
sample positions $j$. Larger supports decorrelate over more samples and
therefore offer improved compression (smaller coefficients), but at
higher computational expense.
The scaling and wavelet functions $\Phi_k(j)$ resp. $\Psi_k(j)$
decompose the $k$-th input matrix along a fixed axis $i$ (matrix row or
column) into a pair of approximation coefficients $s^k_{i,j}$ and
detail coefficients $d^k_{i,j}$. The input matrix is, itself, the set of
approximation coefficients $s^{k-1}_{i,j}$ from the previous iteration.
The set of approximation coefficients $s^k$ always represent the
signal at half the resolution compared to those from the previous
iteration, $s^{k-1}$. In the initial iteration $k=0$, the approximations
are equal to the original input signal: $s^0_{i,j}$ = $y_{i,j}$.
For a 4-tap function as shown in figure \ref{fig:waveletFunc}, the
$k$-th iteration of the wavelet transform is as follows:
\begin{eqnarray}
\label{eq:d2FwdXform-s}
s^k_{i,j} &=& h_0\ s^{k-1}_{i,2j} \ +\ h_1\ s^{k-1}_{i,2j+1}\ +\
h_2\ s^{k-1}_{i,2j+2}\ +\ h_3\ s^{k-1}_{i,2j+3}\\
\label{eq:d2FwdXform-d}
d^k_{i,j} &=& g_0\ s^{k-1}_{i,2j} \ +\ g_1\ s^{k-1}_{i,2j+1}\ +\
g_2\ s^{k-1}_{i,2j+2}\ +\ g_3\ s^{k-1}_{i,2j+3},
\end{eqnarray}
where $h_j$ and $g_j$ are constants obtained by evaluating
$\Phi_k(j)$ and $\Psi_k(j)$ at the corresponding positions of the input
signal samples that fall within the functions' support.
For the 4-tap Daubechies wavelet, these are:
\begin{equation}
\label{eq:d2-hCoeffs}
h_0 = \frac{1 + \sqrt{3}}{4\sqrt{2}} ;\quad
h_1 = \frac{3 + \sqrt{3}}{4\sqrt{2}} ;\quad
h_2 = \frac{3 - \sqrt{3}}{4\sqrt{2}} ;\quad
h_3 = \frac{1 - \sqrt{3}}{4\sqrt{2}}
\end{equation}
\begin{equation}
\label{eq:d2-gCoeffs}
g_0 = h_3 ;\quad
g_1 = h_2 ;\quad
g_2 = h_1 ;\quad
g_3 = h_0
\end{equation}
Note that the transform axis index $j$ is doubled on the RHS of
equations \ref{eq:d2FwdXform-s} and \ref{eq:d2FwdXform-d};
this is because each iteration of the
transform reduces the number of approximation coefficients compared to the
input, yielding the coarser approximation.
Note also that, for the sake of clarity, equations \ref{eq:d2FwdXform-s}
and \ref{eq:d2FwdXform-d}
omit the on-the-fly transposition shown in figure \ref{fig:waveletStep};
the actual implementation swaps the indices $i, j$ on the LHS
during assignment.
In the case of the precomputed contribution photon map, the contributions
contain RGB radiometric data. Consequently, the above decomposition
is extended to the three colour channels. Thus the coefficients
$s^k_{i,j}$ and $d^k_{i,j}$ are infact 3-tuples.
The colour channels are treated completely independently, and no
decorellation occurs between them.
Each subsequent iteration of the wavelet transform recurses on the
approximation coefficients on the previous iteration; in a 2D matrix,
this corresponds to the upper left subquadarant of the wavelet
coefficient matrix (see figure \ref{fig:waveletCoeffsFull}). Detail
coefficients are accumulated in the remaining subquadrants, starting at
the corners. In the 2D context, this implies that detail coefficients
are also obtained from details and approximations of the previous
iteration, and likewise, approximations from previous approximations and
details, thus fully decorrelating the input signal in every dimension.
% ---------------------------------------------------------------------------
\subsection{Wavelet Transform Boundary Extension}
\label{sec:boundaryExt}
In an ideal world (a.k.a. theory), input signals are assumed to be
infinite and life is simple. In \realLife, everything has a beginning
and an end, including life itself, and incidentally, the input to a
wavelet transform.%
\footnote{... which inevitably raises that perpetual philosophical
question, ``why bother?''
}
Because wavelets have finite support, they will partly extend beyond the
given input signal. There are many sensible (and equally less so) ways to
extend a given input, but this really depends on assumptions about
the nature of the underlying signal, i.e. whether it is periodic,
monotonic, or singular (peaky).
If the input size is a power of two, the signal can simply be wrapped
around, and the number of approximation coefficients is halved in each
iteration.%
\footnote{The specialised 2D wavelet transform function
\cmd{wavelet2:d4Step2()} only handles input sizes of powers of 2,
and is hardwired to perform a cyclic (wraparound) extension of the
input signal, without the need for padding coefficients. This is
obviously
far more efficient and compact, since the number of resulting
coefficients is the same size as the input. While not used by the
precomputed contribution photon map, this function is available for
use by other \radiance{} applications where this restriction on input
sizes is acceptable.
}
However, such a restriction on the input size may be deemed unacceptable
for many applications, including precomputed contributions, which then
warrants a boundary treatment of the input signal.
\begin{figure}[htb]
\centering
\includegraphics[width=0.45\linewidth]{boundaryext-crop}
\parbox{0.9\linewidth}{%
\caption{\label{fig:boundaryExt}
Example of boundary extension at the right edge of a signal
of length $l$, consisting of samples $y_{i,0} \ldots y_{i,l-1}$.
Common extension modes include zero (gray), symmetric/reflection
(green), constant (red), and 1st order gradient (blue).
}
}
\end{figure}
Boundary issues are handled by extrapolating
the input signal beyond the left and right boundaries. The wavelet
module \cmd{wavelet3.c} offers the following boundary extension modes,
(see also the definitions in \cmd{wavelet2.h}, and figure
\ref{fig:boundaryExt} for examples):
\begin{description}
\item[WAVELET\_EXTEND\_CONST:] the input signal is assumed to be
constant beyond
the boudary, i.e. the first resp. last value is repeated as
required. This is probably the safest choice in most cases, unless
there is a significant gradient at the boundaries.
\item[WAVELET\_EXTEND\_GRAD1:] the input signal is linearly extrapolated
according to its 1st order gradient (slope).
\item[WAVELET\_EXTEND\_GRAD2:] the input signal is linearly extrapolated
according to its 2nd order gradient.
\item[WAVELET\_EXTEND\_CYCL:] the input signal is assumed to be cyclic
(i.e. periodic) and wrapped around at either boundary. This may
not always be the optimal choice.
\item[WAVELET\_EXTEND\_SYMM:] the input signal is assumed to be
symmetric,
and reflected at either boundary. This may not always be the optimal
choice.
\item[WAVELET\_EXTEND\_ZERO:] this input signal is simply set to zero
beyond
the boundary. This can lead to large coefficients if the absolute
input values are large near the boundary, which, in turn, can lead to
poor compression. This is optimal only for input signals consisting of
a singularity (i.e. an isolated peak tapering towards the boundaries).
\item[WAVELET\_EXTEND\_INV:] the input signal is reflected at the
boundary,
similarly to \cmd{WAVE\-LET\_\-EXT\-END\_\-SYMM}, but negated,
resulting in anti-symmetry.
This extension artificially boosts boundary detail coefficients
to reduce the likelihood they will be tresholded compared to the
interior coefficients, and therefore preserved. In practice tho,
it turns out to be crap.
\end{description}
The boundary extension mode can be defined by setting
\cmd{WAVELET\_EXTEND\_MODE} to one of the above. If undefined, the
extension mode defaults to \cmd{WAVELET\_EXTEND\_CONST}. Several of these
modes can also be found (possibly under different names) in the
\lit{pywavelets} package, and are partly inspired by these.
Because this is a compile-time option, it is not user-selectable. While
this lacks flexibility, in practice most users would be overwhelmed if
they had to choose a suitable boundary extension mode, since probably
neither they (nor the author) can anticipate the nature of the resulting
input to the wavelet transform.
% ---------------------------------------------------------------------------
\subsection{Padding Coefficients}
Boundary extension implies the generation of additional approximation and
detail coefficient pairs beyond the halved length of the input. These
\textit{padding coefficients} are generated at the boundaries,
and are essential to reduce artefacts at the edges of the reconstructed
input signal during the inverse wavelet transform.
The precomputed contribution module uses the more general 2D padded wavelet
transform module \cmd{wavelet3.c} to compress contributions, which lifts
the restriction on the input size imposed by the 2D non-padded transform
in \cmd{wavelet2.c}.
The number of padding coefficients depends on the support of the wavelet;
the larger the support, the larger the number of padding coefficients.
Furthermore, these are accumulated at every iteration of the transform,
i.e. over multiple resolutions.
This means the total number of wavelet coefficients (approximation and
details) can be significantly \emph{larger} than the input,
which makes an effective compression all the more necessary in order
for the wavelet encoding to be viable.
With padding, the number of coefficients for an input of length $l^k$
samples at iteration $k$ is:
\begin{equation}
l^{k+1} = \left\lfloor\frac{l^k + \lambda - 1}{2}\right\rfloor,
\label{eq:padLen}
\end{equation}
where $\lambda$ is the size of the wavelet's support.
Based on this, it is possible to predict the additional space required in
the wavelet coefficient matrix for the padding coefficients, and dimension
it accordingly in order to accommodate the latter.
This is why the coefficient matrix in figure
\ref{fig:waveletCoeffsFull} exhibits unoccupied (read: wasted) space.
But since nobody at Micro\$oft gives a stuff about efficiency and quality
software, why fret over a few wasted kilobytes?
\footnote{... unless you
plan on porting this to your ZX81 or KIM-1, in which case I wish you
luck!
}
To support the allocation of an array of suitable dimensions,
the function \cmd{wavelet2:padD2Step()} returns the output length
for a given input, including padding, if either the input or
output array is NULL. Similarly, the full transform function
\cmd{wavelet2:padWaveletXform2()}, which calls the former, can sum this
output length over all iterations if either its input or output array is
NULL. The function \cmd{pmapcontrib:preComputeContrib()} relies on
this functionality when allocating the wavelet coefficient matrices for
each contribution modifier.
% ---------------------------------------------------------------------------
\subsection{Wavelet Coefficient Thresholding}
Contributions are compressed by thresholding their corresponding
detail coefficients after the wavelet transform is complete. This entails
removing the (\var{comp})\% least significant such coefficients (i.e.
smallest in terms of absolute value, since coefficients can
be negative). This is demonstrated in figure \ref{fig:waveletCoeffsThresh}
for a compression of 75\%. In general, the details of details in the
lower right subquadrants will be the smallest coefficients, and therefore
most likely to be thresholded.
\begin{figure}[htb]
\centering
\includegraphics[width=0.75\linewidth]{%
wavelet3-test-16x16-thresh75-reaspect-crop%
}
\parbox{0.9\linewidth}{%
\caption{\label{fig:waveletCoeffsThresh}
The coefficient matrix from figure \ref{fig:waveletCoeffsFull}
after thresholding the 75\% coefficients with the lowest absolute
value, implicitly setting these to zero (indicated by bracketed
dots). For contributions with a low gradient, as is typically
the case with sky luminance distributions, the lowest
coefficients will mostly be the details in the submatrices
highlighted in green. The coarsest approximation coefficients
in the upper left submatrix (highlighted in red) are essential
for reconstructing the original matrix, and are therefore never
thresholded.
}
}
\end{figure}
Once coefficients have been selected for thresholding, they can be omitted
when storing or transmitting the wavelet encoded data, thereby saving
mass storage or bandwidth.
The rationale behind thresholding is that the omission of insignificant
coefficients will not appreciably alter the reconstructed signal.
In the context of the reconstruction, the omitted coefficients are
implicitly treated as zero.
\begin{samepage}
Thresholding lies at the heart of wavelet compression, which is why a
good tresholding strategy is important.
There are many ways to threshold coefficients, such as:
\begin{enumerate}
\item using a fixed threshold (hard thresholding, possibly adapted
to the resolution),
\item attenuating coefficients if they exceed the threshold (soft
thresholding), and
\item dropping a fixed fraction of the smallest coefficients.
\end{enumerate}
\end{samepage}
Options (1) and (2) generate a variable number
of thresholded coefficients, which better adapts to the frequency content
of the original signal. The primary disadvantage of this strategy is that
it complicates the paging of coefficients from disk, as it requires a
dedicated indexing structure.
Option (3) is the least optimal, since the user has no control over the
magnitude of thresholded coefficients, and therefore the incurred
error. It does, however, afford control over the \emph{number} of
thresholded coefficients, as it is constant. This in turn greatly
simplifies the paging mechanism (no index is required, since the record
size is fixed and known beforehand). This strategy was therefore chosen
for the precomputed contributon photon map to simplify the implementation.
(Path of least resistance, you know, guv...)
It is important to note that the coarsest level approximation coefficients
(highlighted in the upper 3$\times$3 submatrix in figure
\ref{fig:waveletCoeffsThresh}) are \emph{never} thresholded. These
are essential for the reconstruction, since
the entire matrix containing the original input signal is derived from
them by the inverse wavelet transform, in conjunction with the (thresholded)
detail coefficients. These approximation coefficients are therefore
unconditionally preserved by the thresholding.
Thresholding in the precomputed contribution photon map is performed
in a dedicated routine \cmd{pmapcontrib:thresholdContribs()}.
This transfers all non-zero \emph{detail} coefficients (i.e. omitting
the unoccupied space in the wavelet coefficient matrix left over by
padding) to a thresholding buffer, consisting of an array of
struct \cmd{PreComputedContribCoeff}. Each entry in this buffer contains
a pointer to the corresponding detail coefficient in the wavelet
coefficient matrix, and its linearised 2D matrix index, using the
mapping in equation \ref{eq:linearIdx} to linearise 2D bin indices.
The index is necessary to identify which coefficients were removed
after thresholding.
The thresholding routine ``knows'' how many non-zero detail coefficients
to expect based on the summed number of padded coefficients (equation
\ref{eq:padLen}) minus the $3\times 3 = 9$ approximation coefficients,
which are excluded from thresholding. If the wavelet transform
actually produced some zero coefficients ('appens more often than
you'd expect, guv), the thresholding buffer will contain fewer
coefficients than expected. In this case, the remaining buffer is
simply filled with as many duplicates of a zero coefficient as
required; specifically, the coefficient in the lower right corner is
chosen, since it is guaranteed to be unoccupied
(see figure \ref{fig:waveletCoeffsFull}),
and will therefore be thresholded anyway.
The actual thresholding consists of partitioning the coefficients
in the thresholding buffer so that all coefficients at positions
$[0,\ l(1-comp/100)-1]$ have larger magnitudes than those in positions
\mbox{$[l (1-comp/100),\ l-1]$}. This is performed by a separate
recursive routine with the ever so unlikely name
\cmd{pmap\-cont\-rib:\-co\-eff\-Par\-tit\-ion()}, which swaps
out-of-order buffer entries, much like quicksort. Unlike quicksort,
however, the coefficients within each partition need not be sorted,
which significantly reduces the number of recursive calls. In addition,
swapping buffer entries (i.e. pointers to coefficients and their matrix
indices) is more efficient than swapping the actual RGB floating point
coefficients.
The coefficient magnitude is evaluated as dot product over RGB,
corresponding to squared vector magnitude.
Once the coefficients have been partitioned in the thresholding buffer,
only the most significant are kept, i.e. those in the partition
$[0,\ l(1-comp/100)-1]$. These are subsequently
sorted by their coefficient indices using \cmd{qsort()} from the
standard C library, again by swapping pointers to coefficients and their
matrix indices, instead of the RGB coefficients themselves. This results in
an array of coefficients (well, pointers to them) with monotonically
increasing matrix indices, which is required by the subsequent mRGBE
coefficient encoding on disk.
% ---------------------------------------------------------------------------
\subsection{mRGBE Wavelet Coefficient Encoding}
\label{sec:mrgbe.c}
Once the wavelet coefficients have been thresholded, they are encoded
using the modified
RGBE encoding (mRGBE) prototyped in section \ref{sec:mrgbePoC}.
The C implementation in module \cmd{mrgbe.c} as part of the contribution
photon map is essentially a direct port of the Python prototype, and
is based on the same assumptions.
The mRGBE fields and their correponding 32-bit integer value are defined
as a union of type \cmd{mRGBE} (surprise, surprise, guv...) whose
bit field configurations are defined by the macros \cmd{MANTBITS},
\cmd{EXPBITS}, and \cmd{DATABITS}.
The default bit field configuration is shown in figure
\ref{fig:mrgbeStruct}, but can be redefined to suit applications that
require more precision at the expense of a reduced payload data range,
or vice versa. In practice, the default presents a compromise suitable
for most applications.
\begin{figure}[htb]
\centering
\includegraphics[width=0.95\linewidth]{mrgbe-crop}\\
\parbox{0.9\linewidth}{%
\caption{\label{fig:mrgbeStruct}
Structure of 32-bit mRGBE encoding for wavelet coefficients.
The encoding consists of three mantissae per RGB colour channel.
a common exponent (base 2), and an associated payload data field
to store the coefficient index (linearised from its 2D matrix
indices). The bits can be allocated within the 32-bit envelope
at compile-time to trade off precision, encoding range, and
payload data range. The default configuration,
\cmd{MANTBITS} = 6, \cmd{EXPBITS} = 5, \cmd{DATABITS} = 9
(abbreviated 6:6:6:5:9), balances
these confliciting requirements.
}
}
\end{figure}
The primary contribution compression/encoding function
\cmd{pmapcontrib:encodeContribs()} keeps track of the per-colour channel
range $[d_{min,i},\: d_{max,i}]$ of the wavelet coefficients' absolute
values, and passes these in a struct \cmd{mRGBERange} to the encoding
initialisation function \cmd{mrgbe:mRGBEinit()}.
This function is responsible for setting the per-colour channel
normalisation factor $d_{norm,i}$, which is returned in
\cmd{mRGBERange.norm}:
\begin{equation}
d_{norm,i} = \frac{1 - 2^{-\left(2^\textrm{EXPBITS}\right)}}{
d_{max,i} - d_{min,i}
},\quad i\in\lbrace r,g,b\rbrace,
\end{equation}
where \cmd{EXPBITS} is the number of bits allocated to the shared exponent
in the mRGBE encoding, as shown in figure \ref{fig:mrgbeStruct}.
Once the mRGBE normalisation is initialised, \cmd{encodeContribs()} calls
\cmd{mRGBEencode()} for each RGB wavelet detail coefficient
$d = [d_r,\; d_g,\; d_b]$
in the thresholding buffer, passing the initialised \cmd{mRGBERange}
instance containing the normalisation factor $d_{norm}$, to obtain the
corresponding mRGBE encoding consisting of per-colour channel mantissae
$m_i$, and a shared base-2 exponent $x$:
\begin{eqnarray}
\label{eq:mrgbe}
m_i &=& \mathrm{sgn}\,\left(d_i\right)
\left\lfloor \overline{m} \cdot \overline{d}_i\, +\, \epsilon
\right\rfloor\, +\, m_{max}, \quad i\in\lbrace r,\, g,\, b\rbrace,
\quad\epsilon \in [0, 1)\\
\overline{m} &=& m \frac{m_{max}}{\overline{d}}\\
(m, x) &=& \mathrm{frexp}\left(\overline{d}\right)\\
\overline{d} &=& \max\left(
\overline{d}_r,\; \overline{d}_g,\; \overline{d}_b
\right)\\
\overline{d}_i &=& \left(\lvert d_i\rvert\, -\, d_{min,i}\right)\,
d_{norm,i}, \quad i\in\lbrace r,\, g,\, b\rbrace\\
m_{max} &=& 2^\textrm{MANTBITS-1},
\end{eqnarray}
where overbars denote normalised values, and \cmd{MANTBITS} specifies
the number of bits allocated to each mantissa $m_i$ in the mRGBE encoding
(including sign bit).
Note that each coefficient $d_i$ is offset by its corresponding mRGBE
range minimum, $d_{min,i}$, before being normalised by $d_{norm,i}$ to
obtain $\overline{d}_i$. Similarly, $\overline{d}$ is the latter's
maximum over RGB.
The $\mathrm{frexp}(\overline{d})$ function is part of the
standard C library and returns a floating point mantissa $m$ and an
integer base-2 exponent $x$ such that $m 2^x = \overline{d}$. Note that
the absolute value $\lvert x \rvert$ is stored in the mRGBE
exponent field, since $\overline{d} \leq 1$ implies a consistently
negative exponent.
Note also that each mantissa $m_i$ is rounded to the nearest integer via
an optional constant $\epsilon$ (defaulting to 0), and then offset
by the signed mantissa maximum, $m_{max}$, which corresponds to half the
encoding range accommodated by \cmd{MANTBITS}.
This offset encodes the mantissa's sign, with all values below
$m_{max}$ being negative.
\cmd{mRGBEencode()} accepts each wavelet coefficient's linear index as
payload data, which increases mono\-ton\-ical\-ly since the coefficients
were previously sorted with respect to their indices.
Consequently, the coefficient index is \emph{incrementally} encoded,
i.e. as the difference to that of its immediate predecessor, starting at
0 (hence only the first coefficient index is absolute).
This incremental index encoding generally requires fewer bits to encode
than an absolute index, which reduces the likelihood of
overflowing the encoding range of the mRGBE payload data field.
However, it is important to realise that the likelihood of overflowing the
mRGBE data field increases with the index increments, notably when the
thresholded wavelet coefficient matrix becomes sparsely populated due to
a high compression ratio and/or number of bins. This cannot be predicted as
it depends on the distribution of the thresholded coefficients, which in
turn depends on the nature and frequency content of the wavelet transformed
contributions. If this occurs, we're caught up the creek without a paddle
and toss in the towel (!), aborting the contribution precomputation with
an error.
This is far from optimal, and an issue that
perhaps could have been more elegantly handled with more time and budget.
As a half-baked user-friendly gesture, \mkpmap{} will at least warn
if the total number of coefficients (including padding) exceeds the
payload data field range, i.e. $2^\textrm{DATABITS}$, indicating that,
under the aforementioned conditions, the mRGBE data field
\emph{could} theoretically overflow.
The encoding function \cmd{mRGBEencode()} sets the per-colour-channel
mantissae $m_i$ in each of the bitfields \cmd{mRGBE.\{red,green,blue\}},
and the shared exponent $x$ in \cmd{mRGBE.exp}. Furthermore, the
incremental coefficient index is set in bitfield \cmd{mRGBE.dat}.
Together, these occupy a 32-bit envelope, which can be conveniently
accessed as a scalar integer value \cmd{mRGBE.all} via the union
declaration.
Each such 32-bit mRGBE-encoded wavelet coefficient is appended to a
temporary array of struct \cmd{mRGBE} in the container
\cmd{preComputedContrib}, along with the instance of \cmd{mRGBERange}
containing the encoding normalisation and offset. These are the final
return values of the contribution compression/encoding routine
\cmd{encodeContribs()}, at which point the routine returns.
Once the wavelet detail coefficients have been compressed and encoded,
\cmd{preComputeContrib()} prepends them with the
approximation coefficients in the upper $3\times 3$ submatrix
of the wavelet coefficient matrix, followed by the per-colour-channel
mRGBE range minimum and maxima, as returned in the \cmd{mRGBERange}
instance by \cmd{encodeContribs()}. Since the approximation coefficients
are not thresholded, their positions in the matrix are known, on top
of which their encoding demands higher precision than mRGBE, consequently
they are encoded with \radiance's standard 32-bit RGBE format.
Similarly, the mRGBE encoding range is also encoded as RGBE to
preserve the normalisation and offset with sufficient precision,
thereby minimising the decoding error. Since the approximation
coefficients can (rather suprisingly) be negative, but 32-bit RGBE
can only encode positive values, a hack is employed whereby the least
significant bit (bit 0) of each mantissa is sacrificed as a sign bit
(see macro \cmd{PMAP\_CONTRIB\_SET\_RGBE32\_SGN()} in
\cmd{pmapcontrib.h}), resulting in a slight reduction in precision.
At this point, a new photon is created with the same attributes
as the original photon selected for precomputation, by calling
\cmd{pmapdata:newPhoton()}. This function sets specific attributes
for various photon types; for precomputed contribution photons, it
sets the auxiliary data field \cmd{Photon.aux.contribSrc}
to the current light source index (passed via the photon ray's \cmd{rsrc}
field), in order to identify the precomputed photon's emitting light
source.
The RGBE encoded approximation coefficients and mRGBE range, as well as
the mRGBE encoded thresholded detail coefficients, are passed to
\cmd{newPhoton()} and (if not NULL) accumulated in a contribution
buffer \cmd{PhotonMap.contribHeapBuf}. This is analogous to the
accumulation of photons in the heap buffer \cmd{PhotonMap.heapBuf}
which already takes place in \cmd{newPhoton()}; in this case it
holds the precomputed photons associated with the contributions, in
the same order.
Once the heap buffer is full (both photon and contribution buffers
hold the same number of records), they are flushed to separate
heap files on disk via the lazily initialised file pointers,
\cmd{PhotonMap.\{heap, contribHeap\}}. This ensures that
photon map contruction is consistently out-of-core.
Once \cmd{preComputeContrib()} has iterated over all precomputed
photons, and flushed the photons and contributions for precomputed
child photon map to disk, it chucks -- uh, discards the original
photon map. Iterating over each child photon map, it calls the
contribution photon map specific build routine,
\cmd{buildPreCompContribPmap()}. This, along with the
saving routines, resides in a separate module \cmd{pmcontrib3.c},
and is described in the next section.
% ---------------------------------------------------------------------------
\subsection{Building and Saving Precomputed Contribution Photon Maps}
\label{sec:buildSave}
The top-level build routine for per-modifier child photon maps,
\cmd{buildPreCompContribPmap()}, sets up the output subdirectory
from the parent photon map's name (as specified to \mkpmap{} via
the \opt{-apC} option), and the filename of each child photon map and
its wavelet coefficient file by appending the corresponding modifier
name, as specified by the \opt{-m} option.
In addition, the routine recursively deletes any previous files
in the subdirectory using the file tree walk function \cmd{nftw()} from
the standard C library.%
\footnote{
This function supersedes \cmd{ftw()}, which
is deprecated according to POSIX.1-2008. Both of these functions are
not included in the standard Wind0z API, and therefore no cleanup
takes place on this platform. So tuff luck.
}
Each child photon map is stored on disk in a subdirectory
\lit{<pmapfile>.rc/} derived from the parent photon map's name,
\lit{<pmapfile>}. The photons themselves are stored out-of-core in a
file \lit{<pmapfile>.rc/<mod>.leaf}, while the compressed wavelet
coefficients are stored in a separate file \lit{<pmapfile>.rc/<mod>.wvt}
(see also figure \ref{fig:overview}).
In addition, all binning parameters
relevant to \rcontrib, i.e all instances of the options \opt{-bn},
\opt{-e}, \opt{\mbox{-m}}, \opt{-M} and \opt{-p} passed to \mkpmap,
are collected from the command line and saved to an option file
\lit{<pmapfile>.opt} in the parent directory. This also includes options
to specify a Shirley-Chiu mapping via the \cmd{disk2square.cal} function
file, which is functionally equivalent to the internal contribution
photon binning routine, \cmd{pmapcontrib:ray2bin()}. For
definitions of the relevant option strings, see macros
\cmd{mkpmap:PMAP\_CONTRIB\_RCOPT\_$\mathbf{*}$}.
Instances of these option strings are appended to a cumulative string by
\mkpmap{} as it parses the command line, which is then passed to the main
photon distribution routine \cmd{pmcontrib2:distribPhotonContrib}. The
latter assigns the option string to the generated photon map's
\cmd{PhotonMap.rcOpts} field for subsequent saving to disk.
The option file is intended to be passed to \rcontrib{} via the \opt{@}
option to ensure the binning parameters are consistent with those in the
photon map (the technical reasons why this isn't handled automatically
are detailed in section \ref{sec:load}).
Once the output filenames have been initialised,
\cmd{buildPreCompContribPmap()} calls the standard photon map building
routine, \cmd{pmapdata:buildPhotonMap()}. This consolidates the
photon and contribution heaps generated by the multiple processes
forked by the \opt{-n} option, normalises the
photon flux if applicable (skipped if daylight coefficients are enabled
with the \opt{-V} option, in which
case the photon flux is already normalised). Since the
contribution photon map requires out-of-core storage,
this function calls the specific out-of-core build routine,
\cmd{pmapooc:OOC\_BuildPhotonMap()}. With a regular photon map, this
calls \cmd{oocsort:OOC\_Sort()}, which sorts photons out-of-core
according to the Morton code derived from the photons' 3D positions
\cite{schregle-techreport-2016}. The photons are then saved to the
out-of-core octree leaf file \lit{<pmapfile>.rc/<mod>.leaf} before
the out-of-core octree structure to index the leaves is built by
calling \cmd{oocbuild:OOC\_Build()}.
In the case of a contribution photon map, an extended out-of-core
sorting function \cmd{pmcontribsort:contribPhotonSort()} is called,
which is derived from \cmd{OOC\_Sort()} and uses the same low-level
operations to access intermediate files.
Besides sorting the photons in their consolidated heap file, the
adapted routine also sorts the corresponding wavelet coefficients in
their likewise consolidated heap file. While the photons are unordered
w.r.t their Morton codes within each heap, both heaps are ordered w.r.t
each other, since they were synchronously flushed to disk by
\cmd{pmapcontrib:preComputeContrib()} during precomputation.
Thus, sorting both heaps w.r.t the photon's Morton indices maintains the
correspondence between photons and their wavelet coefficients.
Because the number of thresholded coefficients is fixed, the size of each
set of coefficients is too, which greatly simplifies this step of the
build process.
Like the original out-of-core sorting routine, the contribution photon
sorting routine employs an external mergesort \cite{Knuth:1998:ACP:280635,
Seyedafsari:2010}. This recursively subdivides the unordered
photon and contribution heap files into progressively smaller
subblocks (maintaining these out-of-core in temporary files), until they
are small enough to be quicksorted in-core (in parallel, if the
\opt{-n} option is passed to \mkpmap). The sorted blocks are
then merged into progressively larger out-of-core blocks as the recursion
unwinds. On exiting the sorting routine, the sorted photons are returned
in the out-of-core octree leaf file \lit{<pmapfile>.rc/<mod>.leaf},
while the corresponding wavelet coefficients are returned in the
wavelet file \lit{<pmapfile>.rc/<mod>.wvt}. At this point, the
out-of-core octree to index the photons in the leaf file
is built by \cmd{oocbuild:OOC\_Build()}, as with any out-of-core
photon map.
Once built, precomputed contribution photon maps are saved by
\cmd{pm\-cont\-rib3:save\-Cont\-rib\-Pho\-ton\-Map()}, which is called
for the (parent) contribution photon map by the standard
photon saving routine \cmd{pmapio:\-save\-Photon\-Map()}.
\cmd{saveContribPhotonMap()} saves the binning parameters specified
in the parent photon map's \cmd{rcOpts} string to the option file
for later use with \rcontrib. The routine subsequently iterates over the
(child) photon maps referenced in the parent's \cmd{preCompContribTab}
using the standard
lookup tables iterator routine \cmd{look\-up:lu\_do\-all()},
which calls the saving routine \cmd{pmcontrib3:savePreCompContrib()}
for each per-modifier child photon map in the LUT. This latter routine
calls, in turn, the standard photon map saving routine
\cmd{pmapio:\-save\-Photon\-Map()} again, this time to save the actual
per-modifier photons.%
\footnote{
Hey, are we confused yet? And in case you're wondering,
\cmd{savePhotonMap()} will \emph{not} wind up in a recursive loop, as
each child photon map's \cmd{preCompContribTab} field will be NULL,
since, in the na\"ively ideal world of the \radiance{} photon map
at least, kids don't have kids!
}
To this end, \cmd{save\-Photon\-Map()} was modified to generate
contribution-specific info in the file header (number of coefficients,
compression rate, etc), and
saves the photon map itself. Saving the photon map entails encoding the
indexing structure of the out-of-core octree to reference the photons
in its leaves (which physically reside on disk in the leaf file).
Once all child contribution photon maps and their parent are saved,
the precomputation is concluded and \mkpmap{} cleans up and terminates.
Woohoo!
% ---------------------------------------------------------------------------
\subsection{Logarithmic vs. Linear Encoding of Contributions}
\label{sec:logEncoding}
Contributions can optionally be encoded logarithmically during the
wavelet transform via the \cmd{PMAP\_\-CON\-TRIB\_\-LOG} compile-time
option (see \cmd{pmapcontrib.h}). If defined,
\cmd{pmap\-contrib:\-en\-code\-Con\-tribs()}
applies a natural logarithm to every RGB contribution tuple prior
to the the wavelet transform.
The advantage of this encoding is that it reduces the range of the
input, and consquently that of the resulting wavelet coefficients, which
improves the precision of the subsequent mRGBE encoding.
It also elegantly handles potential negative values resulting from
compression artefacts during the inverse wavelet transform, since the
contributions will always be positive after inverting the log
encoding through exponentiation.
The downside of this logarithmic encoding is that it complicates the
boundary extension during the wavelet transform, as most extension modes
assume linear input data, particularly the gradient extension modes.
Consequently, logarithmic encoding should only be used for a "safe"
extension mode, such as \cmd{WAVELET\_EXTEND\_CONST}. A further downside
is the increased sensitivity of the encoding to compression artefacts,
as well as jitter due the limited precision of the mRGBE encoding.
% ---------------------------------------------------------------------------
\subsection{Achtung, Baby: Sparsely Populated Bins}
\label{sec:emptyBins}
A potential anomaly with contribution photon mapping is that some bins
may not be populated during precomputation. This can happen in situations
with weak illuminance, and is difficult to predict. If the ratio of
populated (i.e. nonzero) bins is too low, bias may result from the
sparsely populated bins (see figure \ref{fig:cpmapTest-emptyBinsHDR}).
Note that this issue is solely dependent on the number of bins and
photons, not the compression ratio.
\begin{figure}[htb]
\centering
\includegraphics[width=0.7\linewidth]{cpmapTest-emptyBins}
\parbox{0.7\linewidth}{%
\caption{%
\label{fig:cpmapTest-emptyBinsHDR}
Bias caused by sparsely binned contributions. With 64 bins
(right, shown for bin 63), an average of 90\% of bins contain
nonzero contributions.
With 256 bins (left, shown for bin 255), this ratio drops to
under 50\%, leading to obvious bias, notably on the left wall
and ceiling. See figure section \ref{sec:cpmapTestHDR} for a
description of the scene and a composite rendering of all bins.
}
}
\end{figure}
The average fraction of empty bins
can be obtained by compiling with the
\cmd{PMAP\_\-CONT\-RIB\_\-BIN\-HIS\-TO}
macro. Plotting this against the number of bins (see figure
\ref{fig:cpmapTest-emptyBinsPlot}) reveals a pronounced
discontinuity at the point where bias sets in due to sparsely populated
bins. This suggests an ill-posed solution if the average number of
bins rises above ca. 25\%.
\begin{figure}[htb]
\centering
\includegraphics[width=0.7\linewidth]{cpmapTest-64m-bn64-emptyBins}
\parbox{0.7\linewidth}{%
\caption{%
\label{fig:cpmapTest-emptyBinsPlot}
Graph of empty bin ratio (averaged over all precomputed photons)
as a function of the number of bins. This ratio rises sharply
above 9$\times$9 = 81 bins, indicating the majority of
precomputed contributions include unpopulated bins, which in
turn manifests itself as visible bias as shown figure
\ref{fig:cpmapTest-emptyBinsHDR}.
}
}
\end{figure}
As a consequence, the contribution photon
density estimate routine \cmd{getPhotonContrib()} issues a warning if
fewer than 50\% of bins are populated (doubling the empirical threshold
for good measure). The frequency of these warnings
indicates the necessity to increase the overall number of photons used
for precomputation, as well as the photon lookup bandwidth. However,
there is no guarantee this will remedy the situation, as the actual
luminance distribution may simply not cover the entire incident
hemisphere.
% ---------------------------------------------------------------------------
\clearpage
\section{Unit Tests}
The main constitutent modules of the precomputed contribution photon map
contain optional unit tests which can be built at compile time. These
consist of a \cmd{main()} function which is enabled by compiler macros
to produce a standalone binary for individual testing of each module's
basic functionality.
\subsection{Contribution Binning Unit Test}
\label{sec:pmapcontrib-test}
The \cmd{pmapcontrib} module contains a unit test which can be built
with the command
\begin{center}
\begin{minipage}{0.7\linewidth}
\begin{lstlisting}
rmake pmapcontrib-test
\end{lstlisting}
\end{minipage}
\end{center}
where the compilation target \cmd{pmapcontrib-test} enables the macro
\cmd{PMAP\_\-CONT\-RIB\_\-TEST}. The test verifies
the Shirley-Chiu binning function and its orientation variables by calling
\cmd{ray2bin()} for a set of sample rays in the incident hemisphere, and
outputs the corresponding linearised bin numbers.
The unit test is invoked as follows:
\begin{center}
\begin{minipage}{0.7\linewidth}
\begin{lstlisting}
pmapcontrib-test <scdim> <nsamp> [<var>=<value>; ..]
\end{lstlisting}
\end{minipage}
\end{center}
where \lit{scdim} is the dimension of the Shirley-Chiu square, and
\lit{nsamp} is the number of sample rays. The sample rays are distributed
over the hemisphere by stratifying in $\theta$ and $\phi$. The orientation
of the Shirley-Chiu mapping can be modified by appending optional
variable assignments to the command line (see section \ref{sec:binning}
for the list of relevant variables). Figure \ref{fig:pmapcontrib-test}
shows a sample output of this unit test.
\begin{figure}[htb]
\centering
\includegraphics[width=0.8\linewidth]{pmapcontrib-test}
\parbox{0.9\linewidth}{%
\caption{%
\label{fig:pmapcontrib-test}
Output of contribution binning unit test
for an 8$\times$8 Shirley-Chiu mapping.
}
}
\end{figure}
\subsection{Wavelet Unit Test}
\label{sec:wavelet3-test}
The \cmd{wavelet3} module contains a unit test of the 2D padded wavelet
transform which can be built with the command
\begin{center}
\begin{minipage}{0.7\linewidth}
\begin{lstlisting}
rmake wavelet3-test
\end{lstlisting}
\end{minipage}
\end{center}
where the compilation target \cmd{wavelet3-test} enables the macro
\cmd{WAVE\-LET\_\-TEST\_\-2D\-PAD}.\footnote{
The \cmd{wavelet} and \cmd{wavelet2} modules also contain unit tests
of the 1D and 2D unpadded wavelet transforms, which are enabled at
compile time with the \cmd{WAVE\-LET\_\-TEST\_\-1D} and
\cmd{WAVE\-LET\_\-TEST\_\-2D} macros, respectively. These are
functionally similar to module \cmd{wavelet3}'s unit test.
}
The unit test is invoked as follows:
\begin{center}
\begin{minipage}{0.7\linewidth}
\begin{lstlisting}
wavelet3-test <scdim> [threshold] [dataFile]
\end{lstlisting}
\end{minipage}
\end{center}
where \lit{scdim} is the dimension of the Shirley-Chiu square,
\lit{threshold} is an optional (hard) threshold which sets all wavelet
coefficients below this value to zero, and \lit{dataFile} is an optional
ASCII file containing the values for an input matrix, one matrix row per
line, which must contain at least \lit{scdim} lines and as many values
per line (the excess being ignored). This is particularly useful to import
and wavelet transform actual contributions output by \rcontrib{} for
the same binning configuration.
The test allocates and initialises a 2D input matrix of the specified size
\lit{scdim} $\times$ \lit{scdim}. The matrix is either initialised from
the optional \lit{dataFile} (if specified), or with generated values as
determined by the \cmd{WAVELET\_TEST\_INIT} macro, which supports the
following settings:
\begin{enumerate}
\item Random data, with independent colour channels:
$y_{i,j} = [\xi_r,\; \xi_g,\; \xi_b]$.
\item Random data, with correlated colour channels:
$y_{i,j} = [\xi_r,\; (0.1+0.9\xi_g)\:\xi_r,\; (0.1+0.9\xi_b)\:\xi_r]$,
i.e. RGB differ by less than a factor of 10.
\item Random data, identical for all colour channels:
$y_{i,j} = [\xi_r,\; \xi_r,\; \xi_r]$.
\item Monotonically increasing along 1st axis:
$y_{i,j} = [i + 1, \cdots ,\cdots]$, where an offset of 1 avoids
taking the logarithm of zero if log encoding of coefficients is
enabled (see section \ref{sec:logEncoding}).
\item Monotonically increasing along both axes:
$y_{i,j} = [ij+1, \cdots, \cdots]$, with an offset of 1 again to
avoid taking the log of zero.
\item Monotonically increasing by linear index:
$y_{i,j} = [i \cdot scdim + j + 1, \cdots, \cdots]$,
i.e. serialised matrix rows.
\item Symmetric ``bump'' function:
$y_{i,j} = \left[
\left(1.1 - \frac{\lvert j - scdim/2 + 0.5\rvert}
{scdim/2 - 0.5}
\right)
\left(1.1 - \frac{\lvert i - scdim/2 + 0.5\rvert}
{scdim/2 - 0.5}
\right),
\cdots,
\cdots
\right]$, with an offset of 0.1 to avoid taking the log of zero.
\end{enumerate}
Note that $\xi_i \in [0,1]$ denotes independent random variables per
colour channel, and ellipses indicate repeated values for the remaining
colour channels. These initialisation options are useful to compare the
reconstructed matrix against a known (simple) reference, as opposed to
a more complex distribution from the \lit{dataFile}.
Regardless of the source,
the input array is optionally log-encoded if \cmd{WAVELET\_TEST\_LOG} is
enabled.
\begin{figure}[p]
\centering
\includegraphics[width=\linewidth]{wavelet3-test}
\parbox{\linewidth}{%
\caption{%
\label{fig:wavelet3-test}
Output of the wavelet transform unit
test for a 5$\times$5 input matrix with thresholding (the
mRGBE output is omitted for the sake of brevity).
Space for padding coefficients in the matrix is indicated by
dots, while thresholded coefficients are indicated by
dots surrounded by square brackets.
}
}
\end{figure}
The unit test then allocates an output matrix and its transpose according
to the padded size returned by \cmd{padWaveletXform2()} with NULL input
arrays. It then performs a full 2D padded wavelet transform using this
function. If \lit{threshold} was specified on the command line,
the resulting wavelet detail coefficients with absolute value%
\footnote{
The scalar absolute value is obtained as dot product from
the RGB tuples.
}
below \lit{threshold} are
set to zero; similarly to \cmd{pmapcontrib:thresh\-old\-Cont\-ribs()},
the approximation coefficients in the upper left of the output matrix are
not thresholded.
The test outputs the wavelet coefficient matrix to the console,
marking the thresholded coefficients as bracketed dots (see figure
\ref{fig:wavelet3-test}).
If \cmd{WAVELET\_TEST\_mRGBE} is defined, the unit test also encodes the
(thresholded) wavelet coefficients to mRGBE, placing these in
an additional output matrix, which is also output below the original
floating point coefficients for comparison. Again, the approximation
coefficients in the upper left of the matrix are preserved.
Finally, the test inverts the wavelet transform using the (thresholded)
wavelet coefficients by calling \cmd{padWaveletInvXform2()}, and outputs
the reconstructed data along with the root mean square error (RMSE)
compared to the original input. This process is optionally repeated for
the mRGBE-encoded coefficients.
As a convenience for subsequent analysis, the (thresholded) wavelet
coefficients can also be output to a file defined by the macro
\cmd{WAVELET\_TEST\_COEFFOUT}, which defaults to
\cmd{wavelet3-test-coeff.dat}.
Similarly, the mRGBE-encoded coefficients can be output to
\cmd{WAVELET\_\-TEST\_\-COEFF\-OUT\_\-mRGBE}, which defaults to
\cmd{wavelet3-test-coeff-mrgbe.dat}. Analogously, the reconstructed
data can be output to \cmd{WAVELET\_TEST\_OUT} (defaulting to
\cmd{wavelet3-test-xform.dat}), while the reconstruction from
mRGBE-encoded coefficients can be output to
\cmd{WAVELET\_TEST\_OUT\_mRGBE} (defaulting to
\cmd{wavelet3-test-xform-mrgbe.dat}).
Undefining any of these macros (e.g. by commenting out their definitions
in \cmd{wavelet3.c}) disables the corresponding file output.
The wavelet transform unit test is crucial to evaluate deviations
(artefacts) incurred by thresholding and the limited precision of the mRGBE
encoding (particularly with different bitfield configurations of the
latter). Consequently, the overwhelming majority of testing was conducted
with this tool in order to troubleshoot and optimise the wavelet transform,
particularly with respect to the various boundary extension modes.
\clearpage
\subsection{mRGBE Unit Test}
\label{sec:mrgbe-test}
The \cmd{mrgbe} module contains a unit test of the mRGBE encoding
which can be built with the command
\begin{center}
\begin{minipage}{0.7\linewidth}
\begin{lstlisting}
rmake mrgbe-test
\end{lstlisting}
\end{minipage}
\end{center}
where the compilation target \cmd{mrgbe-test} enables the macro
\cmd{mRGBE\_TEST}. The unit test is invoked as follows:
\begin{center}
\begin{minipage}{0.7\linewidth}
\begin{lstlisting}
mrgbe-test <numTests>
\end{lstlisting}
\end{minipage}
\end{center}
The test initially triggers possible exceptions with invalid or marginal
encoding input (notably zero, which is handled separately in the encoding
routine), under/overflow (relative to the specified encoding range),
and empty encoding range.
It then proceeds to encode and decode \lit{numTests} random 3-tuples in
a predetermined range (defined by macros \cmd{RGBMIN} and \cmd{RGBMAX}),
and dumping the RMSE between encoded and decoded values, calculated as
dot product of their component differences.
Figure \ref{fig:mrgbe-test} shows a sample run.
\begin{figure}[htb]
\centering
\includegraphics[width=\linewidth]{mrgbe-test}
\parbox{\linewidth}{%
\caption{%
\label{fig:mrgbe-test}
Sample output of mRGBE unit test for a 6:6:6:5:9 mRGBE bit
configuration. With 6 bits per mantissa, the RMSE averages
around 3\%.
}
}
\end{figure}
% ---------------------------------------------------------------------------
\clearpage
\section{Precomputed Contribution Evaluation with \rcontrib}
The evaluation of precomputed contributions from the photon map in
\rcontrib{} entails locating the nearest precomputed photon, paging
its compressed contributions from disk on demand, decoding the
mRGBE-encoded wavelet coefficients, and inverting the wavelet transform
to reconstruct the contributions.
The reconstructed contributions are then transferred to \rcontrib's
contribution lookup table, and can be optionally cached to accelerate
neighbouring evaluations that require the same photon again.
\subsection{Loading Precomputed Contribution Photon Maps}
\label{sec:load}
Similarly to saving the contribution photon maps in \mkpmap{}, \rcontrib{}
calls the main photon map loading routine \cmd{pmapio:loadPhotonMap()} with
the parent photon map, which passes control to a dedicated routine
\cmd{pmcontrib4:loadContribPhotonMap()}. The latter again calls
\cmd{loadPhotonMap()} to load each per-modifier child photon map. Note
that the photons and their contributions remain out-of-core, and reside on
disk. The out-of-core octree data structure, however, remains in-core to
facilitate navigating the photon map during lookups.
The binning parameters used by \mkpmap{} to precompute the contributions
are dumped to an option file \lit{<pmapfile>.opt} in the photon map's
parent directory when it is built and saved. The contents of this
file are passed to \rcontrib{} via the \opt{@} option to ensure the
binning parameters are consistent with those in the photon map. This
dimensions \rcontrib's contribution array to accommodate the precomputed
bins via the \opt{-bn} option, and also specifies the binning function
file \cmd{disk2square.cal} for a Shirley-Chiu mapping
along with its orientation parameters, via the options \opt{-f}, \opt{-b},
\opt{-e} and \opt{-p}. Although the contributions from the photons are
already prebinned, the primary ray traced by \rcontrib{} which triggers
the photon lookup (possibly via
one ambient or several specular interactions) will still be
binned, and this binning must of course be consistent with that of the
precomputed contributions.
While automatic binning parameters would clearly be more convenient, the
option file is a necessity dictated by \rcontrib's architecture.
This is because
by the time the photon map is loaded, \rcontrib's contribution arrays
have already been allocated and initialised, along with its output
streams if the \cmd{-o} option is specified, which can be assigned to
individual modifiers and even bins. Reallocating these resources once the
photon map is loaded, and the binning parameters have been extracted,
would be too complex and error prone. While this may change in the future,
it was considered safer to use the already existing concept of option
files to pass the correct parameters to \rcontrib{} at startup,
which of course then becomes the user's responsibility.%
\footnote{
This doesn't seem unreasonable as it conforms to \radiance's
``it's the user's fault'' philosophy. ;\^{})
}
\subsection{Locating and Paging Precomputed Photons}
As in \rtrace{}, the evaluation of contributions is triggered via the
ambient calculation routine \cmd{ambient:multambient()}, which in turn
calls the photon mapping ``hook'' \cmd{pmapamb:ambPmap()} (resp.
\cmd{pmap\-amb:\-amb\-Pmap\-Caustic()} for caustic photons).%
\footnote{
Caustic photons carry a flag indicating they have been specularly
scattered, and are exclusively accepted by \cmd{pmapfilt:filterPhoton()}
during lookups initiated by \cmd{ambPmapCaustic()}. Note, however,
that the dedicated caustic photon map generated with \mkpmap's
\opt{-apc} option doesn't support contributions.
}
These routines call the appropriate photon lookup routine depending on
the photon type (defined in the callback \cmd{PhotonMap.lookup}), which
in the case of contribution photons is
\cmd{pmcontrib4:getPreCompPhotonContrib()}.
\cmd{getPreCompPhotonContrib()} calls
\cmd{raytrace:raycontrib()} to obtain the cumulative contribution of the
incident ray; typically this is the last ray in a path starting with a
primary ray followed by zero or more specular scattering events, and
zero or one (if \opt{-ab} is positive) diffuse scattering events.
This is then passed to \cmd{getPreCompContribByMod()}, which is called
for each per-modifier child photon map.
\cmd{getPreCompContribByMod()} locates the single closest precomputed
contribution photon in the corresponding child photon map for the
current modifier by calling the standard lookup routine
\cmd{pmapdata:find1Photon()}. The latter, in turn, calls the
out-of-core lookup routine \cmd{pmapooc:OOC\_\-Find\-1\-Pho\-ton()}, which
pages photons from the out-of-core octree leaf file and caches them
via an instance of the out-of-core photon cache
\cite{schregle-techreport-2016}.
The contributions for the found photon are decoded by calling
\cmd{pmcontrib4:get\-Pre\-Comp\-Con\-trib\-By\-Pho\-ton()} (see below)
and passing the photon along with its index.
\cmd{getPreCompContribByMod()} subsequently scales the decoded
contributions by the incident ray's cumulative contribution which was
passed by the caller, \cmd{getPreCompPhotonContrib()}, after which
the modifier's contributions are returned.
\subsection{mRGBE Wavelet Coefficient Decoding}
\cmd{getPreCompContribByPhoton()} loads the 32-bit RGBE-encoded
detail coefficient range and approximation coefficients from the
wavelet coefficient file for a given photon, using the photon's
numeric index as file offset.
The approximation coefficients are placed in the upper left of the
(initially zeroed) wavelet coefficient matrix. The sign of each
approximation coefficient's colour channel is set
according to bit 0 of the corresponding 8-bit mantissa in the 32-bit RGBE
encoding, via a convenient macro \cmd{PMAP\_CONTRIB\_GET\_RGBE32\_SGN()}.
\cmd{getPreCompContribByPhoton()} then loads the mRGBE-encoded
detail coefficients for the given photon from the wavelet coefficient
file (where they are stored immediately after the mRGBE range and
approximation coefficients).
Miscellaneous bookkeeping such as lazy initialisation of the
primary and transposed wavelet matrices also takes place here.
The detail coefficients are loaded into a lazily allocated
decoding buffer embedded in the current child photon map's field of
type struct \cmd{PreComputedContrib}.
Once the detail coefficients and their range have been loaded, they are
decoded from the decoding buffer by calling \cmd{decodeContribs()},
which also performs the inverse wavelet transform.
\cmd{decodeContribs()} initialises the mRGBE normalisation from the
retrieved detail coefficient range by calling \cmd{mrgbe:mRGBEinit()},
and subsequently passes this to the mRGBE decoding routine
\cmd{mrgbe:\-mRGBE\-de\-code()} for each detail coefficient. The mRGBE
decoding routine returns each decoded detail coefficient as a floating
point RGB 3-tuple along with its corresponding incremental linear
coefficient index.
Given a 32-bit mRGBE-encoded coefficient consisting of the per-colour
channel mantissae $m_i$ and the shared base-2 exponent $x$,
\cmd{mRGBEdecode()} returns the corresponding floating point
RGB wavelet coefficient $d = [d_r,\: d_g,\: d_b]$:
\begin{eqnarray}
\label{eq:invmrgbe}
\overline{d} &=& \frac{\mathrm{ldexp}\left(1, -\lvert x \rvert\right)}
{m_{max}} \\
d_i &=& \mathrm{sgn_m}\left(m_i\right) \: \left(
\lvert m_i - m_{max} + \epsilon \rvert \
\frac{\overline{d}}{d_{norm,i}}\: +\: d_{min,i}
\right) \\
\mathrm{sgn_m}\left(m_i\right) &=& \left\{
\begin{array}{l r}
-1 & \quad \textrm{if } m_i < m_{max} \\
1 & \quad \textrm{if } m_i \geq m_{max} \\
\end{array}
\right. \\
m_{max} &=& 2^\textrm{MANTBITS-1},
\end{eqnarray}
where overbars denote normalised values, and $d_{norm,i}$ and $d_{min,i}$
are the mRGBE normalisation factor and range minimum for color channel $i$,
respectively. Each coefficient $d_i$ is denormalised by $d_{norm,i}$ and
offset by $d_{min,i}$, inverting the encoding in equations
\ref{eq:mrgbe}ff. The function $\mathrm{sgn_m}\left(m_i\right)$ is
a macro that returns the sign of the mRGBE mantissa $m_i$ by comparing its
value with the signed mantissa offset, $m_{max}$. This offset is
subtracted from each $m_i$ to obtain $d_i$.
The $\mathrm{ldexp}\left(m, x\right)$ function is part of the
standard C library and computes integer powers of 2 scaled by a floating
point mantissa, i.e. $\overline{d} = m 2^x$, where $m = 1$ in this case
to obtain a normalised power of 2. Note that the exponent $x$ is stored
as absolute value in the mRGBE exponent field, and is therefore negated
so that $\overline{d} \leq 1$.
Due to the limited precision of the mRGBE
encoding, the decoded value will deviate from the original. A jitter
value $\epsilon \in [0,0.5]$ can be added to break up quantisation
artefacts (aliasing). In practice, values in excess of 0.5 increase the
RMSE reported by the unit test (see section \ref{sec:mrgbe-test}).
The decoded detail coefficients are transferred to the wavelet coefficient
matrix at their corresponding indices, thus incrementally populating
the matrix. The matrix index is
obtained by summing the incremental values embedded in the mRGBE data
field of the consecutively stored coefficients. Since this index $k$ is
1-dimensional, it is deserialised to 2D coordinates
$i, j$ by invering the mapping in equation \ref{eq:linearIdx}:
\begin{eqnarray}
\label{eq:invLinearIdx}
i &=& \lfloor k / m \rfloor \\
j &=& k \textrm{ mod } m,
\end{eqnarray}
where $m$ is the array dimension. Once the matrix has been populated with
all decoded coefficients, the unpopulated coefficients remain zero (as
initialised); these represent those that were thresholded during
compression.
\subsection{2D Inverse Wavelet Transform}
Once the wavelet coefficient matrix has been decoded and populated,
\cmd{decodeContribs()} calls
\cmd{wave\-let2:\-pad\-Wave\-let\-Inv\-Xform2()} to
perform a full inverse Daubechies DB2 wavelet transform to recover the
original contributions -- or more specifically, an approximation thereof
subject to compression artefacts.
The latter function therefore acts as a wrapper
for \cmd{pad\-D2\-Inv\-Step2()}, which performs one pass of the inverse
wavelet transform along a fixed (horizontal) axis.
Like its forward transforming counterpart \cmd{pad\-D2\-Step2()}
described in section \ref{sec:wavelet3.c}, this function returns an
output matrix in which the inverted coefficients have been transposed so
it can be swapped with the input matrix to perform a second pass along the
alternate axis. Consequently, each pair of invocations of the inverse
transform step constitutes a complete horizontal/vertical inverse
transform pass at a given resolution.
Given approximation and detail coefficients, $s^k_{i,j}$ amd $d^k_{i,j}$
at iteration $k$, the 4-tap Daubechies wavelet transform in equations
\ref{eq:d2FwdXform-s} and \ref{eq:d2FwdXform-d} is inverted as follows
to reconstruct the adjacent approximations $s^{k-1}_{i,2j}$ and
$s^{k-1}_{i,2j+1}$ at doubled resolution for the next iteration $k-1$:
\begin{eqnarray}
\label{eq:d2InvXform}
s^{k-1}_{i,2j} &=& h_2\ s^k_{i,j} \ +\ g_2\ d^k_{i,j} \ +\
h_0\ s^k_{i,j+1} \ +\ g_0\ d^k_{i,j+1}\\
s^{k-1}_{i,2j+1} &=& h_3\ s^k_{i,j} \ +\ g_3\ d^k_{i,j} \ +\
h_1\ s^k_{i,j+1} \ +\ g_1\ d^k_{i,j+1},
\end{eqnarray}
where $h_j$ and $g_j$ are the constants from equations
\ref{eq:d2-hCoeffs} and \ref{eq:d2-gCoeffs}.
At the final iteration where $k=0$, the approximation coefficients
correspond to the original signal, so that $s^0_{i,j} \approx y_{i,j}$.
Note that, for the sake of clarity, equation \ref{eq:d2InvXform}
omits the on-the-fly transposition shown in figure \ref{fig:waveletStep}
for the forward transform; in actuality, the implementation swaps the
indices $i, j$ on the RHS.
Upon completing the inverse wavelet transform,
\cmd{decodeContribs()} exits and returns the reconstructed binned
contributions, which are subsequently transferred row-by-row via
\cmd{memcpy()} to \cmd{rcontrib}'s linear binned contribution array
for the current modifier. Having accumulated the contributions from all
modifiers, \cmd{getPreCompPhotonContrib()} finally returns and
hands control back to \rcontrib.
\subsection{Decoded Contribution Caching}
The out-of-core photon cache used by the lookup routine
\cmd{pmapooc:OOC\_\-Find\-1\-Pho\-ton()} anticipates the need to
repeatedly evaluate photons for spatially adjacent lookups,
thus reducing I/O latency incurred by paging from disk.
The cache is an instance of struct
\cmd{ooccache:OOC\_Cache} and is organised into a variable number of
entries, where each entry contains a page (a block loaded on demand
from disk) consisting of a fixed number of neighbouring photons.
A standard least-recently-used (LRU) page replacement strategy
governs the preemption of obsolete pages when the cache fills
\cite{Tanenbaum:2014:MOS:2655363}.
To hide the latency involved in decoding the photon's contributions, an
additional, per-modifier precomputed contribution cache (another
instance of struct \cmd{OOC\_Cache}) is interrogated by
\cmd{getPreCompContribByMod()} to determine whether the contributions
have already been paged and decoded.
A dedicated routine \cmd{pmcontribcache:getContribCache()}
accesses the contribution cache using the found photon's index as
caching key, and returns a flag indicating the cached status of the
photon's contributions. This in turn calls the out-of-core caching routine
\cmd{ooccache:OOC\_CacheData()}, which returns a pointer to an array of
binned contributions.
If \cmd{getContribCache()} signals that the decoded contributions are
already cached, they are returned in the array and can be directly
accumulated in \cmd{rcontrib}'s contribution bins, obviating the need
to page and decode these (again).
If the contributions have not yet been cached, a new entry is allocated
with an empty array, which must be populated by the caller.
The contribution cache uses the same LRU page replacement strategy as
the out-of-core photon cache, except that a page now contains the decoded
contributions associated with a single photon; consequently, the page
size of this \cmd{OOC\_Cache} instance is hardwired to 1 (as specified in
number of photons).
This cache instance is lazily initialised by calling a dedicated
routine \cmd{pmcontribcache:initContribCache()}.
If no cached contributions were found for the current photon, they are
paged from disk and decoded as described in the previous sections,
and transferred to the new cache entry.
The contributions must obviously then be
weighted by the incident ray's cumulative contribution \emph{after}
the cache entry has been initialised, since the incident ray is variable.
% ---------------------------------------------------------------------------
\clearpage
\section{Compilation}
The precomputed contributon photon map is based on the out-of-core
photon map data structure, which requires enabling \cmd{PMAP\_OOC} at
compile time:
\begin{center}
\begin{minipage}{0.5\linewidth}
\begin{lstlisting}
rmake OPT+=-DPMAP_OOC
\end{lstlisting}
\end{minipage}
\end{center}
This enables \cmd{PMAP\_CONTRIB} in \cmd{pmapcontrib.h}, which in turn
enables the contribution photon mapping code and the corresponding
options in \mkpmap. Note that this functionality is absent in the
in-core kd-tree data structure. Since the latter is still the default
as of this writing, out-of-core (and therefore contribution) photon
mapping must be explicitly enabled.
In addition, verbose debugging output and sanity checks in the
contribution photon map code can be enabled as follows:
\begin{center}
\begin{minipage}{0.5\linewidth}
\begin{lstlisting}
rmake OPT+="-DPMAP_OOC -DPMAP_CONTRIB_DBG"
\end{lstlisting}
\end{minipage}
\end{center}
These sanity checks are expensive and should be disabled
in production code. Among the tests performed are:
\begin{itemize}
\item Checking for out-of-order wavelet coefficients after calling
\cmd{coeffPartition()} in \cmd{thresholdContribs()}.
\item Encoding of generated contributions in \cmd{encodeContribs()},
including a ``bump'' function similar to the initialisation
options available in the wavelet unit test (see section
\ref{sec:wavelet3-test}).
\item Checking of mRGBE encoded wavelet coefficients in
\cmd{encodeContribs()} by comparing the
decoded coefficient with the original, and triggering a
consistency error if their difference is outside tolerance.
This error is also triggered if the encoded and decoded
incremental coefficient index differs. Furthermore, a
coefficient sign reversal (in any colour channel) triggers
this error too.
\item A 10-second delay in \cmd{preComputeContrib()} after forking
child processes in multiprocessing mode (when passing the
\opt{-n} option to \mkpmap), giving the user time
to attach a debugger to a particular subprocess.
The PID of all forked subprocesses is dumped to the console
for convenience.
\item Checking for duplicate photon selection in
\cmd{preComputeContrib()}; no photon should be selected twice
for precomputation, unless the random photon index selection
is buggy.
\item Tally and output of average mRGBE deviations, similar to
the mRGBE unit test (see section \ref{sec:mrgbe-test});
\item Checking for invalid mRGBE-encoded wavelet coefficient indices
in \cmd{decodeContribs()}.
\end{itemize}
Unit tests for the contribution binning, wavelet transform and mRGBE
encoding can also be enabled at compile time. See sections
\ref{sec:pmapcontrib-test}, \ref{sec:wavelet3-test} and
\ref{sec:mrgbe-test}, respectively, for details.
% ---------------------------------------------------------------------------
%\clearpage
\chapter{Results}
\section{Wavelet Transform Colourmaps}
\label{sec:xformCmaps}
\subsection{CIE Sunny Sky}
To illustrate the effects of wavelet compression, a simple CIE sunny
sky model (without solar source) was generated with \cmd{gensky} to
serve as reference. The contributions from this sky source were
collected with \rcClassic{} in 32 $\times$ 32 = 1024 bins using a
Shirley-Chiu mapping, and reshaped into a 2D matrix. A fisheye rendering
of the sky and a colourmap of the binned contributions are shown for
reference in figure \ref{fig:skycontrib1024-cie-orig}. The corresponding
script to bin the contributions is shown in listing
\ref{lst:skycontrib-1024.sh}.
\lstinputlisting[float=h, inputpath={listings},
label=lst:skycontrib-1024.sh,
caption = {Script to bin sky contributions with \rcClassic{} using
Shirley-Chiu mapping. Note the use of the \opt{-c} option to
sample multiple rays per bin and reduce variance.
}
]
{skycontrib-1024.sh}
The 2D contributions where loaded as datafile into the \cmd{wavelet3-test}
unit test (see section \ref{sec:wavelet3-test}) and transformed using the
padded 2D wavelet transform.
Figure \ref{fig:skycontrib1024-cie-coeffs} shows colourmaps of
the resulting wavelet coefficient matrices after thresholding, alongside
the reconstructed contributions, for a sequence of compression ratios.
The colourmaps encode absolute value as colour, with black corresponding
to zero. Black regions in the wavelet matrices indicate either unused
space to accommodate padding coefficients, or thresholded detail
coefficients. The largest detail coefficients lie in the upper left of
the matrix, towards the coarser resolutions, indicating the signal has
predominantly low frequency content, as expected from the CIE model.
As expected, these coefficients are only thresholded at very high
compression ratios.
The approximation coefficients in the extreme upper left submatrix
(shown in yellow) are largest in magnitude and never thresholded.
In the reconstructed contributions from the inverse wavelet transform,
artefacts become noticeable beyond 80\% compression ratio.
The resulting loss of detail is however tolerable until ca. 95\% due
to the low frequency content of the contributions, at which point
the circumsolar region bleeds into the boundary.
\begin{figure}[p]
\centering
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/ciesky%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-orig-imgplot-crop%
}
\subcaption{%
\label{fig:skycontrib1024-cie-orig}
Original: fisheye rendering (left),
binned contributions (right).
}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-00-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-00-imgplot-crop%
}
\subcaption{Wavelet transform: no compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-50-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-50-imgplot-crop%
}
\subcaption{Wavelet transform: 50\% compression}
\end{subfigure}\\
\medskip
\caption{%
Colourmaps of wavelet coefficient matrix (left) and
reconstructed contributions (right) from a CIE sky model
for different compression ratios. Black regions in the
coefficient matrix indicate unused or thresholded coefficients.
The original sky distribution and binned contributions are shown
in \ref{fig:skycontrib1024-cie-orig}.
}
\end{figure}
\begin{figure}[p]\ContinuedFloat
\centering
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-60-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-60-imgplot-crop%
}
\subcaption{60\% Compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-70-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-70-imgplot-crop%
}
\subcaption{70\% Compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-80-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-80-imgplot-crop%
}
\subcaption{80\% Compression}
\end{subfigure}\\
\medskip
\caption{(continued).}
\end{figure}
\begin{figure}[p]\ContinuedFloat
\centering
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-90-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-90-imgplot-crop%
}
\subcaption{90\% Compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-95-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-95-imgplot-crop%
}
\subcaption{95\% Compression (supernova)}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-98-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-98-imgplot-crop%
}
\subcaption{98\% Compression (FUBAR)}
\end{subfigure}\\
\medskip
\caption{(continued).}
\label{fig:skycontrib1024-cie-coeffs}
\end{figure}
% ---------------------------------------------------------------------------
\subsection{HDR Sky Capture}
The procedure in the last section was applied to an HDR sky capture
obtained from a camera with fisheye lens, which is shown for reference
along with the corresponding binned contributions in figure
\ref{fig:skycontrib1024-hdr-orig}. Note the capture also includes
vegetation on the horizon. Figure \ref{fig:skycontrib1024-hdr-coeffs}
again shows colourmaps of the thresholded wavelet coefficient matrices
and the resulting reconstructed contributions.
This example exhibits more frequency content (notably in the
boundary regions due to vegetation on the horizon), and
artefacts already become objectionable at 90\% and above, with the
sky distribution completely breaking up above 95\%.
\begin{figure}[p]
\centering
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/hdrsky%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-orig-imgplot-crop%
}
\subcaption{%
\label{fig:skycontrib1024-hdr-orig}
Original: HDR camera capture (left),
binned contributions (right).
}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-00-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-00-imgplot-crop%
}
\subcaption{Wavelet transform: no compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-50-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-50-imgplot-crop%
}
\subcaption{Wavelet transform: 50\% compression}
\end{subfigure}\\
\medskip
\caption{%
Colourmaps of wavelet coefficient matrix (left) and
reconstructed contributions (right) from an HDR sky capture
for different compression ratios. Black regions in the
coefficient matrix indicate unused or thresholded coefficients.
The original sky distribution and binned contributions are shown
in \ref{fig:skycontrib1024-hdr-orig}.
}
\end{figure}
\begin{figure}[p]\ContinuedFloat
\centering
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-60-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-60-imgplot-crop%
}
\subcaption{60\% Compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-70-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-70-imgplot-crop%
}
\subcaption{70\% Compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-80-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-80-imgplot-crop%
}
\subcaption{80\% Compression}
\end{subfigure}\\
\medskip
\caption{(continued).}
\end{figure}
\begin{figure}[p]\ContinuedFloat
\centering
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-90-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-90-imgplot-crop%
}
\subcaption{90\% Compression (supernova)}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-95-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-95-imgplot-crop%
}
\subcaption{95\% Compression (FUBAR)}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-98-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-98-imgplot-crop%
}
\subcaption{98\% Compression (black hole?)}
\end{subfigure}\\
\medskip
\caption{(continued).}
\label{fig:skycontrib1024-hdr-coeffs}
\end{figure}
% ---------------------------------------------------------------------------
\section{Binned Contribution Renderings}
\label{sec:cpmapTestHDR}
Figure \ref{fig:cpmapTestHDR-rc} shows a set of per-bin fisheye
falsecolour renderings generated by \rcClassic{} for a
simple,
bilaterally lit scene. The scene contains two opposing fenestrations with
cyan tint, and 3970 suns uniformly distributed in the sky dome. The
sensor position is located in the centre of the space, at floor height.
The number of ambient bounces was set to 4 (\opt{-ab 4}) to obtain
a reasonable indirect illuminance estimate. Using 256$\times$256 primary
rays per bin, the running time on an Intel Xeon E5-2660 @ 2.60GHz
with 20 cores was ca. 1 hour. The shell script%
\footnote{
Yeah, it's \cmd{(t)csh}; we're too old-school for \cmd{bash}.
}
used to generate the renderings is shown in
listing~\ref{lst:cpmapTestHDR-rc}.
\lstinputlisting[float=h, label=lst:cpmapTestHDR-rc, inputpath={listings},
caption = {Script to generate the per-bin fisheye renderings in
figure \ref{fig:cpmapTestHDR-rc} with \rcClassic.
}
]
{rc-hdr.sh}
Figure \ref{fig:cpmapTestHDR-pmap} shows the same scene rendered with
\rcontrib{} using ca. 64000 precomputed contribution photons, whose
contributions were wavelet compressed by 80\%. The contributions were
rendered without ambient bounce (\opt{-ab -1}) and instead evaluated
directly via photon lookups at each pixel's world coordinates.
Despite the visible noise due to direct photon visualisation, there is
overall good agreement with figure \ref{fig:cpmapTestHDR-rc}, except for
the leftmost column, which corresponds to bin numbers that are even
multiples of 16. These bins lie on the boundary of the wavelet transform
domain and therefore subject to compression artefacts, as discussed
in the next section.
While \mkpmap{} took ca. 13.5 minutes to precompute the photon map on
the same CPU, rendering with \rcontrib{} took only 14 seconds.
\lstinputlisting[float=h, label=lst:cpmapTestHDR-pmap, inputpath={listings},
caption = {Script to generate the per-bin fisheye renderings in
figure \ref{fig:cpmapTestHDR-pmap} with \rcontrib{} using
precomputed contribution photon mapping.
}
]
{cpmapTest-hdr.sh}
\begin{figure}[h]
\centering
\includegraphics[width=\linewidth]{%
rc-bn64-ab4-compos-false%
}
\parbox{\linewidth}{%
\caption{%
\label{fig:cpmapTestHDR-rc}
Falsecolour rendered contributions from \rcClassic{} for
each of 64 bins in the bilaterally lit test scene containing
3970 solar positions.
The sensor is located in to centre of the space on the floor,
facing the zenith. The glazings on either side of the space
are tinted cyan.
This sequence was rendered with up to 4 ambient
bounces \opt{-ab 4}. On 20 cores,
the renderings took ca. 1 hour to complete at
256$\times$256 pixels per bin.
}
}
\end{figure}
\begin{figure}[h]
\centering
\includegraphics[width=\linewidth]{%
cpmapTest-64m-bn64-2400-comp0.8-reorient-compos-false%
}
\parbox{\linewidth}{%
\caption{%
\label{fig:cpmapTestHDR-pmap}
The bilaterally lit scene from figure \ref{fig:cpmapTestHDR-rc}
rendered with precomputed contribution photons
after 80\% wavelet compression.
This sequence was rendered with \opt{-ab -1},
consequently no ambient bounces are performed. This greatly
accelerates computation at the expense of some noise: on 20 cores,
the entire sequence took just 14 seconds to complete at
256$\times$256 pixels per bin.
}
}
\end{figure}
\section{Wavelet Compression and Boundary Artefacts}
Figure \ref{fig:contribCompression} shows a series of 3D plots of
the contributions in the bilaterally lit scene for different
wavelet compression ratios. This example uses a first order gradient
boundary extension and logarithmic contribution encoding.
While noise in the photon density estimate dominates at lower compression,
boundary artefacts tend to dominate at higher compression ratios. This
suggests the transform is very sensitive to thresholded padding
coefficients, and offers an explanation for the deviations in the
leftmost column in figure \ref{fig:cpmapTestHDR-pmap}.
\begin{figure}[p]
\centering
\begin{minipage}{0.49\linewidth}
\centering
\includegraphics[width=\linewidth]{%
cpmapTest-64m-bn256-3200-comp0.5%
}\\
Compression 0.5
\end{minipage}
\hfill
\begin{minipage}{0.49\linewidth}
\centering
\includegraphics[width=\linewidth]{%
cpmapTest-64m-bn256-3200-comp0.6%
}\\
Compression 0.6
\end{minipage}\\[3mm]
\begin{minipage}{0.49\linewidth}
\centering
\includegraphics[width=\linewidth]{%
cpmapTest-64m-bn256-3200-comp0.7%
}\\
Compression 0.7
\end{minipage}
\hfill
\begin{minipage}{0.49\linewidth}
\centering
\includegraphics[width=\linewidth]{%
cpmapTest-64m-bn256-3200-comp0.8%
}\\
Compression 0.8
\end{minipage}\\[3mm]
\centering
\begin{minipage}{0.49\linewidth}
\centering
\includegraphics[width=\linewidth]{%
cpmapTest-64m-bn256-3200-comp0.9%
}\\
Compression 0.9
\end{minipage}\\[3mm]
\parbox{\linewidth}{%
\caption{%
\label{fig:contribCompression}
3D plots of contributions calculated by \rcClassic{}
(blue) and photon map (red) in the bilaterally lit scene.
The contributions are accumulated in 256 bins for a
sensor located in the centre of the floor.
The contribution photon map was compressed with ratios
of 0.5 to 0.9. In this example, compression artefacts
appear at the boundaries (corresponding to the horizon)
with 80\% compression and above. This is a known limitation
of wavelet compression if the input signal exhibits high
gradients at the boundary.
}
}
\end{figure}
% ---------------------------------------------------------------------------
%\clearpage
\setcounter{secnumdepth}{-2}
\chapter{Acknowledgements}
\setcounter{secnumdepth}{2}
This research was supported by the Swiss National Science Foundation
as part of the project ``Light Fields for Spatio-Temporal Glare
Assessment'' (\#179067).
The author would like to thank his colleagues Dr. Lars Grobe and
Dr. Stephen Wasilewki for their collaboration, for providing feedback
during weekly lunchtime meetings, and for testing the code on MacOS.
The author would also like to express special thanks to the head of
the former Competence Centre for Envelopes and Solar Energy (CC EASE),
Prof. Stephen Wittkopf.
%, who gave our crew a sleek
%vessel, until it was torpedoed by the powers that be in 2016.
Finally, the author would like to thank his colleagues at
LBNL and Ladybug Tools for supporting the release of the new
photon mapping code through automated testing on supported platforms,
notably Mingbo Peng and Taoning Wang. And of course, the author's
biggest thanks go out to Greg Ward for his enduring support and
service to the \radiance{}
{\fontfamily{jkpvos}\selectfont\textit{Publick}}
... uh, community.
{\fontfamily{jkpvos}\selectfont\Large\vspace{1cm}\noindent
\textit{\textbf{The crew of the former CC EASE bids= you Farewell
for now, and thanks= the RADIANCE Publick for their Support.
}
}
}
% ---------------------------------------------------------------------------
%\clearpage
\bibliographystyle{alpha}
\bibliography{precomp-contrib-pmap-techreport}
% ---------------------------------------------------------------------------
%\clearpage
\appendix
\chapter{Source Code Definitions}
\lstinputlisting[float=h, label=lst:codeDefs1, inputpath={listings},
caption = {Source code definitions in \cmd{pmapcontrib.h}
relevant to precomputed contribution photons.
}
]
{pmapcontrib.h}
\lstinputlisting[float=h, label=lst:codeDefs2, inputpath={listings},
caption = {Source code definitions in \cmd{wavelet2.h} and
\cmd{mrgbe.h} relevant to precomputed contribution photons.
}
]
{wavelet2-mrgbe.h}
\lstinputlisting[float=h, label=lst:codeDefs3, inputpath={listings},
caption = {Source code definitions in \cmd{pmapdata.h} relevant to
precomputed contribution photons.
}
]
{pmapdata.h}
% ---------------------------------------------------------------------------
\chapter{Software Architecture}
\label{sec:swarch}
Figure \ref{fig:swarch-mkpmap} provides an overview of the precomputed
contribution photon map's software architecture in the context of
photon map generation and precomputation with \cmd{mkmap}.
Figure \ref{fig:swarch-rcontrib} gives a similar overview
in the context of the photon map's evaluation with \cmd{rcontrib}.
The blue blocks group relevant fuctions (red text) within software
modules built from C source files with the indicated names.
Red arrows correspond to function calls, which are ordered top-down.
Note that some routines, notably
\cmd{savePhotonMap()} and \cmd{loadPhotonMap()} recurse on a parent
photon map's per-modifier child photon maps, hence the arrows indicate
a loop.
A special case are the variables \cmd{pmapcontrib:pmapContribTab} and
\cmd{pmapcontrib:pmapContribMode}, which are set by
\cmd{pmcontrib4:initPmapContribTab()} in the context of \rcontrib.
These serve as references (pointers) to \rcontrib's contribution table
\cmd{rcontrib:modconttab} and coefficient/contribution mode flag
\cmd{rcontrib:contrib}, but are unused (i.e. NULL) in the context of
\mkpmap, where no such variables exist.
\begin{sidewaysfigure}
\center
\includegraphics[width=0.95\textwidth]{contrib-swarch-mkpmap2-crop}
\caption{
\label{fig:swarch-mkpmap}
Software architecture of precomputed contribution photon map in
the context of \cmd{mkpmap}. Red arrows represent function calls,
ordered top-down.
}
\end{sidewaysfigure}
\begin{sidewaysfigure}
\center
\includegraphics[width=0.95\linewidth]{contrib-swarch-rcontrib-crop}
\caption{
\label{fig:swarch-rcontrib}
Software architecture of precomputed contribution photon map in
the context of \cmd{rcontrib}. Red Arrows represent function calls,
ordered top-down.
}
\end{sidewaysfigure}
\end{document}

Event Timeline