Page Menu
Home
c4science
Search
Configure Global Search
Log In
Files
F111588103
precomp-contrib-pmap-techreport.tex,v
No One
Temporary
Actions
Download File
Edit File
Delete File
View Transforms
Subscribe
Mute Notifications
Award Token
Subscribers
None
File Metadata
Details
File Info
Storage
Attached
Created
Sat, May 3, 14:03
Size
230 KB
Mime Type
text/x-tex
Expires
Mon, May 5, 14:03 (2 d)
Engine
blob
Format
Raw Data
Handle
25955126
Attached To
R12290 RADIANCE Photon Map Documentation
precomp-contrib-pmap-techreport.tex,v
View Options
head 1.13;
access;
symbols;
locks
u-no-hoo:1.13; strict;
comment @% @;
1.13
date 2023.06.06.15.51.20; author u-no-hoo; state Exp;
branches;
next 1.12;
1.12
date 2023.06.06.15.41.32; author u-no-hoo; state Exp;
branches;
next 1.11;
1.11
date 2022.07.24.13.01.08; author u-no-hoo; state Exp;
branches;
next 1.10;
1.10
date 2022.07.14.18.35.35; author u-no-hoo; state Exp;
branches;
next 1.9;
1.9
date 2022.07.13.21.02.49; author u-no-hoo; state Exp;
branches;
next 1.8;
1.8
date 2022.07.08.17.27.25; author u-no-hoo; state Exp;
branches;
next 1.7;
1.7
date 2022.07.08.17.10.27; author u-no-hoo; state Exp;
branches;
next 1.6;
1.6
date 2022.06.14.17.19.02; author u-no-hoo; state Exp;
branches;
next 1.5;
1.5
date 2022.06.07.14.03.42; author u-no-hoo; state Exp;
branches;
next 1.4;
1.4
date 2022.05.12.21.06.49; author u-no-hoo; state Exp;
branches;
next 1.3;
1.3
date 2022.05.12.20.53.18; author u-no-hoo; state Exp;
branches;
next 1.2;
1.2
date 2022.05.07.14.24.45; author u-no-hoo; state Exp;
branches;
next 1.1;
1.1
date 2022.05.07.00.27.36; author u-no-hoo; state Exp;
branches;
next ;
desc
@Technical report on the RADIANCE precomputed contribution photon map.
@
1.13
log
@Affiliation updated to CC Building Envelopes
@
text
@\documentclass[a4paper,11pt]{report}
\usepackage{color,graphicx,pslatex,rcs,gensymb,bigints,listings,rotating,
listings
}
\usepackage[colorlinks,urlcolor=red,bookmarks=false,breaklinks=false]{
hyperref
}
\usepackage[skip=0.5em]{subcaption}
\usepackage[onlyrm,veryoldstyle]{kpfonts}
%\usepackage[firstpage,stamp]{draftwatermark}
\usepackage[T1]{fontenc}
% Reduce margins
\addtolength{\oddsidemargin}{-1.5cm}
\addtolength{\evensidemargin}{-1.5cm}
\addtolength{\textwidth}{3cm}
\addtolength{\topmargin}{-2cm}
\addtolength{\textheight}{3cm}
\setlength{\marginparwidth}{2cm}
\usepackage{todonotes}
\graphicspath{{figs/}}
\definecolor{lstbg}{gray}{0.9}
\definecolor{lstfg}{rgb}{0,0,0.5}
\lstset{language=C, frame=single, backgroundcolor=\color{lstbg},
% xleftmargin=5pt, xrightmargin=5pt,
% aboveskip=20pt, belowskip=20pt,
numbers=none, captionpos=b, basicstyle=\footnotesize\color{lstfg},
breaklines=true, breakatwhitespace=true, showstringspaces=false,
escapeinside={(*@@}{@@*)}
}
\renewcommand{\familydefault}{\sfdefault}
\newcommand{\radiance}{\textsc{radiance}}
\newcommand{\radClassic}{\textsc{radiance classic$^\textsf{\tiny TM}$}}
\newcommand{\var}[1]{$\mathit{#1}$}
\newcommand{\lit}[1]{\textit{#1}}
\newcommand{\opt}[1]{\textbf{#1}}
\newcommand{\cmd}[1]{\textit{\textbf{#1}}\/}
\newcommand{\realLife}{\textsc{RealLife$^\textsf{\tiny TM}$}}
\newcommand{\mkpmap}{\cmd{mkpmap}}
\newcommand{\rcontrib}{\cmd{rcontrib}}
\newcommand{\rtrace}{\cmd{rtrace}}
\newcommand{\rcClassic}{\cmd{rcontrib classic$^\textsf{\tiny TM}$}}
\RCS $Revision: 1.12 $
\RCS $Date: 2023/06/06 15:41:32 $
\fontfamily{jpk}\selectfont
\title{The \radiance{} Precomputed Contribution Photon Map\\
--- Technical Report ---\\[5mm]
{\fontfamily{jkpvos}\selectfont\Large
\textrm{
\textit{%
Being an Account of the final remarkable Enterprises=
and desperate Actions=, \&c, undertaken by the shipwreck'd Crew
of the founder'd CC~EASE,
in which is= describ'd a most fantastick Design,
viz. furnishing
a compleat computed Day-Light, by the Season,
to the RADIANCE Publick
}
}
}
}
\fontfamily{jpk}\selectfont
\author{
Roland Schregle (roland.schregle@@\{hslu.ch, gmail.com\})\\
CC Building Envelopes and Civil Engineering\\
Lucerne University of Applied Sciences and Arts
}
\date{
Revision \RCSRevision\\
\today
}
\begin{document}
\maketitle
\begin{abstract}
The \rcontrib{} tool included with the \radiance{} lighting simulation
and rendering software suite is useful for calculating daylight
coefficients for climate-based modelling, notably to assess annual
daylight availability under representative sky conditions.
This requires ``binning'' contributions from grouped sky directions
(``patches'') and sun positions. This can be a slow process as
\rcontrib{} does not cache irradiance, resulting in redundant rays
being traced for neighbouring sensor positions.
\radiance{} includes a contribution photon mapping module to bin
contributions by tracing rays from the light sources (forward
raytracing), in contrast to \radiance's standard backward raytracing
approach from the sensors (referred to here as \rcClassic). Photon
mapping is particularly efficient to simulate redirection through
complex shading devices using data-driven BSDFs, such as prismatic
films.
This technical report documents the further development of the
\radiance{} contribution photon map to support precomputed
contributions to reduce redundancy in \rcontrib.
To this end, the contribution photon map utilises a wavelet
compression and efficient coefficient encoding to compactly
represent the large volume of data the contributions incur.
The encoded coefficients are then paged on-demand from disk
and cached to reduce the in-core footprint of the contributions
in \rcontrib.
This document focuses on the implementation details of
the photon mapping software in support of this new functionality.
\end{abstract}
\tableofcontents
% ---------------------------------------------------------------------------
%\clearpage
\chapter{Introduction}
\section{Motivation}
Daylight coefficients and contributions are used in \radiance{} workflows
to simulate and analyse annual daylight availability subject to
seasonal variations. This includes accumulating contributions or
daylight coefficients (the latter corresponding to normalised
contributions) from solar sources and sky ``patches'', the latter
corresponding to discrete subdivisions of the incident hemisphere
on the receiving surface. These coefficients or contributions are
then propagated in the scene to assess predictions of indirect
daylight availablity on work planes, for example.
In a static scene, these daylight coefficients can be subsequently
scaled a posteriori with
discretised sky luminance vectors obtained either from empirical sky
models or climate data measured at the proposed site.
This process can be repeated as desired for a time-series of luminance
distributions without the need to recalculate the coefficients, since
these remain static along with the geometry. The resulting vector
calculations can be easily parallelised on modern computing platforms
for very large time-series, affording a high temporal resolution.
Calculating the coefficients at sufficiently high resolution in the
first place, can, however, be computationally expensive.
\radiance{} uses the \rcontrib{} tool to accumulate contributions
or coefficients from light sources as well as arbitrary objects. It
accounts for the light transport paths contributed by a set of
user-specified objects (identified by their modifiers), optionally
discretising their incident directions into ``bins'', and outputs these
either to one or more files, or to the console. In the context of contributions,
the incident direction is that of the first ray emitted or scattered
from an object with a modifier whose contributions are sought.
The \radiance{} photon map was extended to support annual daylight
simulations using contribution photons when it became part of the
official \radiance{} software distribution
\cite{schregle-techreport-2015}. Its primary purpose was to efficiently
compute daylight coefficients for daylight redirecting components with
predominantly specular reflection or transmission.
Initial results using contribution
photon mapping indicated -- unsurprisingly -- that the number of
required photons to adequately predict lighting levels
scales linearly with the number of
timestamps / light sources \cite{schregle-cisbat-2015}.
This in turn motivated the development of an \textit{out-of-core} photon
map that maintains its photons entirely on disk, and
dynamically loads subsets of photons on demand
\cite{schregle-oocpmap-jbps-2016, schregle-techreport-2016}.
By loading photons on demand, the resident memory footprint is reduced to
those photons which actually contribute to the sensor points under
consideration. By doing so, complex annual daylight simulations with
large photon maps can be efficiently performed on commodity office PCs.
Yay.
This technical report supplements the user documentation found in the
\radiance{} photon map manual \cite{schregle-pmapManual-2022}
and \mkpmap{} and \rcontrib{} manpages. It is primarily intended
for researchers interested in extending the code, and serves as
primary documentation of the source code.
% ---------------------------------------------------------------------------
\section{Overview}
An overview of the precomputed contribution photon map toolchain is shown
in figure \ref{fig:overview}. It follows the general \radiance{} photon
mapping workflow consisting of a \mkpmap{} preprocess to generate the
photon map, and a modified \radiance{} tool which evaluates
irradiance from the photons for a given set of virtual
sensor positions (often arranged in a grid).
\begin{figure}[p]
\centering
\includegraphics[width=\linewidth]{contribpmap-overview3-crop}
\parbox{\linewidth}{%
\caption{%
\label{fig:overview}
Overview of contribution photon mapping workflow.
Green paths denote inputs (parameters and sensor positions),
while red paths denote output. Light source contributions for
modifier mod are binned using an
$\left\lfloor\sqrt{nbins}\right\rfloor^2$
Shirley-Chiu disk-to-square mapping, and wavelet compressed
for a fraction \var{precomp} of photons by \mkpmap.
These contributions are saved along with the corresponding
photons in separate files for each modifier, grouped in a
subdirectory under the parent photon map \var{pmapfile}. The
photons and their precomputed contributions are subsequently
paged on demand, uncompressed, and cached by \rcontrib,
which passes the contributions to the standard contribution
calculation used by \rcClassic.
}
}
\end{figure}
The photon map represents
a precomputed radiance distribution which can be evaluated multiple
times under the assumption that the geometry and lighting remain static.
This aspect of photon mapping amortises the expense of precomputation,
thus the payoff scales with the number of sensor positions. Contribution
photon mapping is then significantly faster than ``classic'' \rcontrib{}
in the presence of complex geometry and specular materials, giving rise
to multiple scattering events. This effect is further amplified by the
fact that the irradiance cache is disabled in \rcontrib.
In precomputed contribution photon mapping mode, \mkpmap{} is designed
to behave similarly to \rcontrib, and accepts a subset of its parameters to
``bin'' (discretise) contributions based on their incident directions.
The contributors are identified by their modifiers with the \opt{-m}
option, but unlike \rcClassic, they are restricted to light
sources only. The binned direction is the incident direction of a
photon's first interaction with a scene object after emission from a
contributing light source. This corresponds to the \textit{photon primaries}
in the earlier contribution photon mapping implementation, although these
are no longer stored as they are no longer evaluated in \rcontrib.
The workflow in figure \ref{fig:overview} can be divided into the
following stages, each comprising distinct components as indicated
in the gray boxes:
\begin{enumerate}
\item Photon map generation with \mkpmap:
This precomputes the contribution photons and compresses them
for use by \rcontrib. Under the hood, \mkpmap{} performs
the following steps:
\begin{enumerate}
\item Distribution of \var{nphotons} photons as specified with
the \opt{-apC} option.
\item Binning of photon incident directions into \var{nbins} bins
(as specified with the \opt{-bn} option) using a Shirley-Chiu
disk-to-square mapping \cite{shirleyChiu-1997}. Technically,
this already occurs on-the-fly during photon distribution.
\item Precomputation of binned contributions for a fraction
\var{precomp} of the distributed photons, as specified with
the \opt{-apP} option (this option already served the same
purpose for the existing precomputed global photon map).
This involves performing a density estimate by locating the
\var{bwidth} nearest photons in the vicinity of a precomputed
photon, and accumulating their pre-binned contributions
in the mapped Shirley-Chiu square.
\item A wavelet transform over the binned contributions,
resulting in a set of approximation and detail coefficients.
\item Compression of wavelet detail coefficients by
thresholding, i.e. keeping only the (\var{comp})\% most
significant coefficients, as specified with the \opt{-apC}
option.
\item Compact encoding of thresholded wavelet coefficients
using a customised 32-bit \textit{mRGBE} format similar to
the RGBE format used for \radiance{} HDR files
\cite{ward-RGBE-1994}, with an additional coefficient index
to reconstruct the original bin ordering.
\item Saving of the precomputed photons and their corresponding
compressed wavelet coefficients; the previously distributed
photons are no longer needed and discarded.
\end{enumerate}
\item The out-of-core precomputed contribution photon map on disk:
This consists of the following files:
\begin{enumerate}
\item A \emph{parent} photon map file \var{pmapfile} as
specified with the \opt{-apC} option.
This file does not actually contain any photons, and merely
serves as a header for \cmd{getinfo}.
\item An option file \var{pmapfile.opt} containing the binning
options used to precompute to contribution photons; this is
passed to \rcontrib{} via its \opt{@@} option to ensure consistent
binning when evaluating the contributions.
\item A subdirectory \var{pmapfile.rc} containing the actual
precomputed contribution photon maps.
\item Per-modifier \emph{child} photon maps \var{mod_i.pm}
foreach modifier \var{mod_i} specified with the \opt{-m}
option. These use the existing out-of-core photon map format
\cite{schregle-techreport-2016} and include a companion
\var{mod_i.pm.leaf} file containing the actual photon payload.
\item Per-modifier compressed wavelet coefficients file
\var{mod_i.wvt}, organised as aggregate records of fixed size
per photon. The ordering is identical to that of the out-of-core
leaf file to facilitate on-demand-paging.
\end{enumerate}
\item Evaluation of precomputed contributions with \rcontrib:
\begin{enumerate}
\item Lookup for single closest precomputed photon at each
given sensor position, paging photons
from the out-of-core leaf file as required. This returns
a photon index $p$ corresponding to the photon's record number
in the leaf file.
\item Paging of the photon's associated mRGBE-encoded,
compressed wavelet coefficients, which are located in the
$p$-th record in the wavelet coefficient file.
\item mRGBE decoding of the wavelet coefficients to floating
point, plus coefficient index.
\item Expansion of compressed wavelet coefficients by populating
the coefficient matrix with the decoded coefficients in the
ordering indicated by their decoded indices.
\item Inverse wavelet transform to recover the original binned
contributions in the Shirley-Chiu square.
\item Caching of reconstructed contributions to hide latency
incurred by paging and reconstruction. This cache is
derived from that used by the existing out-of-core photon map,
and contains \var{ncached} pages (as specified with the
\opt{-aC} option), each containing a set of
fully decoded binned contributions for a single precomputed
photon.
\item Passing of reconstructed contributions to \rcontrib's
internal contribution lookup table, at which point control
is handed over to \textit{Don Gregorio} to produce the usual
binned contributions per sensor position according to the
specified output format.
\end{enumerate}
\end{enumerate}
Each stage and its constituents is detailed in section
\ref{sec:implementation}.
% ---------------------------------------------------------------------------
%\clearpage
\chapter{Proof of Concept}
\section{Rationale and Prototyping}
The design choices to implement the workflow in figure \ref{fig:overview}
were borne out by the author's collective experience with wavelets.
This dates back to a first exposure during a guest lecture on Wavelet
Radiosity held by Prof. H.P. Seidel hosted by the University of Bonn in
the mid-90s \cite{stamminger-waveletRadiosity-1995}. Further exposure
followed as the 90s saw a flurry of activity in applying wavelets to
computer graphics, notably the development of the lifting scheme
\cite{sweldens-lifting-1996}, and the adaptation of wavelets to the
spherical topology \cite{schroeder-sphWavelets-1995}.
The latter publication inspired the author's first own application of
wavelets in developing a 4D BSDF compression scheme as part of an
``adjunct project'' (read: digression) to the development of the
original photon map extension at Fraunhofer ISE.
This unfinished work served as a proof of concept, but was shelved and
never formally published. However, it was retrospectively
presented at the \radiance{} workshop almost a decade later as part
of the short-lived%
\footnote{Extremely short-lived infact, since this was the only
instalment!
}
lecture series, \lit{Secret Weapons of RADIANCE:
Stuff That Never Took Off} \cite{schregle-bsdfComp-2011}.
It was this initial, hands-on experience
with wavelets that made the author aware of their inherent power as
an analysis and compression tool.
Some two decades later, the author's involvement in the SNSF-funded
project \lit{Lightfields for glare assessment}
rekindled his interest in wavelets.
This was chiefly due to their use by his very talented
colleague, Stephen Wasilewski, to adaptively sample spatio-temporal
luminance distributions to represent a lightfield.
This sampling scheme was based on the
\cmd{pywavelets} Python module and led to the development of the
\lit{Raytraverse} tool \cite{wasilewski-raytraverse-2021}.
When a proposition was therefore made within the framework of the project
to extend the existing contribution photon map with precomputed
contributions, an application of wavelets was immediately apparent.
The availablity of \cmd{pywavelets} as a convenient prototyping tool
quickly prompted the development of a concept and initial experimentation
to prove the feasibility of the proposition.
A detailed account of the initial concept that led to the development
of the precomputed contribution photon map can be found -- along with
choice expletives documenting frustrating dead ends for a dose of realism
-- in a separate proposal document
\cite{schregle-preCompContribPmapProposal-2021}.
It should be acknowledged that during the development of precomputed
contributions, a number of existing -- and some \emph{very} old -- bugs
were still discovered in various places around the photon mapping code.
This work therefore not only extended the photon map's functionality,
but also made the code more robust in general.
% ---------------------------------------------------------------------------
\section{Initial Compression Tests with \cmd{pywavelets}}
\label{sec:waveletPoC}
An initial back-of-the-envelope calculation made it clear that the
storage required for precomputed contributions would be prohibitive
even for current office workstations and laptops; 10M precomputed
photons (a realistic number for a complex simulated environment) with
32-bit RGBE-encoded contributions in 2305~bins (corresponding to a
Reinhart~MF:4 mapping \cite{bourgeoisReinhart-2008}) would occupy
92.2~Gb on disk.
Note this calculation omits the storage occupied by the photons
themselves, which would occupy a relatively insignificant 240 Mb.
This simple example makes it clear that an effective compression scheme
is needed.
A quick statistical analysis of contributions from test scenes
immediately obviated the notion of using a na\"ive run-length encoding of
contiguous empty (zero) contributions. Instead, a wavelet compression
scheme was favoured based on prior experience.
To assess viable candidate compression techniques prior to hard-core
C coding, preliminary tests were conducted with a simple Python prototype
(\cmd{wavelet\_test.py}) using Filip Wasilewski's excellent
\cmd{pywavelets} (\cmd{pywt}) package \cite{Lee2019}.
The \rcontrib{} tool assumes a linear ordering of bins from
what is essentially a 3D domain (hemisphere of incident directions)
which, due to dimensional redundancy, can be reduced to a 2D
domain ($\theta, \phi$), and in turn serialised to a 1D domain,
as is the case with the popular Reinhart mapping.
This is inherently a compromise, since the original topology,
and therefore any multidimensional correlation in the contributions,
is then lost.
In an initial proof-of-concept, max-normalised scalar contributions from
a uniform sky in a bilaterally lit test scene with skylight were
binned using a Reinhart MF:4 mapping (2305 bins) and passed to a 1D
wavelet transform. The resulting wavelet coefficients were then
thresholded against a user-specified value to achieve what amounts to
a lossy compression, effectively setting the thresholded coefficients
to zero. Finally, the thresholded coefficients were reconstructed by
applying the inverse wavelet transform, and their deviations assessed
relative to the original.
The initial results revealed that the type of wavelet used (and
\cmd{pywt} offers many!) has a less significant impact on the
reconstruction error compared to the linear ordering of the coefficients.
In their original order, localised peaks in the linearised
contributions were significantly attenuated with high compression
(typically 90--98\% in these tests).
Ordering the contributions by their value to obtain a monotonically
increasing signal significantly reduced these artefacts, as it effectively
imposes correlation on the input. Similar tests were also conducted
with solar sources, which introduced higher dynamic range, more
pronounced peaks, and major artefacts without sorting (see figure
\ref{fig:solarcont-mf4}).
\begin{figure}[htb]
\centering
\includegraphics[width=0.45\linewidth]{%
solarcont-mf4-db4-90%
}
\includegraphics[width=0.45\linewidth]{%
solarcont-mf4-sort-db4-98-detail%
}\\
\parbox{0.9\linewidth}{%
\caption{%
\label{fig:solarcont-mf4}
Daubechies wavelet transform of Reinhart MF:4 mapped solar
contributions (2305 bins) after thresholding 90\% of coefficients.
In their original linear ordering (left), the binned contributions
exhibit isolated discontinuities (peaks) which are poorly
decorrelated by the
wavelet transform, resulting in obvious compression artefacts.
Ordering the contributions by increasing magnitude (right)
imposes corellation and significantly reduces these artefacts.
This reordering
however complicates the reconstruction of the contributions,
and doesn't decorellate the contributions in their original
hemispherical domain.
}
}
\end{figure}
Other 1D bin orderings using 2D space-filling curves such as Morton codes
or Hilbert indices yielded similarly disappointing results with peaky
contributions.
In addition, sorting the contributions complicates their reconstruction,
as this requires storing the original bin order.
Furthermore, this still doesn't decorellate the contributions
in their original 2D hemispherical domain, which results in suboptimal
compression. A 1D wavelet compression was therefore abandoned in
favour of a more elaborate 2D mapping of the hemisphere, and a wavelet
transform in this domain.
Furthermore, it was desirable not to limit the number of bins to a
power of two. This requires extending the contributions at the boundaries,
as described in section \ref{sec:boundaryExt}. Boundary extensions in
the context of wavelet transforms are poorly documented in the literature;
consequently, a detailed analysis of the \cmd{pywt} C source code was
conducted in order to obtain an insight into how the different boundary
extension modes offered by the module are implemented, notably in a
multi-dimensional context. This revealed the necessity of introducing
additional \textit{padding coefficients} at the boundaries, consequently
increasing the number of coefficients beyond the number of bins.
Because it is so comprehensive and highly
optimised, the \cmd{pywt} source code became the primary basis for the
wavelets module that was developed for \radiance{}, which is described in
section \ref{sec:wavelet3.c}.
% ---------------------------------------------------------------------------
\section{Initial Wavelet Coefficient Encoding Tests}
\label{sec:mrgbePoC}
An initial analysis of the dynamic range of normalised, scalar wavelet
coefficients generated by the test script using \cmd{pywt} indicated
that the majority of these spanned ca. 10 orders of magnitude (see
figure~\ref{fig:coeffRange}).
\begin{figure}[htb]
\centering
\includegraphics[width=0.9\linewidth]{skycont-mf4-sort-db4-coeffs}
\parbox{0.9\linewidth}{%
\caption{%
\label{fig:coeffRange}
Logarthmic plot of absolute wavelet coefficient magnitudes after
normalisation. The coefficients (magenta crosses) are clustered
in bands of increasing frequency from left to right, with the
number of coefficients doubling in consecutive bands due to the
wavelet transform's multiresolution analysis. In this example,
the dynamic range of the coefficients is limited to ca. 10 orders
of magnitude. It is also evident that the coefficients at the
bottom between $10^{-19}$ and $10^{-20}$
are negligible and can be omitted. It follows therefore that
it suffices to encode coefficients up to the green line at ca.
$4.6^{-10}$; this corresponds to an effective range of
$[2^{-31}, 1]$, which can be encoded using a 5-bit binary mantissa.
In practice, thresholding the coefficients will further reduce this
dynamic range.
}
}
\end{figure}
By clamping the maximum absolute
coefficient magnitude to 1, it is possible to map the coefficients as
3-tuples to a normalised range and encode these as integer RGB
mantissae%
\footnote{Yeah, that \emph{is} the plural.}
with a common exponent, thus accounting for 3 colour channels, as is done
with \radiance's 32-bit RGBE format \cite{ward-RGBE-1994}.
Furthermore, because the coefficients are thresholded, their minimum
absolute magnitude is likewise bounded. This insight, coupled with the fact
that the coefficients in the range $10^{-19}$ to $10^{-20}$ are clearly
insignificant, indicated the viability of an encoding with limited
precision. This encoding, which would require pre-normalising the
coefficients and storing corresponding independent normalisation factors
per colour
channel, would also need to accommodate a coefficient index to indicate
which coefficients remain after thresholding. This payload data would
be incorporated into the modified RGBE encoding (or mRGBE, as we like
to call it%
\footnote{
Actually, it can stand for \{micro, modified, minimalist\} RGBE;
take your pick.
It just has to sound cool nowadays; style over substance, you know.
iRGBE would've been trendy too, but probably infringes on some
trademark from Cupertino.
}) as a dedicated field of fixed width, which of course limits the
number of bits allocated to the RGB mantissae and the exponent, thus
inherently reducing
the precision compared to \radiance's native RGBE, which allocates
8 bits to each field.
\begin{figure}[!htb]
\centering
\includegraphics[width=0.8\linewidth]{%
skycont-mf4-sort-db4-bincoeffs%
}\\
\includegraphics[width=0.8\linewidth]{%
skycont-mf4-sort-db4-bincoeffs-detail%
}\\
\parbox{0.9\linewidth}{%
\caption{%
\label{fig:mrgbeTest}
Logarithmic plot of wavelet coefficient range in figure
\ref{fig:coeffRange} (top, detail inset at bottom) with
encoded/decoded mRGBE coefficients superimposed as blue circles.
Despite the limited precision
(4 RGB mantissa bits + 1 sign, 5 exponent bits), the mRGBE
encoding correlates well with the original floating point values
within the constrained dynamic range $[2^{-31}, 1]$ covered by
the exponent.
}
}
\end{figure}
A prototype mRGBE encoding/decoding script (\cmd{bincoeff\_test.py},
importing routines from \cmd{wave\-let\_test.py}) was therefore developed in
Python to check the accuracy (read: loss of precision) of the proposed
encoding given the previous practical sample data. Figure
\ref{fig:mrgbeTest} shows initial results of an mRGBE encoding using
5 bits per RGB mantissa (including 1 bit each for the sign) and the
exponent (implicitly negative), leaving 12 bits for the payload data,
which was not used in this test. Despite the limited precision, the
encoded and decoded coefficients exhibit remarkable coherence with the
original floating point values. However, it should be noted that the
RGB colour channels are identical in this test; although highly saturated
luminance is unlikely in realistic scenarios, higher deviations were
expected when the R, G, and B channels differ. This was further
investigated and quantified in the unit test of the C module
developed on the basis of the Python prototype, which is detailed in
section \ref{sec:mrgbe.c}.
% ---------------------------------------------------------------------------
\section{Design Goals}
The conclusions drawn from the initial tests helped set the design goals
for the precomputed contribution photon map. These included:
\begin{itemize}
\item A fixed binning of photon incident directions to a square matrix
using the area-preserving Shirley-Chiu disk-to-square mapping
which enables decorrelating the contributions in 2D in
a subsequent wavelet transform.
\item Compression in 2D via a computationally lightweight 4-tap
(having a support of 4 samples) wavelet transform, the Daubechies
DB2 wavelet being a popular choice.
\item An arbitrary number of bins (not just powers of 2). This requires
boundary extension of the contributions during the wavelet
transform.
\item Reuse or adaptation of existing photon map code where
possible; this included modifying the existing out-of-core photon
cache, the original contribution photon distribution routines,
the precomputation routines for global photons,
photon lookup routines, out-of-core caching routines, and
routines interfacing to \rcontrib.
We doan' need to reinvent da wheel.
\item Compact encoding (mRGBE) of 3-tuple wavelet coefficients and
their indices in a 32-bit envelope in a separate wavelet
coefficient file, with at least 5 bits per RGB mantissa.
\item Thresholding of wavelet coefficients by dropping a fixed number;
this results in a fixed size of compressed coefficients per
photon, which facilitates directly accessing contribution records
in the wavelet coefficient file.
\item Binning settings in the precomputed photon map override those
in \rcontrib{} when the photon map is loaded; these are passed
via an option file generated by \mkpmap.
\item Out-of-core storage of photons and their wavelet coefficients,
with on-demand paging in \rcontrib,
as already done by the standard out-of-core photon map.
Consequently, the in-core kd-tree data structure is no longer
supported with contribution photons.
\item Caching of reconstructed contributions after mRGBE decoding and
inverse wavelet transform in \rcontrib.
\end{itemize}
% ---------------------------------------------------------------------------
%\clearpage
\chapter{Implementation}
\label{sec:implementation}
The following sections present implementation details of each
component of the precomputed contribution photon map workflow
in reference to figure \ref{fig:overview}.
Each section refers to the associated source code modules
and relevant functions. The reader is referred to
the software architecture graphs in section \ref{sec:swarch} in the
appendix
for an overview of the modules, how they are embedded within the
\rcontrib{} framework, and how they interact.
The exposition also frequently refers to data types defined in the source
code, which are summarised in listings \ref{lst:codeDefs1},
\ref{lst:codeDefs2}, and \ref{lst:codeDefs3} for the interested reader.
% ---------------------------------------------------------------------------
\section{Contribution Photon Generation \& Precomputation with
\mkpmap}
Precomputed contribution photon maps are generated with the \mkpmap{}
tool. This entails distributing (and binning) the photons,
precomputing their binned contributions by density estimation,
compressing the resulting contributions via wavelet transform (handling
potential boundary artefacts), and encoding the wavelet coefficients.
\subsection{Contribution Photon Distribution}
Contribution photons have their own photon distribution routine,
\cmd{pmcontrib2:distribPhotonContrib()}.
It differs from the standard photon distribution routine
\cmd{pmap:distribPhotons()} in that each source contributes
(approximately) the same number of photons
(i.e. $nemit \approx nphotons \: /\, nsrc$, where $nsrc$ is the number of
contributing sources).
This measure is intended to
balance the density of photons contributed by each source whose
contributions are sought (as specified with the \opt{-m} option to
\mkpmap).
As in the standard photon
distribution routine, the number of emitted photons to emit is adjusted
to obtain the desired target photon count
by emitting a fraction of the photons in a prepass, and then
extrapolating this
in proportion to the stored photons vs. the total target count.
Since the number of photons emitted from each source no longer correlates
with its emitted flux, as is the case with \cmd{distribPhotons()},
the resulting flux per photon must be adjusted individually for
each source to compensate for bias, at the expense of increased variance.
% ---------------------------------------------------------------------------
\subsection{Binning of Contributions}
\label{sec:binning}
When a contribution photon is emitted, its \textit{contribution source}
is temporarily set in the photon map's \cmd{PhotonMap.lastContribSrc}
field, consisting of a 16-bit integer
source index and bin. The latter is obtained by calling
\cmd{pmapcontrib:ray2bin()} via \cmd{pmapcontrib:contribSourceBin()},
which performs a
Shirley-Chiu disk-to-square mapping (see figure \ref{fig:shirleyChiu})
of the photon's emitted direction
vector.
\begin{figure}[htb]
\centering
\includegraphics[width=0.8\linewidth]{shirley-chiu2-crop}\\
\parbox{0.9\linewidth}{%
\caption{%
\label{fig:shirleyChiu}
Shirley-Chiu disk-to-square mapping used to bin
contribution photon directions. This mapping has the
desirable property of preserving adjacency and fractional area.
The disk is obtained by projecting the
hemisphere of photon directions onto the plane defined by the
surface normal $[rNx,\: rNy,\: rNz]$. The polar angle origin
$\phi = 0$ in the disk is defined by the up vector
$[Ux,\: Uy,\: Uz]$.
}
}
\end{figure}
The disk coordinates are obtained by projecting the photon's direction
onto the disk plane as defined by its normal, $[rN_x,\: rN_y,\: rN_z]$
(see below). The mapped square coordinates $[sc_x,\: sc_y]$
are, in turn, mapped to a linear bin
\begin{equation}
b_p = sc_x \cdot scdim + sc_y,
\label{eq:linearIdx}
\end{equation}
where $scdim = \left\lfloor\sqrt{nbins}\right\rfloor$ is the dimension
of the Shirley-Chiu square,
and is derived from the number of bins specified to \mkpmap{}
with the \opt{-bn} option. If the latter is not an integer square,
it will be rounded to the nearest such number by \mkpmap.
If the Shirley-Chiu mapping fails (e.g if the photon's direction
lies in the plane or is incident from the back, the bin is set to
-1 as invalid. This mapping is fixed for contribution photons and
cannot be modified by the user, as the resulting square topology
is immediately applicable to a 2D wavelet transform in matrix form.
\cmd{ray2bin()} evaluates the following variables to reorient the
disk-to-square mapping:
\begin{itemize}
\item{$RHS$}: 1 for right-handed coordinate system
(default), -1 for left-handed.
\item{$rNx,\: rNy,\: rNz$}: disk plane surface normal
(defines $\theta = 0$, default [0, 0, 1])
\item{$Ux,\: Uy,\: Uz$}: up vector (defines $\phi = 0$,
default [0, 1, 0])
\end{itemize}
The contribution source is set speculatively for each emitted photon,
in the anticipation that its path will lead to stored photons. If the
path spawns no photons (e.g. because the photon was backscattered
from a port and left the scene, or because the materials have zero
reflectance), the contribution source is simply
discarded and overwritten by the next emitted photon.
On the other hand, When a photon is created, the photon's contribution
source field \cmd{Pho\-ton.\-aux.\-con\-trib\-Src} is set from
\cmd{PhotonMap.lastContribSrc}, which remains constant for the photon
path until the next photon is emitted. Note that \cmd{Photon.aux} is
designated an auxiliary data field which is specific to the photon type.
It is defined as a union structure and can also store a unique photon
path ID for regular photons, and a photon's time of flight (expressed
as distance travelled) for transient photons.
The contribution sources stored with each photon are subsequently
accumulated in bins when the contributions are precomputed, as
detailed in the next section.
% ---------------------------------------------------------------------------
\subsection{Precomputation of Contribution Photons}
\label{sec:precomp}
Once the approximate target number of photons \var{nphotons} has been
reached, a fraction \var{precomp} of these (specified with the \opt{-apP}
option to \mkpmap) is drawn uniformly at random as candidates for
precomputation, thus preserving the relative distribution of photons in
the scene. The entire precomputation is wrapped by the routine
\cmd{pmapcontrib:preComputeContrib()}.
The contributions for a candidate precomputed photon
are collected by function
\cmd{pmap\-con\-trib:\-get\-Pho\-ton\-Con\-trib()}. This
routine extracts the candidate photon's emitting light source from its
contribution source field \cmd{Photon.aux.contribSrc.sourceIdx},
from which, in turn, the source modifier is obtained. If the modifier is
among those whose contributions are sought, it is passed to the
standard photon lookup routine \cmd{pmapdata:findPhotons()}, which in
turn calls the lookup routine specific to the out-of-core data structure
(sorry, no OOP-style overloading here), \cmd{pmapooc:OOC\_FindPhotons()}.
This routine performs a nearest neighbour search for the \var{bwidth}
closest photons. It also sets a filter of instance
\cmd{pmapfilt:OOC\_SearchFilterState}, which is passed to a callback
function \cmd{pmapfilt:filterPhoton()} called by the lookup routine
to accept/reject photons on the fly based on the filter's criteria.
This function ordinarily tests only photon normals to reject photons
incident from backfaces. However, with contribution photons, the
filtering callback further adds the constraint that the photons are
emitted from a given light source modifier. This ensures the
lookup collects only contributions for the same modifier as the candidate
precomputed photon.
If the photon search is successful, \cmd{getPhotonContrib()} performs
a modified density estimate of the photons, accumulating their
contributions (obtained from their \cmd{Photon.flux} field) in
the preassigned bins indicated by each photon's contribution source
field \cmd{Photon.aux.contribSrc.sourceBin} (this incidentally also
includes the candidate precomputed photon's own binned contribution).
The binned contributions are then divided by the area of the disk
intercepted by the search volume, with radius corresponding to the
maximum found photon distance.
The precomputation routine allocates a lookup table (\cmd{struct LUTAB},
defined in the stock \radiance{} module \cmd{lookup.c}) of per-modifier
\emph{child} photon maps, and assigns it to the \emph{parent} photon map's
\cmd{PhotonMap.preCompContribTab} field. This enables the parent photon
map to act as a container for the per-modifier child photon maps, with
the actual precomputed photons stored in the latter. The modifier name
then acts as the LUT key, as is also the case with \rcontrib's
contribution table \cmd{rcontrib:modconttab}.
The precomputation routine also allocates a container
\cmd{PhotonMap.preCompContrib} of type \cmd{PreComputedContrib} in
each child photon map (see listing \ref{lst:codeDefs2}, which serves
as scratch space for the subsequent wavelet transform and compression.
This includes an instance of \cmd{wavelet2.h:WaveletMatrix2}, which
is a 2D array of floating point 3-tuples, corresponding to the
RGB wavelet coefficients in each colour channel.
In addition, the wavelet transform requires a second, transposed
matrix for intermediate results, which is also preallocated
and initialised to 0 prior to every transform.
Note that these data structures are allocated and dimensioned separately
per modifier (which is unique to each child photon map), as the binning
parameters may differ for each.
The preallocated data structure container of type \cmd{PreComputedContrib}
is then passed to the main compression/encoding routine
\cmd{pmapcontrib:encodeContribs()} along with the user-specified
compression ratio \var{comp}.
This routine performs the actual wavelet transform and
subsequent coefficient thresholding.
% ---------------------------------------------------------------------------
\subsection{2D Wavelet Transform}
\label{sec:wavelet3.c}
Wavelets are a science unto themselves and a very broad topic that is
mostly opaque for non-ma\-the\-ma\-ti\-ci\-ans and even computer
scientists.
Once one has grasped the fundamental concept of wavelets, however, the
theory is relatively intuitive, if far from simple. The reader is
referred to \cite{Graps:1995} for an excellent introduction with
notable applications.
Much like the Fourier transform, the wavelet transform essentially
represents a given input signal as a series of sums of scaled
(in terms of amplitude) basis functions, that have been dilated and
translated along the signal's propagation axis.
These functions are referred to as the actual wavelets, though in
practice they aren't necessarily always undulating (see that funny
squiggly thing in figure \ref{fig:waveletFunc} for an example).
Unlike Fourier
basis functions, wavelets have a finite support, i.e. a defined
non-zero region of influence. Based on this principle, it is
possible to decompose a given signal into a set of coefficients
(the scaling terms) at different dilations and translations.
The dilations and translations can be considered sliding windows
depicting the signal at varying zoom levels, or resolutions.
Consequently, wavelets are a form of \textit{multiresolution
analysis}.
Signals can be decomposed by a wavelet transform in arbitrary
dimensions, though applications in 1 and 2 dimensions are most common
(with image processing and compression being notable applications of
the latter).
The precomputed contribution photon map decomposes the pre-binned
contributions in their native 2D domain to optimally leverage
corellation along each axis of the Shirley-Chiu square, effectively
treating it as a square matrix.
Once the contributions for each precomputed photon have been accumulated,
they are transferred as input to the preallocated wavelet matrix. In the
general case, this requires a \cmd{memcpy()} of each matrix row from the
linear contribution array, effectively reverting the linearisation
that follows the Shirley-Chiu mapping.%
\footnote{In the special case where input and output sizes of the
transform are the same, i.e. when no padding oocurs, the matrix
rows (which are \textit{Iliffe} vectors, or pointers to arrays) can
simply be set up to map directly into the corresponding segments in
the linear contribution array, obviating the need to push data about.
Obviously this is much more efficient. This can be done in
conjunction with the specialised 2D wavelet transform function
\cmd{wavelet2:waveletXform2()}, which only operates on input
sizes of powers of 2 and does not generate padding coefficients.
However, due to the major restriction this imposes on the input size
(and the number of bins), this function is not used by the precomputed
contribution photon map.
}
The routine \cmd{pmapcontrib:encodeContribs()} calls the full 2D
wavelet transform wrapper routine \cmd{wavelet3:padWaveletXform2()}.
This routine, in turn calls \cmd{padD2Step2()} to perform one pass of the
transform along a fixed (horizontal) axis. The latter function accepts
an output matrix of the same size as the input, and transposes the
resulting coefficients from the iteration ``on the fly'' as they are
written to the output matrix.
With the input and output matrices now swapped, the next iteration of
this function will perform another horizontal pass over the transposed
matrix, which is tantamount to a vertical pass over the original matrix.
With the output matrix now transposed a second time, the original
orientation is obtained, and the matrices swapped a second time.
Consequently, each pair of invocations of a transform step constitutes a
complete horizontal/vertical transform pass at a given resolution (see
figure \ref{fig:waveletStep}).
\begin{figure}[htb]
\centering
\includegraphics[width=0.7\linewidth]{Daubechies4-functions}
\parbox{0.9\linewidth}{%
\caption{\label{fig:waveletFunc}
Daubechies DB2
``4-tap'' wavelet function. The scaling function
(blue) decorrelates the input signal as approximation
coefficients. The wavelet function (red) -- itself orthogonal
to the scaling function -- decorrelates the input as detail
coefficients. Note the asymmetry of this wavelet family. This
wavelet has a support of 4 adjacent samples and balances
computational efficiency and boundary effects for poorer
decorrelation compared to wavelets with larger supports,
but also higher computational expense.
}
}
\end{figure}
\begin{figure}[p]
\centering
\includegraphics[width=0.7\linewidth]{%
wavelet3-test-16x16-steps-reaspect-crop%
}\\
\parbox{\linewidth}{%
\caption{\label{fig:waveletStep}
Sample output of wavelet unit test for 16$\times$16 bins,
showing the first two resolution levels of the 2D wavelet
transform,
each consisting of transform steps over the horizontal and
vertical axes. The output of each transform step
(right of arrows)
becomes the input for the next step (left of arrows).
Starting with the original input samples $y_{i,j}$, each step
generates a set of approximation coefficients $s_{i,j}$ (red
arrows) and detail coefficients $d_{i,j}$ (green arrows). To
alternate the transform axes, the implementation transposes the
output matrix on the fly (note the reversed output coefficient
indices), consequently the transform need only be performed
along one axis, which simplifies indexing; the original matrix
orientation is then restored every two steps. After each
iteration (horizontal/vertical transform pair), the
approximation coefficients $s(\ldots (s_{i,j})\ldots ) =
(s_{i,j})^k$ in the upper left of the output matrix become the
input for the next (halved) resolution level. The sizes of the
progressively smaller output submatrices (indicated by square
brackets) include padding coefficients. Consequently, the
output matrix leaves unoccupied cells (indicated by dots) to
accommodate these additional coefficients, which accumulate in
the upper left of the matrix in subsequent resolution levels.
}
}
\end{figure}
Each transform pass reduces the number of output
coefficients thereby yielding a coarser representation of the
original input signal, corresponding to a lower frequency band. This
is termed
\textit{multiresolution analysis} and is a key characteristic of
the wavelet transform, namely that it decorrelates the input signal
at various resolutions, or frequency bands (in filtering parlance).
The alternating transposition of the matrix effects a decorellation
along the horizontal and vertical axes.
\begin{figure}[htb]
\centering
\includegraphics[width=0.75\linewidth]{%
wavelet3-test-16x16-full-reaspect-crop%
}
\parbox{0.9\linewidth}{%
\caption{\label{fig:waveletCoeffsFull}
Sample output of wavelet unit test for 16$\times$16 bins,
showing the wavelet coefficient matrix after a full transform.
The coloured fields identify the coefficient type (
red = approximation $s$, green = detail $d$) and the successive
transform steps, from right to left. Each iteration generates
coefficients from those of prior iterations at higher resolutions:
approximations of prior details $sd$ (lower left),
details of prior approximations $ds$ (upper right),
details of prior details $dd$ (lower right),
approximations of prior approximation $ss$ (upper left).
Each subsequent iteration then recurses into the upper left
submatrix, using the approximations $ss$ as
increasingly coarse representations of the original contributions
as input. After the final iteration, the red submatrix in the
upper left corner contains the 3$\times$3 coarsest approximations.
}
}
\end{figure}
The wavelet transform projects the input signal $y_{i,j}$ onto a pair of
orthogonal basis functions specific to each wavelet family (see figure
\ref{fig:waveletFunc} for a simple example from the Daubechies family%
\footnote{The naming convention within each wavelet family is not firmly
established. The Daubechies wavelet in figure \ref{fig:waveletFunc}
is either denoted \textit{DB2}, referring to its 2 vanishing moments,
or \textit{D4}, referring to its support of 4 (having 4 ``taps'',
in filter parlance).
The \lit{pywavelets} module adheres to the former convention,
consequently so does the module \cmd{wavelet3.c}.
}),
termed the \textit{scaling function} $\Phi_k(j)$ and the \textit{wavelet
function} $\Psi_k(j)$, where $k$ is the resolution (or dilation) of the
function, and $j$ is its position along the input signal's transform axis.
Unlike Fourier basis function, wavelets have finite support and are
localised in space, i.e. they are only non-zero for a finite subset of
sample positions $j$. Larger supports decorrelate over more samples and
therefore offer improved compression (smaller coefficients), but at
higher computational expense.
The scaling and wavelet functions $\Phi_k(j)$ resp. $\Psi_k(j)$
decompose the $k$-th input matrix along a fixed axis $i$ (matrix row or
column) into a pair of approximation coefficients $s^k_{i,j}$ and
detail coefficients $d^k_{i,j}$. The input matrix is, itself, the set of
approximation coefficients $s^{k-1}_{i,j}$ from the previous iteration.
The set of approximation coefficients $s^k$ always represent the
signal at half the resolution compared to those from the previous
iteration, $s^{k-1}$. In the initial iteration $k=0$, the approximations
are equal to the original input signal: $s^0_{i,j}$ = $y_{i,j}$.
For a 4-tap function as shown in figure \ref{fig:waveletFunc}, the
$k$-th iteration of the wavelet transform is as follows:
\begin{eqnarray}
\label{eq:d2FwdXform-s}
s^k_{i,j} &=& h_0\ s^{k-1}_{i,2j} \ +\ h_1\ s^{k-1}_{i,2j+1}\ +\
h_2\ s^{k-1}_{i,2j+2}\ +\ h_3\ s^{k-1}_{i,2j+3}\\
\label{eq:d2FwdXform-d}
d^k_{i,j} &=& g_0\ s^{k-1}_{i,2j} \ +\ g_1\ s^{k-1}_{i,2j+1}\ +\
g_2\ s^{k-1}_{i,2j+2}\ +\ g_3\ s^{k-1}_{i,2j+3},
\end{eqnarray}
where $h_j$ and $g_j$ are constants obtained by evaluating
$\Phi_k(j)$ and $\Psi_k(j)$ at the corresponding positions of the input
signal samples that fall within the functions' support.
For the 4-tap Daubechies wavelet, these are:
\begin{equation}
\label{eq:d2-hCoeffs}
h_0 = \frac{1 + \sqrt{3}}{4\sqrt{2}} ;\quad
h_1 = \frac{3 + \sqrt{3}}{4\sqrt{2}} ;\quad
h_2 = \frac{3 - \sqrt{3}}{4\sqrt{2}} ;\quad
h_3 = \frac{1 - \sqrt{3}}{4\sqrt{2}}
\end{equation}
\begin{equation}
\label{eq:d2-gCoeffs}
g_0 = h_3 ;\quad
g_1 = h_2 ;\quad
g_2 = h_1 ;\quad
g_3 = h_0
\end{equation}
Note that the transform axis index $j$ is doubled on the RHS of
equations \ref{eq:d2FwdXform-s} and \ref{eq:d2FwdXform-d};
this is because each iteration of the
transform reduces the number of approximation coefficients compared to the
input, yielding the coarser approximation.
Note also that, for the sake of clarity, equations \ref{eq:d2FwdXform-s}
and \ref{eq:d2FwdXform-d}
omit the on-the-fly transposition shown in figure \ref{fig:waveletStep};
the actual implementation swaps the indices $i, j$ on the LHS
during assignment.
In the case of the precomputed contribution photon map, the contributions
contain RGB radiometric data. Consequently, the above decomposition
is extended to the three colour channels. Thus the coefficients
$s^k_{i,j}$ and $d^k_{i,j}$ are infact 3-tuples.
The colour channels are treated completely independently, and no
decorellation occurs between them.
Each subsequent iteration of the wavelet transform recurses on the
approximation coefficients on the previous iteration; in a 2D matrix,
this corresponds to the upper left subquadarant of the wavelet
coefficient matrix (see figure \ref{fig:waveletCoeffsFull}). Detail
coefficients are accumulated in the remaining subquadrants, starting at
the corners. In the 2D context, this implies that detail coefficients
are also obtained from details and approximations of the previous
iteration, and likewise, approximations from previous approximations and
details, thus fully decorrelating the input signal in every dimension.
% ---------------------------------------------------------------------------
\subsection{Wavelet Transform Boundary Extension}
\label{sec:boundaryExt}
In an ideal world (a.k.a. theory), input signals are assumed to be
infinite and life is simple. In \realLife, everything has a beginning
and an end, including life itself, and incidentally, the input to a
wavelet transform.%
\footnote{... which inevitably raises that perpetual philosophical
question, ``why bother?''
}
Because wavelets have finite support, they will partly extend beyond the
given input signal. There are many sensible (and equally less so) ways to
extend a given input, but this really depends on assumptions about
the nature of the underlying signal, i.e. whether it is periodic,
monotonic, or singular (peaky).
If the input size is a power of two, the signal can simply be wrapped
around, and the number of approximation coefficients is halved in each
iteration.%
\footnote{The specialised 2D wavelet transform function
\cmd{wavelet2:d4Step2()} only handles input sizes of powers of 2,
and is hardwired to perform a cyclic (wraparound) extension of the
input signal, without the need for padding coefficients. This is
obviously
far more efficient and compact, since the number of resulting
coefficients is the same size as the input. While not used by the
precomputed contribution photon map, this function is available for
use by other \radiance{} applications where this restriction on input
sizes is acceptable.
}
However, such a restriction on the input size may be deemed unacceptable
for many applications, including precomputed contributions, which then
warrants a boundary treatment of the input signal.
\begin{figure}[htb]
\centering
\includegraphics[width=0.45\linewidth]{boundaryext-crop}
\parbox{0.9\linewidth}{%
\caption{\label{fig:boundaryExt}
Example of boundary extension at the right edge of a signal
of length $l$, consisting of samples $y_{i,0} \ldots y_{i,l-1}$.
Common extension modes include zero (gray), symmetric/reflection
(green), constant (red), and 1st order gradient (blue).
}
}
\end{figure}
Boundary issues are handled by extrapolating
the input signal beyond the left and right boundaries. The wavelet
module \cmd{wavelet3.c} offers the following boundary extension modes,
(see also the definitions in \cmd{wavelet2.h}, and figure
\ref{fig:boundaryExt} for examples):
\begin{description}
\item[WAVELET\_EXTEND\_CONST:] the input signal is assumed to be
constant beyond
the boudary, i.e. the first resp. last value is repeated as
required. This is probably the safest choice in most cases, unless
there is a significant gradient at the boundaries.
\item[WAVELET\_EXTEND\_GRAD1:] the input signal is linearly extrapolated
according to its 1st order gradient (slope).
\item[WAVELET\_EXTEND\_GRAD2:] the input signal is linearly extrapolated
according to its 2nd order gradient.
\item[WAVELET\_EXTEND\_CYCL:] the input signal is assumed to be cyclic
(i.e. periodic) and wrapped around at either boundary. This may
not always be the optimal choice.
\item[WAVELET\_EXTEND\_SYMM:] the input signal is assumed to be
symmetric,
and reflected at either boundary. This may not always be the optimal
choice.
\item[WAVELET\_EXTEND\_ZERO:] this input signal is simply set to zero
beyond
the boundary. This can lead to large coefficients if the absolute
input values are large near the boundary, which, in turn, can lead to
poor compression. This is optimal only for input signals consisting of
a singularity (i.e. an isolated peak tapering towards the boundaries).
\item[WAVELET\_EXTEND\_INV:] the input signal is reflected at the
boundary,
similarly to \cmd{WAVE\-LET\_\-EXT\-END\_\-SYMM}, but negated,
resulting in anti-symmetry.
This extension artificially boosts boundary detail coefficients
to reduce the likelihood they will be tresholded compared to the
interior coefficients, and therefore preserved. In practice tho,
it turns out to be crap.
\end{description}
The boundary extension mode can be defined by setting
\cmd{WAVELET\_EXTEND\_MODE} to one of the above. If undefined, the
extension mode defaults to \cmd{WAVELET\_EXTEND\_CONST}. Several of these
modes can also be found (possibly under different names) in the
\lit{pywavelets} package, and are partly inspired by these.
Because this is a compile-time option, it is not user-selectable. While
this lacks flexibility, in practice most users would be overwhelmed if
they had to choose a suitable boundary extension mode, since probably
neither they (nor the author) can anticipate the nature of the resulting
input to the wavelet transform.
% ---------------------------------------------------------------------------
\subsection{Padding Coefficients}
Boundary extension implies the generation of additional approximation and
detail coefficient pairs beyond the halved length of the input. These
\textit{padding coefficients} are generated at the boundaries,
and are essential to reduce artefacts at the edges of the reconstructed
input signal during the inverse wavelet transform.
The precomputed contribution module uses the more general 2D padded wavelet
transform module \cmd{wavelet3.c} to compress contributions, which lifts
the restriction on the input size imposed by the 2D non-padded transform
in \cmd{wavelet2.c}.
The number of padding coefficients depends on the support of the wavelet;
the larger the support, the larger the number of padding coefficients.
Furthermore, these are accumulated at every iteration of the transform,
i.e. over multiple resolutions.
This means the total number of wavelet coefficients (approximation and
details) can be significantly \emph{larger} than the input,
which makes an effective compression all the more necessary in order
for the wavelet encoding to be viable.
With padding, the number of coefficients for an input of length $l^k$
samples at iteration $k$ is:
\begin{equation}
l^{k+1} = \left\lfloor\frac{l^k + \lambda - 1}{2}\right\rfloor,
\label{eq:padLen}
\end{equation}
where $\lambda$ is the size of the wavelet's support.
Based on this, it is possible to predict the additional space required in
the wavelet coefficient matrix for the padding coefficients, and dimension
it accordingly in order to accommodate the latter.
This is why the coefficient matrix in figure
\ref{fig:waveletCoeffsFull} exhibits unoccupied (read: wasted) space.
But since nobody at Micro\$oft gives a stuff about efficiency and quality
software, why fret over a few wasted kilobytes?
\footnote{... unless you
plan on porting this to your ZX81 or KIM-1, in which case I wish you
luck!
}
To support the allocation of an array of suitable dimensions,
the function \cmd{wavelet2:padD2Step()} returns the output length
for a given input, including padding, if either the input or
output array is NULL. Similarly, the full transform function
\cmd{wavelet2:padWaveletXform2()}, which calls the former, can sum this
output length over all iterations if either its input or output array is
NULL. The function \cmd{pmapcontrib:preComputeContrib()} relies on
this functionality when allocating the wavelet coefficient matrices for
each contribution modifier.
% ---------------------------------------------------------------------------
\subsection{Wavelet Coefficient Thresholding}
Contributions are compressed by thresholding their corresponding
detail coefficients after the wavelet transform is complete. This entails
removing the (\var{comp})\% least significant such coefficients (i.e.
smallest in terms of absolute value, since coefficients can
be negative). This is demonstrated in figure \ref{fig:waveletCoeffsThresh}
for a compression of 75\%. In general, the details of details in the
lower right subquadrants will be the smallest coefficients, and therefore
most likely to be thresholded.
\begin{figure}[htb]
\centering
\includegraphics[width=0.75\linewidth]{%
wavelet3-test-16x16-thresh75-reaspect-crop%
}
\parbox{0.9\linewidth}{%
\caption{\label{fig:waveletCoeffsThresh}
The coefficient matrix from figure \ref{fig:waveletCoeffsFull}
after thresholding the 75\% coefficients with the lowest absolute
value, implicitly setting these to zero (indicated by bracketed
dots). For contributions with a low gradient, as is typically
the case with sky luminance distributions, the lowest
coefficients will mostly be the details in the submatrices
highlighted in green. The coarsest approximation coefficients
in the upper left submatrix (highlighted in red) are essential
for reconstructing the original matrix, and are therefore never
thresholded.
}
}
\end{figure}
Once coefficients have been selected for thresholding, they can be omitted
when storing or transmitting the wavelet encoded data, thereby saving
mass storage or bandwidth.
The rationale behind thresholding is that the omission of insignificant
coefficients will not appreciably alter the reconstructed signal.
In the context of the reconstruction, the omitted coefficients are
implicitly treated as zero.
\begin{samepage}
Thresholding lies at the heart of wavelet compression, which is why a
good tresholding strategy is important.
There are many ways to threshold coefficients, such as:
\begin{enumerate}
\item using a fixed threshold (hard thresholding, possibly adapted
to the resolution),
\item attenuating coefficients if they exceed the threshold (soft
thresholding), and
\item dropping a fixed fraction of the smallest coefficients.
\end{enumerate}
\end{samepage}
Options (1) and (2) generate a variable number
of thresholded coefficients, which better adapts to the frequency content
of the original signal. The primary disadvantage of this strategy is that
it complicates the paging of coefficients from disk, as it requires a
dedicated indexing structure.
Option (3) is the least optimal, since the user has no control over the
magnitude of thresholded coefficients, and therefore the incurred
error. It does, however, afford control over the \emph{number} of
thresholded coefficients, as it is constant. This in turn greatly
simplifies the paging mechanism (no index is required, since the record
size is fixed and known beforehand). This strategy was therefore chosen
for the precomputed contributon photon map to simplify the implementation.
(Path of least resistance, you know, guv...)
It is important to note that the coarsest level approximation coefficients
(highlighted in the upper 3$\times$3 submatrix in figure
\ref{fig:waveletCoeffsThresh}) are \emph{never} thresholded. These
are essential for the reconstruction, since
the entire matrix containing the original input signal is derived from
them by the inverse wavelet transform, in conjunction with the (thresholded)
detail coefficients. These approximation coefficients are therefore
unconditionally preserved by the thresholding.
Thresholding in the precomputed contribution photon map is performed
in a dedicated routine \cmd{pmapcontrib:thresholdContribs()}.
This transfers all non-zero \emph{detail} coefficients (i.e. omitting
the unoccupied space in the wavelet coefficient matrix left over by
padding) to a thresholding buffer, consisting of an array of
struct \cmd{PreComputedContribCoeff}. Each entry in this buffer contains
a pointer to the corresponding detail coefficient in the wavelet
coefficient matrix, and its linearised 2D matrix index, using the
mapping in equation \ref{eq:linearIdx} to linearise 2D bin indices.
The index is necessary to identify which coefficients were removed
after thresholding.
The thresholding routine ``knows'' how many non-zero detail coefficients
to expect based on the summed number of padded coefficients (equation
\ref{eq:padLen}) minus the $3\times 3 = 9$ approximation coefficients,
which are excluded from thresholding. If the wavelet transform
actually produced some zero coefficients ('appens more often than
you'd expect, guv), the thresholding buffer will contain fewer
coefficients than expected. In this case, the remaining buffer is
simply filled with as many duplicates of a zero coefficient as
required; specifically, the coefficient in the lower right corner is
chosen, since it is guaranteed to be unoccupied
(see figure \ref{fig:waveletCoeffsFull}),
and will therefore be thresholded anyway.
The actual thresholding consists of partitioning the coefficients
in the thresholding buffer so that all coefficients at positions
$[0,\ l(1-comp/100)-1]$ have larger magnitudes than those in positions
\mbox{$[l (1-comp/100),\ l-1]$}. This is performed by a separate
recursive routine with the ever so unlikely name
\cmd{pmap\-cont\-rib:\-co\-eff\-Par\-tit\-ion()}, which swaps
out-of-order buffer entries, much like quicksort. Unlike quicksort,
however, the coefficients within each partition need not be sorted,
which significantly reduces the number of recursive calls. In addition,
swapping buffer entries (i.e. pointers to coefficients and their matrix
indices) is more efficient than swapping the actual RGB floating point
coefficients.
The coefficient magnitude is evaluated as dot product over RGB,
corresponding to squared vector magnitude.
Once the coefficients have been partitioned in the thresholding buffer,
only the most significant are kept, i.e. those in the partition
$[0,\ l(1-comp/100)-1]$. These are subsequently
sorted by their coefficient indices using \cmd{qsort()} from the
standard C library, again by swapping pointers to coefficients and their
matrix indices, instead of the RGB coefficients themselves. This results in
an array of coefficients (well, pointers to them) with monotonically
increasing matrix indices, which is required by the subsequent mRGBE
coefficient encoding on disk.
% ---------------------------------------------------------------------------
\subsection{mRGBE Wavelet Coefficient Encoding}
\label{sec:mrgbe.c}
Once the wavelet coefficients have been thresholded, they are encoded
using the modified
RGBE encoding (mRGBE) prototyped in section \ref{sec:mrgbePoC}.
The C implementation in module \cmd{mrgbe.c} as part of the contribution
photon map is essentially a direct port of the Python prototype, and
is based on the same assumptions.
The mRGBE fields and their correponding 32-bit integer value are defined
as a union of type \cmd{mRGBE} (surprise, surprise, guv...) whose
bit field configurations are defined by the macros \cmd{MANTBITS},
\cmd{EXPBITS}, and \cmd{DATABITS}.
The default bit field configuration is shown in figure
\ref{fig:mrgbeStruct}, but can be redefined to suit applications that
require more precision at the expense of a reduced payload data range,
or vice versa. In practice, the default presents a compromise suitable
for most applications.
\begin{figure}[htb]
\centering
\includegraphics[width=0.95\linewidth]{mrgbe-crop}\\
\parbox{0.9\linewidth}{%
\caption{\label{fig:mrgbeStruct}
Structure of 32-bit mRGBE encoding for wavelet coefficients.
The encoding consists of three mantissae per RGB colour channel.
a common exponent (base 2), and an associated payload data field
to store the coefficient index (linearised from its 2D matrix
indices). The bits can be allocated within the 32-bit envelope
at compile-time to trade off precision, encoding range, and
payload data range. The default configuration,
\cmd{MANTBITS} = 6, \cmd{EXPBITS} = 5, \cmd{DATABITS} = 9
(abbreviated 6:6:6:5:9), balances
these confliciting requirements.
}
}
\end{figure}
The primary contribution compression/encoding function
\cmd{pmapcontrib:encodeContribs()} keeps track of the per-colour channel
range $[d_{min,i},\: d_{max,i}]$ of the wavelet coefficients' absolute
values, and passes these in a struct \cmd{mRGBERange} to the encoding
initialisation function \cmd{mrgbe:mRGBEinit()}.
This function is responsible for setting the per-colour channel
normalisation factor $d_{norm,i}$, which is returned in
\cmd{mRGBERange.norm}:
\begin{equation}
d_{norm,i} = \frac{1 - 2^{-\left(2^\textrm{EXPBITS}\right)}}{
d_{max,i} - d_{min,i}
},\quad i\in\lbrace r,g,b\rbrace,
\end{equation}
where \cmd{EXPBITS} is the number of bits allocated to the shared exponent
in the mRGBE encoding, as shown in figure \ref{fig:mrgbeStruct}.
Once the mRGBE normalisation is initialised, \cmd{encodeContribs()} calls
\cmd{mRGBEencode()} for each RGB wavelet detail coefficient
$d = [d_r,\; d_g,\; d_b]$
in the thresholding buffer, passing the initialised \cmd{mRGBERange}
instance containing the normalisation factor $d_{norm}$, to obtain the
corresponding mRGBE encoding consisting of per-colour channel mantissae
$m_i$, and a shared base-2 exponent $x$:
\begin{eqnarray}
\label{eq:mrgbe}
m_i &=& \mathrm{sgn}\,\left(d_i\right)
\left\lfloor \overline{m} \cdot \overline{d}_i\, +\, \epsilon
\right\rfloor\, +\, m_{max}, \quad i\in\lbrace r,\, g,\, b\rbrace,
\quad\epsilon \in [0, 1)\\
\overline{m} &=& m \frac{m_{max}}{\overline{d}}\\
(m, x) &=& \mathrm{frexp}\left(\overline{d}\right)\\
\overline{d} &=& \max\left(
\overline{d}_r,\; \overline{d}_g,\; \overline{d}_b
\right)\\
\overline{d}_i &=& \left(\lvert d_i\rvert\, -\, d_{min,i}\right)\,
d_{norm,i}, \quad i\in\lbrace r,\, g,\, b\rbrace\\
m_{max} &=& 2^\textrm{MANTBITS-1},
\end{eqnarray}
where overbars denote normalised values, and \cmd{MANTBITS} specifies
the number of bits allocated to each mantissa $m_i$ in the mRGBE encoding
(including sign bit).
Note that each coefficient $d_i$ is offset by its corresponding mRGBE
range minimum, $d_{min,i}$, before being normalised by $d_{norm,i}$ to
obtain $\overline{d}_i$. Similarly, $\overline{d}$ is the latter's
maximum over RGB.
The $\mathrm{frexp}(\overline{d})$ function is part of the
standard C library and returns a floating point mantissa $m$ and an
integer base-2 exponent $x$ such that $m 2^x = \overline{d}$. Note that
the absolute value $\lvert x \rvert$ is stored in the mRGBE
exponent field, since $\overline{d} \leq 1$ implies a consistently
negative exponent.
Note also that each mantissa $m_i$ is rounded to the nearest integer via
an optional constant $\epsilon$ (defaulting to 0), and then offset
by the signed mantissa maximum, $m_{max}$, which corresponds to half the
encoding range accommodated by \cmd{MANTBITS}.
This offset encodes the mantissa's sign, with all values below
$m_{max}$ being negative.
\cmd{mRGBEencode()} accepts each wavelet coefficient's linear index as
payload data, which increases mono\-ton\-ical\-ly since the coefficients
were previously sorted with respect to their indices.
Consequently, the coefficient index is \emph{incrementally} encoded,
i.e. as the difference to that of its immediate predecessor, starting at
0 (hence only the first coefficient index is absolute).
This incremental index encoding generally requires fewer bits to encode
than an absolute index, which reduces the likelihood of
overflowing the encoding range of the mRGBE payload data field.
However, it is important to realise that the likelihood of overflowing the
mRGBE data field increases with the index increments, notably when the
thresholded wavelet coefficient matrix becomes sparsely populated due to
a high compression ratio and/or number of bins. This cannot be predicted as
it depends on the distribution of the thresholded coefficients, which in
turn depends on the nature and frequency content of the wavelet transformed
contributions. If this occurs, we're caught up the creek without a paddle
and toss in the towel (!), aborting the contribution precomputation with
an error.
This is far from optimal, and an issue that
perhaps could have been more elegantly handled with more time and budget.
As a half-baked user-friendly gesture, \mkpmap{} will at least warn
if the total number of coefficients (including padding) exceeds the
payload data field range, i.e. $2^\textrm{DATABITS}$, indicating that,
under the aforementioned conditions, the mRGBE data field
\emph{could} theoretically overflow.
The encoding function \cmd{mRGBEencode()} sets the per-colour-channel
mantissae $m_i$ in each of the bitfields \cmd{mRGBE.\{red,green,blue\}},
and the shared exponent $x$ in \cmd{mRGBE.exp}. Furthermore, the
incremental coefficient index is set in bitfield \cmd{mRGBE.dat}.
Together, these occupy a 32-bit envelope, which can be conveniently
accessed as a scalar integer value \cmd{mRGBE.all} via the union
declaration.
Each such 32-bit mRGBE-encoded wavelet coefficient is appended to a
temporary array of struct \cmd{mRGBE} in the container
\cmd{preComputedContrib}, along with the instance of \cmd{mRGBERange}
containing the encoding normalisation and offset. These are the final
return values of the contribution compression/encoding routine
\cmd{encodeContribs()}, at which point the routine returns.
Once the wavelet detail coefficients have been compressed and encoded,
\cmd{preComputeContrib()} prepends them with the
approximation coefficients in the upper $3\times 3$ submatrix
of the wavelet coefficient matrix, followed by the per-colour-channel
mRGBE range minimum and maxima, as returned in the \cmd{mRGBERange}
instance by \cmd{encodeContribs()}. Since the approximation coefficients
are not thresholded, their positions in the matrix are known, on top
of which their encoding demands higher precision than mRGBE, consequently
they are encoded with \radiance's standard 32-bit RGBE format.
Similarly, the mRGBE encoding range is also encoded as RGBE to
preserve the normalisation and offset with sufficient precision,
thereby minimising the decoding error. Since the approximation
coefficients can (rather suprisingly) be negative, but 32-bit RGBE
can only encode positive values, a hack is employed whereby the least
significant bit (bit 0) of each mantissa is sacrificed as a sign bit
(see macro \cmd{PMAP\_CONTRIB\_SET\_RGBE32\_SGN()} in
\cmd{pmapcontrib.h}), resulting in a slight reduction in precision.
At this point, a new photon is created with the same attributes
as the original photon selected for precomputation, by calling
\cmd{pmapdata:newPhoton()}. This function sets specific attributes
for various photon types; for precomputed contribution photons, it
sets the auxiliary data field \cmd{Photon.aux.contribSrc}
to the current light source index (passed via the photon ray's \cmd{rsrc}
field), in order to identify the precomputed photon's emitting light
source.
The RGBE encoded approximation coefficients and mRGBE range, as well as
the mRGBE encoded thresholded detail coefficients, are passed to
\cmd{newPhoton()} and (if not NULL) accumulated in a contribution
buffer \cmd{PhotonMap.contribHeapBuf}. This is analogous to the
accumulation of photons in the heap buffer \cmd{PhotonMap.heapBuf}
which already takes place in \cmd{newPhoton()}; in this case it
holds the precomputed photons associated with the contributions, in
the same order.
Once the heap buffer is full (both photon and contribution buffers
hold the same number of records), they are flushed to separate
heap files on disk via the lazily initialised file pointers,
\cmd{PhotonMap.\{heap, contribHeap\}}. This ensures that
photon map contruction is consistently out-of-core.
Once \cmd{preComputeContrib()} has iterated over all precomputed
photons, and flushed the photons and contributions for precomputed
child photon map to disk, it chucks -- uh, discards the original
photon map. Iterating over each child photon map, it calls the
contribution photon map specific build routine,
\cmd{buildPreCompContribPmap()}. This, along with the
saving routines, resides in a separate module \cmd{pmcontrib3.c},
and is described in the next section.
% ---------------------------------------------------------------------------
\subsection{Building and Saving Precomputed Contribution Photon Maps}
\label{sec:buildSave}
The top-level build routine for per-modifier child photon maps,
\cmd{buildPreCompContribPmap()}, sets up the output subdirectory
from the parent photon map's name (as specified to \mkpmap{} via
the \opt{-apC} option), and the filename of each child photon map and
its wavelet coefficient file by appending the corresponding modifier
name, as specified by the \opt{-m} option.
In addition, the routine recursively deletes any previous files
in the subdirectory using the file tree walk function \cmd{nftw()} from
the standard C library.%
\footnote{
This function supersedes \cmd{ftw()}, which
is deprecated according to POSIX.1-2008. Both of these functions are
not included in the standard Wind0z API, and therefore no cleanup
takes place on this platform. So tuff luck.
}
Each child photon map is stored on disk in a subdirectory
\lit{<pmapfile>.rc/} derived from the parent photon map's name,
\lit{<pmapfile>}. The photons themselves are stored out-of-core in a
file \lit{<pmapfile>.rc/<mod>.leaf}, while the compressed wavelet
coefficients are stored in a separate file \lit{<pmapfile>.rc/<mod>.wvt}
(see also figure \ref{fig:overview}).
In addition, all binning parameters
relevant to \rcontrib, i.e all instances of the options \opt{-bn},
\opt{-e}, \opt{\mbox{-m}}, \opt{-M} and \opt{-p} passed to \mkpmap,
are collected from the command line and saved to an option file
\lit{<pmapfile>.opt} in the parent directory. This also includes options
to specify a Shirley-Chiu mapping via the \cmd{disk2square.cal} function
file, which is functionally equivalent to the internal contribution
photon binning routine, \cmd{pmapcontrib:ray2bin()}. For
definitions of the relevant option strings, see macros
\cmd{mkpmap:PMAP\_CONTRIB\_RCOPT\_$\mathbf{*}$}.
Instances of these option strings are appended to a cumulative string by
\mkpmap{} as it parses the command line, which is then passed to the main
photon distribution routine \cmd{pmcontrib2:distribPhotonContrib}. The
latter assigns the option string to the generated photon map's
\cmd{PhotonMap.rcOpts} field for subsequent saving to disk.
The option file is intended to be passed to \rcontrib{} via the \opt{@@}
option to ensure the binning parameters are consistent with those in the
photon map (the technical reasons why this isn't handled automatically
are detailed in section \ref{sec:load}).
Once the output filenames have been initialised,
\cmd{buildPreCompContribPmap()} calls the standard photon map building
routine, \cmd{pmapdata:buildPhotonMap()}. This consolidates the
photon and contribution heaps generated by the multiple processes
forked by the \opt{-n} option, normalises the
photon flux if applicable (skipped if daylight coefficients are enabled
with the \opt{-V} option, in which
case the photon flux is already normalised). Since the
contribution photon map requires out-of-core storage,
this function calls the specific out-of-core build routine,
\cmd{pmapooc:OOC\_BuildPhotonMap()}. With a regular photon map, this
calls \cmd{oocsort:OOC\_Sort()}, which sorts photons out-of-core
according to the Morton code derived from the photons' 3D positions
\cite{schregle-techreport-2016}. The photons are then saved to the
out-of-core octree leaf file \lit{<pmapfile>.rc/<mod>.leaf} before
the out-of-core octree structure to index the leaves is built by
calling \cmd{oocbuild:OOC\_Build()}.
In the case of a contribution photon map, an extended out-of-core
sorting function \cmd{pmcontribsort:contribPhotonSort()} is called,
which is derived from \cmd{OOC\_Sort()} and uses the same low-level
operations to access intermediate files.
Besides sorting the photons in their consolidated heap file, the
adapted routine also sorts the corresponding wavelet coefficients in
their likewise consolidated heap file. While the photons are unordered
w.r.t their Morton codes within each heap, both heaps are ordered w.r.t
each other, since they were synchronously flushed to disk by
\cmd{pmapcontrib:preComputeContrib()} during precomputation.
Thus, sorting both heaps w.r.t the photon's Morton indices maintains the
correspondence between photons and their wavelet coefficients.
Because the number of thresholded coefficients is fixed, the size of each
set of coefficients is too, which greatly simplifies this step of the
build process.
Like the original out-of-core sorting routine, the contribution photon
sorting routine employs an external mergesort \cite{Knuth:1998:ACP:280635,
Seyedafsari:2010}. This recursively subdivides the unordered
photon and contribution heap files into progressively smaller
subblocks (maintaining these out-of-core in temporary files), until they
are small enough to be quicksorted in-core (in parallel, if the
\opt{-n} option is passed to \mkpmap). The sorted blocks are
then merged into progressively larger out-of-core blocks as the recursion
unwinds. On exiting the sorting routine, the sorted photons are returned
in the out-of-core octree leaf file \lit{<pmapfile>.rc/<mod>.leaf},
while the corresponding wavelet coefficients are returned in the
wavelet file \lit{<pmapfile>.rc/<mod>.wvt}. At this point, the
out-of-core octree to index the photons in the leaf file
is built by \cmd{oocbuild:OOC\_Build()}, as with any out-of-core
photon map.
Once built, precomputed contribution photon maps are saved by
\cmd{pm\-cont\-rib3:save\-Cont\-rib\-Pho\-ton\-Map()}, which is called
for the (parent) contribution photon map by the standard
photon saving routine \cmd{pmapio:\-save\-Photon\-Map()}.
\cmd{saveContribPhotonMap()} saves the binning parameters specified
in the parent photon map's \cmd{rcOpts} string to the option file
for later use with \rcontrib. The routine subsequently iterates over the
(child) photon maps referenced in the parent's \cmd{preCompContribTab}
using the standard
lookup tables iterator routine \cmd{look\-up:lu\_do\-all()},
which calls the saving routine \cmd{pmcontrib3:savePreCompContrib()}
for each per-modifier child photon map in the LUT. This latter routine
calls, in turn, the standard photon map saving routine
\cmd{pmapio:\-save\-Photon\-Map()} again, this time to save the actual
per-modifier photons.%
\footnote{
Hey, are we confused yet? And in case you're wondering,
\cmd{savePhotonMap()} will \emph{not} wind up in a recursive loop, as
each child photon map's \cmd{preCompContribTab} field will be NULL,
since, in the na\"ively ideal world of the \radiance{} photon map
at least, kids don't have kids!
}
To this end, \cmd{save\-Photon\-Map()} was modified to generate
contribution-specific info in the file header (number of coefficients,
compression rate, etc), and
saves the photon map itself. Saving the photon map entails encoding the
indexing structure of the out-of-core octree to reference the photons
in its leaves (which physically reside on disk in the leaf file).
Once all child contribution photon maps and their parent are saved,
the precomputation is concluded and \mkpmap{} cleans up and terminates.
Woohoo!
% ---------------------------------------------------------------------------
\subsection{Logarithmic vs. Linear Encoding of Contributions}
\label{sec:logEncoding}
Contributions can optionally be encoded logarithmically during the
wavelet transform via the \cmd{PMAP\_\-CON\-TRIB\_\-LOG} compile-time
option (see \cmd{pmapcontrib.h}). If defined,
\cmd{pmap\-contrib:\-en\-code\-Con\-tribs()}
applies a natural logarithm to every RGB contribution tuple prior
to the the wavelet transform.
The advantage of this encoding is that it reduces the range of the
input, and consquently that of the resulting wavelet coefficients, which
improves the precision of the subsequent mRGBE encoding.
It also elegantly handles potential negative values resulting from
compression artefacts during the inverse wavelet transform, since the
contributions will always be positive after inverting the log
encoding through exponentiation.
The downside of this logarithmic encoding is that it complicates the
boundary extension during the wavelet transform, as most extension modes
assume linear input data, particularly the gradient extension modes.
Consequently, logarithmic encoding should only be used for a "safe"
extension mode, such as \cmd{WAVELET\_EXTEND\_CONST}. A further downside
is the increased sensitivity of the encoding to compression artefacts,
as well as jitter due the limited precision of the mRGBE encoding.
% ---------------------------------------------------------------------------
\subsection{Achtung, Baby: Sparsely Populated Bins}
\label{sec:emptyBins}
A potential anomaly with contribution photon mapping is that some bins
may not be populated during precomputation. This can happen in situations
with weak illuminance, and is difficult to predict. If the ratio of
populated (i.e. nonzero) bins is too low, bias may result from the
sparsely populated bins (see figure \ref{fig:cpmapTest-emptyBinsHDR}).
Note that this issue is solely dependent on the number of bins and
photons, not the compression ratio.
\begin{figure}[htb]
\centering
\includegraphics[width=0.7\linewidth]{cpmapTest-emptyBins}
\parbox{0.7\linewidth}{%
\caption{%
\label{fig:cpmapTest-emptyBinsHDR}
Bias caused by sparsely binned contributions. With 64 bins
(right, shown for bin 63), an average of 90\% of bins contain
nonzero contributions.
With 256 bins (left, shown for bin 255), this ratio drops to
under 50\%, leading to obvious bias, notably on the left wall
and ceiling. See figure section \ref{sec:cpmapTestHDR} for a
description of the scene and a composite rendering of all bins.
}
}
\end{figure}
The average fraction of empty bins
can be obtained by compiling with the
\cmd{PMAP\_\-CONT\-RIB\_\-BIN\-HIS\-TO}
macro. Plotting this against the number of bins (see figure
\ref{fig:cpmapTest-emptyBinsPlot}) reveals a pronounced
discontinuity at the point where bias sets in due to sparsely populated
bins. This suggests an ill-posed solution if the average number of
bins rises above ca. 25\%.
\begin{figure}[htb]
\centering
\includegraphics[width=0.7\linewidth]{cpmapTest-64m-bn64-emptyBins}
\parbox{0.7\linewidth}{%
\caption{%
\label{fig:cpmapTest-emptyBinsPlot}
Graph of empty bin ratio (averaged over all precomputed photons)
as a function of the number of bins. This ratio rises sharply
above 9$\times$9 = 81 bins, indicating the majority of
precomputed contributions include unpopulated bins, which in
turn manifests itself as visible bias as shown figure
\ref{fig:cpmapTest-emptyBinsHDR}.
}
}
\end{figure}
As a consequence, the contribution photon
density estimate routine \cmd{getPhotonContrib()} issues a warning if
fewer than 50\% of bins are populated (doubling the empirical threshold
for good measure). The frequency of these warnings
indicates the necessity to increase the overall number of photons used
for precomputation, as well as the photon lookup bandwidth. However,
there is no guarantee this will remedy the situation, as the actual
luminance distribution may simply not cover the entire incident
hemisphere.
% ---------------------------------------------------------------------------
\clearpage
\section{Unit Tests}
The main constitutent modules of the precomputed contribution photon map
contain optional unit tests which can be built at compile time. These
consist of a \cmd{main()} function which is enabled by compiler macros
to produce a standalone binary for individual testing of each module's
basic functionality.
\subsection{Contribution Binning Unit Test}
\label{sec:pmapcontrib-test}
The \cmd{pmapcontrib} module contains a unit test which can be built
with the command
\begin{center}
\begin{minipage}{0.7\linewidth}
\begin{lstlisting}
rmake pmapcontrib-test
\end{lstlisting}
\end{minipage}
\end{center}
where the compilation target \cmd{pmapcontrib-test} enables the macro
\cmd{PMAP\_\-CONT\-RIB\_\-TEST}. The test verifies
the Shirley-Chiu binning function and its orientation variables by calling
\cmd{ray2bin()} for a set of sample rays in the incident hemisphere, and
outputs the corresponding linearised bin numbers.
The unit test is invoked as follows:
\begin{center}
\begin{minipage}{0.7\linewidth}
\begin{lstlisting}
pmapcontrib-test <scdim> <nsamp> [<var>=<value>; ..]
\end{lstlisting}
\end{minipage}
\end{center}
where \lit{scdim} is the dimension of the Shirley-Chiu square, and
\lit{nsamp} is the number of sample rays. The sample rays are distributed
over the hemisphere by stratifying in $\theta$ and $\phi$. The orientation
of the Shirley-Chiu mapping can be modified by appending optional
variable assignments to the command line (see section \ref{sec:binning}
for the list of relevant variables). Figure \ref{fig:pmapcontrib-test}
shows a sample output of this unit test.
\begin{figure}[htb]
\centering
\includegraphics[width=0.8\linewidth]{pmapcontrib-test}
\parbox{0.9\linewidth}{%
\caption{%
\label{fig:pmapcontrib-test}
Output of contribution binning unit test
for an 8$\times$8 Shirley-Chiu mapping.
}
}
\end{figure}
\subsection{Wavelet Unit Test}
\label{sec:wavelet3-test}
The \cmd{wavelet3} module contains a unit test of the 2D padded wavelet
transform which can be built with the command
\begin{center}
\begin{minipage}{0.7\linewidth}
\begin{lstlisting}
rmake wavelet3-test
\end{lstlisting}
\end{minipage}
\end{center}
where the compilation target \cmd{wavelet3-test} enables the macro
\cmd{WAVE\-LET\_\-TEST\_\-2D\-PAD}.\footnote{
The \cmd{wavelet} and \cmd{wavelet2} modules also contain unit tests
of the 1D and 2D unpadded wavelet transforms, which are enabled at
compile time with the \cmd{WAVE\-LET\_\-TEST\_\-1D} and
\cmd{WAVE\-LET\_\-TEST\_\-2D} macros, respectively. These are
functionally similar to module \cmd{wavelet3}'s unit test.
}
The unit test is invoked as follows:
\begin{center}
\begin{minipage}{0.7\linewidth}
\begin{lstlisting}
wavelet3-test <scdim> [threshold] [dataFile]
\end{lstlisting}
\end{minipage}
\end{center}
where \lit{scdim} is the dimension of the Shirley-Chiu square,
\lit{threshold} is an optional (hard) threshold which sets all wavelet
coefficients below this value to zero, and \lit{dataFile} is an optional
ASCII file containing the values for an input matrix, one matrix row per
line, which must contain at least \lit{scdim} lines and as many values
per line (the excess being ignored). This is particularly useful to import
and wavelet transform actual contributions output by \rcontrib{} for
the same binning configuration.
The test allocates and initialises a 2D input matrix of the specified size
\lit{scdim} $\times$ \lit{scdim}. The matrix is either initialised from
the optional \lit{dataFile} (if specified), or with generated values as
determined by the \cmd{WAVELET\_TEST\_INIT} macro, which supports the
following settings:
\begin{enumerate}
\item Random data, with independent colour channels:
$y_{i,j} = [\xi_r,\; \xi_g,\; \xi_b]$.
\item Random data, with correlated colour channels:
$y_{i,j} = [\xi_r,\; (0.1+0.9\xi_g)\:\xi_r,\; (0.1+0.9\xi_b)\:\xi_r]$,
i.e. RGB differ by less than a factor of 10.
\item Random data, identical for all colour channels:
$y_{i,j} = [\xi_r,\; \xi_r,\; \xi_r]$.
\item Monotonically increasing along 1st axis:
$y_{i,j} = [i + 1, \cdots ,\cdots]$, where an offset of 1 avoids
taking the logarithm of zero if log encoding of coefficients is
enabled (see section \ref{sec:logEncoding}).
\item Monotonically increasing along both axes:
$y_{i,j} = [ij+1, \cdots, \cdots]$, with an offset of 1 again to
avoid taking the log of zero.
\item Monotonically increasing by linear index:
$y_{i,j} = [i \cdot scdim + j + 1, \cdots, \cdots]$,
i.e. serialised matrix rows.
\item Symmetric ``bump'' function:
$y_{i,j} = \left[
\left(1.1 - \frac{\lvert j - scdim/2 + 0.5\rvert}
{scdim/2 - 0.5}
\right)
\left(1.1 - \frac{\lvert i - scdim/2 + 0.5\rvert}
{scdim/2 - 0.5}
\right),
\cdots,
\cdots
\right]$, with an offset of 0.1 to avoid taking the log of zero.
\end{enumerate}
Note that $\xi_i \in [0,1]$ denotes independent random variables per
colour channel, and ellipses indicate repeated values for the remaining
colour channels. These initialisation options are useful to compare the
reconstructed matrix against a known (simple) reference, as opposed to
a more complex distribution from the \lit{dataFile}.
Regardless of the source,
the input array is optionally log-encoded if \cmd{WAVELET\_TEST\_LOG} is
enabled.
\begin{figure}[p]
\centering
\includegraphics[width=\linewidth]{wavelet3-test}
\parbox{\linewidth}{%
\caption{%
\label{fig:wavelet3-test}
Output of the wavelet transform unit
test for a 5$\times$5 input matrix with thresholding (the
mRGBE output is omitted for the sake of brevity).
Space for padding coefficients in the matrix is indicated by
dots, while thresholded coefficients are indicated by
dots surrounded by square brackets.
}
}
\end{figure}
The unit test then allocates an output matrix and its transpose according
to the padded size returned by \cmd{padWaveletXform2()} with NULL input
arrays. It then performs a full 2D padded wavelet transform using this
function. If \lit{threshold} was specified on the command line,
the resulting wavelet detail coefficients with absolute value%
\footnote{
The scalar absolute value is obtained as dot product from
the RGB tuples.
}
below \lit{threshold} are
set to zero; similarly to \cmd{pmapcontrib:thresh\-old\-Cont\-ribs()},
the approximation coefficients in the upper left of the output matrix are
not thresholded.
The test outputs the wavelet coefficient matrix to the console,
marking the thresholded coefficients as bracketed dots (see figure
\ref{fig:wavelet3-test}).
If \cmd{WAVELET\_TEST\_mRGBE} is defined, the unit test also encodes the
(thresholded) wavelet coefficients to mRGBE, placing these in
an additional output matrix, which is also output below the original
floating point coefficients for comparison. Again, the approximation
coefficients in the upper left of the matrix are preserved.
Finally, the test inverts the wavelet transform using the (thresholded)
wavelet coefficients by calling \cmd{padWaveletInvXform2()}, and outputs
the reconstructed data along with the root mean square error (RMSE)
compared to the original input. This process is optionally repeated for
the mRGBE-encoded coefficients.
As a convenience for subsequent analysis, the (thresholded) wavelet
coefficients can also be output to a file defined by the macro
\cmd{WAVELET\_TEST\_COEFFOUT}, which defaults to
\cmd{wavelet3-test-coeff.dat}.
Similarly, the mRGBE-encoded coefficients can be output to
\cmd{WAVELET\_\-TEST\_\-COEFF\-OUT\_\-mRGBE}, which defaults to
\cmd{wavelet3-test-coeff-mrgbe.dat}. Analogously, the reconstructed
data can be output to \cmd{WAVELET\_TEST\_OUT} (defaulting to
\cmd{wavelet3-test-xform.dat}), while the reconstruction from
mRGBE-encoded coefficients can be output to
\cmd{WAVELET\_TEST\_OUT\_mRGBE} (defaulting to
\cmd{wavelet3-test-xform-mrgbe.dat}).
Undefining any of these macros (e.g. by commenting out their definitions
in \cmd{wavelet3.c}) disables the corresponding file output.
The wavelet transform unit test is crucial to evaluate deviations
(artefacts) incurred by thresholding and the limited precision of the mRGBE
encoding (particularly with different bitfield configurations of the
latter). Consequently, the overwhelming majority of testing was conducted
with this tool in order to troubleshoot and optimise the wavelet transform,
particularly with respect to the various boundary extension modes.
\clearpage
\subsection{mRGBE Unit Test}
\label{sec:mrgbe-test}
The \cmd{mrgbe} module contains a unit test of the mRGBE encoding
which can be built with the command
\begin{center}
\begin{minipage}{0.7\linewidth}
\begin{lstlisting}
rmake mrgbe-test
\end{lstlisting}
\end{minipage}
\end{center}
where the compilation target \cmd{mrgbe-test} enables the macro
\cmd{mRGBE\_TEST}. The unit test is invoked as follows:
\begin{center}
\begin{minipage}{0.7\linewidth}
\begin{lstlisting}
mrgbe-test <numTests>
\end{lstlisting}
\end{minipage}
\end{center}
The test initially triggers possible exceptions with invalid or marginal
encoding input (notably zero, which is handled separately in the encoding
routine), under/overflow (relative to the specified encoding range),
and empty encoding range.
It then proceeds to encode and decode \lit{numTests} random 3-tuples in
a predetermined range (defined by macros \cmd{RGBMIN} and \cmd{RGBMAX}),
and dumping the RMSE between encoded and decoded values, calculated as
dot product of their component differences.
Figure \ref{fig:mrgbe-test} shows a sample run.
\begin{figure}[htb]
\centering
\includegraphics[width=\linewidth]{mrgbe-test}
\parbox{\linewidth}{%
\caption{%
\label{fig:mrgbe-test}
Sample output of mRGBE unit test for a 6:6:6:5:9 mRGBE bit
configuration. With 6 bits per mantissa, the RMSE averages
around 3\%.
}
}
\end{figure}
% ---------------------------------------------------------------------------
\clearpage
\section{Precomputed Contribution Evaluation with \rcontrib}
The evaluation of precomputed contributions from the photon map in
\rcontrib{} entails locating the nearest precomputed photon, paging
its compressed contributions from disk on demand, decoding the
mRGBE-encoded wavelet coefficients, and inverting the wavelet transform
to reconstruct the contributions.
The reconstructed contributions are then transferred to \rcontrib's
contribution lookup table, and can be optionally cached to accelerate
neighbouring evaluations that require the same photon again.
\subsection{Loading Precomputed Contribution Photon Maps}
\label{sec:load}
Similarly to saving the contribution photon maps in \mkpmap{}, \rcontrib{}
calls the main photon map loading routine \cmd{pmapio:loadPhotonMap()} with
the parent photon map, which passes control to a dedicated routine
\cmd{pmcontrib4:loadContribPhotonMap()}. The latter again calls
\cmd{loadPhotonMap()} to load each per-modifier child photon map. Note
that the photons and their contributions remain out-of-core, and reside on
disk. The out-of-core octree data structure, however, remains in-core to
facilitate navigating the photon map during lookups.
The binning parameters used by \mkpmap{} to precompute the contributions
are dumped to an option file \lit{<pmapfile>.opt} in the photon map's
parent directory when it is built and saved. The contents of this
file are passed to \rcontrib{} via the \opt{@@} option to ensure the
binning parameters are consistent with those in the photon map. This
dimensions \rcontrib's contribution array to accommodate the precomputed
bins via the \opt{-bn} option, and also specifies the binning function
file \cmd{disk2square.cal} for a Shirley-Chiu mapping
along with its orientation parameters, via the options \opt{-f}, \opt{-b},
\opt{-e} and \opt{-p}. Although the contributions from the photons are
already prebinned, the primary ray traced by \rcontrib{} which triggers
the photon lookup (possibly via
one ambient or several specular interactions) will still be
binned, and this binning must of course be consistent with that of the
precomputed contributions.
While automatic binning parameters would clearly be more convenient, the
option file is a necessity dictated by \rcontrib's architecture.
This is because
by the time the photon map is loaded, \rcontrib's contribution arrays
have already been allocated and initialised, along with its output
streams if the \cmd{-o} option is specified, which can be assigned to
individual modifiers and even bins. Reallocating these resources once the
photon map is loaded, and the binning parameters have been extracted,
would be too complex and error prone. While this may change in the future,
it was considered safer to use the already existing concept of option
files to pass the correct parameters to \rcontrib{} at startup,
which of course then becomes the user's responsibility.%
\footnote{
This doesn't seem unreasonable as it conforms to \radiance's
``it's the user's fault'' philosophy. ;\^{})
}
\subsection{Locating and Paging Precomputed Photons}
As in \rtrace{}, the evaluation of contributions is triggered via the
ambient calculation routine \cmd{ambient:multambient()}, which in turn
calls the photon mapping ``hook'' \cmd{pmapamb:ambPmap()} (resp.
\cmd{pmap\-amb:\-amb\-Pmap\-Caustic()} for caustic photons).%
\footnote{
Caustic photons carry a flag indicating they have been specularly
scattered, and are exclusively accepted by \cmd{pmapfilt:filterPhoton()}
during lookups initiated by \cmd{ambPmapCaustic()}. Note, however,
that the dedicated caustic photon map generated with \mkpmap's
\opt{-apc} option doesn't support contributions.
}
These routines call the appropriate photon lookup routine depending on
the photon type (defined in the callback \cmd{PhotonMap.lookup}), which
in the case of contribution photons is
\cmd{pmcontrib4:getPreCompPhotonContrib()}.
\cmd{getPreCompPhotonContrib()} calls
\cmd{raytrace:raycontrib()} to obtain the cumulative contribution of the
incident ray; typically this is the last ray in a path starting with a
primary ray followed by zero or more specular scattering events, and
zero or one (if \opt{-ab} is positive) diffuse scattering events.
This is then passed to \cmd{getPreCompContribByMod()}, which is called
for each per-modifier child photon map.
\cmd{getPreCompContribByMod()} locates the single closest precomputed
contribution photon in the corresponding child photon map for the
current modifier by calling the standard lookup routine
\cmd{pmapdata:find1Photon()}. The latter, in turn, calls the
out-of-core lookup routine \cmd{pmapooc:OOC\_\-Find\-1\-Pho\-ton()}, which
pages photons from the out-of-core octree leaf file and caches them
via an instance of the out-of-core photon cache
\cite{schregle-techreport-2016}.
The contributions for the found photon are decoded by calling
\cmd{pmcontrib4:get\-Pre\-Comp\-Con\-trib\-By\-Pho\-ton()} (see below)
and passing the photon along with its index.
\cmd{getPreCompContribByMod()} subsequently scales the decoded
contributions by the incident ray's cumulative contribution which was
passed by the caller, \cmd{getPreCompPhotonContrib()}, after which
the modifier's contributions are returned.
\subsection{mRGBE Wavelet Coefficient Decoding}
\cmd{getPreCompContribByPhoton()} loads the 32-bit RGBE-encoded
detail coefficient range and approximation coefficients from the
wavelet coefficient file for a given photon, using the photon's
numeric index as file offset.
The approximation coefficients are placed in the upper left of the
(initially zeroed) wavelet coefficient matrix. The sign of each
approximation coefficient's colour channel is set
according to bit 0 of the corresponding 8-bit mantissa in the 32-bit RGBE
encoding, via a convenient macro \cmd{PMAP\_CONTRIB\_GET\_RGBE32\_SGN()}.
\cmd{getPreCompContribByPhoton()} then loads the mRGBE-encoded
detail coefficients for the given photon from the wavelet coefficient
file (where they are stored immediately after the mRGBE range and
approximation coefficients).
Miscellaneous bookkeeping such as lazy initialisation of the
primary and transposed wavelet matrices also takes place here.
The detail coefficients are loaded into a lazily allocated
decoding buffer embedded in the current child photon map's field of
type struct \cmd{PreComputedContrib}.
Once the detail coefficients and their range have been loaded, they are
decoded from the decoding buffer by calling \cmd{decodeContribs()},
which also performs the inverse wavelet transform.
\cmd{decodeContribs()} initialises the mRGBE normalisation from the
retrieved detail coefficient range by calling \cmd{mrgbe:mRGBEinit()},
and subsequently passes this to the mRGBE decoding routine
\cmd{mrgbe:\-mRGBE\-de\-code()} for each detail coefficient. The mRGBE
decoding routine returns each decoded detail coefficient as a floating
point RGB 3-tuple along with its corresponding incremental linear
coefficient index.
Given a 32-bit mRGBE-encoded coefficient consisting of the per-colour
channel mantissae $m_i$ and the shared base-2 exponent $x$,
\cmd{mRGBEdecode()} returns the corresponding floating point
RGB wavelet coefficient $d = [d_r,\: d_g,\: d_b]$:
\begin{eqnarray}
\label{eq:invmrgbe}
\overline{d} &=& \frac{\mathrm{ldexp}\left(1, -\lvert x \rvert\right)}
{m_{max}} \\
d_i &=& \mathrm{sgn_m}\left(m_i\right) \: \left(
\lvert m_i - m_{max} + \epsilon \rvert \
\frac{\overline{d}}{d_{norm,i}}\: +\: d_{min,i}
\right) \\
\mathrm{sgn_m}\left(m_i\right) &=& \left\{
\begin{array}{l r}
-1 & \quad \textrm{if } m_i < m_{max} \\
1 & \quad \textrm{if } m_i \geq m_{max} \\
\end{array}
\right. \\
m_{max} &=& 2^\textrm{MANTBITS-1},
\end{eqnarray}
where overbars denote normalised values, and $d_{norm,i}$ and $d_{min,i}$
are the mRGBE normalisation factor and range minimum for color channel $i$,
respectively. Each coefficient $d_i$ is denormalised by $d_{norm,i}$ and
offset by $d_{min,i}$, inverting the encoding in equations
\ref{eq:mrgbe}ff. The function $\mathrm{sgn_m}\left(m_i\right)$ is
a macro that returns the sign of the mRGBE mantissa $m_i$ by comparing its
value with the signed mantissa offset, $m_{max}$. This offset is
subtracted from each $m_i$ to obtain $d_i$.
The $\mathrm{ldexp}\left(m, x\right)$ function is part of the
standard C library and computes integer powers of 2 scaled by a floating
point mantissa, i.e. $\overline{d} = m 2^x$, where $m = 1$ in this case
to obtain a normalised power of 2. Note that the exponent $x$ is stored
as absolute value in the mRGBE exponent field, and is therefore negated
so that $\overline{d} \leq 1$.
Due to the limited precision of the mRGBE
encoding, the decoded value will deviate from the original. A jitter
value $\epsilon \in [0,0.5]$ can be added to break up quantisation
artefacts (aliasing). In practice, values in excess of 0.5 increase the
RMSE reported by the unit test (see section \ref{sec:mrgbe-test}).
The decoded detail coefficients are transferred to the wavelet coefficient
matrix at their corresponding indices, thus incrementally populating
the matrix. The matrix index is
obtained by summing the incremental values embedded in the mRGBE data
field of the consecutively stored coefficients. Since this index $k$ is
1-dimensional, it is deserialised to 2D coordinates
$i, j$ by invering the mapping in equation \ref{eq:linearIdx}:
\begin{eqnarray}
\label{eq:invLinearIdx}
i &=& \lfloor k / m \rfloor \\
j &=& k \textrm{ mod } m,
\end{eqnarray}
where $m$ is the array dimension. Once the matrix has been populated with
all decoded coefficients, the unpopulated coefficients remain zero (as
initialised); these represent those that were thresholded during
compression.
\subsection{2D Inverse Wavelet Transform}
Once the wavelet coefficient matrix has been decoded and populated,
\cmd{decodeContribs()} calls
\cmd{wave\-let2:\-pad\-Wave\-let\-Inv\-Xform2()} to
perform a full inverse Daubechies DB2 wavelet transform to recover the
original contributions -- or more specifically, an approximation thereof
subject to compression artefacts.
The latter function therefore acts as a wrapper
for \cmd{pad\-D2\-Inv\-Step2()}, which performs one pass of the inverse
wavelet transform along a fixed (horizontal) axis.
Like its forward transforming counterpart \cmd{pad\-D2\-Step2()}
described in section \ref{sec:wavelet3.c}, this function returns an
output matrix in which the inverted coefficients have been transposed so
it can be swapped with the input matrix to perform a second pass along the
alternate axis. Consequently, each pair of invocations of the inverse
transform step constitutes a complete horizontal/vertical inverse
transform pass at a given resolution.
Given approximation and detail coefficients, $s^k_{i,j}$ amd $d^k_{i,j}$
at iteration $k$, the 4-tap Daubechies wavelet transform in equations
\ref{eq:d2FwdXform-s} and \ref{eq:d2FwdXform-d} is inverted as follows
to reconstruct the adjacent approximations $s^{k-1}_{i,2j}$ and
$s^{k-1}_{i,2j+1}$ at doubled resolution for the next iteration $k-1$:
\begin{eqnarray}
\label{eq:d2InvXform}
s^{k-1}_{i,2j} &=& h_2\ s^k_{i,j} \ +\ g_2\ d^k_{i,j} \ +\
h_0\ s^k_{i,j+1} \ +\ g_0\ d^k_{i,j+1}\\
s^{k-1}_{i,2j+1} &=& h_3\ s^k_{i,j} \ +\ g_3\ d^k_{i,j} \ +\
h_1\ s^k_{i,j+1} \ +\ g_1\ d^k_{i,j+1},
\end{eqnarray}
where $h_j$ and $g_j$ are the constants from equations
\ref{eq:d2-hCoeffs} and \ref{eq:d2-gCoeffs}.
At the final iteration where $k=0$, the approximation coefficients
correspond to the original signal, so that $s^0_{i,j} \approx y_{i,j}$.
Note that, for the sake of clarity, equation \ref{eq:d2InvXform}
omits the on-the-fly transposition shown in figure \ref{fig:waveletStep}
for the forward transform; in actuality, the implementation swaps the
indices $i, j$ on the RHS.
Upon completing the inverse wavelet transform,
\cmd{decodeContribs()} exits and returns the reconstructed binned
contributions, which are subsequently transferred row-by-row via
\cmd{memcpy()} to \cmd{rcontrib}'s linear binned contribution array
for the current modifier. Having accumulated the contributions from all
modifiers, \cmd{getPreCompPhotonContrib()} finally returns and
hands control back to \rcontrib.
\subsection{Decoded Contribution Caching}
The out-of-core photon cache used by the lookup routine
\cmd{pmapooc:OOC\_\-Find\-1\-Pho\-ton()} anticipates the need to
repeatedly evaluate photons for spatially adjacent lookups,
thus reducing I/O latency incurred by paging from disk.
The cache is an instance of struct
\cmd{ooccache:OOC\_Cache} and is organised into a variable number of
entries, where each entry contains a page (a block loaded on demand
from disk) consisting of a fixed number of neighbouring photons.
A standard least-recently-used (LRU) page replacement strategy
governs the preemption of obsolete pages when the cache fills
\cite{Tanenbaum:2014:MOS:2655363}.
To hide the latency involved in decoding the photon's contributions, an
additional, per-modifier precomputed contribution cache (another
instance of struct \cmd{OOC\_Cache}) is interrogated by
\cmd{getPreCompContribByMod()} to determine whether the contributions
have already been paged and decoded.
A dedicated routine \cmd{pmcontribcache:getContribCache()}
accesses the contribution cache using the found photon's index as
caching key, and returns a flag indicating the cached status of the
photon's contributions. This in turn calls the out-of-core caching routine
\cmd{ooccache:OOC\_CacheData()}, which returns a pointer to an array of
binned contributions.
If \cmd{getContribCache()} signals that the decoded contributions are
already cached, they are returned in the array and can be directly
accumulated in \cmd{rcontrib}'s contribution bins, obviating the need
to page and decode these (again).
If the contributions have not yet been cached, a new entry is allocated
with an empty array, which must be populated by the caller.
The contribution cache uses the same LRU page replacement strategy as
the out-of-core photon cache, except that a page now contains the decoded
contributions associated with a single photon; consequently, the page
size of this \cmd{OOC\_Cache} instance is hardwired to 1 (as specified in
number of photons).
This cache instance is lazily initialised by calling a dedicated
routine \cmd{pmcontribcache:initContribCache()}.
If no cached contributions were found for the current photon, they are
paged from disk and decoded as described in the previous sections,
and transferred to the new cache entry.
The contributions must obviously then be
weighted by the incident ray's cumulative contribution \emph{after}
the cache entry has been initialised, since the incident ray is variable.
% ---------------------------------------------------------------------------
\clearpage
\section{Compilation}
The precomputed contributon photon map is based on the out-of-core
photon map data structure, which requires enabling \cmd{PMAP\_OOC} at
compile time:
\begin{center}
\begin{minipage}{0.5\linewidth}
\begin{lstlisting}
rmake OPT+=-DPMAP_OOC
\end{lstlisting}
\end{minipage}
\end{center}
This enables \cmd{PMAP\_CONTRIB} in \cmd{pmapcontrib.h}, which in turn
enables the contribution photon mapping code and the corresponding
options in \mkpmap. Note that this functionality is absent in the
in-core kd-tree data structure. Since the latter is still the default
as of this writing, out-of-core (and therefore contribution) photon
mapping must be explicitly enabled.
In addition, verbose debugging output and sanity checks in the
contribution photon map code can be enabled as follows:
\begin{center}
\begin{minipage}{0.5\linewidth}
\begin{lstlisting}
rmake OPT+="-DPMAP_OOC -DPMAP_CONTRIB_DBG"
\end{lstlisting}
\end{minipage}
\end{center}
These sanity checks are expensive and should be disabled
in production code. Among the tests performed are:
\begin{itemize}
\item Checking for out-of-order wavelet coefficients after calling
\cmd{coeffPartition()} in \cmd{thresholdContribs()}.
\item Encoding of generated contributions in \cmd{encodeContribs()},
including a ``bump'' function similar to the initialisation
options available in the wavelet unit test (see section
\ref{sec:wavelet3-test}).
\item Checking of mRGBE encoded wavelet coefficients in
\cmd{encodeContribs()} by comparing the
decoded coefficient with the original, and triggering a
consistency error if their difference is outside tolerance.
This error is also triggered if the encoded and decoded
incremental coefficient index differs. Furthermore, a
coefficient sign reversal (in any colour channel) triggers
this error too.
\item A 10-second delay in \cmd{preComputeContrib()} after forking
child processes in multiprocessing mode (when passing the
\opt{-n} option to \mkpmap), giving the user time
to attach a debugger to a particular subprocess.
The PID of all forked subprocesses is dumped to the console
for convenience.
\item Checking for duplicate photon selection in
\cmd{preComputeContrib()}; no photon should be selected twice
for precomputation, unless the random photon index selection
is buggy.
\item Tally and output of average mRGBE deviations, similar to
the mRGBE unit test (see section \ref{sec:mrgbe-test});
\item Checking for invalid mRGBE-encoded wavelet coefficient indices
in \cmd{decodeContribs()}.
\end{itemize}
Unit tests for the contribution binning, wavelet transform and mRGBE
encoding can also be enabled at compile time. See sections
\ref{sec:pmapcontrib-test}, \ref{sec:wavelet3-test} and
\ref{sec:mrgbe-test}, respectively, for details.
% ---------------------------------------------------------------------------
%\clearpage
\chapter{Results}
\section{Wavelet Transform Colourmaps}
\label{sec:xformCmaps}
\subsection{CIE Sunny Sky}
To illustrate the effects of wavelet compression, a simple CIE sunny
sky model (without solar source) was generated with \cmd{gensky} to
serve as reference. The contributions from this sky source were
collected with \rcClassic{} in 32 $\times$ 32 = 1024 bins using a
Shirley-Chiu mapping, and reshaped into a 2D matrix. A fisheye rendering
of the sky and a colourmap of the binned contributions are shown for
reference in figure \ref{fig:skycontrib1024-cie-orig}. The corresponding
script to bin the contributions is shown in listing
\ref{lst:skycontrib-1024.sh}.
\lstinputlisting[float=h, inputpath={listings},
label=lst:skycontrib-1024.sh,
caption = {Script to bin sky contributions with \rcClassic{} using
Shirley-Chiu mapping. Note the use of the \opt{-c} option to
sample multiple rays per bin and reduce variance.
}
]
{skycontrib-1024.sh}
The 2D contributions where loaded as datafile into the \cmd{wavelet3-test}
unit test (see section \ref{sec:wavelet3-test}) and transformed using the
padded 2D wavelet transform.
Figure \ref{fig:skycontrib1024-cie-coeffs} shows colourmaps of
the resulting wavelet coefficient matrices after thresholding, alongside
the reconstructed contributions, for a sequence of compression ratios.
The colourmaps encode absolute value as colour, with black corresponding
to zero. Black regions in the wavelet matrices indicate either unused
space to accommodate padding coefficients, or thresholded detail
coefficients. The largest detail coefficients lie in the upper left of
the matrix, towards the coarser resolutions, indicating the signal has
predominantly low frequency content, as expected from the CIE model.
As expected, these coefficients are only thresholded at very high
compression ratios.
The approximation coefficients in the extreme upper left submatrix
(shown in yellow) are largest in magnitude and never thresholded.
In the reconstructed contributions from the inverse wavelet transform,
artefacts become noticeable beyond 80\% compression ratio.
The resulting loss of detail is however tolerable until ca. 95\% due
to the low frequency content of the contributions, at which point
the circumsolar region bleeds into the boundary.
\begin{figure}[p]
\centering
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/ciesky%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-orig-imgplot-crop%
}
\subcaption{%
\label{fig:skycontrib1024-cie-orig}
Original: fisheye rendering (left),
binned contributions (right).
}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-00-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-00-imgplot-crop%
}
\subcaption{Wavelet transform: no compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-50-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-50-imgplot-crop%
}
\subcaption{Wavelet transform: 50\% compression}
\end{subfigure}\\
\medskip
\caption{%
Colourmaps of wavelet coefficient matrix (left) and
reconstructed contributions (right) from a CIE sky model
for different compression ratios. Black regions in the
coefficient matrix indicate unused or thresholded coefficients.
The original sky distribution and binned contributions are shown
in \ref{fig:skycontrib1024-cie-orig}.
}
\end{figure}
\begin{figure}[p]\ContinuedFloat
\centering
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-60-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-60-imgplot-crop%
}
\subcaption{60\% Compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-70-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-70-imgplot-crop%
}
\subcaption{70\% Compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-80-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-80-imgplot-crop%
}
\subcaption{80\% Compression}
\end{subfigure}\\
\medskip
\caption{(continued).}
\end{figure}
\begin{figure}[p]\ContinuedFloat
\centering
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-90-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-90-imgplot-crop%
}
\subcaption{90\% Compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-95-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-95-imgplot-crop%
}
\subcaption{95\% Compression (supernova)}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-coeff-98-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-cie/skycontrib-2d-1024-xform-98-imgplot-crop%
}
\subcaption{98\% Compression (FUBAR)}
\end{subfigure}\\
\medskip
\caption{(continued).}
\label{fig:skycontrib1024-cie-coeffs}
\end{figure}
% ---------------------------------------------------------------------------
\subsection{HDR Sky Capture}
The procedure in the last section was applied to an HDR sky capture
obtained from a camera with fisheye lens, which is shown for reference
along with the corresponding binned contributions in figure
\ref{fig:skycontrib1024-hdr-orig}. Note the capture also includes
vegetation on the horizon. Figure \ref{fig:skycontrib1024-hdr-coeffs}
again shows colourmaps of the thresholded wavelet coefficient matrices
and the resulting reconstructed contributions.
This example exhibits more frequency content (notably in the
boundary regions due to vegetation on the horizon), and
artefacts already become objectionable at 90\% and above, with the
sky distribution completely breaking up above 95\%.
\begin{figure}[p]
\centering
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/hdrsky%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-orig-imgplot-crop%
}
\subcaption{%
\label{fig:skycontrib1024-hdr-orig}
Original: HDR camera capture (left),
binned contributions (right).
}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-00-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-00-imgplot-crop%
}
\subcaption{Wavelet transform: no compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-50-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-50-imgplot-crop%
}
\subcaption{Wavelet transform: 50\% compression}
\end{subfigure}\\
\medskip
\caption{%
Colourmaps of wavelet coefficient matrix (left) and
reconstructed contributions (right) from an HDR sky capture
for different compression ratios. Black regions in the
coefficient matrix indicate unused or thresholded coefficients.
The original sky distribution and binned contributions are shown
in \ref{fig:skycontrib1024-hdr-orig}.
}
\end{figure}
\begin{figure}[p]\ContinuedFloat
\centering
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-60-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-60-imgplot-crop%
}
\subcaption{60\% Compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-70-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-70-imgplot-crop%
}
\subcaption{70\% Compression}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-80-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-80-imgplot-crop%
}
\subcaption{80\% Compression}
\end{subfigure}\\
\medskip
\caption{(continued).}
\end{figure}
\begin{figure}[p]\ContinuedFloat
\centering
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-90-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-90-imgplot-crop%
}
\subcaption{90\% Compression (supernova)}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-95-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-95-imgplot-crop%
}
\subcaption{95\% Compression (FUBAR)}
\end{subfigure}\\
\medskip
\begin{subfigure}{\textwidth}
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-coeff-98-imgplot-crop%
}
\hfill
\includegraphics[height=0.28\textheight]{%
skycontrib-hdr/skycontrib-2d-1024-xform-98-imgplot-crop%
}
\subcaption{98\% Compression (black hole?)}
\end{subfigure}\\
\medskip
\caption{(continued).}
\label{fig:skycontrib1024-hdr-coeffs}
\end{figure}
% ---------------------------------------------------------------------------
\section{Binned Contribution Renderings}
\label{sec:cpmapTestHDR}
Figure \ref{fig:cpmapTestHDR-rc} shows a set of per-bin fisheye
falsecolour renderings generated by \rcClassic{} for a
simple,
bilaterally lit scene. The scene contains two opposing fenestrations with
cyan tint, and 3970 suns uniformly distributed in the sky dome. The
sensor position is located in the centre of the space, at floor height.
The number of ambient bounces was set to 4 (\opt{-ab 4}) to obtain
a reasonable indirect illuminance estimate. Using 256$\times$256 primary
rays per bin, the running time on an Intel Xeon E5-2660 @@ 2.60GHz
with 20 cores was ca. 1 hour. The shell script%
\footnote{
Yeah, it's \cmd{(t)csh}; we're too old-school for \cmd{bash}.
}
used to generate the renderings is shown in
listing~\ref{lst:cpmapTestHDR-rc}.
\lstinputlisting[float=h, label=lst:cpmapTestHDR-rc, inputpath={listings},
caption = {Script to generate the per-bin fisheye renderings in
figure \ref{fig:cpmapTestHDR-rc} with \rcClassic.
}
]
{rc-hdr.sh}
Figure \ref{fig:cpmapTestHDR-pmap} shows the same scene rendered with
\rcontrib{} using ca. 64000 precomputed contribution photons, whose
contributions were wavelet compressed by 80\%. The contributions were
rendered without ambient bounce (\opt{-ab -1}) and instead evaluated
directly via photon lookups at each pixel's world coordinates.
Despite the visible noise due to direct photon visualisation, there is
overall good agreement with figure \ref{fig:cpmapTestHDR-rc}, except for
the leftmost column, which corresponds to bin numbers that are even
multiples of 16. These bins lie on the boundary of the wavelet transform
domain and therefore subject to compression artefacts, as discussed
in the next section.
While \mkpmap{} took ca. 13.5 minutes to precompute the photon map on
the same CPU, rendering with \rcontrib{} took only 14 seconds.
\lstinputlisting[float=h, label=lst:cpmapTestHDR-pmap, inputpath={listings},
caption = {Script to generate the per-bin fisheye renderings in
figure \ref{fig:cpmapTestHDR-pmap} with \rcontrib{} using
precomputed contribution photon mapping.
}
]
{cpmapTest-hdr.sh}
\begin{figure}[h]
\centering
\includegraphics[width=\linewidth]{%
rc-bn64-ab4-compos-false%
}
\parbox{\linewidth}{%
\caption{%
\label{fig:cpmapTestHDR-rc}
Falsecolour rendered contributions from \rcClassic{} for
each of 64 bins in the bilaterally lit test scene containing
3970 solar positions.
The sensor is located in to centre of the space on the floor,
facing the zenith. The glazings on either side of the space
are tinted cyan.
This sequence was rendered with up to 4 ambient
bounces \opt{-ab 4}. On 20 cores,
the renderings took ca. 1 hour to complete at
256$\times$256 pixels per bin.
}
}
\end{figure}
\begin{figure}[h]
\centering
\includegraphics[width=\linewidth]{%
cpmapTest-64m-bn64-2400-comp0.8-reorient-compos-false%
}
\parbox{\linewidth}{%
\caption{%
\label{fig:cpmapTestHDR-pmap}
The bilaterally lit scene from figure \ref{fig:cpmapTestHDR-rc}
rendered with precomputed contribution photons
after 80\% wavelet compression.
This sequence was rendered with \opt{-ab -1},
consequently no ambient bounces are performed. This greatly
accelerates computation at the expense of some noise: on 20 cores,
the entire sequence took just 14 seconds to complete at
256$\times$256 pixels per bin.
}
}
\end{figure}
\section{Wavelet Compression and Boundary Artefacts}
Figure \ref{fig:contribCompression} shows a series of 3D plots of
the contributions in the bilaterally lit scene for different
wavelet compression ratios. This example uses a first order gradient
boundary extension and logarithmic contribution encoding.
While noise in the photon density estimate dominates at lower compression,
boundary artefacts tend to dominate at higher compression ratios. This
suggests the transform is very sensitive to thresholded padding
coefficients, and offers an explanation for the deviations in the
leftmost column in figure \ref{fig:cpmapTestHDR-pmap}.
\begin{figure}[p]
\centering
\begin{minipage}{0.49\linewidth}
\centering
\includegraphics[width=\linewidth]{%
cpmapTest-64m-bn256-3200-comp0.5%
}\\
Compression 0.5
\end{minipage}
\hfill
\begin{minipage}{0.49\linewidth}
\centering
\includegraphics[width=\linewidth]{%
cpmapTest-64m-bn256-3200-comp0.6%
}\\
Compression 0.6
\end{minipage}\\[3mm]
\begin{minipage}{0.49\linewidth}
\centering
\includegraphics[width=\linewidth]{%
cpmapTest-64m-bn256-3200-comp0.7%
}\\
Compression 0.7
\end{minipage}
\hfill
\begin{minipage}{0.49\linewidth}
\centering
\includegraphics[width=\linewidth]{%
cpmapTest-64m-bn256-3200-comp0.8%
}\\
Compression 0.8
\end{minipage}\\[3mm]
\centering
\begin{minipage}{0.49\linewidth}
\centering
\includegraphics[width=\linewidth]{%
cpmapTest-64m-bn256-3200-comp0.9%
}\\
Compression 0.9
\end{minipage}\\[3mm]
\parbox{\linewidth}{%
\caption{%
\label{fig:contribCompression}
3D plots of contributions calculated by \rcClassic{}
(blue) and photon map (red) in the bilaterally lit scene.
The contributions are accumulated in 256 bins for a
sensor located in the centre of the floor.
The contribution photon map was compressed with ratios
of 0.5 to 0.9. In this example, compression artefacts
appear at the boundaries (corresponding to the horizon)
with 80\% compression and above. This is a known limitation
of wavelet compression if the input signal exhibits high
gradients at the boundary.
}
}
\end{figure}
% ---------------------------------------------------------------------------
%\clearpage
\setcounter{secnumdepth}{-2}
\chapter{Acknowledgements}
\setcounter{secnumdepth}{2}
This research was supported by the Swiss National Science Foundation
as part of the project ``Light Fields for Spatio-Temporal Glare
Assessment'' (\#179067).
The author would like to thank his colleagues Dr. Lars Grobe and
Dr. Stephen Wasilewki for their collaboration, for providing feedback
during weekly lunchtime meetings, and for testing the code on MacOS.
The author would also like to express special thanks to the head of
the former Competence Centre for Envelopes and Solar Energy (CC EASE),
Prof. Stephen Wittkopf.
%, who gave our crew a sleek
%vessel, until it was torpedoed by the powers that be in 2016.
Finally, the author would like to thank his colleagues at
LBNL and Ladybug Tools for supporting the release of the new
photon mapping code through automated testing on supported platforms,
notably Mingbo Peng and Taoning Wang. And of course, the author's
biggest thanks go out to Greg Ward for his enduring support and
service to the \radiance{}
{\fontfamily{jkpvos}\selectfont\textit{Publick}}
... uh, community.
{\fontfamily{jkpvos}\selectfont\Large\vspace{1cm}\noindent
\textit{\textbf{The crew of the former CC EASE bids= you Farewell
for now, and thanks= the RADIANCE Publick for their Support.
}
}
}
% ---------------------------------------------------------------------------
%\clearpage
\bibliographystyle{alpha}
\bibliography{precomp-contrib-pmap-techreport}
% ---------------------------------------------------------------------------
%\clearpage
\appendix
\chapter{Source Code Definitions}
\lstinputlisting[float=h, label=lst:codeDefs1, inputpath={listings},
caption = {Source code definitions in \cmd{pmapcontrib.h}
relevant to precomputed contribution photons.
}
]
{pmapcontrib.h}
\lstinputlisting[float=h, label=lst:codeDefs2, inputpath={listings},
caption = {Source code definitions in \cmd{wavelet2.h} and
\cmd{mrgbe.h} relevant to precomputed contribution photons.
}
]
{wavelet2-mrgbe.h}
\lstinputlisting[float=h, label=lst:codeDefs3, inputpath={listings},
caption = {Source code definitions in \cmd{pmapdata.h} relevant to
precomputed contribution photons.
}
]
{pmapdata.h}
% ---------------------------------------------------------------------------
\chapter{Software Architecture}
\label{sec:swarch}
Figure \ref{fig:swarch-mkpmap} provides an overview of the precomputed
contribution photon map's software architecture in the context of
photon map generation and precomputation with \cmd{mkmap}.
Figure \ref{fig:swarch-rcontrib} gives a similar overview
in the context of the photon map's evaluation with \cmd{rcontrib}.
The blue blocks group relevant fuctions (red text) within software
modules built from C source files with the indicated names.
Red arrows correspond to function calls, which are ordered top-down.
Note that some routines, notably
\cmd{savePhotonMap()} and \cmd{loadPhotonMap()} recurse on a parent
photon map's per-modifier child photon maps, hence the arrows indicate
a loop.
A special case are the variables \cmd{pmapcontrib:pmapContribTab} and
\cmd{pmapcontrib:pmapContribMode}, which are set by
\cmd{pmcontrib4:initPmapContribTab()} in the context of \rcontrib.
These serve as references (pointers) to \rcontrib's contribution table
\cmd{rcontrib:modconttab} and coefficient/contribution mode flag
\cmd{rcontrib:contrib}, but are unused (i.e. NULL) in the context of
\mkpmap, where no such variables exist.
\begin{sidewaysfigure}
\center
\includegraphics[width=0.95\textwidth]{contrib-swarch-mkpmap2-crop}
\caption{
\label{fig:swarch-mkpmap}
Software architecture of precomputed contribution photon map in
the context of \cmd{mkpmap}. Red arrows represent function calls,
ordered top-down.
}
\end{sidewaysfigure}
\begin{sidewaysfigure}
\center
\includegraphics[width=0.95\linewidth]{contrib-swarch-rcontrib-crop}
\caption{
\label{fig:swarch-rcontrib}
Software architecture of precomputed contribution photon map in
the context of \cmd{rcontrib}. Red Arrows represent function calls,
ordered top-down.
}
\end{sidewaysfigure}
\end{document}
@
1.12
log
@Updated affiliation for submission to SNF
@
text
@d51 2
a52 2
\RCS $Revision: 1.11 $
\RCS $Date: 2022/07/24 13:01:08 $
d74 1
a74 1
FG Envelopes and Solar Energy\\
@
1.11
log
@Revised back-of-the-envelope calculation in section 2.2 to clarify that
the 10M precomputed photons are infact a realistic number of a _complex_
environment, not a simple one. This was initially confused with the
number of photons overall prior to precomputation.
@
text
@d32 1
a32 1
breaklines=true, breakatwhitespace=true, showstringspaces=false,
d51 2
a52 2
\RCS $Revision: 1.10 $
\RCS $Date: 2022/07/14 18:35:35 $
d61 1
a61 1
and desperate Actions=, \&c, undertaken by the shipwreck'd Crew
d63 1
a63 1
in which is= describ'd a most fantastick Design,
d73 3
a75 2
Roland Schregle (roland.schregle@@gmail.com)\\
RS SciComp
d78 1
a78 1
Revision \RCSRevision\\
d85 1
a85 1
d88 1
a88 1
and rendering software suite is useful for calculating daylight
d92 1
a92 1
(``patches'') and sun positions. This can be a slow process as
d94 4
a97 4
being traced for neighbouring sensor positions.
\radiance{} includes a contribution photon mapping module to bin
contributions by tracing rays from the light sources (forward
d100 1
a100 1
mapping is particularly efficient to simulate redirection through
d103 8
a110 8
This technical report documents the further development of the
\radiance{} contribution photon map to support precomputed
contributions to reduce redundancy in \rcontrib.
To this end, the contribution photon map utilises a wavelet
compression and efficient coefficient encoding to compactly
represent the large volume of data the contributions incur.
The encoded coefficients are then paged on-demand from disk
d113 1
a113 1
This document focuses on the implementation details of
d116 1
a116 1
d119 1
a119 1
d123 2
a124 2
\chapter{Introduction}
d126 1
a126 1
d135 3
a137 3
daylight availablity on work planes, for example.
In a static scene, these daylight coefficients can be subsequently
d148 1
a148 1
d157 1
a157 1
d160 1
a160 1
official \radiance{} software distribution
d162 2
a163 2
compute daylight coefficients for daylight redirecting components with
predominantly specular reflection or transmission.
d170 1
a170 1
map that maintains its photons entirely on disk, and
d174 1
a174 1
those photons which actually contribute to the sensor points under
d176 1
a176 1
large photon maps can be efficiently performed on commodity office PCs.
d178 1
a178 1
d180 1
a180 1
\radiance{} photon map manual \cite{schregle-pmapManual-2022}
d184 2
a185 2
d187 1
a187 1
d189 1
a189 1
d195 2
a196 2
sensor positions (often arranged in a grid).
d203 1
a203 1
Overview of contribution photon mapping workflow.
d206 1
a206 1
modifier mod are binned using an
d208 3
a210 3
Shirley-Chiu disk-to-square mapping, and wavelet compressed
for a fraction \var{precomp} of photons by \mkpmap.
These contributions are saved along with the corresponding
d213 2
a214 2
photons and their precomputed contributions are subsequently
paged on demand, uncompressed, and cached by \rcontrib,
d220 1
a220 1
d241 1
a241 1
d254 1
a254 1
(as specified with the \opt{-bn} option) using a Shirley-Chiu
d268 1
a268 1
thresholding, i.e. keeping only the (\var{comp})\% most
d271 1
a271 1
\item Compact encoding of thresholded wavelet coefficients
d280 1
a280 1
d284 2
a285 2
\item A \emph{parent} photon map file \var{pmapfile} as
specified with the \opt{-apC} option.
d294 1
a294 1
\item Per-modifier \emph{child} photon maps \var{mod_i.pm}
d296 1
a296 1
option. These use the existing out-of-core photon map format
d304 1
a304 1
d313 1
a313 1
compressed wavelet coefficients, which are located in the
d315 1
a315 1
\item mRGBE decoding of the wavelet coefficients to floating
d336 2
a337 2
Each stage and its constituents is detailed in section
d339 1
a339 1
d342 1
a342 1
d345 1
a345 1
d347 1
a347 1
d349 1
a349 1
were borne out by the author's collective experience with wavelets.
d351 1
a351 1
Radiosity held by Prof. H.P. Seidel hosted by the University of Bonn in
d354 1
a354 1
computer graphics, notably the development of the lifting scheme
d356 2
a357 2
spherical topology \cite{schroeder-sphWavelets-1995}.
d359 2
a360 2
wavelets in developing a 4D BSDF compression scheme as part of an
``adjunct project'' (read: digression) to the development of the
d366 1
a366 1
\footnote{Extremely short-lived infact, since this was the only
d368 3
a370 3
}
lecture series, \lit{Secret Weapons of RADIANCE:
Stuff That Never Took Off} \cite{schregle-bsdfComp-2011}.
d374 1
a374 1
d378 1
a378 1
This was chiefly due to their use by his very talented
d380 1
a380 1
luminance distributions to represent a lightfield.
d384 1
a384 1
d386 1
a386 1
to extend the existing contribution photon map with precomputed
d391 1
a391 1
d394 2
a395 2
choice expletives documenting frustrating dead ends for a dose of realism
-- in a separate proposal document
d397 1
a397 1
d399 1
a399 1
contributions, a number of existing -- and some \emph{very} old -- bugs
d405 3
a407 3
% ---------------------------------------------------------------------------
d414 3
a416 3
photons (a realistic number for a complex simulated environment) with
32-bit RGBE-encoded contributions in 2305~bins (corresponding to a
Reinhart~MF:4 mapping \cite{bourgeoisReinhart-2008}) would occupy
d418 1
a418 1
Note this calculation omits the storage occupied by the photons
d422 1
a422 1
d427 1
a427 1
d430 1
a430 1
(\cmd{wavelet\_test.py}) using Filip Wasilewski's excellent
d433 6
a438 6
what is essentially a 3D domain (hemisphere of incident directions)
which, due to dimensional redundancy, can be reduced to a 2D
domain ($\theta, \phi$), and in turn serialised to a 1D domain,
as is the case with the popular Reinhart mapping.
This is inherently a compromise, since the original topology,
and therefore any multidimensional correlation in the contributions,
d440 1
a440 1
d442 1
a442 1
a uniform sky in a bilaterally lit test scene with skylight were
d446 1
a446 1
a lossy compression, effectively setting the thresholded coefficients
d451 1
a451 1
The initial results revealed that the type of wavelet used (and
d454 1
a454 1
In their original order, localised peaks in the linearised
d461 1
a461 1
pronounced peaks, and major artefacts without sorting (see figure
d463 1
a463 1
d478 1
a478 1
exhibit isolated discontinuities (peaks) which are poorly
d482 1
a482 1
imposes corellation and significantly reduces these artefacts.
d485 1
a485 1
and doesn't decorellate the contributions in their original
d495 1
a495 1
as this requires storing the original bin order.
d499 1
a499 1
favour of a more elaborate 2D mapping of the hemisphere, and a wavelet
d501 1
a501 1
d506 1
a506 1
consequently, a detailed analysis of the \cmd{pywt} C source code was
d508 1
a508 1
extension modes offered by the module are implemented, notably in a
d511 1
a511 1
increasing the number of coefficients beyond the number of bins.
d519 2
a520 2
d523 1
a523 1
d525 1
a525 1
coefficients generated by the test script using \cmd{pywt} indicated
d527 2
a528 2
figure~\ref{fig:coeffRange}).
d541 1
a541 1
of magnitude. It is also evident that the coefficients at the
d552 2
a553 2
By clamping the maximum absolute
d555 1
a555 1
3-tuples to a normalised range and encode these as integer RGB
d572 2
a573 2
Actually, it can stand for \{micro, modified, minimalist\} RGBE;
take your pick.
d578 3
a580 3
number of bits allocated to the RGB mantissae and the exponent, thus
inherently reducing
the precision compared to \radiance's native RGBE, which allocates
d595 2
a596 2
\ref{fig:coeffRange} (top, detail inset at bottom) with
encoded/decoded mRGBE coefficients superimposed as blue circles.
d600 1
a600 1
within the constrained dynamic range $[2^{-31}, 1]$ covered by
d609 1
a609 1
encoding given the previous practical sample data. Figure
d611 1
a611 1
5 bits per RGB mantissa (including 1 bit each for the sign) and the
d619 1
a619 1
investigated and quantified in the unit test of the C module
d628 1
a628 1
d636 2
a637 2
\item Compression in 2D via a computationally lightweight 4-tap
(having a support of 4 samples) wavelet transform, the Daubechies
d646 1
a646 1
photon lookup routines, out-of-core caching routines, and
d650 1
a650 1
their indices in a 32-bit envelope in a separate wavelet
d660 2
a661 2
with on-demand paging in \rcontrib,
as already done by the standard out-of-core photon map.
d669 1
a669 1
% ---------------------------------------------------------------------------
d672 1
a672 1
d675 1
a675 1
d685 3
a687 3
The exposition also frequently refers to data types defined in the source
code, which are summarised in listings \ref{lst:codeDefs1},
a688 1
d690 4
a693 3
% ---------------------------------------------------------------------------
d696 1
a696 1
d702 2
a703 2
d705 1
a705 1
d710 2
a711 2
(approximately) the same number of photons
(i.e. $nemit \approx nphotons \: /\, nsrc$, where $nsrc$ is the number of
d716 1
a716 1
\mkpmap).
d722 2
a723 2
in proportion to the stored photons vs. the total target count.
d726 1
a726 1
the resulting flux per photon must be adjusted individually for
d728 2
a729 2
d731 2
a732 2
d735 1
a735 1
d740 1
a740 1
\cmd{pmapcontrib:ray2bin()} via \cmd{pmapcontrib:contribSourceBin()},
d744 2
a745 2
vector.
d753 2
a754 2
contribution photon directions. This mapping has the
desirable property of preserving adjacency and fractional area.
d756 1
a756 1
hemisphere of photon directions onto the plane defined by the
d763 1
a763 1
d765 2
a766 2
onto the disk plane as defined by its normal, $[rN_x,\: rN_y,\: rN_z]$
(see below). The mapped square coordinates $[sc_x,\: sc_y]$
d768 1
a768 1
\begin{equation}
d772 1
a772 1
where $scdim = \left\lfloor\sqrt{nbins}\right\rfloor$ is the dimension
d778 1
a778 1
lies in the plane or is incident from the back, the bin is set to
d782 2
a783 2
\cmd{ray2bin()} evaluates the following variables to reorient the
d788 1
a788 1
\item{$rNx,\: rNy,\: rNz$}: disk plane surface normal
d790 1
a790 1
\item{$Ux,\: Uy,\: Uz$}: up vector (defines $\phi = 0$,
d793 2
a794 2
The contribution source is set speculatively for each emitted photon,
d799 2
a800 2
discarded and overwritten by the next emitted photon.
d802 1
a802 1
source field \cmd{Pho\-ton.\-aux.\-con\-trib\-Src} is set from
d817 1
a817 1
d820 1
a820 1
d824 2
a825 2
precomputation, thus preserving the relative distribution of photons in
the scene. The entire precomputation is wrapped by the routine
d829 1
a829 1
are collected by function
d834 1
a834 1
among those whose contributions are sought, it is passed to the
d839 1
a839 1
closest photons. It also sets a filter of instance
d846 1
a846 1
emitted from a given light source modifier. This ensures the
d849 1
a849 1
d853 1
a853 1
the preassigned bins indicated by each photon's contribution source
d859 1
a859 1
d868 1
a868 1
d870 1
a870 1
\cmd{PhotonMap.preCompContrib} of type \cmd{PreComputedContrib} in
d875 2
a876 2
RGB wavelet coefficients in each colour channel.
In addition, the wavelet transform requires a second, transposed
d879 1
a879 1
Note that these data structures are allocated and dimensioned separately
d885 1
a885 1
compression ratio \var{comp}.
d888 1
a888 1
d891 2
a892 2
d895 1
a895 1
d897 1
a897 1
mostly opaque for non-ma\-the\-ma\-ti\-ci\-ans and even computer
d901 1
a901 1
referred to \cite{Graps:1995} for an excellent introduction with
d903 1
a903 1
d907 2
a908 2
translated along the signal's propagation axis.
These functions are referred to as the actual wavelets, though in
d910 3
a912 3
squiggly thing in figure \ref{fig:waveletFunc} for an example).
Unlike Fourier
basis functions, wavelets have a finite support, i.e. a defined
d915 2
a916 2
(the scaling terms) at different dilations and translations.
The dilations and translations can be considered sliding windows
d920 1
a920 1
d924 1
a924 1
the latter).
d929 1
a929 1
d942 1
a942 1
\cmd{wavelet2:waveletXform2()}, which only operates on input
d950 1
a950 1
wavelet transform wrapper routine \cmd{wavelet3:padWaveletXform2()}.
d956 1
a956 1
d959 1
a959 1
matrix, which is tantamount to a vertical pass over the original matrix.
d961 1
a961 1
orientation is obtained, and the matrices swapped a second time.
d965 1
a965 1
d984 1
a984 1
d991 1
a991 1
\caption{\label{fig:waveletStep}
d993 1
a993 1
showing the first two resolution levels of the 2D wavelet
d996 3
a998 3
vertical axes. The output of each transform step
(right of arrows)
becomes the input for the next step (left of arrows).
d1019 1
a1019 1
d1022 1
a1022 1
original input signal, corresponding to a lower frequency band. This
d1028 2
a1029 2
along the horizontal and vertical axes.
d1043 1
a1043 1
approximations of prior details $sd$ (lower left),
d1050 1
a1050 1
as input. After the final iteration, the red submatrix in the
d1069 1
a1069 1
function, and $j$ is its position along the input signal's transform axis.
d1072 2
a1073 2
sample positions $j$. Larger supports decorrelate over more samples and
therefore offer improved compression (smaller coefficients), but at
d1075 1
a1075 1
d1078 1
a1078 1
column) into a pair of approximation coefficients $s^k_{i,j}$ and
d1081 1
a1081 1
The set of approximation coefficients $s^k$ always represent the
d1089 1
a1089 1
s^k_{i,j} &=& h_0\ s^{k-1}_{i,2j} \ +\ h_1\ s^{k-1}_{i,2j+1}\ +\
d1092 1
a1092 1
d^k_{i,j} &=& g_0\ s^{k-1}_{i,2j} \ +\ g_1\ s^{k-1}_{i,2j+1}\ +\
d1113 1
a1113 1
d1115 1
a1115 1
equations \ref{eq:d2FwdXform-s} and \ref{eq:d2FwdXform-d};
d1119 1
a1119 1
Note also that, for the sake of clarity, equations \ref{eq:d2FwdXform-s}
d1122 1
a1122 1
the actual implementation swaps the indices $i, j$ on the LHS
d1124 1
a1124 1
d1126 2
a1127 2
contain RGB radiometric data. Consequently, the above decomposition
is extended to the three colour channels. Thus the coefficients
d1148 1
a1148 1
d1150 2
a1151 2
infinite and life is simple. In \realLife, everything has a beginning
and an end, including life itself, and incidentally, the input to a
d1153 1
a1153 1
\footnote{... which inevitably raises that perpetual philosophical
d1161 2
a1162 2
If the input size is a power of two, the signal can simply be wrapped
d1166 1
a1166 1
\cmd{wavelet2:d4Step2()} only handles input sizes of powers of 2,
d1169 1
a1169 1
obviously
d1172 1
a1172 1
precomputed contribution photon map, this function is available for
d1177 3
a1179 3
for many applications, including precomputed contributions, which then
warrants a boundary treatment of the input signal.
d1192 1
a1192 1
d1194 1
a1194 1
the input signal beyond the left and right boundaries. The wavelet
d1197 1
a1197 1
\ref{fig:boundaryExt} for examples):
d1204 1
a1204 1
\item[WAVELET\_EXTEND\_GRAD1:] the input signal is linearly extrapolated
d1206 1
a1206 1
\item[WAVELET\_EXTEND\_GRAD2:] the input signal is linearly extrapolated
d1209 1
a1209 1
(i.e. periodic) and wrapped around at either boundary. This may
d1235 1
a1235 1
d1243 3
a1245 3
% ---------------------------------------------------------------------------
d1247 1
a1247 1
d1252 1
a1252 1
input signal during the inverse wavelet transform.
d1257 1
a1257 1
d1266 1
a1266 1
d1276 1
a1276 1
it accordingly in order to accommodate the latter.
d1282 1
a1282 1
plan on porting this to your ZX81 or KIM-1, in which case I wish you
d1285 1
a1285 1
d1295 1
a1295 1
d1301 1
a1301 1
d1308 1
a1308 1
lower right subquadrants will be the smallest coefficients, and therefore
d1310 1
a1310 1
d1339 1
a1339 1
d1341 1
a1341 1
Thresholding lies at the heart of wavelet compression, which is why a
d1352 1
a1352 1
d1363 1
a1363 1
size is fixed and known beforehand). This strategy was therefore chosen
d1366 1
a1366 1
d1378 3
a1380 3
This transfers all non-zero \emph{detail} coefficients (i.e. omitting
the unoccupied space in the wavelet coefficient matrix left over by
padding) to a thresholding buffer, consisting of an array of
d1382 2
a1383 2
a pointer to the corresponding detail coefficient in the wavelet
coefficient matrix, and its linearised 2D matrix index, using the
d1386 2
a1387 2
after thresholding.
d1393 3
a1395 3
you'd expect, guv), the thresholding buffer will contain fewer
coefficients than expected. In this case, the remaining buffer is
simply filled with as many duplicates of a zero coefficient as
d1397 1
a1397 1
chosen, since it is guaranteed to be unoccupied
d1400 1
a1400 1
d1402 1
a1402 1
in the thresholding buffer so that all coefficients at positions
d1406 3
a1408 3
\cmd{pmap\-cont\-rib:\-co\-eff\-Par\-tit\-ion()}, which swaps
out-of-order buffer entries, much like quicksort. Unlike quicksort,
however, the coefficients within each partition need not be sorted,
d1413 1
a1413 1
The coefficient magnitude is evaluated as dot product over RGB,
d1415 1
a1415 1
d1417 3
a1419 3
only the most significant are kept, i.e. those in the partition
$[0,\ l(1-comp/100)-1]$. These are subsequently
sorted by their coefficient indices using \cmd{qsort()} from the
d1425 1
a1425 1
d1428 1
a1428 1
d1432 2
a1433 2
Once the wavelet coefficients have been thresholded, they are encoded
d1438 1
a1438 1
is based on the same assumptions.
d1443 1
a1443 1
The default bit field configuration is shown in figure
d1448 1
a1448 1
d1454 2
a1455 2
Structure of 32-bit mRGBE encoding for wavelet coefficients.
The encoding consists of three mantissae per RGB colour channel.
d1460 2
a1461 2
payload data range. The default configuration,
\cmd{MANTBITS} = 6, \cmd{EXPBITS} = 5, \cmd{DATABITS} = 9
d1467 1
a1467 1
d1472 1
a1472 1
initialisation function \cmd{mrgbe:mRGBEinit()}.
d1474 1
a1474 1
normalisation factor $d_{norm,i}$, which is returned in
d1483 1
a1483 1
d1485 1
a1485 1
\cmd{mRGBEencode()} for each RGB wavelet detail coefficient
d1493 1
a1493 1
m_i &=& \mathrm{sgn}\,\left(d_i\right)
d1509 1
a1509 1
Note that each coefficient $d_i$ is offset by its corresponding mRGBE
d1513 1
a1513 1
d1517 1
a1517 1
the absolute value $\lvert x \rvert$ is stored in the mRGBE
d1519 1
a1519 1
negative exponent.
d1524 2
a1525 2
encoding range accommodated by \cmd{MANTBITS}.
This offset encodes the mantissa's sign, with all values below
d1527 2
a1528 2
\cmd{mRGBEencode()} accepts each wavelet coefficient's linear index as
d1530 2
a1531 2
were previously sorted with respect to their indices.
Consequently, the coefficient index is \emph{incrementally} encoded,
d1533 2
a1534 2
0 (hence only the first coefficient index is absolute).
This incremental index encoding generally requires fewer bits to encode
d1537 2
a1538 2
However, it is important to realise that the likelihood of overflowing the
d1545 1
a1545 1
and toss in the towel (!), aborting the contribution precomputation with
d1547 1
a1547 1
This is far from optimal, and an issue that
d1554 1
a1554 1
d1560 1
a1560 1
accessed as a scalar integer value \cmd{mRGBE.all} via the union
d1562 2
a1563 2
Each such 32-bit mRGBE-encoded wavelet coefficient is appended to a
d1568 2
a1569 2
\cmd{encodeContribs()}, at which point the routine returns.
d1581 2
a1582 2
thereby minimising the decoding error. Since the approximation
coefficients can (rather suprisingly) be negative, but 32-bit RGBE
d1585 1
a1585 1
(see macro \cmd{PMAP\_CONTRIB\_SET\_RGBE32\_SGN()} in
d1587 1
a1587 1
d1596 2
a1597 2
The RGBE encoded approximation coefficients and mRGBE range, as well as
d1599 3
a1601 3
\cmd{newPhoton()} and (if not NULL) accumulated in a contribution
buffer \cmd{PhotonMap.contribHeapBuf}. This is analogous to the
accumulation of photons in the heap buffer \cmd{PhotonMap.heapBuf}
d1610 1
a1610 1
d1614 1
a1614 1
photon map. Iterating over each child photon map, it calls the
d1626 2
a1627 2
The top-level build routine for per-modifier child photon maps,
d1631 3
a1633 3
its wavelet coefficient file by appending the corresponding modifier
name, as specified by the \opt{-m} option.
In addition, the routine recursively deletes any previous files
d1642 2
a1643 2
Each child photon map is stored on disk in a subdirectory
d1645 1
a1645 1
\lit{<pmapfile>}. The photons themselves are stored out-of-core in a
d1648 2
a1649 2
(see also figure \ref{fig:overview}).
d1651 1
a1651 1
relevant to \rcontrib, i.e all instances of the options \opt{-bn},
d1656 1
a1656 1
file, which is functionally equivalent to the internal contribution
d1660 4
a1663 4
Instances of these option strings are appended to a cumulative string by
\mkpmap{} as it parses the command line, which is then passed to the main
photon distribution routine \cmd{pmcontrib2:distribPhotonContrib}. The
latter assigns the option string to the generated photon map's
d1665 6
a1670 6
The option file is intended to be passed to \rcontrib{} via the \opt{@@}
option to ensure the binning parameters are consistent with those in the
photon map (the technical reasons why this isn't handled automatically
are detailed in section \ref{sec:load}).
Once the output filenames have been initialised,
d1683 7
a1689 7
\cite{schregle-techreport-2016}. The photons are then saved to the
out-of-core octree leaf file \lit{<pmapfile>.rc/<mod>.leaf} before
the out-of-core octree structure to index the leaves is built by
calling \cmd{oocbuild:OOC\_Build()}.
In the case of a contribution photon map, an extended out-of-core
sorting function \cmd{pmcontribsort:contribPhotonSort()} is called,
d1691 3
a1693 3
operations to access intermediate files.
Besides sorting the photons in their consolidated heap file, the
adapted routine also sorts the corresponding wavelet coefficients in
d1696 1
a1696 1
each other, since they were synchronously flushed to disk by
d1699 1
a1699 1
correspondence between photons and their wavelet coefficients.
d1701 1
a1701 1
set of coefficients is too, which greatly simplifies this step of the
d1703 1
a1703 1
d1710 1
a1710 1
\opt{-n} option is passed to \mkpmap). The sorted blocks are
d1719 1
a1719 1
d1721 1
a1721 1
\cmd{pm\-cont\-rib3:save\-Cont\-rib\-Pho\-ton\-Map()}, which is called
d1726 1
a1726 1
for later use with \rcontrib. The routine subsequently iterates over the
d1728 1
a1728 1
using the standard
d1737 1
a1737 1
\cmd{savePhotonMap()} will \emph{not} wind up in a recursive loop, as
d1742 3
a1744 3
To this end, \cmd{save\-Photon\-Map()} was modified to generate
contribution-specific info in the file header (number of coefficients,
compression rate, etc), and
d1748 1
a1748 1
d1750 1
a1750 1
the precomputation is concluded and \mkpmap{} cleans up and terminates.
d1752 1
a1752 1
d1759 1
a1759 1
d1761 1
a1761 1
wavelet transform via the \cmd{PMAP\_\-CON\-TRIB\_\-LOG} compile-time
d1774 1
a1774 1
d1779 1
a1779 1
extension mode, such as \cmd{WAVELET\_EXTEND\_CONST}. A further downside
d1789 1
a1789 1
d1794 1
a1794 1
sparsely populated bins (see figure \ref{fig:cpmapTest-emptyBinsHDR}).
d1797 1
a1797 1
d1805 3
a1807 3
(right, shown for bin 63), an average of 90\% of bins contain
nonzero contributions.
With 256 bins (left, shown for bin 255), this ratio drops to
d1814 1
a1814 1
d1816 1
a1816 1
can be obtained by compiling with the
d1822 2
a1823 2
bins rises above ca. 25\%.
d1839 1
a1839 1
d1858 1
a1858 1
d1864 1
a1864 1
d1869 1
a1869 1
d1880 1
a1880 1
\cmd{PMAP\_\-CONT\-RIB\_\-TEST}. The test verifies
d1885 1
a1885 1
d1900 1
a1900 1
d1907 1
a1907 1
Output of contribution binning unit test
d1931 2
a1932 2
compile time with the \cmd{WAVE\-LET\_\-TEST\_\-1D} and
\cmd{WAVE\-LET\_\-TEST\_\-2D} macros, respectively. These are
d1943 1
a1943 1
where \lit{scdim} is the dimension of the Shirley-Chiu square,
d1951 1
a1951 1
d1953 3
a1955 3
\lit{scdim} $\times$ \lit{scdim}. The matrix is either initialised from
the optional \lit{dataFile} (if specified), or with generated values as
determined by the \cmd{WAVELET\_TEST\_INIT} macro, which supports the
d1973 1
a1973 1
$y_{i,j} = [i \cdot scdim + j + 1, \cdots, \cdots]$,
d1983 1
a1983 1
\cdots,
d1988 4
a1991 4
colour channel, and ellipses indicate repeated values for the remaining
colour channels. These initialisation options are useful to compare the
reconstructed matrix against a known (simple) reference, as opposed to
a more complex distribution from the \lit{dataFile}.
d1995 1
a1995 1
d2002 2
a2003 2
Output of the wavelet transform unit
test for a 5$\times$5 input matrix with thresholding (the
d2006 1
a2006 1
dots, while thresholded coefficients are indicated by
d2011 1
a2011 1
d2015 1
a2015 1
function. If \lit{threshold} was specified on the command line,
d2020 4
a2023 4
}
below \lit{threshold} are
set to zero; similarly to \cmd{pmapcontrib:thresh\-old\-Cont\-ribs()},
the approximation coefficients in the upper left of the output matrix are
d2028 1
a2028 1
d2034 2
a2035 2
Finally, the test inverts the wavelet transform using the (thresholded)
d2037 1
a2037 1
the reconstructed data along with the root mean square error (RMSE)
d2040 1
a2040 1
d2042 5
a2046 5
coefficients can also be output to a file defined by the macro
\cmd{WAVELET\_TEST\_COEFFOUT}, which defaults to
\cmd{wavelet3-test-coeff.dat}.
Similarly, the mRGBE-encoded coefficients can be output to
\cmd{WAVELET\_\-TEST\_\-COEFF\-OUT\_\-mRGBE}, which defaults to
d2048 4
a2051 4
data can be output to \cmd{WAVELET\_TEST\_OUT} (defaulting to
\cmd{wavelet3-test-xform.dat}), while the reconstruction from
mRGBE-encoded coefficients can be output to
\cmd{WAVELET\_TEST\_OUT\_mRGBE} (defaulting to
d2062 3
a2064 3
d2088 1
a2088 1
d2091 1
a2091 1
routine), under/overflow (relative to the specified encoding range),
d2093 1
a2093 1
d2096 2
a2097 2
and dumping the RMSE between encoded and decoded values, calculated as
dot product of their component differences.
d2099 1
a2099 1
d2119 1
a2119 1
d2122 2
a2123 2
its compressed contributions from disk on demand, decoding the
mRGBE-encoded wavelet coefficients, and inverting the wavelet transform
d2126 1
a2126 1
contribution lookup table, and can be optionally cached to accelerate
d2128 3
a2130 3
d2133 1
a2133 1
d2136 1
a2136 1
the parent photon map, which passes control to a dedicated routine
d2140 1
a2140 1
disk. The out-of-core octree data structure, however, remains in-core to
d2142 1
a2142 1
d2158 3
a2160 3
While automatic binning parameters would clearly be more convenient, the
option file is a necessity dictated by \rcontrib's architecture.
d2162 3
a2164 3
by the time the photon map is loaded, \rcontrib's contribution arrays
have already been allocated and initialised, along with its output
streams if the \cmd{-o} option is specified, which can be assigned to
d2168 2
a2169 2
it was considered safer to use the already existing concept of option
files to pass the correct parameters to \rcontrib{} at startup,
d2172 1
a2172 1
This doesn't seem unreasonable as it conforms to \radiance's
d2175 3
a2177 3
d2179 4
a2182 4
As in \rtrace{}, the evaluation of contributions is triggered via the
ambient calculation routine \cmd{ambient:multambient()}, which in turn
calls the photon mapping ``hook'' \cmd{pmapamb:ambPmap()} (resp.
d2185 4
a2188 4
Caustic photons carry a flag indicating they have been specularly
scattered, and are exclusively accepted by \cmd{pmapfilt:filterPhoton()}
during lookups initiated by \cmd{ambPmapCaustic()}. Note, however,
that the dedicated caustic photon map generated with \mkpmap's
d2193 3
a2195 3
in the case of contribution photons is
\cmd{pmcontrib4:getPreCompPhotonContrib()}.
d2198 1
a2198 1
incident ray; typically this is the last ray in a path starting with a
d2203 1
a2203 1
d2209 2
a2210 2
pages photons from the out-of-core octree leaf file and caches them
via an instance of the out-of-core photon cache
d2212 1
a2212 1
d2214 3
a2216 3
\cmd{pmcontrib4:get\-Pre\-Comp\-Con\-trib\-By\-Pho\-ton()} (see below)
and passing the photon along with its index.
\cmd{getPreCompContribByMod()} subsequently scales the decoded
d2224 4
a2227 4
\cmd{getPreCompContribByPhoton()} loads the 32-bit RGBE-encoded
detail coefficient range and approximation coefficients from the
wavelet coefficient file for a given photon, using the photon's
d2229 2
a2230 2
The approximation coefficients are placed in the upper left of the
(initially zeroed) wavelet coefficient matrix. The sign of each
d2234 6
a2239 6
\cmd{getPreCompContribByPhoton()} then loads the mRGBE-encoded
detail coefficients for the given photon from the wavelet coefficient
file (where they are stored immediately after the mRGBE range and
approximation coefficients).
Miscellaneous bookkeeping such as lazy initialisation of the
d2241 1
a2241 1
The detail coefficients are loaded into a lazily allocated
d2244 4
a2247 4
Once the detail coefficients and their range have been loaded, they are
decoded from the decoding buffer by calling \cmd{decodeContribs()},
which also performs the inverse wavelet transform.
d2249 2
a2250 2
retrieved detail coefficient range by calling \cmd{mrgbe:mRGBEinit()},
and subsequently passes this to the mRGBE decoding routine
d2252 2
a2253 2
decoding routine returns each decoded detail coefficient as a floating
point RGB 3-tuple along with its corresponding incremental linear
d2255 2
a2256 2
Given a 32-bit mRGBE-encoded coefficient consisting of the per-colour
d2265 1
a2265 1
\lvert m_i - m_{max} + \epsilon \rvert \
d2278 1
a2278 1
respectively. Each coefficient $d_i$ is denormalised by $d_{norm,i}$ and
d2282 3
a2284 3
value with the signed mantissa offset, $m_{max}$. This offset is
subtracted from each $m_i$ to obtain $d_i$.
d2289 1
a2289 1
as absolute value in the mRGBE exponent field, and is therefore negated
d2291 1
a2291 1
d2301 2
a2302 2
obtained by summing the incremental values embedded in the mRGBE data
field of the consecutively stored coefficients. Since this index $k$ is
d2312 1
a2312 1
initialised); these represent those that were thresholded during
d2314 3
a2316 3
d2318 1
a2318 1
d2320 1
a2320 1
\cmd{decodeContribs()} calls
d2323 2
a2324 2
original contributions -- or more specifically, an approximation thereof
subject to compression artefacts.
d2326 2
a2327 2
for \cmd{pad\-D2\-Inv\-Step2()}, which performs one pass of the inverse
wavelet transform along a fixed (horizontal) axis.
d2329 2
a2330 2
described in section \ref{sec:wavelet3.c}, this function returns an
output matrix in which the inverted coefficients have been transposed so
d2332 2
a2333 2
alternate axis. Consequently, each pair of invocations of the inverse
transform step constitutes a complete horizontal/vertical inverse
d2335 1
a2335 1
d2338 2
a2339 2
\ref{eq:d2FwdXform-s} and \ref{eq:d2FwdXform-d} is inverted as follows
to reconstruct the adjacent approximations $s^{k-1}_{i,2j}$ and
d2352 1
a2352 1
Note that, for the sake of clarity, equation \ref{eq:d2InvXform}
d2354 1
a2354 1
for the forward transform; in actuality, the implementation swaps the
d2356 1
a2356 1
d2368 1
a2368 1
d2370 3
a2372 3
\cmd{pmapooc:OOC\_\-Find\-1\-Pho\-ton()} anticipates the need to
repeatedly evaluate photons for spatially adjacent lookups,
thus reducing I/O latency incurred by paging from disk.
d2374 5
a2378 5
\cmd{ooccache:OOC\_Cache} and is organised into a variable number of
entries, where each entry contains a page (a block loaded on demand
from disk) consisting of a fixed number of neighbouring photons.
A standard least-recently-used (LRU) page replacement strategy
governs the preemption of obsolete pages when the cache fills
d2380 1
a2380 1
d2382 2
a2383 2
additional, per-modifier precomputed contribution cache (another
instance of struct \cmd{OOC\_Cache}) is interrogated by
d2391 3
a2393 3
binned contributions.
If \cmd{getContribCache()} signals that the decoded contributions are
already cached, they are returned in the array and can be directly
d2398 4
a2401 4
The contribution cache uses the same LRU page replacement strategy as
the out-of-core photon cache, except that a page now contains the decoded
contributions associated with a single photon; consequently, the page
d2406 5
a2410 5
If no cached contributions were found for the current photon, they are
paged from disk and decoded as described in the previous sections,
and transferred to the new cache entry.
The contributions must obviously then be
d2413 3
a2415 3
d2421 1
a2421 1
d2423 1
a2423 1
photon map data structure, which requires enabling \cmd{PMAP\_OOC} at
d2425 1
a2425 1
d2437 1
a2437 1
as of this writing, out-of-core (and therefore contribution) photon
d2439 2
a2440 2
In addition, verbose debugging output and sanity checks in the
d2442 1
a2442 1
d2459 1
a2459 1
\item Checking of mRGBE encoded wavelet coefficients in
d2461 1
a2461 1
decoded coefficient with the original, and triggering a
d2463 1
a2463 1
This error is also triggered if the encoded and decoded
d2475 1
a2475 1
for precomputation, unless the random photon index selection
d2482 1
a2482 1
d2484 2
a2485 2
encoding can also be enabled at compile time. See sections
\ref{sec:pmapcontrib-test}, \ref{sec:wavelet3-test} and
d2487 2
a2488 2
d2498 1
a2498 1
d2500 2
a2501 2
To illustrate the effects of wavelet compression, a simple CIE sunny
d2503 2
a2504 2
serve as reference. The contributions from this sky source were
collected with \rcClassic{} in 32 $\times$ 32 = 1024 bins using a
d2506 1
a2506 1
of the sky and a colourmap of the binned contributions are shown for
d2512 1
a2512 1
label=lst:skycontrib-1024.sh,
d2518 3
a2520 3
{skycontrib-1024.sh}
The 2D contributions where loaded as datafile into the \cmd{wavelet3-test}
d2522 1
a2522 1
padded 2D wavelet transform.
d2524 1
a2524 1
the resulting wavelet coefficient matrices after thresholding, alongside
d2527 2
a2528 2
to zero. Black regions in the wavelet matrices indicate either unused
space to accommodate padding coefficients, or thresholded detail
d2532 1
a2532 1
As expected, these coefficients are only thresholded at very high
d2538 1
a2538 1
The resulting loss of detail is however tolerable until ca. 95\% due
d2541 1
a2541 1
d2554 1
a2554 1
Original: fisheye rendering (left),
d2586 1
a2586 1
The original sky distribution and binned contributions are shown
d2590 1
a2590 1
d2628 1
a2628 1
d2665 1
a2665 1
\label{fig:skycontrib1024-cie-coeffs}
d2667 4
a2670 4
% ---------------------------------------------------------------------------
d2672 1
a2672 1
d2680 1
a2680 1
d2685 1
a2685 1
d2698 1
a2698 1
Original: HDR camera capture (left),
d2730 1
a2730 1
The original sky distribution and binned contributions are shown
d2734 1
a2734 1
d2772 1
a2772 1
d2809 1
a2809 1
\label{fig:skycontrib1024-hdr-coeffs}
d2811 4
a2814 4
% ---------------------------------------------------------------------------
d2817 1
a2817 1
d2819 3
a2821 3
falsecolour renderings generated by \rcClassic{} for a
simple,
bilaterally lit scene. The scene contains two opposing fenestrations with
d2826 1
a2826 1
rays per bin, the running time on an Intel Xeon E5-2660 @@ 2.60GHz
d2831 1
a2831 1
used to generate the renderings is shown in
d2835 1
a2835 1
caption = {Script to generate the per-bin fisheye renderings in
d2847 1
a2847 1
Despite the visible noise due to direct photon visualisation, there is
d2849 1
a2849 1
the leftmost column, which corresponds to bin numbers that are even
d2852 2
a2853 2
in the next section.
While \mkpmap{} took ca. 13.5 minutes to precompute the photon map on
d2858 2
a2859 2
caption = {Script to generate the per-bin fisheye renderings in
figure \ref{fig:cpmapTestHDR-pmap} with \rcontrib{} using
d2874 3
a2876 3
Falsecolour rendered contributions from \rcClassic{} for
each of 64 bins in the bilaterally lit test scene containing
3970 solar positions.
d2879 1
a2879 1
are tinted cyan.
d2887 2
a2888 2
d2916 1
a2916 1
boundary extension and logarithmic contribution encoding.
d2919 1
a2919 1
suggests the transform is very sensitive to thresholded padding
d2922 1
a2922 1
d2971 2
a2972 2
of 0.5 to 0.9. In this example, compression artefacts
appear at the boundaries (corresponding to the horizon)
d2981 3
a2983 3
% ---------------------------------------------------------------------------
d2988 1
a2988 1
d2990 1
a2990 1
as part of the project ``Light Fields for Spatio-Temporal Glare
d2992 4
a2995 4
The author would like to thank his colleagues Dr. Lars Grobe and
Dr. Stephen Wasilewki for their collaboration, for providing feedback
during weekly lunchtime meetings, and for testing the code on MacOS.
d2998 3
a3000 2
Prof. Stephen Wittkopf, who gave our crew a sleek
vessel, until it was torpedoed by the powers that be in 2016.
d3002 1
a3002 1
Finally, the author would like to thank his colleagues at
d3007 1
a3007 1
service to the \radiance{}
d3012 1
a3012 1
\textit{\textbf{The crew of the former CC EASE bids= you Farewell
d3019 1
a3019 1
% ---------------------------------------------------------------------------
d3027 2
a3028 1
% ---------------------------------------------------------------------------
a3029 1
d3033 2
a3034 2
d3041 2
a3042 2
d3049 2
a3050 2
d3057 2
a3058 2
d3064 1
a3064 1
d3067 1
a3067 1
photon map generation and precomputation with \cmd{mkmap}.
d3070 4
a3073 4
The blue blocks group relevant fuctions (red text) within software
modules built from C source files with the indicated names.
Red arrows correspond to function calls, which are ordered top-down.
d3078 1
a3078 1
A special case are the variables \cmd{pmapcontrib:pmapContribTab} and
d3083 1
a3083 1
\cmd{rcontrib:contrib}, but are unused (i.e. NULL) in the context of
d3091 1
a3091 1
Software architecture of precomputed contribution photon map in
a3109 1
@
1.10
log
@Added a samepage environment in thresholding section, documented
initPmapContribTab() in SW architecture section.
@
text
@d51 2
a52 2
\RCS $Revision: 1.9 $
\RCS $Date: 2022/07/13 21:02:49 $
d413 4
a416 3
photons (a modest number for most intents and purposes) with 32-bit
RGBE-encoded contributions in 2305 bins (corresponding to a Reinhart
MF:4 mapping \cite{bourgeoisReinhart-2008}) would occupy 92 Gb on disk.
d419 1
a419 1
This simple example makes it clear that some sort of compression scheme
@
1.9
log
@Replaced SW architecture sketches with finalised figures.
@
text
@d10 1
a10 1
\usepackage[firstpage,stamp]{draftwatermark}
d51 2
a52 2
\RCS $Revision: 1.8 $
\RCS $Date: 2022/07/08 17:27:25 $
d1338 12
a1349 7
Thresholding lies at the heart of wavelet compression, which is why a
good tresholding strategy is important. That said, the strategy currently
used by the precomputed contribution photon map is admittedly simple
and lame; it's one of those components that could have been improved with
more time and budget.
The strategy was selected as a compromise between ease
of implementation and efficient compression, for the reasons that follow.
a1350 9
\newpage
\noindent There are many ways to threshold coefficients, such as:
\begin{enumerate}
\item using a fixed threshold (hard thresholding, possibly adapted
to the resolution),
\item attenuation of coefficients if they exceed the threshold (soft
thresholding), and
\item dropping a fixed fraction of the smallest coefficients.
\end{enumerate}
d1545 1
a1545 1
This is far from optimal, and yes guv, it's another issue that
d3070 1
d3075 7
@
1.8
log
@Revised abstract.
@
text
@d51 2
a52 2
\RCS $Revision: 1.7 $
\RCS $Date: 2022/07/08 17:10:27 $
d3073 2
a3074 2
Red arrows correspond to
intermodule calls. Note that some routines, notably
d3077 1
a3077 1
a loop. \todo{Replace sketches!}
d3081 1
a3081 1
\includegraphics[width=0.7\linewidth]{swarch-mkpmap-sketch}
d3084 3
a3086 3
Software architecture of precomputed contribution photon map
with bindings for \cmd{mkpmap}. Arrows represent
calla between modules.
d3091 1
a3091 2
%\begin{sidewaysfigure}
\begin{figure}
d3093 1
a3093 1
\includegraphics[width=\linewidth]{swarch-rcontrib-sketch}
d3096 3
a3098 3
Software architecture of precomputed contribution photon map
with bindings for \cmd{rcontrib}. Arrows represent
calls between modules.
d3100 1
a3100 2
\end{figure}
%\end{sidewaysfigure}
@
1.7
log
@Proofreading, documented option file.
Added colour map figures for CIE and HDR capture skies in results section.
@
text
@d51 2
a52 2
\RCS $Revision: 1.6 $
\RCS $Date: 2022/06/14 17:19:02 $
d86 27
a112 7
This technical report documents the further development of the
\radiance{} contribution photon map to support precomputed
contributions for climate-based daylight modelling. To this end,
the contribution photon map utilises a wavelet compression and
efficient coefficient encoding to compactly represent the large
volume of data the contributions incur.
This document focuses primarily on the implementation details of
@
1.6
log
@Added mRGBE decoding and inv. wavelet xform subsections to section 3.3
@
text
@d8 1
d51 2
a52 2
\RCS $Revision: 1.5 $
\RCS $Date: 2022/06/07 14:03:42 $
d78 1
a78 1
\RCSDate
d88 1
a88 1
contribution for climate-based daylight modelling. To this end,
d125 2
d133 1
a133 1
in one or more files, or on the console. In the context of contributions,
d140 7
a146 5
\cite{schregle-techreport-2015}. Initial results using contribution
photon mapping to compute daylight coefficients for DRCs with strong
redirection indicated -- unsurprisingly -- that the number of
required photons to adequately predict lighting levels (e.g. to
evaluate daylight autonomy) scales linearly with the number of
d149 3
a151 2
map that maintains its photons entirely on disk
\cite{schregle-oocpmap-jbps-2016, schregle-techreport-2016}.
d155 2
a156 1
large photon maps can be accommodated on commodity office PCs. Yay.
d178 1
a178 1
\includegraphics[width=\linewidth]{contribpmap-overview2-crop}
d203 1
a203 1
This aspect of photon map amortises the expense of precomputation,
d210 3
a212 3
In precomputed contribution photon mapping mode \mkpmap{} is designed
to behave similarly to \rcontrib, and accepts similar parameters to
``bin'' (discretise) contribution based on their incident direction.
d214 1
a214 1
option, but unlike classic \rcontrib, they are restricted to light
d238 1
a238 1
the \var{apP} option (this option already served the same
d247 1
a247 1
thresholding, i.e. keeping only the most \var{comp}\%
d257 1
a257 1
photon are no longe needed and discarded.
d261 1
a261 2
This consists of the following files:\todo{Include .opt file
passed to \rcontrib!}
d267 4
d305 1
a305 1
\opt{-aC} option, each containing a set of
d344 6
a349 3
of the short-lived\footnote{Extremely short-lived infact, since this
was the only instalment!} lecture series, \lit{Secret weapons of RADIANCE:
Stuff that never took off} \cite{schregle-bsdfComp-2011}.
d361 1
a361 1
\lit{pywavelets} Python module and led to the development of the
d367 1
a367 1
The availablity of \lit{pywavelets} as a convenient prototyping tool
d387 1
a387 1
\section{Initial Compression Tests with \lit{pywavelets}}
d399 1
a399 1
was needed.
d409 1
a409 1
\lit{pywavelets (pywt)} package \cite{Lee2019}.
d413 2
a414 2
domain, and in turn serialised to a 1D domain, as is the case for the
popular Reinhart mapping.
d421 1
a421 1
binned using a Reinhart MF:4 mapping (2305 bins) were passed to a 1D
d426 2
a427 2
applying the inverse wavelet transform, and their deviations checked
from the original.
d437 3
a439 3
imposes some correlation on the input. Similar tests were also conducted
with solar sources, which introduced higher dynamic range and more
pronounced peaks and artefacts without sorting (see figure
d456 2
a457 1
exhibit pronounced peaks which are poorly decorrelated by the
d460 2
a461 1
significantly reduces these artefacts. This reordering
d469 1
a469 1
Other bin orderings using 2D space-filling curves such as Morton codes
d472 3
a474 3
In addition, sorting the contributions complicates the reconstruction
of the contributions, as this requires storing the original contribution
bins. Furthermore, this still doesn't decorellate the contributions
d477 2
a478 2
favour of a 2D mapping of the hemisphere, and a wavelet transform in
this domain.
d487 1
a487 1
multi-dimensional context. This revealed the necessity to introduce
d504 1
a504 1
that the majority of these occupied ca. 10 orders of magnitude (see
d513 1
a513 1
Logarthmic plot of absolute wavelet coefficients magnitudes after
d516 5
a520 4
number of coefficients doubling in consecutive bands. It is
clear from this plot that the dynamic range of the coefficients
is limited to ca. 10 orders of magnitude in practice, and that
the coefficients at the bottom between $10^{-19}$ and $10^{-20}$
d523 2
a524 2
$4.6^{-10}$; this corresponds to an encoding range of
$[2^{-31}, 1]$ which can be encoded using a 5-bit binary mantissa.
d533 3
a535 1
3-tuples to a normalised range and encode these as integer RGB mantissas
d542 3
a544 2
precision. This encoding, which would require pre-normalisation of the
coefficients and storage independent normalisation factors per colour
d548 7
a554 4
to call it\footnote{
Actually, it can stand for \{modi,mini,micro\}RGBE; take your pick.
It just has to sound cool nowadays. iRGBE would've been trendy too,
but probably infringed some obscure Apple copyright.
d556 2
a557 2
number of bits allocated to the RGB mantissae\footnote{Yeah, that
\emph{is} the plural.} and the exponent, thus inherently reducing
d598 2
a599 2
developed on the basis of the Python prototype
(see section \ref{sec:mrgbe.c}).
d615 3
a617 3
(having a support of 4 samples) wavelet transform: the Daubechies
DB2 wavelet is a popular choice.
\item Arbitrary number of bins (not just powers of 2). This requires
d635 2
a636 3
in \rcontrib{} when the photon map is loaded.\todo{
Update! Binning opts now passed to rcontrib via an option file
}
d639 1
a639 1
as used with the standard out-of-core photon map.
d661 2
a662 2
for an overview of the modules, and how they are embedded within the
\rcontrib{} framework.
d676 3
a678 3
tool. This entails distributing (and binning) the photons, and
precomputing their binned contributions by performing density estimates,
compressing the resulting contributions via a wavelet transform (handling
d685 1
a685 1
\cmd{distribPhotonContrib()}.
d687 5
a691 3
\cmd{distribPhotons()} in \cmd{pmap.c} in that each source contributes
(approximately) the same number of distributed photons
(i.e. $N_{emit} \approx N_{p} / N_{src}$). This measure is intended to
d696 4
a699 2
distribution routine, the number of emitted photons to emit is determined
by emitting a fraction of the photons in a prepass, and extrapolating this
d702 3
a704 5
Since the number of emitted photons will now vary for each source,
it will no longer correlate with each source's emitted flux, in contrast
to the standard photon distribution routine.
This in turn implies the photon flux differs for each source.
The resulting flux per photon is therefore adjusted individually for
d718 3
a720 2
\cmd{ray2bin()} via \cmd{contribSourceBin()}, which performs a
Shirley-Chiu disk-to-square mapping (see figure \ref{fig:shirleyChiu}
d735 1
a735 1
surface normal $[rNx, rNy, rNz]$. The polar angle origin
d737 1
a737 1
$[Ux, Uy, Uz]$.
d743 3
a745 4
onto the disk plane as defined by its normal, $[rN_x, rN_y, rN_z]$
(see below). The mapped square coordinates $[sc_x, sc_y]$
are then serialised to a bin
These square coordinates are in turn mapped to a linear bin
d761 1
a761 1
\cmd{ray2bin} evaluates the following variables to reorient the
d766 1
a766 1
\item{$rNx, rNy, rNz$}: surface normal
d768 1
a768 1
\item{$Ux, Uy, Uz$}: up vector (defines $\phi = 0$,
d775 3
a777 2
from a port and left the scene), the contribution source is simply
discarded and overwritten by that of the next emitted photon.
d783 1
a783 1
designated an auxiliary data bit which is specific to the photon type.
d789 2
a790 1
accumulated in bins when the contributions are precomputed.
d796 1
a796 1
\subsection{Contribution Photon Precomputation}
d801 3
a803 3
option to \mkpmap) is drawn uniformly at random (thus preserving
the relative distribution of photons in the scene) as candidates for
precomputation. The entire precomputation is wrapped by the routine
d806 1
a806 1
The contributions for the candidate precomputed photon
d822 4
a825 3
incident from backfaces. With contribution photons, it only accepts
photons from the same light source modifier. This ensures the lookup
collects only contributions for the same modifier as the candidate
d832 2
a833 2
field \cmd{Photon.aux.contribSrc.sourceBin}. This incidentally also
includes the precomputed photon's own binned contribution.
d835 2
a836 2
intercepted by the search volume (with radius corresponding to the
maximum found photon distance).
d839 1
a839 1
defined in the stock \radiance{} module \cmd{lookup.c} of per-modifier
d843 2
a844 2
the actual precomputed photons stored in the latter, and the modifier name
acting as the LUT key, as is already done with the \rcontrib's
d848 2
a849 2
\cmd{PhotonMap.preCompContrib} of type \cmd{preComputedContrib} in each
child photon map (see listing \ref{lst:codeDefs2}, which serves
d860 1
a860 1
The preallocated data structure container of type \cmd{preComputedContrib}
d863 2
a864 1
compression ratio. This routine performs the actual wavelet transform and
d878 3
a880 1
theory is relatively intuitive, if far from simple.
d885 1
a885 1
translated along the signal's propagation axis basis functions.
d900 4
a903 3
dimensions, though applications in 1 and 2 dimensions (e.g. for
image processing and compression) are most common. The
precomputed contribution photon map decomposes the pre-binned
d912 1
a912 1
following the Shirley-Chiu mapping.%
d915 1
a915 1
rows (which are \textit{Iliffe} vectors, or pointers to arrays), can
d922 3
a924 2
Due to the restriction on the input size, this function is not used
by the precomputed contribution photon map.
d949 1
a949 1
Daubechies D2
d957 2
a958 1
decorrelation compared to wavelets with larger supports.
d971 2
a972 1
showing first two resolution levels of the 2D wavelet transform,
d974 3
a976 2
vertical axes. The output (right of arrows) of each transform
step becomes the input for the next step (left of arrows).
d1020 1
a1020 1
coefficients from those of prior iterations at high resolutions:
d1026 1
a1026 1
submatrix, using the approximations of approximation as
d1029 1
a1029 2
upper left corner contains the 3$\times$3 coarsest approximations
of approximations. The
d1039 1
a1039 1
is either denoted \textit{D2}, referring to its 2 vanishing moments,
d1049 4
a1052 4
localised in space, i.e. they are non-zero for a fixed set of positions
$j$. Larger supports decorrelate over more samples and therefore offer
improved compression (smaller coefficients), at higher computational
expense.
d1066 1
a1066 1
\label{eq:d2FwdXform}
d1069 1
d1076 1
a1076 18
Note that the transform axis index $j$ is doubled on the RHS of
equation \ref{eq:d2FwdXform}; this is because each iteration of the
transform reduces the number of approximation coefficients compared to the
input, yielding the coarser approximation.
Note also that, for the sake of clarity, equation \ref{eq:d2FwdXform}
omits the on-the-fly transposition shown in figure \ref{fig:waveletStep};
in actuality, the implementation swaps the indices $i, j$ on the LHS
during assignment.
In the case of the precomputed contribution photon map, the contributions
contain RGB radiometric data. Consequently, the above decomposition
is extended to the three colour channels. Thus the coefficients
$s^k_{i,j}$ and $d^k_{i,j}$ are infact 3-tuples.
The colour channels are treated completely independently, and no
decorellation occurs between them.
For the 4-tap Daubechies wavelet in figure \ref{fig:waveletFunc}, the
constants $h_j$ and $g_j$ are:
d1091 19
a1109 1
d1127 6
a1132 5
In an ideal world (a.k.a. theory), input signals are considered infinite
and life is simple. In \realLife, everything has a beginning and an end,
including life itself, and incidentally, the input to a wavelet
transform.\footnote{... which inevitably raises that perpetual
philosophical question, ``why bother?''
d1140 3
a1142 3
If the input size is a power of two, the signal can
be simply wrapped around, and the number of approximation coefficients
is halved in each iteration.%
d1154 3
a1156 3
However, restrictions on the input size may be unacceptable for most
applications, which then warrants a boundary treatment of the input
signal.
d1196 1
a1196 1
input values are large near the boundary, which can lead to
d1198 1
a1198 1
a sigularity (i.e. an isolated peak tapering towards the boundaries).
d1205 1
a1205 1
interior coefficients, and therefore preserved. In pratice tho,
d1210 3
a1212 3
extension mode defauls to \cmd{WAVELET\_EXTEND\_CONST}. Several of these
modes can also be found (if under different names) in the
\lit{pywavelets} package, and are partly inspired by them.
d1227 3
a1229 3
details coefficient pair beyond the halved length of the input. These
additional \textit{padding coefficients} are generated at the boundaries,
and essential to reduce artefacts at the edges of the reconstructed
d1245 1
a1245 1
With padding, the number of coefficients for an input of size $l^k$
d1253 3
a1255 3
the matrix for the padding coefficients, and dimension the wavelet
coefficient matrix in order to accommodate the padding coefficients for
each iteration. This is why the coefficient matrix in figure
d1257 1
a1257 1
But since nobody at Micro\$oft gives a damn about efficiency and quality
d1264 3
a1266 3
To support the allocation of an array of suitable size,
the function \cmd{wavelet2:padD2Step()} returns the output length,
including padding, for a given input length if the either the input or
d1282 1
a1282 1
removing the \var{comp}\% least significant such coefficients (i.e.
d1285 3
a1287 2
for a compression of 75\%. In general, the details of details will be the
smallest coefficients, and therefore most likely to be thresholded.
d1297 1
a1297 1
after tresholding the 75\% coefficients with the lowest absolute
d1316 1
a1316 1
implicitly treated as zero.\todo{Thresholding formula?}
d1319 3
a1321 3
good tresholding strategy is important. The strategy currently used
by the precomputed contribution photon map is, however, simple but crap.
It is admittedly one of the components that could have been improved with
d1323 1
a1323 1
The crappy thresholding strategy was selected as a compromise between ease
d1326 10
a1335 5
There are many ways to threshold coefficients, such as (a) against a
fixed threshold (hard thresholding), (b) against a variable threshold
as a function of resolution (adaptive thresholding), and (c) the rather
primitive fixed fraction thresholding employed in the precomputed
contribution photon map. Options (a) and (b) generate a variable number
d1340 1
a1340 1
Option (c) is the least optimal, since the user has no control over the
d1350 1
a1350 1
(highlighted in upper 3$\times$3 submatrix in figure
d1354 3
a1356 3
these approximations using the (thresholded) detail coefficients during
the inverse wavelet transform. These coefficients are therefore
preserved by the thresholding.
d1363 1
a1363 1
\lit{struct PreComputedContribCoeff}. Each entry in this buffer contains
d1373 9
a1381 8
which are excluded from thresholding. If the thresholding buffer
contains fewer coefficients than expected, the wavelet transform
actually produced zero coefficients ('appens more often than
you'd expect, guv). In this case, the remaining buffer is filled with
as many duplicates of a zero coefficient as required;
specifically, the coefficient in the lower right corner, which is
guaranteed to be unoccupied (see figure \ref{fig:waveletCoeffsFull}),
and will be thresholded anyway.
a1383 1
\todo{Needs figure?}
d1387 1
a1387 1
recursive routine with the completely unexpected name
d1393 1
a1393 1
indices) is more efficent than swapping the actual RGB floating point
d1399 3
a1401 3
only the most significant, namely those in the partition
$[0,\ l(1-comp/100)-1]$, are kept. These are subsequently
sorted by their coefficient indices, using \cmd{qsort()} from the
d1419 1
a1419 1
phootn map is essentialy a direct port of the Python prototype, and
d1421 1
a1421 1
The mRGBE fields and its correponding 32-bit integer mapping are defined
d1423 2
a1424 1
bit field configurations are defined as macros.
d1427 3
a1429 2
require more precision or a greater payload data range. In practice, the
threshold presents a compromise suitable for most applications.
d1438 3
a1440 3
a common exponent (base 2), and an associated data bit in this
case the coefficient index (linearised from its 2D matrix
indices). The bits can be allocated within the 32-bit enveloped
d1442 3
a1444 2
index range. The default configuration, \textit{MANTBITS = 6,
EXPBITS = 5, DATABITS = 9} (abbreviated 6:6:6:5:9), balances
d1451 2
a1452 2
\cmd{encodeContribs()} keeps track of the per-colour channel range
$[d_{min,i}, d_{max,i}]$ of the wavelet coefficients' absolute
d1454 1
a1454 1
initialisation function \cmd{mRGBEinit()}.
d1456 2
a1457 2
normalisation factor
$d_{norm,i}$, the latter being returned in \cmd{mRGBERange.norm}:
d1463 1
a1463 1
where EXPBITS is the number of bits allocated to the shared exponent
d1467 2
a1468 1
\cmd{mRGBEencode()} for each RGB wavelet coefficient $d = [d_r, d_g, d_b]$
d1470 3
a1472 3
instance containing the normalisation $d_{norm}$, to obtain the
mRGBE encoding consisting of per-colour channel mantissae $m_i$, and a
shared base-2 exponent $x$:
d1475 3
a1477 3
m_i &=& \mathrm{sgn}\left(d_i\right)
\left\lfloor \overline{m} \cdot \overline{d}_i + \epsilon
\right\rfloor + m_{max}, \quad i\in\lbrace r,g,b\rbrace,
d1481 5
a1485 4
\overline{d} &=& \max\left(\overline{d}_r, \overline{d}_g,
\overline{d}_b\right)\\
\overline{d}_i &=& \left(\lvert d_i\rvert - d_{min,i}\right)
d_{norm,i}, \quad i\in\lbrace r,g,b\rbrace\\
d1488 3
a1490 1
where overbars denote normalised values.
d1499 3
a1501 2
$\lvert x \rvert$ is stored in the mRGBE's exponent field, as it
is implicitly negative since $\overline{d} \leq 1$.
d1504 1
a1504 1
an optional constant $\epsilon$, which defaults to 0), and then offset
d1506 12
a1517 11
encoding range of the mantissa. The latter encodes the mantissa's sign,
with all mantissa values below $m_{max}$ being negative.
We note that \cmd{mRGBEencode()} also accepts a wavelet coefficient's
linear index as payload data, and that these indices are monotonously
increasing, since the coefficients were previously sorted with respect to
them. The coefficient is \emph{incrementally} encoded, i.e. as the
difference to its immediate predecessor's, starting at 0 (consequently,
the first coefficient index is the only absolute one).
This incremental index encoding obviously requires fewer bits to encode
compared to the absolute index, which in turn reduces the likelihood of
d1520 2
a1521 2
It is crucial to realise that the likelihood of overflowing the mRGBE
data field increases with the index increments, notably when the
d1523 1
a1523 1
high compression ratio and/or number of bins. This cannot be predicted as
d1526 6
a1531 5
contributions. If this occurs, we're caught with our pants down and toss
in the towel, aborting contribution precomputation with an error
-- bit naff innit, guv? And yes guv, this is another aspect that could have
been better handled with more time and budget permitting.
As a half-baked user-friendly gesture, \mkpmap{} will however warn
d1534 2
a1535 2
in the worst case, the mRGBE-encoded coefficient index \emph{could}
theoreticall overflow.
d1541 3
a1543 2
Together, these occupy a 32-bit envelope, which can be accessed as
a scalar integer value \cmd{mRGBE.all} via the union declaration.
d1550 1
a1550 2
\cmd{encodeContribs()}, at which point the routine returns to its
caller, \cmd{preComputeContrib()}.
d1552 2
a1553 1
The latter function prepends the encoded contributions with the
d1555 1
a1555 1
of the wavelet coefficient matrix, and the per-colour-channel
d1574 2
a1575 2
specifically sets the auxiliary data field \cmd{Photon.aux.contribSrc}
to the current light source index (passed vay the photon ray's \cmd{rsrc}
d1579 2
a1580 2
The RGBE encoded range and approximation coefficients, as well as the
mRGBE encoded thresholded detail coefficients, are passed to
d1595 1
a1595 1
child photon map to disk, it chucks, uh discards the original
d1607 1
d1612 1
a1612 1
the \opt{-apC} option), and the filename each of child photon map and
d1617 3
a1619 1
standard C library. \footnote{This function supersedes \cmd{ftw()}, which
d1622 1
a1622 1
takes place on this platform.
d1630 21
a1650 1
(see also figure \ref{fig:overview}).
d1655 8
a1662 7
photon and contribution heaps in multiprocessing mode, normalises the
photon flux if applicable (skipped if daylight coefficients, rather
than contributions are specified with the \opt{-V} option, in which
case the photon flux is already normalised). In conjunction with the
contribution photon map, which requires out-of-core photon mapping,
this function call the specific out-of-core build routine,
\cmd{pmapooc:OOC\_BuildPhotonMap()}. With regular photon maps, this
d1664 5
a1668 5
according to the Morton code derived from their 3D positions
\cite{schregle-techreport-2016}, saving these to the out-of-core
octree leaf file \lit{<pmapfile>.rc/<mod>.leaf} before proceeding to
build the out-of-core octree structure to index the leaves by calling
\cmd{oocbuild:OOC\_Build()}.
d1677 3
a1679 1
w.r.t their Morton codes, both heaps are ordered w.r.t each other.
d1691 1
a1691 1
are small enough to be quicksorted in-core(in parallel, if the
d1706 5
a1710 2
\cmd{saveContribPhotonMap()} iterates over the (child) photon maps
referenced in the parent's \cmd{preCompContribTab} using the standard
d1713 1
a1713 1
for each per-modifier child photon map in the LUT. This routine
d1716 2
a1717 1
per-modifier photons. \footnote{
d1723 7
a1729 1
}
a1730 7
\cmd{save\-Photon\-Map()} was modified to generate contribution-specific
info in the header (number of coefficients, compression rate, etc), and
saves the photon map itself, which in out-of-core mode simply encodes the
indexing structure of the out-of-core octree to reference the photons in
the leaf file, (as well as the compressed wavelet coefficients in this
case).
d1732 2
a1733 1
\mkpmap{} cleans up and terminates. Woohoo!
d1742 1
a1742 1
Contributions can be optionally logarithmically encoded during the
d1744 1
a1744 1
option (see \cmd{pmapcontrib.h}. If defined,
d1746 2
a1747 1
applies a natural logarithm to every input value to the wavelet transform
d1750 1
a1750 1
input, and consquently the resulting wavelet coefficients, which
d1753 1
a1753 1
compression artefacts during the inverse transform, since the
d1762 2
a1763 2
is the increased sensitivity of the encoding to artefacts and jitter
resulting from the limited precision of the mRGBE encoding.
d1769 1
a1769 1
\subsection{Sparsely Populated Bins}
d1775 1
a1775 1
populated (i.e. nonzero) is too low, the bias may result from the
d1777 2
d1804 1
a1804 1
bins rises above ca. 50\%.
d1816 1
a1816 1
turn gives manifests itself as visible bias as shown figure
d1824 2
a1825 1
fewer than 50\% of bins are populated. The frequency of these warnings
d1842 1
a1842 1
contain option unit tests which can be built at compile time. These
d1850 1
d1852 11
a1862 2
The \cmd{pmapcontrib} module contains a unit test which can be enabled at
compile time by defining \cmd{PMAP\_\-CONT\-RIB\_\-TEST}. The test verifies
d1901 9
a1909 1
transform that can be enabled at compile time by defining
d1911 1
a1911 1
The \cmd{wavelet} and \cmd{wavelet2} modules contain similar unit tests
d1914 2
a1915 5
\cmd{WAVE\-LET\_\-TEST\_\-2D} macros, respectively. These are not
discussed in detail here, but are functionally similar to
module \cmd{wavelet3}'s unit test, except that \cmd{wavelet} does not
test the mRGBE encoding. (At the time, we just couldn't have been
buggered, guv...)
d1930 1
a1930 1
per line (the excess being ignored). This is particularly usefuly to import
d1932 1
a1932 1
the same number of bins.
d1936 1
a1936 1
the optional \lit{dataFile} (if specified), or based on fixed values as
d1940 4
a1943 4
\item Random data, independent colour channels:
$y_{i,j} = [\xi_r, \xi_g, \xi_b]$.
\item Random data, correlated colour channels channels:
$y_{i,j} = [\xi_r, (0.1+0.9\xi)\xi_r, (0.1+0.9\xi)\xi_r]$,
d1945 2
a1946 2
\item Random data, identical for all colour channels channels:
$y_{i,j} = [\xi_r, \xi_r, \xi_r]$.
d1969 3
a1971 2
Note that ellipses indicate repeated values for the remaining colour
channels. These initialisation options are useful to compare the
d1973 2
a1974 1
something more complex from the \lit{dataFile}. Regardless of the source,
d1984 3
a1986 3
Output of wavelet transform unit
test for a 5$\times$5 input matrix with thresholding (mRGBE
output omitted for the sake of brevity).
d1997 13
a2009 6
function. If a threshold parameter was specified, the resulting wavelet
coefficients whose absolute value lies below this value are set to zero;
similarly to \cmd{pmapcontrib:thresholdContribs()}, the approximation
coefficients in the upper left the output matrix are not thresholded.
The test outputs the wavelet coefficient matrix, indicating the thresholded
coefficients as bracketed dots (see figure \ref{fig:wavelet3-test}).
d2013 1
a2013 1
an additional output matrix, which is also dumped below the original
d2017 20
a2036 5
Finally, the test inverts the (thresholded) wavelet coefficients by
calling \cmd{padWaveletInvXform2()}, dumping the reconstructed data
along with the root mean square error (RMSE) compared to the original
input. This process is optionally repeated for the mRGBE encoded
coefficients.
d2039 1
a2039 1
(artefacts) incurred by thresholding and the limited precision mRGBE
d2041 1
a2041 1
latter). Unsurprisingly, the overwhelming majority of testing was conducted
d2052 9
a2060 1
that can be enabled at compile time by defining
d2072 3
a2074 3
encoding input, notably zero (which is handled separately in the encoding
routine), payload data and encoded coefficient under/overflow (relative
to specified encoding range), and empty specified range.
d2078 3
a2080 3
comparing the latter with the former and dumping the corresponding RMSE
between the two, calculated as dot product of the componentwise differences
for each pair of 3-tuples. Figure \ref{fig:mrgbe-test} shows a sample run.
d2088 1
a2088 1
Sample output of mRGBE unit test for a 6:6:6:5:9 mRGBE
d2107 4
a2110 3
The reconstructed contributions can then be optionally cached to
accelerate neighbouring evaluations.
d2114 1
d2122 35
a2156 2
disk. The out-of-core octree data structure remains in-core to facilitate
navigating the photon map during lookups.
d2163 10
a2172 9
ambient calculation
routine \cmd{ambient:multambient()}, which in turn calls the photon mapping
``hook'' \cmd{pmapamb:ambPmap()} (resp. \cmd{pmapamb:ambPmapCaustic()}
for caustic photons\footnote{Caustic photons carry a flag indicating
they have been specularly scattered, and will be exclusively accepted
by \cmd{pmapfilt:filterPhoton()} during lookups. Note that the dedicated
caustic photon map generated with \mkpmap's \opt{-apc} option doesn't
support contributions.
}).
d2174 1
a2174 1
the photon type (defined in the callback \cmd{PhotonMap:lookup}), which
d2182 1
a2182 1
zero or one (if \opt{-ab} is positive) diffuse scatterig events.
d2198 4
a2201 4
The returned contributions are subsequently weighted by the incident ray's
cumulative contribution which was passed by the caller,
\cmd{getPreCompPhotonContrib()}, before they are returned by
\cmd{getPreCompContribByMod()}.
d2208 6
a2213 4
detail coefficient range for a given photon and its approximation
coefficients, which are
placed in the upper left of the (initially zeroed) wavelet coefficient
matrix. The sign of each approximation coefficient's colour channel is set
d2215 1
a2215 1
encoding, via a convenient macro \cmd{PMAP\_CONTRIB\_GET\_RGBE32\_SGN}.
d2218 1
a2218 1
detail coefficients for the given photon from the wave\-let coefficient
d2220 2
a2221 2
approximation coefficients), using the photon's numeric index as file
offset. Miscellaneous bookkeeping such as lazy initialisation of the
d2224 1
a2224 1
decoding buffer embedded into current child photon map's field of
d2230 7
a2236 5
\cmd{decodeContribs()} initialises the mRGBE
normalisation from the retrieved coefficient range, and passes this to
the mRGBE decoding routine \cmd{mrgbe:mRGBEdecode()}, which returns the
decoded detail coefficient as a floating point RGB 3-tuple along with the
corresponding incremental linear coefficient index.
d2241 1
a2241 1
RGB wavelet coefficient $d = [d_r, d_g, d_b]$:
d2250 6
d2263 3
a2265 3
a macro that returns the sign of the mRGBE mantissa $m_i$ by comparing it
with the signed mantissa offset, $m_{max}$. This offset is subtracted from
the $m_i$ to obtain $d_i$.
d2290 1
a2290 1
j &=& k \mod m,
d2304 13
a2316 3
perform a full inverse Daubechies D2 wavelet transform to recover the
original contributons (or rather, an approximation thereof subject to
compression artefacts).
d2319 4
a2322 4
at iteration $k$, the 4-tap Daubechies wavelet transform in equation
\ref{eq:d2FwdXform} is inverted as follows to reconstruct the
adjacent approximations $s^{k-1}_{i,2j}$ and $s^{k-1}_{i,2j+1}$
at doubled resolution for the next iteration $k-1$:
d2333 2
a2334 2
correspond to the original signal: $s^0_{i,j} = y_{i,j}$.
Note, for the sake of clarity, equation \ref{eq:d2InvXform}
d2342 4
a2345 2
\cmd{memcpy()} to \cmd{rcontrib}'s linear contribution bin array,
before control is handed back to the latter.
d2352 4
a2355 3
\cmd{pmapooc:OOC\_\-Find\-1\-Pho\-ton()} obviates the need to
repeatedly page photons from disk for spatially adjacent lookups,
thus reducing I/O latency. The cache is an instance of struct
d2360 2
a2361 1
governs the preemption of obsolete pages when the cache fills.
d2366 3
a2368 2
\cmd{getPreCompContribByMod()} before paging and decoding the
contributions. A dedicated routine \cmd{pmcontribcache:getContribCache()}
d2370 2
a2371 1
caching key. This in turn calls the out-of-core caching routine
d2373 5
a2377 3
binned contributions. If the decoded contributions are already cached,
they are returned in the array and can be directly accumulated in
\cmd{rcontrib}'s contribution bins.
d2384 2
a2385 1
size of this \cmd{OOC\_Cache} instance is hardwired to 1.
d2390 3
a2392 2
paged from disk and decoded as described above, and transferred to the
new cache entry. The contributions must obviously then be
d2418 3
a2420 2
in-core kd-tree data structure. Since this is the default, out-of-core
(and contribution) photon mapping must be explicitly enabled.
d2432 1
a2432 1
These sanity checks are expensive and should therefore be disabled
d2437 1
a2437 1
\item Encoding known ``contributions'' in \cmd{encodeContribs()},
d2441 2
a2442 2
\item Checking the mRGBE encoded wavelet coefficients in
\cmd{encodeContribs()} by decoding them comparing with the
d2446 1
a2446 1
incremental wavelet coefficient differs. Finally, a
d2450 5
a2454 3
child processes in multiprocessing mode (\mkpmap{} option
\opt{-n}) to attach a debugger to a particular process.
The PID of all forked processes is dumped for convenience.
d2457 2
a2458 2
for precomputation... unless the random photon index selection
were buggy.
d2461 1
a2461 1
\item Checking for invalid mrgbe-encoded wavelet coefficient indices
d2465 6
d2478 320
a2797 1
\section{Fisheye Renderings}
d2801 1
a2801 1
falsecolour srenderings generated by \rcClassic{} for a
d2809 6
a2814 3
with 20 cores was ca. 1 hour. The script used to generate the renderings
is shown in listing~\ref{lst:cpmapTestHDR-rc}.
d2829 8
a2836 4
Despite the obvious noise due to direct photon visualisation, there is
good agreement with figure \ref{fig:cpmapTestHDR-rc}. While \mkpmap{}
took ca. 13.5 minutes to precompute the photon map on the same CPU,
rendering with \rcontrib{} took only 14 seconds.
d2884 1
a2884 1
accelerates computation at the expense of noise: on 20 cores,
d2893 1
a2893 1
\section{Wavelet Compression Artefacts}
d2898 1
a2898 1
boundary extension and logarthmic contribution encoding.
d2902 2
a2903 1
coefficients.
d2976 1
a2976 1
Dr. Stephen Wasilewki for their collaboration, their feedback
d3048 7
a3054 5
\cmd{mkmap}. Figure \ref{fig:swarch-rcontrib} gives a similar overview
in the context of \cmd{rcontrib}.
The blue blocks correspond to software modules built from
C source files with the same name. Red arrows between the modules
correspond to calls between modules. Note that some routines, notably
d3057 1
a3057 2
a loop. The (major) routines exposed by a module are summarised in red
text inside the corresponding block. \todo{Replace sketches!}
d3066 1
a3066 1
the call graph between the modules.
d3079 1
a3079 1
the call graph between the modules.
@
1.5
log
@Added section 3.3 (Evaluation with rcontrib)
@
text
@d50 2
a51 2
\RCS $Revision: 1.4 $
\RCS $Date: 2022/05/12 21:06:49 $
d1033 1
a1037 1
\label{eq:d2FwdXform}
d1061 1
a1065 1
\label{eq:d2-hCoeffs}
d1068 1
a1072 1
\label{eq:d2-gCoeffs}
d1428 1
d2006 1
a2006 1
\cmd{pmcontrib4:getPreCompPhotonContrib()}.
d2008 28
d2037 79
a2116 1
\subsection{mRGBE Wavelet Coefficient Decoding}
d2119 1
d2121 6
a2126 1
\subsection{2D Inverse Wavelet Transform}
d2129 4
a2132 4
at iteration $k$, the 4-tap Daubechies wavelet transform in figure
\ref{fig:waveletFunc} can be inverted as follows to reconstruct the
approximations $s^{k-1}_{i,j}$ at doubled resolution for the next
iteration $k-1$.
d2134 1
a2138 1
\label{eq:d2InvXform}
d2148 6
d2158 38
@
1.4
log
@Censored
@
text
@d46 1
d50 2
a51 2
\RCS $Revision: 1.3 $
\RCS $Date: 2022/05/12 20:53:18 $
d654 7
d1967 40
d2111 1
d2195 70
@
1.3
log
@Commit before censure
@
text
@d49 2
a50 2
\RCS $Revision: 1.2 $
\RCS $Date: 2022/05/07 14:24:45 $
a2170 1
a2179 11
Last and \emph{certainly least}, the author acknowledges
the complacent powers that be in a particular non-EU institution of
higher education, for reasons that are known to them.
May you learn to appreciate dedicated individuals who consistenly
turn out quality work within tight budgets and deadlines,
and to assume more responsibility for your associates,
particularly if they have a cross to bear.
The complex work documented here came about mainly
through dedication, elbow grease and headaches, not your patronage.
If you cannot appreciate that, it is \emph{your} loss.
d2181 2
a2182 10
\iffalse
\textit{The scatter'd crew of CC EASE
has= abandon'd the wretched Mothership.\\
We bid you Farewell for now, and thank the RADIANCE Publick for
their Support.
}
\else
\textit{\textbf{The crew of the former CC EASE bids= you Farewell
for now, and thanks= the RADIANCE Publick for their Support.
}
d2184 1
a2184 1
\fi
d2186 1
a2186 1
@
1.2
log
@Checkin before cleanup of acknowledgements
@
text
@d49 2
a50 2
\RCS $Revision: 1.1 $
\RCS $Date: 2022/05/07 00:27:36 $
d58 1
a58 1
Being an Account of the final remarkable Enterprises= %Exploits=
d62 1
a62 1
viz. furnishing %provisioning
d64 1
a64 7
%in encreas'd Quantity / Numbers
to the RADIANCE Publick%\footnote{
% Apologies to ``Capt. Charles Johnson'' (presumably a
% \textit{nom de plume} for Daniel Defoe),
% author of \textit{A General History of the Pyrates},
% published 1724.
%}
d71 2
a72 8
\iffalse
Roland Schregle (roland.schregle@@\{hslu.ch, gmail.com\})\\
FG Envelopes and Solar Energy\\
Lucerne University of Applied Sciences and Arts
\else
Roland Schregle (roland.schregle@@gmail.com)\\
RS SciComp
\fi
d253 2
a254 1
This consists of the following files:
d341 1
a341 1
project \lit{Lightfields for glare assessment} %at Hochschule Luzern
d612 3
a614 1
in \rcontrib{} when the photon map is loaded.
d1960 1
d1964 1
d1987 1
a1987 3
% ---------------------------------------------------------------------------
a2167 1
%at Hochschule Luzern,
a2170 7
\iffalse
Its nominal (but certainly not spiritual) successor, the research
group FG EASE (now insignificantly embedded in CC Building Evenlopes),
is supervised by Dr. Susanne Gosztonyi, whom the
author hereby acknowledges for her patience with the occasional
squabbles within the group, despite her busy schedule.
\fi
d2180 12
a2191 72
\iffalse
Last and \emph{certainly least}, the author would like to acknowledge
the unfair treatment he received on behalf of the Institute of Civil
Engineering's management. Let's hope they finally find
their ``strategy''. Good luck and good riddance.
\fi
\iftrue
Last and \emph{certainly least}, the author acknowledges
%his disappointment with
the complacent powers that be in a particular non-EU institution of
higher education, %within HLSU,
for reasons that are known to them.
%to those responsible for making loopy decisions.
May you learn to appreciate dedicated individuals who consistenly
turn out quality work within tight budgets and deadlines,
and to assume more responsibility for your associates,
particularly if they have a cross to bear.
The complex work documented here came about mainly
through dedication, elbow grease and headaches, not your patronage.
If you cannot appreciate that, it is \emph{your} loss.
\iffalse
He apologises for overachieving and raising the bar within
the department by developing and documenting highly complex yet
stable code within tight deadlines and budgets, representing the
school at international conferences, publishing high quality peer
reviewed papers,
%submitting project proposals with international partners,
being thorough and meticulous in everything he did, and working
long hours despite his poor health.
Sorry all that didn't quite fit the department's ``portfolio''.
%The author wishes the department good luck in finding its new
%identity (again), and achieving its future non-research goals.
The author therefore wishes the department good luck in
((((re)re)re)re)inventing
itself (again) on the path to infinite wisdom, and
that it achieves its future non-research goals.
% Keep up the good chatter, folks.
Let's face it, fellas: you're gonna need a bigger boat.
\fi
\iffalse
He apologises for overachieving and raising the bar within
the department by delivering highly complex, quality work
within tight deadlines and budgets, being overly thorough
and meticulous, and working long hours despite his poor health
(towards the latter of which the department expressed complete
indifference).
May the management one day learn the art of appreciation.
May it also learn to adopt
emerging technologies as a far-sighted strategy.
Finally, may it expand its narrow and antiquated mindset.
The author wishes the department good luck in finding the path
to infinite wisdom, and that it achieves its future non-research
goals in its cozy little cocoon.
Face it fellas, you're gonna need a bigger boat.
\fi
\fi
\iffalse
Citing section \ref{sec:boundaryExt}, all things have a beginning
and an end, and so too ends the author's stint at Hochschule Luzern
after 8 solid years, during which research waned and
nonsense waxed.
\iffalse
After some of the best years of his career -- followed by some
disappointing ones as research at HSLU waned and nonsense waxed --
\else
\fi
'Tis time to cruise to new shores, assemble a new crew and ply new seas.
\fi
@
1.1
log
@Initial revision
@
text
@d49 2
a50 2
\RCS $Revision$
\RCS $Date$
d352 1
a352 1
project \lit{Lightfields for glare assessment} at Hochschule Luzern
d2176 3
a2178 2
the former Competence Centre for Envelopes and Solar Energy (CC EASE)
at Hochschule Luzern, Prof. Stephen Wittkopf, who gave our crew a sleek
d2208 2
a2209 1
the complacent powers that be in a particular non-EU lab, %within HLSU,
d2256 1
a2256 1
\iftrue
@
Event Timeline
Log In to Comment