User Details
User Details
- User Since
- Mar 14 2016, 14:57 (455 w, 1 d)
- Availability
- Available
- Organization
- epfl.ch (university)
Jan 12 2022
Jan 12 2022
better outputs + new timer
env variables for phoenix
fourestey committed R11986:4037b77e1634: collapsing triple loop for better vectorization (authored by fourestey).
collapsing triple loop for better vectorization
compiler specific makefiles
Jan 9 2022
Jan 9 2022
omp offload version working
Jan 7 2022
Jan 7 2022
homegenous naming convention
bug correction
Jan 5 2022
Jan 5 2022
harmonized name conventions
fourestey committed R11986:1bfd752d5fcc: openacc version working with nvhpc and gcc (11) (authored by fourestey).
openacc version working with nvhpc and gcc (11)
Jan 4 2022
Jan 4 2022
adding gitirgnore file
making omp offload version work
Jan 3 2022
Jan 3 2022
fourestey committed R11986:6971d15a9d5a: new unified main for the gradz_n2n_fd4 routine (authored by fourestey).
new unified main for the gradz_n2n_fd4 routine
update with new versions
Jul 8 2019
Jul 8 2019
bug correction
merge conclusion
fourestey committed R1448:a3bb68a6631c: new env files for piz daint and grand tave at CSCS (authored by fourestey).
new env files for piz daint and grand tave at CSCS
fourestey committed R1448:b539a2db5099: added cmath headers for sin/cos/exp... (authored by fourestey).
added cmath headers for sin/cos/exp...
merge conclusion
fourestey committed R1448:0c28c06bb895: new fonctions for chi communications and computation (authored by fourestey).
new fonctions for chi communications and computation
fourestey committed R1448:fa74ce9b1678: script moved to their specific folders (authored by fourestey).
script moved to their specific folders
updated version
fourestey committed R1448:32d924a41de1: moving env files to their specific folder (authored by fourestey).
moving env files to their specific folder
fourestey committed R1448:e43422c7c5f9: new separated CPU and GPU chi computation (authored by fourestey).
new separated CPU and GPU chi computation
fourestey committed R1448:994921ef9d31: putting the timers at the right place (authored by fourestey).
putting the timers at the right place
moving files to src
separating CPU and GPU version
fourestey committed R1448:52c13e877df3: changing the number of blocks to fit 2 GPUs (this wont be nececassry anymore… (authored by fourestey).
changing the number of blocks to fit 2 GPUs (this wont be nececassry anymore…
adding back the lenstool test
mpi checkers
fourestey committed R1448:d744bdc62471: adding the GPU and CPU separate chi versions (authored by fourestey).
adding the GPU and CPU separate chi versions
fourestey committed R1448:b74535da73d7: new routine to read input files WITHOUT allocating the arrays at the same time (authored by fourestey).
new routine to read input files WITHOUT allocating the arrays at the same time
chi GPU stuff
timer routines
chi CPU stuff
new default block sizes
fourestey committed R1448:a2852774963f: allocation is separated between CPU and GPU (authored by fourestey).
allocation is separated between CPU and GPU
moved to src
fourestey committed R1448:6f94d454bc16: adding register count (in comment at least) (authored by fourestey).
adding register count (in comment at least)
fourestey committed R1448:e9c6e90b21e1: displaying PCI bus for no reasons besides showing off (authored by fourestey).
displaying PCI bus for no reasons besides showing off
removing problematic image
fourestey committed R1448:b139d4f38fa2: setting image_pos_gpu to zero by hand (authored by fourestey).
setting image_pos_gpu to zero by hand
fourestey committed R1448:74e854997ee0: removing the non-barycenter chi computation to speed things up (authored by fourestey).
removing the non-barycenter chi computation to speed things up
working version
fourestey committed R1448:bd8635c16957: better debug output for images properties (authored by fourestey).
better debug output for images properties
adding delensing on the GPUs
fourestey committed R1448:3af8387f1045: working version of the delensing on the GPU (authored by fourestey).
working version of the delensing on the GPU
better debug output for multi-PE
fourestey committed R1448:dcac6b943a2a: working version of the delensing on the GPU (authored by fourestey).
working version of the delensing on the GPU
unified memory FTW
fourestey committed R1448:49fdf6f34b23: overhaul of the whole chi computation concept but separating the delensing and… (authored by fourestey).
overhaul of the whole chi computation concept but separating the delensing and…
fourestey committed R1448:6cd8c63fcaf4: user defined cpu and gpu compilation flags (authored by fourestey).
user defined cpu and gpu compilation flags
unified memory FTW
unified memory FTW
unified memory FTW
Merge branch 'master' into develop
fourestey committed R1448:6b890919eecd: overhaul of the whole chi computation concept but separating the delensing and… (authored by fourestey).
overhaul of the whole chi computation concept but separating the delensing and…
fourestey committed R1448:fbab3cb1a16f: Merge branch 'master' of ssh://c4science.ch/diffusion/1448/lenstool-hpc (authored by fourestey).
Merge branch 'master' of ssh://c4science.ch/diffusion/1448/lenstool-hpc
adding compilation flags file
general update
fourestey committed R1448:5ecfeeadf73e: updated compilation options (-Ofast in particular) (authored by fourestey).
updated compilation options (-Ofast in particular)
bug fix for multiple GPUs
better readability
preparation for a rehaul
removing binary exe
updating the compilation options
Changing the executable name
lenstool always on
removing the debug info for good
fixeing up comments
adding a avx512 guard
fourestey committed R1448:d42cd80e9716: removing bug that cudafrees a host pointer (authored by fourestey).
removing bug that cudafrees a host pointer
removing the *print* functions
bug removed
removing CXXFLAGS altogether
removing the *print* functions
adding the guard for avx512f
inserting the right margins
fourestey committed R1448:ae29d9b999fd: putting the curly braces where they should be (authored by fourestey).
putting the curly braces where they should be
fourestey committed R1448:8689cd87453b: phasing out the *print* version of the gradient (authored by fourestey).
phasing out the *print* version of the gradient
fourestey committed R1448:3889e3ff8d37: much better version, the compiler is passed through the env file (authored by fourestey).
much better version, the compiler is passed through the env file
re-enabling OpenMP loop
fourestey committed R1448:d46d52721576: Merge branch 'master' of ssh://c4science.ch/diffusion/1448/lenstool-hpc (authored by fourestey).
Merge branch 'master' of ssh://c4science.ch/diffusion/1448/lenstool-hpc
updated version of the scripts
fourestey committed R1448:19917d488399: adding fast sqrt reciprocal for SIS (authored by fourestey).
adding fast sqrt reciprocal for SIS
adding Ofast