Diffusion Desuto Platform (master)

Edit
Desuto Platform
Restricted Project
InactivePublic

Repository with all the component of the Desuto Viewer/Annotation/Retrieval platform.

Recent Commits

Commit	Author	Details	Committed
22b8f4d8699a	roger-schaer	Remove CORS declaration for Apache, not necessary anymore	May 25 2021
4c35a59ed14d	roger-schaer	Add contact information	Apr 10 2019
247a3a06f3cd	roger-schaer	Change embedded image (Phabricator syntax)	Apr 9 2019
8cef2061d8ec	roger-schaer	Change embedded image	Apr 9 2019
bf191727e84a	roger-schaer	Fix backticks	Apr 9 2019
91033b02324e	roger-schaer	Add license information	Apr 9 2019
ff231ae1da0f	roger-schaer	Further adjustments to the README	Apr 9 2019
a3ece7891647	roger-schaer	Update README file	Apr 9 2019
a5a0e8eecfd2	roger-schaer	Remove DL features temporarily	Apr 9 2019
c661c8d22e10	roger-schaer	Adjust docker-compose files	Apr 9 2019
83f9f45ece8c	roger-schaer	Make specifying the path to ParaDISE data more flexible using an environment…	Apr 8 2019
e52d1f5c3ae8	roger-schaer	Update .env template and improve/fix Dockerfiles of ParaDISE & retrieval system	Apr 8 2019
9f8b6084f8a9	roger-schaer	Improve Dockerfile of retrieval interface	Apr 8 2019
13e431bebefd	roger-schaer	Update README	Apr 8 2019
0ac807ac9029	roger-schaer	Update docker-compose files	Apr 8 2019

README.md

Desuto & ParaDISE

This Docker Compose configuration allows building and running a functional Desuto Web Viewer and Retrieval Interface (including the ParaDISE retrieval engine, see http://paradise.khresmoi.eu for more information) from scratch on any Docker-enabled host.

This README file aims to describe the structure of the containers and any possible configuration changes that may be required when setting up the system on a new host.

Screenshot

Prerequisites

To run the platform, you will need to install at least:

A recent version of Docker CE : https://docs.docker.com/install/linux/docker-ce/ubuntu/
A recent version of Docker Compose : https://docs.docker.com/compose/install/
Preferably a machine with a minimum of 10GB of available RAM to store the visual indices in-memory

Containers

The docker-compose.yml file is made up of the following containers:

proxy : Proxy facade for all services based on nginx
mysql : MySQL server for storing image metadata (URLs, modalities, captions, etc.)
couchdb : CouchDgiggingver for storing image annotation data from the Web Viewer
uploads : Basic nginx instance for serving uploaded images
images : Basic nginx instance for serving images of the datasets used for retrieval
paradise-gf : Glassfish Java application server instance hosting the ParaDISE engine
retrieval : Apache instance hosting the Shambala-based retrieval interface
webviewer : Node.js server for the Web Viewer / Annotation tool
iipsrv : IIPImage server for generating the image tiles for the Web Viewer
slideprops : Python Web Service for extracting slide properties (using Openslide)

Volumes

The following shared volumes are declared:

upload-volume : Shared volume for uploaded images (served by the uploads container)

Ports

The following ports need to be open on the host machine to run all services correctly:

80 : Port 80 is used by the proxy facade to expose all underlying services

Environment variables

The following default environment variables are declared in the .env.template file. All current values assume that the server runs on localhost and all services are behind the proxy container. All these variables are injected in various configuration files required by Javascript, Java and Node.js applications. If no special setup is required on the test host, this file does not need to be modified (apart from the last 2 values) and can directly be copied/renamed to .env.

COUCHDB_ADMIN_USER : Username of the CouchDB administrator
COUCHDB_ADMIN_PASS : Password for the CouchDB administrator
COUCHDB_DB_NAME : Name of the database created for the Web Viewer annotation data
COUCHDB_PROTOCOL : Protocol used by CouchDB (http by default, https can be managed by the proxy)
COUCHDB_PORT : Default port to reach the CouchDB container (not exposed, but can be used internally by the other Docker containers)
COUCHDB_HOST : Hostname of the CouchDB server (public-facing, localhost by default)
COUCHDB_ROOT_URL : Public-facing URL of the CouchDB database
COUCHDB_BACKEND_ROOT_URL : Backend URL for the CouchDB database
PARADISE_ROOT_URL : Default URL of the ParaDISE engine
RETRIEVAL_ROOT_URL : Default URL of the Retrieval Interface
IIP_ROOT_URL : Default URL of the IIP server
VIEWER_ROOT_URL : Default URL of the Web Viewer
SLIDEPROPS_ROOT_URL : Default URL of the Slide Properties Web Service
SLIDEVIEWERDATA_LOCAL_PATH : Default path of the data for the Web Viewer (uploaded images, converted images, overlays, etc.)
PARADISE_LOCAL_PATH: Default path of the data for the ParaDISE retrieval system

Running the platform

Set up the data folders

Before running the platform, you should set up the folder structure for the Web Viewer data, as well as the retrieval system data.

Web Viewer data

The Web Viewer data is organised with the following hierarchy:

root
|-- converted
    |-- Files converted automatically by the Web Viewer into the pyramidal,
     -- Deep-Zoom-compatible TIFF format
|-- overlays
    |-- NAME_OF_UPLOADED_IMAGE.tif
        |-- feature-name.png
|-- uploaded
    |-- WSI files uploaded via the Web Viewer interface

The structure is fairly simple, only the "overlays" directory is a bit special. This folder must contain a subfolder for each WSI uploaded to the platform. Each of these subfolders may contain one or more PNG files representing an overlay for the WSI of that subfolder.

Retrieval system data

The Retrieval system data is organised with the following hierarchy:

root
|-- all-dmli-pubmed-info.sql
    This file is an SQL script containing all the info
    of a figure dataset with captions and modalities, such
    as the PubMedCentral dataset. The columns are the following:
    -------------------------------------------------------------
    | id | url | thumbnailURL | articleURL | caption | modality |
    -------------------------------------------------------------
|-- images
    This folder contains all the patches and images that can be
    retrieved by the system, organized by dataset and split into
    several subfolders (by magnification, or other arbitrary levels)
    Example shown below:
    |-- Dataset1 (WSI for example)
        |-- 5X
            |-- XYZ.tif_idx_0__lvl_3__x2688_y2240.png
            |-- ...
        |-- 10X
        |-- ...
    |-- Dataset2 (PubMedCentral for example)
        |-- Subfolder1
            |-- Sub-subfolder1
                |-- 12-0309-F-3.jpg
                |-- ...
            |-- Sub-subfolder2
                |-- ...
        |-- ...
|-- paradise-files
    This folder contains all the files necessary for the ParaDISE
    backend to function, both for image as well as text retrieval.
    Details shown below:
    |-- conf
        This folder contains configuration files for all the visual
        indices available in the system, describing the used parameters
        for indexation, storing mechanism, retrieval settings, etc.
        Refer to the ParaDISE website (http://paradise.khresmoi.eu/)
        for more details.
        |-- pubmedcentral-config.json
        |-- ...
    |-- gt
        This folder contains the ground truth file for the classification
        algorithm used to automatically compute the image modality of a
        given image in a dataset.
        |-- train20XXGT.csv
    |-- image-lists
        This folder contains the lists of URLs/paths of images to index.
        |-- pubmedcentral-dmli.csv
        |-- ...
    |-- indices
        This folder contains CSV files with the features extracted
        from the images in each dataset
        |-- wsi-dataset-5x.csv
        |-- wsi-dataset-10x.csv
        |-- ...
    |-- lucene
        This folder contains the Lucene indices containing the caption
        and fulltext information for datasets such as PubMedCentral.
        |-- pubmedcentral-captions-2016-dmli
        |-- pubmedcentral-fulltext-2016
        |-- ...
    |-- vocabularies
        This folder contains vocabularies for "Bag-of-Words"-based
        visual feature extraction.
        |-- vocabulary_238.csv
        |-- ...

Define the path in the ".env" file

Whether you created the folder structure above manually or received a sample ZIP file with some images, you need to map some data folders into the Docker containers by setting the SLIDEVIEWERDATA_LOCAL_PATH and PARADISE_LOCAL_PATH variables in your ".env" file. Copy or rename the .env.template file to .env and modify the last 2 variables. Simply set them to the paths of the folders you created:

# Linux example
SLIDEVIEWERDATA_LOCAL_PATH=/home/slideviewer/slideviewerdata/
PARADISE_LOCAL_PATH=/home/slideviewer/paradise-data/

# Windows example
SLIDEVIEWERDATA_LOCAL_PATH=C:/SlideViewerData/
PARADISE_LOCAL_PATH=C:/ParaDISEData/

Once these variables are defined, you should be able to correctly build and start all the containers.

Starting the containers

To create/start/restart all containers (in background mode), type the following command in the terminal from the folder containing the docker-compose.yml file:

docker-compose -f docker-compose.yml up -d

To stop the containers, type:

docker-compose stop

To remove all the containers (clean-up), type:

docker-compose down

Accessing the Web Viewer

Wait for about 4-5 minutes to allow all services to load correctly, then access your browser at http://localhost to access the Web Viewer. (NOTE: Using Google Chrome is recommended).

You can then log in with one of the following default user accounts defined in the docker-entrypoint.sh file in the desuto-couchdb directory:

User : user, Password : userpass (Read-only access)
User : pathologist1, Password : pathologistpass
User : pathologist2, Password : pathologistpass

Limitations

The current repository contains all necessary parts for getting up and running with the viewer, annotation tool and retrieval system, but lacks some of the required data (images, indices, etc.) for a completely working system.

It also doesn't include the Deep Learning Python services for computing and extracting DL features from images, this is something that will be added in the future.

If you would like to receive some sample data for quickly exploring the features of the platform, don't hesitate to contact the authors for help.

Contributing

Please contact the authors to ask about adding new contributions to the platform.

.env.template
.gitignore
LICENSE
README.md
desuto-couchdb/
desuto-iipsrv/
desuto-images/
desuto-proxy/
desuto-retrieval/
desuto-slideproperties/
desuto-viewer/
docker-compose.override.yml
docker-compose.yml
paradise-gf/

EditDesuto PlatformRestricted ProjectInactivePublic

Desuto Platform (master)