Repository with all the component of the Desuto Viewer/Annotation/Retrieval platform.
Recent Commits
Commit | Author | Details | Committed | ||||
---|---|---|---|---|---|---|---|
22b8f4d8699a | roger-schaer | Remove CORS declaration for Apache, not necessary anymore | May 25 2021 | ||||
4c35a59ed14d | roger-schaer | Add contact information | Apr 10 2019 | ||||
247a3a06f3cd | roger-schaer | Change embedded image (Phabricator syntax) | Apr 9 2019 | ||||
8cef2061d8ec | roger-schaer | Change embedded image | Apr 9 2019 | ||||
bf191727e84a | roger-schaer | Fix backticks | Apr 9 2019 | ||||
91033b02324e | roger-schaer | Add license information | Apr 9 2019 | ||||
ff231ae1da0f | roger-schaer | Further adjustments to the README | Apr 9 2019 | ||||
a3ece7891647 | roger-schaer | Update README file | Apr 9 2019 | ||||
a5a0e8eecfd2 | roger-schaer | Remove DL features temporarily | Apr 9 2019 | ||||
c661c8d22e10 | roger-schaer | Adjust docker-compose files | Apr 9 2019 | ||||
83f9f45ece8c | roger-schaer | Make specifying the path to ParaDISE data more flexible using an environment… | Apr 8 2019 | ||||
e52d1f5c3ae8 | roger-schaer | Update .env template and improve/fix Dockerfiles of ParaDISE & retrieval system | Apr 8 2019 | ||||
9f8b6084f8a9 | roger-schaer | Improve Dockerfile of retrieval interface | Apr 8 2019 | ||||
13e431bebefd | roger-schaer | Update README | Apr 8 2019 | ||||
0ac807ac9029 | roger-schaer | Update docker-compose files | Apr 8 2019 |
README.md
Desuto & ParaDISE
This Docker Compose configuration allows building and running a functional Desuto Web Viewer and Retrieval Interface (including the ParaDISE retrieval engine, see http://paradise.khresmoi.eu for more information) from scratch on any Docker-enabled host.
This README file aims to describe the structure of the containers and any possible configuration changes that may be required when setting up the system on a new host.
Screenshot
Prerequisites
To run the platform, you will need to install at least:
- A recent version of Docker CE : https://docs.docker.com/install/linux/docker-ce/ubuntu/
- A recent version of Docker Compose : https://docs.docker.com/compose/install/
- Preferably a machine with a minimum of 10GB of available RAM to store the visual indices in-memory
Containers
The docker-compose.yml file is made up of the following containers:
- proxy : Proxy facade for all services based on nginx
- mysql : MySQL server for storing image metadata (URLs, modalities, captions, etc.)
- couchdb : CouchDgiggingver for storing image annotation data from the Web Viewer
- uploads : Basic nginx instance for serving uploaded images
- images : Basic nginx instance for serving images of the datasets used for retrieval
- paradise-gf : Glassfish Java application server instance hosting the ParaDISE engine
- retrieval : Apache instance hosting the Shambala-based retrieval interface
- webviewer : Node.js server for the Web Viewer / Annotation tool
- iipsrv : IIPImage server for generating the image tiles for the Web Viewer
- slideprops : Python Web Service for extracting slide properties (using Openslide)
Volumes
The following shared volumes are declared:
- upload-volume : Shared volume for uploaded images (served by the uploads container)
Ports
The following ports need to be open on the host machine to run all services correctly:
- 80 : Port 80 is used by the proxy facade to expose all underlying services
Environment variables
The following default environment variables are declared in the .env.template file. All current values assume that the server runs on localhost and all services are behind the proxy container. All these variables are injected in various configuration files required by Javascript, Java and Node.js applications. If no special setup is required on the test host, this file does not need to be modified (apart from the last 2 values) and can directly be copied/renamed to .env.
- COUCHDB_ADMIN_USER : Username of the CouchDB administrator
- COUCHDB_ADMIN_PASS : Password for the CouchDB administrator
- COUCHDB_DB_NAME : Name of the database created for the Web Viewer annotation data
- COUCHDB_PROTOCOL : Protocol used by CouchDB (http by default, https can be managed by the proxy)
- COUCHDB_PORT : Default port to reach the CouchDB container (not exposed, but can be used internally by the other Docker containers)
- COUCHDB_HOST : Hostname of the CouchDB server (public-facing, localhost by default)
- COUCHDB_ROOT_URL : Public-facing URL of the CouchDB database
- COUCHDB_BACKEND_ROOT_URL : Backend URL for the CouchDB database
- PARADISE_ROOT_URL : Default URL of the ParaDISE engine
- RETRIEVAL_ROOT_URL : Default URL of the Retrieval Interface
- IIP_ROOT_URL : Default URL of the IIP server
- VIEWER_ROOT_URL : Default URL of the Web Viewer
- SLIDEPROPS_ROOT_URL : Default URL of the Slide Properties Web Service
- SLIDEVIEWERDATA_LOCAL_PATH : Default path of the data for the Web Viewer (uploaded images, converted images, overlays, etc.)
- PARADISE_LOCAL_PATH: Default path of the data for the ParaDISE retrieval system
Running the platform
Set up the data folders
Before running the platform, you should set up the folder structure for the Web Viewer data, as well as the retrieval system data.
Web Viewer data
The Web Viewer data is organised with the following hierarchy:
root |-- converted |-- Files converted automatically by the Web Viewer into the pyramidal, -- Deep-Zoom-compatible TIFF format |-- overlays |-- NAME_OF_UPLOADED_IMAGE.tif |-- feature-name.png |-- uploaded |-- WSI files uploaded via the Web Viewer interface
The structure is fairly simple, only the "overlays" directory is a bit special. This folder must contain a subfolder for each WSI uploaded to the platform. Each of these subfolders may contain one or more PNG files representing an overlay for the WSI of that subfolder.
Retrieval system data
The Retrieval system data is organised with the following hierarchy:
root |-- all-dmli-pubmed-info.sql This file is an SQL script containing all the info of a figure dataset with captions and modalities, such as the PubMedCentral dataset. The columns are the following: ------------------------------------------------------------- | id | url | thumbnailURL | articleURL | caption | modality | ------------------------------------------------------------- |-- images This folder contains all the patches and images that can be retrieved by the system, organized by dataset and split into several subfolders (by magnification, or other arbitrary levels) Example shown below: |-- Dataset1 (WSI for example) |-- 5X |-- XYZ.tif_idx_0__lvl_3__x2688_y2240.png |-- ... |-- 10X |-- ... |-- Dataset2 (PubMedCentral for example) |-- Subfolder1 |-- Sub-subfolder1 |-- 12-0309-F-3.jpg |-- ... |-- Sub-subfolder2 |-- ... |-- ... |-- paradise-files This folder contains all the files necessary for the ParaDISE backend to function, both for image as well as text retrieval. Details shown below: |-- conf This folder contains configuration files for all the visual indices available in the system, describing the used parameters for indexation, storing mechanism, retrieval settings, etc. Refer to the ParaDISE website (http://paradise.khresmoi.eu/) for more details. |-- pubmedcentral-config.json |-- ... |-- gt This folder contains the ground truth file for the classification algorithm used to automatically compute the image modality of a given image in a dataset. |-- train20XXGT.csv |-- image-lists This folder contains the lists of URLs/paths of images to index. |-- pubmedcentral-dmli.csv |-- ... |-- indices This folder contains CSV files with the features extracted from the images in each dataset |-- wsi-dataset-5x.csv |-- wsi-dataset-10x.csv |-- ... |-- lucene This folder contains the Lucene indices containing the caption and fulltext information for datasets such as PubMedCentral. |-- pubmedcentral-captions-2016-dmli |-- pubmedcentral-fulltext-2016 |-- ... |-- vocabularies This folder contains vocabularies for "Bag-of-Words"-based visual feature extraction. |-- vocabulary_238.csv |-- ...
Define the path in the ".env" file
Whether you created the folder structure above manually or received a sample ZIP file with some images, you need to map some data folders into the Docker containers by setting the SLIDEVIEWERDATA_LOCAL_PATH and PARADISE_LOCAL_PATH variables in your ".env" file. Copy or rename the .env.template file to .env and modify the last 2 variables. Simply set them to the paths of the folders you created:
# Linux example SLIDEVIEWERDATA_LOCAL_PATH=/home/slideviewer/slideviewerdata/ PARADISE_LOCAL_PATH=/home/slideviewer/paradise-data/ # Windows example SLIDEVIEWERDATA_LOCAL_PATH=C:/SlideViewerData/ PARADISE_LOCAL_PATH=C:/ParaDISEData/
Once these variables are defined, you should be able to correctly build and start all the containers.
Starting the containers
To create/start/restart all containers (in background mode), type the following command in the terminal from the folder containing the docker-compose.yml file:
docker-compose -f docker-compose.yml up -d
To stop the containers, type:
docker-compose stop
To remove all the containers (clean-up), type:
docker-compose down
Accessing the Web Viewer
Wait for about 4-5 minutes to allow all services to load correctly, then access your browser at http://localhost to access the Web Viewer. (NOTE: Using Google Chrome is recommended).
You can then log in with one of the following default user accounts defined in the docker-entrypoint.sh file in the desuto-couchdb directory:
- User : user, Password : userpass (Read-only access)
- User : pathologist1, Password : pathologistpass
- User : pathologist2, Password : pathologistpass
Limitations
The current repository contains all necessary parts for getting up and running with the viewer, annotation tool and retrieval system, but lacks some of the required data (images, indices, etc.) for a completely working system.
It also doesn't include the Deep Learning Python services for computing and extracting DL features from images, this is something that will be added in the future.
If you would like to receive some sample data for quickly exploring the features of the platform, don't hesitate to contact the authors for help.
Meta
Available under the Apache License 2.0, See LICENSE for more information.
Roger Schaer - roger.schaer@hevs.ch
Sebastian Otalora - juan.otaloramontenegro@hevs.ch
Contributing
Please contact the authors to ask about adding new contributions to the platform.