diff --git a/README.md b/README.md
index dfd5fb4..ff84318 100644
--- a/README.md
+++ b/README.md
@@ -1,227 +1,227 @@
 # HDTorch
 
 HDTorch is PyTorch-based hyperdimensional (HD) computing library for HD learning.
 It includes custom CUDA extensions for speeding up hypervector operations, namely, bit-(un)packing and bit-array summation in the horizontal/vertical dimensions.
 
 In the paper [HDTorch: Accelerating Hyperdimensional Computing with GP-GPUs for Design Space Exploration (ICCAD 2022)](https://arxiv.org/abs/2206.04746), we demonstrate HDTorch’s utility by analyzing four HDC benchmark datasets in terms of accuracy, runtime, and memory consumption, utilizing both classical and online HD training methodologies.
 
 ## Installation
 
-Torchhd is hosted on PyPi and can be installed via the following command:
+HDTorch is hosted on PyPi and can be installed via the following command:
 
 ```
    pip install hdtorch
 ```
 
 ## Basics of Hyperdimensional computing (HDC)
 
 HD computing is a machine learning strategy whose defining feature is its representation of data points as long (’hyper’) vectors, which enables learning by ’accumulation’ of said vectors belonging to the same class. HD computing relies on two conditions; first, any two randomly generated HD vectors are with high probability orthogonal, and second, a vector generated by vector accumulation will be more similar to its components than a vector not of its class.
 
 Binary and bipolar vectors are two common flavors of HD vector, consisting of values 0/1 and -1/1, respectively. In practice,  tertiary (-1,0,1) or integer/float vectors are sometimes used; however, this library focuses on binary and bipolar vectors
 
 The typical HD workflow consists of several steps:
 
 1. Initialize basis vectors in memory that will be used to encode features. They represent the basic units we need, such as class vectors. If we have more complex data, such as EEG data, where we also have channels, we can have basis vectors for each of the channels too.
 
 2. Data (feature) values have to be discretized into several bins. Each of those values will have its own vector that was initialized in the previous step.
 
 3. Discretized features are encoded to HD vectors so that for each sample of features, we instead now have HD vector representing that sample.
 
 4. Learning is performed using all encoded data samples. Several approaches to learning are possible, but the most simple/classic approach is to accumulate all HD vectors representing samples of the same class. After accumulation and normalization to regain binary vectors, these vectors are called 'model' vectors of classes. A more complex form of training is 'online' training, which differs in that the class vectors are updated after every datapoint by multiplying its similarity to the target class by the vector before accumulating it into the class.
 
 5. Inference is performed by first encoding a test sample to an HD vector, then comparing it with learned 'model' vectors. Comparison can be done via various metrics such as cosine, dot, or hamming similarity, but for binary vectors, hamming is the most memory and computation friendly. The label of the most similar 'model' vector is given as a prediction.
 
 
 ## Generating Hypervectors
 
 
 Encoding data such as a set of features to HD vectors can be done in several ways, but most of them, the first step is generating basis hypervectors that are further combined to generate the final HD vector representation of the original data.
 
 Here we provide several ways to initialize basis hypervectors, since the data that they represent can have different structures and relationships. For example, if we want to represent different categories with no inter-relationship, we can generate each HD vector randomly and independently. In contrast, values with inter-relationships may be mapped such that the distance between values is proportional to the distance between corresponding vectors.
 
 Thus, several options to generate a set of basis vectors that our code supports now are:
 
 *  'random' - where every vector is  randomly and independently generated
 
 *  'sandwich' - where every two neighboring vectors have half of the vector the same, but the rest of the vector is random. Here, vectors share 50% similarity with neighboring vectors but not with the ones further.
 
 *  'scale' - or alternatively called 'level' initialization, where the inter-distance between in values the vectors represent is mapped to the similarity between those vectors.
 
 *  'scaleWithRadius' - similar to 'scale' initialization, but for vectors that are closer than the given 'radius' distance. Thus, vectors closer than 'radius' are similar proportionally to their distance, but beyond this 'radius' they are orthogonal.
 
 
 Example of basis vectors generation:
 
 
 ```
     import hdtorch
 
     # Generate 5 random hypervectors with dimension 10000 (not packed, on 'cuda')
     vecs = hdtorch.HDmodel.generateBasisHDVectors('random',5,10000,0,'cuda')
 
     # Generate 20 hypervectors with dimension 500 (not packed, on 'cuda') in which two vectors are similar reverse-proportionally to their distance. Bits that are different between neighboring vectors are chosen in an increasing manner (instead of randomly) and the whole vector is eventually flipped. If the factor at the end was e.g. 2 only half of the total vector would be flipped.
     vecs = hdtorch.HDmodel.generateBasisHDVectors('scaleNoRand1',20,500,0,'cuda')
 
     # Generate 100 hypervectors with dimension 10000 (not packed, on 'cuda') who are similar in proportion to their distance up to the surrounding 10 vectors, and with all vectors further than that nearly orthogonal
     vecs = hdtorch.HDmodel.generateBasisHDVectors('scaleWithRadius10',1000,10000,0,'cuda')
 ```
 
 ## Custom CUDA functions
 
 
 In order to significantly lower computation time and memory usage when operating with hypervectors, we implemented custom CUDA functions for packing, unpacking and manipulating them.
 
 
 Below is shown how these functions are used:
 
 
 ```
     import hdtorch
 
     # Generate random HD vectors of dimension 10000
     vecs = hdtorch.HDmodel.generateBasisHDVectors('random',5,10000,0,'cuda')
 
     # Compress vectors to arrays with dimension [5,313], dtype = int32, (CUDA accelerated, 8x memory reduction). Dimension 313 is a result of ceil(10000/32)
     packed_vecs = hdtorch.pack(vecs)
 
     # Decompress vector to array with dimension [5,10000], dtype = int8 (CUDA accelerated)
     unpacked_vecs = hdtorch.unpack(packed_vecs, 10000)
 ```
 
 Next, as encoding and learning in HDC are based on bitwise summing vectors in horizontal and vertical dimensions, we implement these functions for packed vectors. This additionally reduces computation time for encoding and training.
 
 Using those C-based functions is as follows:
 
 ```
     import hdtorch
 
     # Generate random HD vectors of dimension 10000
     vecs = hdtorch.HDmodel.generateBasisHDVectors('random',5,10000,0,'cuda')
 
     # Compress vectors to arrays with dimension [5,313], dtype = int32, (CUDA accelerated, 8x memory reduction).
     packed_vecs = hdtorch.pack(vecs)
 
     # Horizontal summation of packed vector (CUDA accelerated), the result is array with dimension [5]
     h_count = hdtorch.hcount(packed_vecs)
 
     # Vertical summation of packed vector (CUDA accelerated), the result is array with dimension [10000]
     v_count = hdtorch.vcount(packed_vecs,10000)
 ```
 
 
 
 
 ## Data encoding
 
 
 In order to learn from training data or infer test data labels, data has to be encoded to HD vectors. This means that instead of having data in the form of a 2D matrix [numSampl, numFeat] where each column is one feature, we represent it in the form of 2D matrix of corresponding HD vectors [numSampl, D]. For every sample, numFeat features are encoded to one D-dimensional hypervector.
 
 There is many proposed encoding algorithms, but the most typical is what we call 'FeatXORVal', where each features has an ID vector and n value vectors, where n is the range of values to which data samples are discretized. Data is encoded by binding for the feature ID vector to the value vector corresponding to the data's discretized value, typically via the XOR function. Finally, bound vectors are bundled for all features, generally via bitwise summing and normalizing by the number of summed vectors to regain binary vectors.
 
 This method is demonstrated in the code below:
 
 ```
     import torch
     import hdtorch
 
     numFeat=30
     D=10000
     numSegmentationLevels=20
 
     # initialize data (100 samples, with 30 features, having values between 0 and 256)
     features=torch.randint(0,256,(100, numFeat)).to(device='cuda')
 
     # initialize basis vectors
     featureIDs = hdtorch.HDmodel.generateBasisHDVectors('random',numFeat,D,0,'cuda') #randomly generated feature ID vectors, 1 for each of 30 features, with with D=10000, non packed
     featureVals = hdtorch.HDmodel.generateBasisHDVectors('scaleNoRand1',numSegmentationLevels,D,0,'cuda') #generated feature value vectors, using 'scale' method, 20 possible values, 1, with with D=10000, non packed
 
     #normalize data
     minFeat=torch.min(features, dim=0)[0]
     maxFeat=torch.max(features, dim=0)[0]
     featuresNorm = hdtorch.HDutil.normalizeAndDiscretizeData(features,minFeat, maxFeat, numSegmentationLevels )
 
     #encode features using 'FeatXORVal' approach
     (encodedData, _) = hdtorch.HDencoding.EncodeDataToVectors (featuresNorm, featureIDs, featureVals, 'binary', 0, 'FeatXORVal', D)
     # or using e.g. 'FeatPermute' approach
     (encodedData, _) = hdtorch.HDencoding.EncodeDataToVectors (featuresNorm, featureIDs, featureVals, 'binary', 0, 'FeatPermute', D)
 ```
 
 ## HD computing learning and inference
 
 
 Finally, to use HD vectors to perform learning and inference, we show the whole process on an training and inference example using MNIST data:
 
 ```
     import torch
     import hdtorch
     from torchvision import datasets
     import torchvision.transforms as transforms
 
     # Setting various parameters
     class HDParams():
         HDFlavor = 'binary'  # 'binary', 'bipol' #binary 0,1, bipolar -1,1
         D = 10000  # dimension of hypervectors
         numFeat = 784
         numClasses = 10
         device = 'cuda'  # device to use (cpu, cuda)
         packed = True
         numSegmentationLevels = 20 # defines number of discretization levels to which data is discretized
         similarityType = 'hamming'  # 'hamming','cosine' #similarity measure used for comparing HD vectors
         levelVecType = 'random'  # 'random','sandwich','scaleNoRand1','scaleNoRand2','scaleRand1', ,'scaleRand2'... 'scaleWithRadius3', #defines how HD vectors are initialized
         IDVecType = 'random'
         encodingStrat =  'FeatXORVal' # 'FeatXORVal' 'FeatAppend' 'FeatPermute'   #defines how HD vectors encoded
     hdParams = HDParams()
     batchSize = 1000 #learn in batches
 
     # Loading MNIST dataset
     print("Loading MNIST dataset")
     t = transforms.Compose([transforms.ToTensor(), transforms.ConvertImageDtype(torch.uint8)])
     kwargs = {'num_workers': 1, 'pin_memory': True} if HDParams.device == 'cuda' else {}
     dataTrain = datasets.MNIST(root = './data', train = True, transform = t, download = True)
     dataTest  = datasets.MNIST(root = './data', train = False, transform = t, download = True)
     trainLoader = torch.utils.data.DataLoader(dataset=dataTrain, batch_size=batchSize, shuffle=True, **kwargs)
     testLoader  = torch.utils.data.DataLoader(dataset=dataTest, batch_size=batchSize, shuffle=False, **kwargs)
 
     # Calculate min and max valus on train set - will used for also normalizing test set
     minFeat = trainLoader.dataset.data.view(-1,784).min(0)[0].to(HDParams.device)
     maxFeat = trainLoader.dataset.data.view(-1,784).max(0)[0].to(HDParams.device)
 
     # Initialize HD classifier
     HDModel = hdtorch.HD_classifier(HDParams)
 
     # Training HD model in batches
     print("Training Model")
     for x,(data,labels) in enumerate(trainLoader):
         print(f'Training batch {x}')
         data = data.to(HDParams.device).view(-1,784)
         data = hdtorch.HDutil.normalizeAndDiscretizeData(data,minFeat, maxFeat, HDParams.numSegmentationLevels )
         HDModel.trainModelVecOnData(data,labels.to(HDParams.device))
 
     # Testing performance
     print("Testing Model")
     for x,(data,labels) in enumerate(testLoader):
         data = data.to(HDParams.device).view(-1,784)
         data = hdtorch.HDutil.normalizeAndDiscretizeData(data, minFeat, maxFeat, HDParams.numSegmentationLevels)
         (testPredictions,testDistances) = HDModel.givePrediction(data)
         acc_test = (testPredictions == labels.to(HDParams.device)).sum().item()/len(labels)
         print(f'Batch {x}: Acc: {acc_test}')
 ```
 
 ## Documentation
 
 More documentation on HDTorch's individual features can be found on its [Read the Docs page](https://hdtorch.readthedocs.io/en/latest/).
 
 ### License
 
 This library is [MIT licensed](https://github.com/hyperdimensional-computing/torchhd/blob/main/LICENSE).
 
 ## Citations
 
 If you like this work and use it in your own research, it would be appreciated to cite our following work:
 ```
 @INPROCEEDINGS{iccad2022,
   author={Simon, William Andrew and Pale, Una and Teijeiro, Tomas and Atienza, David},
   booktitle={2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)},
   title={HDTorch: Accelerating Hyperdimensional Computing with GP-GPUs for Design Space Exploration},
   year={2022}}
 ```
\ No newline at end of file
diff --git a/hdtorch/__init__.py b/hdtorch/__init__.py
index 21dfb22..54a61af 100644
--- a/hdtorch/__init__.py
+++ b/hdtorch/__init__.py
@@ -1,54 +1,55 @@
 import os
+from importlib import abc
 from pathlib import Path
 from torch.utils.cpp_extension import load
 _hdtorchcuda = load(
     name='hdtorchcuda',
     extra_cflags=['-O3'],
     is_python_module=True,
     sources=[
         os.path.join(Path(__file__).parent, 'cuda', 'hdtorch.cpp'),
         os.path.join(Path(__file__).parent, 'cuda', 'hdtorch_cu.cu')
     ]
 )
 
 import hdtorch.HDutil as HDutil
 import hdtorch.HDmodel as HDmodel
 import hdtorch.HDencoding as HDencoding
 import hdtorch.HDVecGenerators as HDVecGenerators
 from hdtorch.version import __version__
 
 from hdtorch.HDutil import (
     ham_sim,
     cos_dist,
     cos_sim,
     dot_sim,
     xor_bipolar,
     rotateVec,
     normalizeAndDiscretizeData
 )
 
 
 from hdtorch.HDmodel import (
     HD_classifier
 )
 
 __all__ = [
     "HDutil",
     "HDmodel",
     "HDencoding",
     "HDVecGenerators",
     "ham_sim",
     "cos_dist",
     "cos_sim",
     "dot_sim",
     "xor_bipolar",
     "rotateVec",
     "normalizeAndDiscretizeData",
     "HD_classifier"
 ]
 
 pack =    _hdtorchcuda.pack
 unpack =  _hdtorchcuda.unpack
 hcount =  _hdtorchcuda.hcount
 vcount =  _hdtorchcuda.vcount
 _hcount = _hdtorchcuda._hcount
diff --git a/hdtorch/cuda/hdtorch_cu.cpp b/hdtorch/cuda/hdtorch_cu.cpp
deleted file mode 120000
index de1f414..0000000
--- a/hdtorch/cuda/hdtorch_cu.cpp
+++ /dev/null
@@ -1 +0,0 @@
-hdtorch_cu.cu
\ No newline at end of file
diff --git a/hdtorch/cuda/setup.py.back b/hdtorch/cuda/setup.py.back
deleted file mode 100644
index e6854fb..0000000
--- a/hdtorch/cuda/setup.py.back
+++ /dev/null
@@ -1,9 +0,0 @@
-from distutils.core import setup, Extension
-from torch.utils.cpp_extension import BuildExtension, CUDAExtension
-
-module1 = CUDAExtension('hdtorch', sources = ['hdtorch.cpp', 'hdtorch_cu.cu'],
-                        extra_compile_args=["-O3"])
-
-setup(name='hdtorch',
-      ext_modules=[module1],
-      cmdclass={'build_ext': BuildExtension})
\ No newline at end of file
diff --git a/hdtorch/pylint.txt b/hdtorch/pylint.txt
deleted file mode 100644
index ac1e046..0000000
--- a/hdtorch/pylint.txt
+++ /dev/null
@@ -1,23 +0,0 @@
-************* Module hdtorch.HDencoding
-HDencoding.py:12:0: C0301: Line too long (102/100) (line-too-long)
-HDencoding.py:27:0: C0303: Trailing whitespace (trailing-whitespace)
-HDencoding.py:48:0: C0301: Line too long (102/100) (line-too-long)
-HDencoding.py:133:0: C0301: Line too long (103/100) (line-too-long)
-HDencoding.py:1:0: C0103: Module name "HDencoding" doesn't conform to snake_case naming style (invalid-name)
-HDencoding.py:12:49: C0103: Argument name "HD_flavor" doesn't conform to snake_case naming style (invalid-name)
-HDencoding.py:12:0: R0913: Too many arguments (6/5) (too-many-arguments)
-HDencoding.py:12:60: W0613: Unused argument 'basis_feat_vecs' (unused-argument)
-HDencoding.py:48:50: C0103: Argument name "HD_flavor" doesn't conform to snake_case naming style (invalid-name)
-HDencoding.py:48:0: R0913: Too many arguments (6/5) (too-many-arguments)
-HDencoding.py:69:15: E0602: Undefined variable 'rotateVec' (undefined-variable)
-HDencoding.py:77:4: C0103: Variable name "f" doesn't conform to snake_case naming style (invalid-name)
-HDencoding.py:48:78: W0613: Unused argument 'basis_val_vecs' (unused-argument)
-HDencoding.py:88:46: C0103: Argument name "HD_flavor" doesn't conform to snake_case naming style (invalid-name)
-HDencoding.py:88:0: R0913: Too many arguments (6/5) (too-many-arguments)
-HDencoding.py:123:4: C0103: Variable name "f" doesn't conform to snake_case naming style (invalid-name)
-HDencoding.py:133:66: C0103: Argument name "HD_flavor" doesn't conform to snake_case naming style (invalid-name)
-HDencoding.py:133:0: R0913: Too many arguments (7/5) (too-many-arguments)
-
-------------------------------------------------------------------
-Your code has been rated at 5.42/10 (previous run: 6.04/10, -0.62)
-
diff --git a/hdtorch/version.py b/hdtorch/version.py
index d538f87..6849410 100644
--- a/hdtorch/version.py
+++ b/hdtorch/version.py
@@ -1 +1 @@
-__version__ = "1.0.0"
\ No newline at end of file
+__version__ = "1.1.0"
diff --git a/pyproject.toml b/pyproject.toml
index 8a12bf8..13ec755 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,19 +1,20 @@
 [tool.poetry]
 name = "HDTorch"
 version = "1.1.0"
 license = "MIT"
 description = "HDTorch: Accelerating Hyperdimensional Computing with GP-GPUs for Design Space Exploration"
 readme = "README.md"
 repository = "https://c4science.ch/source/hdtorch/"
+documentation  = "https://hdtorch.readthedocs.io/en/latest/"
 authors = ["wasimon <william.simon@epfl.ch>"]
 
 [tool.poetry.dependencies]
 python = "^3.6"
 
 
 [tool.poetry.dev-dependencies]
 pytest = "5.2"
 
 [build-system]
 requires = ["poetry-core>=1.0.0"]
 build-backend = "poetry.core.masonry.api"
diff --git a/setup.py b/setup.py
index a149fb1..793cf4b 100644
--- a/setup.py
+++ b/setup.py
@@ -1,31 +1,31 @@
 """A setuptools based setup module.
 See:
 https://packaging.python.org/guides/distributing-packages-using-setuptools/
 https://github.com/pypa/sampleproject
 """
 from setuptools import setup, find_packages
 
 # Read the version without importing any dependencies
 version = {}
 with open("hdtorch/version.py") as f:
     exec(f.read(), version)
 
 setup(
-    name="hdtorch",  # use torch-hd on PyPi to install torchhd, torchhd is too similar according to PyPi
+    name="hdtorch",
     version=version["__version__"],
     description="HDTorch is a pytorch based HD Computing library with Hypervector Extensions",
     long_description=open("README.md").read(),
     long_description_content_type="text/markdown",
     url="https://c4science.ch/source/hdtorch/",
     license="MIT",
     install_requires=[
         "torch",
         "numpy",
     ],
     packages=find_packages(exclude=["docs"]),
     python_requires=">=3.8",
     project_urls={
         "Source": "https://c4science.ch/source/hdtorch/",
         "Documentation": "https://hdtorch.readthedocs.io",
     },
 )
\ No newline at end of file