07_oacct_sherpa_publishers.md
No OneTemporary
Actions

Subscribers

None

File Metadata

Created: Wed, Feb 19, 00:00

07_oacct_sherpa_publishers.md
View Options

	# Projet Open Access Compliance Check Tool (OACCT)

	Projet P5 de la bibliothèque de l'EPFL en collaboration avec les bibliothèques des Universités de Genève, Lausanne et Berne : https://www.swissuniversities.ch/themen/digitalisierung/p-5-wissenschaftliche-information/projekte/swiss-mooc-service-1-1-1-1

	Ce notebook permet d'extraire les données choisis parmis les sources obtenues par API et les traiter pour les rendre exploitables dans l'application OACCT.

	Auteur : Pablo Iriarte, Université de Genève (pablo.iriarte@unige.ch)
	Date de dernière mise à jour : 16.07.2021

	## Table Journals Publishers : ajout des informations de Sherpa


	```python
	import pandas as pd
	import csv
	import json
	import numpy as np
	```


	```python
	publishers_issn = pd.read_csv('sample/publishers_brut.tsv', encoding='utf-8', header=0, sep='\t')
	publishers_issn
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>publisher_id</th>
	<th>name</th>
	<th>id</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>Revue_Médicale_Suisse</td>
	<td>Revue Médicale Suisse</td>
	<td>1</td>
	</tr>
	<tr>
	<td>1</td>
	<td>American_Physical_Society</td>
	<td>American Physical Society</td>
	<td>2</td>
	</tr>
	<tr>
	<td>2</td>
	<td>Public_Library_of_Science</td>
	<td>Public Library of Science</td>
	<td>3</td>
	</tr>
	<tr>
	<td>3</td>
	<td>The_Global_Studies_Institute_de_l’Université_d...</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>4</td>
	</tr>
	<tr>
	<td>4</td>
	<td>Universitat_de_València,_Departamento_de_Teorí...</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>5</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>376</td>
	<td>Tipografia_La_Commerciale</td>
	<td>Tipografia La Commerciale</td>
	<td>377</td>
	</tr>
	<tr>
	<td>377</td>
	<td>Red.:_Prof._Dr._F._Cavalli,_Istituto_oncologic...</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>378</td>
	</tr>
	<tr>
	<td>378</td>
	<td>Excerpta_Medica</td>
	<td>Excerpta Medica</td>
	<td>379</td>
	</tr>
	<tr>
	<td>379</td>
	<td>Generative_Grammar_Group_of_the_Department_of_...</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>380</td>
	</tr>
	<tr>
	<td>380</td>
	<td>999999</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	</tr>
	</tbody>
	</table>
	<p>381 rows × 3 columns</p>
	</div>




	```python
	# import ids
	publisher_ids = pd.read_csv('sample/journals_publishers_ids.tsv', encoding='utf-8', header=0, sep='\t')
	publisher_ids
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>id</th>
	<th>publisher</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>1</td>
	<td>1</td>
	</tr>
	<tr>
	<td>1</td>
	<td>2</td>
	<td>2</td>
	</tr>
	<tr>
	<td>2</td>
	<td>3</td>
	<td>3</td>
	</tr>
	<tr>
	<td>3</td>
	<td>4</td>
	<td>4</td>
	</tr>
	<tr>
	<td>4</td>
	<td>4</td>
	<td>5</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>940</td>
	<td>997</td>
	<td>47</td>
	</tr>
	<tr>
	<td>941</td>
	<td>998</td>
	<td>75</td>
	</tr>
	<tr>
	<td>942</td>
	<td>999</td>
	<td>8</td>
	</tr>
	<tr>
	<td>943</td>
	<td>1000</td>
	<td>119</td>
	</tr>
	<tr>
	<td>944</td>
	<td>1001</td>
	<td>217</td>
	</tr>
	</tbody>
	</table>
	<p>945 rows × 2 columns</p>
	</div>




	```python
	# renommage id
	publisher_ids = publisher_ids.rename(columns = {'id': 'journal'})
	publisher_ids = publisher_ids.rename(columns = {'publisher': 'id'})
	```


	```python
	# dédoublonage par publisher id
	publisher_ids_dedup = publisher_ids.drop_duplicates(subset='id')
	publisher_ids_dedup
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>journal</th>
	<th>id</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>1</td>
	<td>1</td>
	</tr>
	<tr>
	<td>1</td>
	<td>2</td>
	<td>2</td>
	</tr>
	<tr>
	<td>2</td>
	<td>3</td>
	<td>3</td>
	</tr>
	<tr>
	<td>3</td>
	<td>4</td>
	<td>4</td>
	</tr>
	<tr>
	<td>4</td>
	<td>4</td>
	<td>5</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>929</td>
	<td>987</td>
	<td>376</td>
	</tr>
	<tr>
	<td>930</td>
	<td>987</td>
	<td>377</td>
	</tr>
	<tr>
	<td>932</td>
	<td>989</td>
	<td>378</td>
	</tr>
	<tr>
	<td>934</td>
	<td>991</td>
	<td>379</td>
	</tr>
	<tr>
	<td>937</td>
	<td>994</td>
	<td>380</td>
	</tr>
	</tbody>
	</table>
	<p>380 rows × 2 columns</p>
	</div>




	```python
	# merge avec journals
	publisher = pd.merge(publishers_issn, publisher_ids_dedup, on='id', how='left')
	publisher
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>publisher_id</th>
	<th>name</th>
	<th>id</th>
	<th>journal</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>Revue_Médicale_Suisse</td>
	<td>Revue Médicale Suisse</td>
	<td>1</td>
	<td>1.0</td>
	</tr>
	<tr>
	<td>1</td>
	<td>American_Physical_Society</td>
	<td>American Physical Society</td>
	<td>2</td>
	<td>2.0</td>
	</tr>
	<tr>
	<td>2</td>
	<td>Public_Library_of_Science</td>
	<td>Public Library of Science</td>
	<td>3</td>
	<td>3.0</td>
	</tr>
	<tr>
	<td>3</td>
	<td>The_Global_Studies_Institute_de_l’Université_d...</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>4</td>
	<td>4.0</td>
	</tr>
	<tr>
	<td>4</td>
	<td>Universitat_de_València,_Departamento_de_Teorí...</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>5</td>
	<td>4.0</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>376</td>
	<td>Tipografia_La_Commerciale</td>
	<td>Tipografia La Commerciale</td>
	<td>377</td>
	<td>987.0</td>
	</tr>
	<tr>
	<td>377</td>
	<td>Red.:_Prof._Dr._F._Cavalli,_Istituto_oncologic...</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>378</td>
	<td>989.0</td>
	</tr>
	<tr>
	<td>378</td>
	<td>Excerpta_Medica</td>
	<td>Excerpta Medica</td>
	<td>379</td>
	<td>991.0</td>
	</tr>
	<tr>
	<td>379</td>
	<td>Generative_Grammar_Group_of_the_Department_of_...</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>380</td>
	<td>994.0</td>
	</tr>
	<tr>
	<td>380</td>
	<td>999999</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	<td>NaN</td>
	</tr>
	</tbody>
	</table>
	<p>381 rows × 4 columns</p>
	</div>




	```python
	# ajout des valeurs de sherpa
	publisher_sherpa = pd.read_csv('sample/publisher_sherpa.tsv', encoding='utf-8', header=0, sep='\t')
	publisher_sherpa
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>journal</th>
	<th>publisher_id</th>
	<th>name</th>
	<th>country</th>
	<th>type</th>
	<th>url</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>532</td>
	<td>45</td>
	<td>John Wiley and Sons</td>
	<td>gb</td>
	<td>former_publisher</td>
	<td>http://www.wiley.com/</td>
	</tr>
	<tr>
	<td>1</td>
	<td>498</td>
	<td>4</td>
	<td>American Chemical Society</td>
	<td>us</td>
	<td>society_publisher</td>
	<td>http://pubs.acs.org/</td>
	</tr>
	<tr>
	<td>2</td>
	<td>789</td>
	<td>126</td>
	<td>Acoustical Society of America</td>
	<td>us</td>
	<td>society_publisher</td>
	<td>http://acousticalsociety.org/</td>
	</tr>
	<tr>
	<td>3</td>
	<td>166</td>
	<td>3291</td>
	<td>Springer</td>
	<td>gb</td>
	<td>commercial_publisher</td>
	<td>https://www.springernature.com/gp/products/jou...</td>
	</tr>
	<tr>
	<td>4</td>
	<td>807</td>
	<td>3291</td>
	<td>Springer</td>
	<td>gb</td>
	<td>commercial_publisher</td>
	<td>https://www.springernature.com/gp/products/jou...</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>803</td>
	<td>870</td>
	<td>10</td>
	<td>American Physical Society</td>
	<td>us</td>
	<td>society_publisher</td>
	<td>http://www.aps.org/</td>
	</tr>
	<tr>
	<td>804</td>
	<td>41</td>
	<td>10</td>
	<td>American Physical Society</td>
	<td>us</td>
	<td>society_publisher</td>
	<td>http://www.aps.org/</td>
	</tr>
	<tr>
	<td>805</td>
	<td>80</td>
	<td>10</td>
	<td>American Physical Society</td>
	<td>us</td>
	<td>society_publisher</td>
	<td>http://www.aps.org/</td>
	</tr>
	<tr>
	<td>806</td>
	<td>533</td>
	<td>10</td>
	<td>American Physical Society</td>
	<td>us</td>
	<td>society_publisher</td>
	<td>http://www.aps.org/</td>
	</tr>
	<tr>
	<td>807</td>
	<td>608</td>
	<td>10</td>
	<td>American Physical Society</td>
	<td>us</td>
	<td>society_publisher</td>
	<td>http://www.aps.org/</td>
	</tr>
	</tbody>
	</table>
	<p>808 rows × 6 columns</p>
	</div>




	```python
	# renommage ids
	publisher_sherpa = publisher_sherpa.rename(columns = {'publisher_id': 'publisher_id_sherpa', 'url': 'website_sherpa', 'country': 'iso_code'})
	```


	```python
	# merge avec ids journals
	publisher = pd.merge(publisher, publisher_sherpa, on='journal', how='left')
	publisher
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>publisher_id</th>
	<th>name_x</th>
	<th>id</th>
	<th>journal</th>
	<th>publisher_id_sherpa</th>
	<th>name_y</th>
	<th>iso_code</th>
	<th>type</th>
	<th>website_sherpa</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>Revue_Médicale_Suisse</td>
	<td>Revue Médicale Suisse</td>
	<td>1</td>
	<td>1.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>1</td>
	<td>American_Physical_Society</td>
	<td>American Physical Society</td>
	<td>2</td>
	<td>2.0</td>
	<td>10.0</td>
	<td>American Physical Society</td>
	<td>us</td>
	<td>society_publisher</td>
	<td>http://www.aps.org/</td>
	</tr>
	<tr>
	<td>2</td>
	<td>Public_Library_of_Science</td>
	<td>Public Library of Science</td>
	<td>3</td>
	<td>3.0</td>
	<td>112.0</td>
	<td>Public Library of Science</td>
	<td>us</td>
	<td>commercial_publisher</td>
	<td>http://www.plos.org/</td>
	</tr>
	<tr>
	<td>3</td>
	<td>The_Global_Studies_Institute_de_l’Université_d...</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>4</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>4</td>
	<td>Universitat_de_València,_Departamento_de_Teorí...</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>5</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>376</td>
	<td>Tipografia_La_Commerciale</td>
	<td>Tipografia La Commerciale</td>
	<td>377</td>
	<td>987.0</td>
	<td>3291.0</td>
	<td>Springer</td>
	<td>gb</td>
	<td>commercial_publisher</td>
	<td>https://www.springernature.com/gp/products/jou...</td>
	</tr>
	<tr>
	<td>377</td>
	<td>Red.:_Prof._Dr._F._Cavalli,_Istituto_oncologic...</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>378</td>
	<td>989.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>378</td>
	<td>Excerpta_Medica</td>
	<td>Excerpta Medica</td>
	<td>379</td>
	<td>991.0</td>
	<td>30.0</td>
	<td>Elsevier</td>
	<td>us</td>
	<td>commercial_publisher</td>
	<td>http://www.elsevier.com/</td>
	</tr>
	<tr>
	<td>379</td>
	<td>Generative_Grammar_Group_of_the_Department_of_...</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>380</td>
	<td>994.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>380</td>
	<td>999999</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	</tbody>
	</table>
	<p>381 rows × 9 columns</p>
	</div>




	```python
	# renommage names
	publisher = publisher.rename(columns = {'name_x': 'name_issn', 'name_y': 'name_sherpa'})
	```


	```python
	# ajout des informations à partir des revues
	publisher_journals = pd.read_csv('sample/journals_publishers_brut.tsv', encoding='utf-8', header=0, sep='\t', usecols=['id', 'url'])
	publisher_journals
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>id</th>
	<th>url</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>1</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>1</td>
	<td>2</td>
	<td>http://prl.aps.org/</td>
	</tr>
	<tr>
	<td>2</td>
	<td>3</td>
	<td>http://www.plosone.org/</td>
	</tr>
	<tr>
	<td>3</td>
	<td>4</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>4</td>
	<td>5</td>
	<td>http://ojps.aip.org/prbo/</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>906</td>
	<td>997</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>907</td>
	<td>998</td>
	<td>http://www.jpedsurg.org</td>
	</tr>
	<tr>
	<td>908</td>
	<td>999</td>
	<td>http://www.springerlink.com/content/100451</td>
	</tr>
	<tr>
	<td>909</td>
	<td>1000</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>910</td>
	<td>1001</td>
	<td>https://www.physiology.org/journal/jappl</td>
	</tr>
	</tbody>
	</table>
	<p>911 rows × 2 columns</p>
	</div>




	```python
	# renommage id
	publisher_journals = publisher_journals.rename(columns = {'id': 'journal'})
	```


	```python
	# merge avec ids journals
	publisher = pd.merge(publisher, publisher_journals, on='journal', how='left')
	publisher
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>publisher_id</th>
	<th>name_issn</th>
	<th>id</th>
	<th>journal</th>
	<th>publisher_id_sherpa</th>
	<th>name_sherpa</th>
	<th>iso_code</th>
	<th>type</th>
	<th>website_sherpa</th>
	<th>url</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>Revue_Médicale_Suisse</td>
	<td>Revue Médicale Suisse</td>
	<td>1</td>
	<td>1.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>1</td>
	<td>American_Physical_Society</td>
	<td>American Physical Society</td>
	<td>2</td>
	<td>2.0</td>
	<td>10.0</td>
	<td>American Physical Society</td>
	<td>us</td>
	<td>society_publisher</td>
	<td>http://www.aps.org/</td>
	<td>http://prl.aps.org/</td>
	</tr>
	<tr>
	<td>2</td>
	<td>Public_Library_of_Science</td>
	<td>Public Library of Science</td>
	<td>3</td>
	<td>3.0</td>
	<td>112.0</td>
	<td>Public Library of Science</td>
	<td>us</td>
	<td>commercial_publisher</td>
	<td>http://www.plos.org/</td>
	<td>http://www.plosone.org/</td>
	</tr>
	<tr>
	<td>3</td>
	<td>The_Global_Studies_Institute_de_l’Université_d...</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>4</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>4</td>
	<td>Universitat_de_València,_Departamento_de_Teorí...</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>5</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>376</td>
	<td>Tipografia_La_Commerciale</td>
	<td>Tipografia La Commerciale</td>
	<td>377</td>
	<td>987.0</td>
	<td>3291.0</td>
	<td>Springer</td>
	<td>gb</td>
	<td>commercial_publisher</td>
	<td>https://www.springernature.com/gp/products/jou...</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>377</td>
	<td>Red.:_Prof._Dr._F._Cavalli,_Istituto_oncologic...</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>378</td>
	<td>989.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>378</td>
	<td>Excerpta_Medica</td>
	<td>Excerpta Medica</td>
	<td>379</td>
	<td>991.0</td>
	<td>30.0</td>
	<td>Elsevier</td>
	<td>us</td>
	<td>commercial_publisher</td>
	<td>http://www.elsevier.com/</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>379</td>
	<td>Generative_Grammar_Group_of_the_Department_of_...</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>380</td>
	<td>994.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>380</td>
	<td>999999</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	</tbody>
	</table>
	<p>381 rows × 10 columns</p>
	</div>




	```python
	# renommage names
	del publisher['publisher_id']
	del publisher['publisher_id_sherpa']
	del publisher['type']
	publisher = publisher.rename(columns = {'url' : 'website_issn_journal'})
	publisher
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>name_issn</th>
	<th>id</th>
	<th>journal</th>
	<th>name_sherpa</th>
	<th>iso_code</th>
	<th>website_sherpa</th>
	<th>website_issn_journal</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>Revue Médicale Suisse</td>
	<td>1</td>
	<td>1.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>1</td>
	<td>American Physical Society</td>
	<td>2</td>
	<td>2.0</td>
	<td>American Physical Society</td>
	<td>us</td>
	<td>http://www.aps.org/</td>
	<td>http://prl.aps.org/</td>
	</tr>
	<tr>
	<td>2</td>
	<td>Public Library of Science</td>
	<td>3</td>
	<td>3.0</td>
	<td>Public Library of Science</td>
	<td>us</td>
	<td>http://www.plos.org/</td>
	<td>http://www.plosone.org/</td>
	</tr>
	<tr>
	<td>3</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>4</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>4</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>5</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>376</td>
	<td>Tipografia La Commerciale</td>
	<td>377</td>
	<td>987.0</td>
	<td>Springer</td>
	<td>gb</td>
	<td>https://www.springernature.com/gp/products/jou...</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>377</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>378</td>
	<td>989.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>378</td>
	<td>Excerpta Medica</td>
	<td>379</td>
	<td>991.0</td>
	<td>Elsevier</td>
	<td>us</td>
	<td>http://www.elsevier.com/</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>379</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>380</td>
	<td>994.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>380</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	</tr>
	</tbody>
	</table>
	<p>381 rows × 7 columns</p>
	</div>




	```python
	# ajout des champs vides des vides et int
	publisher['city'] = ''
	publisher['state'] = ''
	publisher['oa_policies'] = ''
	publisher['starting_year'] = 0
	publisher
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>name_issn</th>
	<th>id</th>
	<th>journal</th>
	<th>name_sherpa</th>
	<th>iso_code</th>
	<th>website_sherpa</th>
	<th>website_issn_journal</th>
	<th>city</th>
	<th>state</th>
	<th>oa_policies</th>
	<th>starting_year</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>Revue Médicale Suisse</td>
	<td>1</td>
	<td>1.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>1</td>
	<td>American Physical Society</td>
	<td>2</td>
	<td>2.0</td>
	<td>American Physical Society</td>
	<td>us</td>
	<td>http://www.aps.org/</td>
	<td>http://prl.aps.org/</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>2</td>
	<td>Public Library of Science</td>
	<td>3</td>
	<td>3.0</td>
	<td>Public Library of Science</td>
	<td>us</td>
	<td>http://www.plos.org/</td>
	<td>http://www.plosone.org/</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>3</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>4</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>4</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>5</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>376</td>
	<td>Tipografia La Commerciale</td>
	<td>377</td>
	<td>987.0</td>
	<td>Springer</td>
	<td>gb</td>
	<td>https://www.springernature.com/gp/products/jou...</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>377</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>378</td>
	<td>989.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>378</td>
	<td>Excerpta Medica</td>
	<td>379</td>
	<td>991.0</td>
	<td>Elsevier</td>
	<td>us</td>
	<td>http://www.elsevier.com/</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>379</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>380</td>
	<td>994.0</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>380</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	</tbody>
	</table>
	<p>381 rows × 11 columns</p>
	</div>




	```python
	# iso_code en majuscules
	publisher['iso_code'] = publisher['iso_code'].str.upper()
	# ajout de la valeur pour unknown
	publisher['iso_code'] = publisher['iso_code'].fillna('__')
	publisher
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>name_issn</th>
	<th>id</th>
	<th>journal</th>
	<th>name_sherpa</th>
	<th>iso_code</th>
	<th>website_sherpa</th>
	<th>website_issn_journal</th>
	<th>city</th>
	<th>state</th>
	<th>oa_policies</th>
	<th>starting_year</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>Revue Médicale Suisse</td>
	<td>1</td>
	<td>1.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>1</td>
	<td>American Physical Society</td>
	<td>2</td>
	<td>2.0</td>
	<td>American Physical Society</td>
	<td>US</td>
	<td>http://www.aps.org/</td>
	<td>http://prl.aps.org/</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>2</td>
	<td>Public Library of Science</td>
	<td>3</td>
	<td>3.0</td>
	<td>Public Library of Science</td>
	<td>US</td>
	<td>http://www.plos.org/</td>
	<td>http://www.plosone.org/</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>3</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>4</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>4</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>5</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>376</td>
	<td>Tipografia La Commerciale</td>
	<td>377</td>
	<td>987.0</td>
	<td>Springer</td>
	<td>GB</td>
	<td>https://www.springernature.com/gp/products/jou...</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>377</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>378</td>
	<td>989.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>378</td>
	<td>Excerpta Medica</td>
	<td>379</td>
	<td>991.0</td>
	<td>Elsevier</td>
	<td>US</td>
	<td>http://www.elsevier.com/</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>379</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>380</td>
	<td>994.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	<tr>
	<td>380</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	</tr>
	</tbody>
	</table>
	<p>381 rows × 11 columns</p>
	</div>




	```python
	# merge avec countries
	country = pd.read_csv('sample/country.tsv', usecols=('iso_code', 'id'), encoding='utf-8', header=0, sep='\t')
	country
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>iso_code</th>
	<th>id</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>AF</td>
	<td>1</td>
	</tr>
	<tr>
	<td>1</td>
	<td>AL</td>
	<td>2</td>
	</tr>
	<tr>
	<td>2</td>
	<td>DZ</td>
	<td>3</td>
	</tr>
	<tr>
	<td>3</td>
	<td>AS</td>
	<td>4</td>
	</tr>
	<tr>
	<td>4</td>
	<td>AD</td>
	<td>5</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>246</td>
	<td>ZM</td>
	<td>247</td>
	</tr>
	<tr>
	<td>247</td>
	<td>ZW</td>
	<td>248</td>
	</tr>
	<tr>
	<td>248</td>
	<td>AX</td>
	<td>249</td>
	</tr>
	<tr>
	<td>249</td>
	<td>OI</td>
	<td>250</td>
	</tr>
	<tr>
	<td>250</td>
	<td>__</td>
	<td>999999</td>
	</tr>
	</tbody>
	</table>
	<p>251 rows × 2 columns</p>
	</div>




	```python
	country = country.rename(columns={'id': 'country'})
	country
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>iso_code</th>
	<th>country</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>AF</td>
	<td>1</td>
	</tr>
	<tr>
	<td>1</td>
	<td>AL</td>
	<td>2</td>
	</tr>
	<tr>
	<td>2</td>
	<td>DZ</td>
	<td>3</td>
	</tr>
	<tr>
	<td>3</td>
	<td>AS</td>
	<td>4</td>
	</tr>
	<tr>
	<td>4</td>
	<td>AD</td>
	<td>5</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>246</td>
	<td>ZM</td>
	<td>247</td>
	</tr>
	<tr>
	<td>247</td>
	<td>ZW</td>
	<td>248</td>
	</tr>
	<tr>
	<td>248</td>
	<td>AX</td>
	<td>249</td>
	</tr>
	<tr>
	<td>249</td>
	<td>OI</td>
	<td>250</td>
	</tr>
	<tr>
	<td>250</td>
	<td>__</td>
	<td>999999</td>
	</tr>
	</tbody>
	</table>
	<p>251 rows × 2 columns</p>
	</div>




	```python
	publisher = pd.merge(publisher, country, on='iso_code', how='left')
	publisher
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>name_issn</th>
	<th>id</th>
	<th>journal</th>
	<th>name_sherpa</th>
	<th>iso_code</th>
	<th>website_sherpa</th>
	<th>website_issn_journal</th>
	<th>city</th>
	<th>state</th>
	<th>oa_policies</th>
	<th>starting_year</th>
	<th>country</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>Revue Médicale Suisse</td>
	<td>1</td>
	<td>1.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>999999</td>
	</tr>
	<tr>
	<td>1</td>
	<td>American Physical Society</td>
	<td>2</td>
	<td>2.0</td>
	<td>American Physical Society</td>
	<td>US</td>
	<td>http://www.aps.org/</td>
	<td>http://prl.aps.org/</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>236</td>
	</tr>
	<tr>
	<td>2</td>
	<td>Public Library of Science</td>
	<td>3</td>
	<td>3.0</td>
	<td>Public Library of Science</td>
	<td>US</td>
	<td>http://www.plos.org/</td>
	<td>http://www.plosone.org/</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>236</td>
	</tr>
	<tr>
	<td>3</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>4</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>999999</td>
	</tr>
	<tr>
	<td>4</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>5</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>999999</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>376</td>
	<td>Tipografia La Commerciale</td>
	<td>377</td>
	<td>987.0</td>
	<td>Springer</td>
	<td>GB</td>
	<td>https://www.springernature.com/gp/products/jou...</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>234</td>
	</tr>
	<tr>
	<td>377</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>378</td>
	<td>989.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>999999</td>
	</tr>
	<tr>
	<td>378</td>
	<td>Excerpta Medica</td>
	<td>379</td>
	<td>991.0</td>
	<td>Elsevier</td>
	<td>US</td>
	<td>http://www.elsevier.com/</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>236</td>
	</tr>
	<tr>
	<td>379</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>380</td>
	<td>994.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>999999</td>
	</tr>
	<tr>
	<td>380</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>999999</td>
	</tr>
	</tbody>
	</table>
	<p>381 rows × 12 columns</p>
	</div>




	```python
	# garder sherpa puis issn.org
	publisher.loc[publisher['name_sherpa'].notna(), 'name'] = publisher['name_sherpa']
	publisher.loc[publisher['name_sherpa'].isna(), 'name'] = publisher['name_issn']
	publisher.loc[publisher['website_sherpa'].notna(), 'website'] = publisher['website_sherpa']
	publisher.loc[publisher['website_sherpa'].isna(), 'website'] = publisher['website_issn_journal']
	publisher
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>name_issn</th>
	<th>id</th>
	<th>journal</th>
	<th>name_sherpa</th>
	<th>iso_code</th>
	<th>website_sherpa</th>
	<th>website_issn_journal</th>
	<th>city</th>
	<th>state</th>
	<th>oa_policies</th>
	<th>starting_year</th>
	<th>country</th>
	<th>name</th>
	<th>website</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>Revue Médicale Suisse</td>
	<td>1</td>
	<td>1.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>999999</td>
	<td>Revue Médicale Suisse</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>1</td>
	<td>American Physical Society</td>
	<td>2</td>
	<td>2.0</td>
	<td>American Physical Society</td>
	<td>US</td>
	<td>http://www.aps.org/</td>
	<td>http://prl.aps.org/</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>236</td>
	<td>American Physical Society</td>
	<td>http://www.aps.org/</td>
	</tr>
	<tr>
	<td>2</td>
	<td>Public Library of Science</td>
	<td>3</td>
	<td>3.0</td>
	<td>Public Library of Science</td>
	<td>US</td>
	<td>http://www.plos.org/</td>
	<td>http://www.plosone.org/</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>236</td>
	<td>Public Library of Science</td>
	<td>http://www.plos.org/</td>
	</tr>
	<tr>
	<td>3</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>4</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>999999</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>4</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>5</td>
	<td>4.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>999999</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>376</td>
	<td>Tipografia La Commerciale</td>
	<td>377</td>
	<td>987.0</td>
	<td>Springer</td>
	<td>GB</td>
	<td>https://www.springernature.com/gp/products/jou...</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>234</td>
	<td>Springer</td>
	<td>https://www.springernature.com/gp/products/jou...</td>
	</tr>
	<tr>
	<td>377</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>378</td>
	<td>989.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>999999</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>378</td>
	<td>Excerpta Medica</td>
	<td>379</td>
	<td>991.0</td>
	<td>Elsevier</td>
	<td>US</td>
	<td>http://www.elsevier.com/</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>236</td>
	<td>Elsevier</td>
	<td>http://www.elsevier.com/</td>
	</tr>
	<tr>
	<td>379</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>380</td>
	<td>994.0</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>999999</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>380</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	<td>NaN</td>
	<td>NaN</td>
	<td>__</td>
	<td>NaN</td>
	<td>NaN</td>
	<td></td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>999999</td>
	<td>UNKNOWN</td>
	<td>NaN</td>
	</tr>
	</tbody>
	</table>
	<p>381 rows × 14 columns</p>
	</div>




	```python
	# garder les champs utiles pour l'éditeur
	publisher_export = publisher[['id', 'name', 'country', 'city', 'state', 'starting_year', 'website', 'oa_policies']]
	```


	```python
	# supprimer les doublons
	publisher_export = publisher_export.drop_duplicates(subset='id')
	publisher_export
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>id</th>
	<th>name</th>
	<th>country</th>
	<th>city</th>
	<th>state</th>
	<th>starting_year</th>
	<th>website</th>
	<th>oa_policies</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>1</td>
	<td>Revue Médicale Suisse</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>NaN</td>
	<td></td>
	</tr>
	<tr>
	<td>1</td>
	<td>2</td>
	<td>American Physical Society</td>
	<td>236</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://www.aps.org/</td>
	<td></td>
	</tr>
	<tr>
	<td>2</td>
	<td>3</td>
	<td>Public Library of Science</td>
	<td>236</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://www.plos.org/</td>
	<td></td>
	</tr>
	<tr>
	<td>3</td>
	<td>4</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>NaN</td>
	<td></td>
	</tr>
	<tr>
	<td>4</td>
	<td>5</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>NaN</td>
	<td></td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>376</td>
	<td>377</td>
	<td>Springer</td>
	<td>234</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>https://www.springernature.com/gp/products/jou...</td>
	<td></td>
	</tr>
	<tr>
	<td>377</td>
	<td>378</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>NaN</td>
	<td></td>
	</tr>
	<tr>
	<td>378</td>
	<td>379</td>
	<td>Elsevier</td>
	<td>236</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://www.elsevier.com/</td>
	<td></td>
	</tr>
	<tr>
	<td>379</td>
	<td>380</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>NaN</td>
	<td></td>
	</tr>
	<tr>
	<td>380</td>
	<td>999999</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>NaN</td>
	<td></td>
	</tr>
	</tbody>
	</table>
	<p>381 rows × 8 columns</p>
	</div>




	```python
	# remplacement des vides et id à int
	publisher_export['website'] = publisher_export['website'].fillna('')
	publisher_export
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>id</th>
	<th>name</th>
	<th>country</th>
	<th>city</th>
	<th>state</th>
	<th>starting_year</th>
	<th>website</th>
	<th>oa_policies</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>1</td>
	<td>Revue Médicale Suisse</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	</tr>
	<tr>
	<td>1</td>
	<td>2</td>
	<td>American Physical Society</td>
	<td>236</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://www.aps.org/</td>
	<td></td>
	</tr>
	<tr>
	<td>2</td>
	<td>3</td>
	<td>Public Library of Science</td>
	<td>236</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://www.plos.org/</td>
	<td></td>
	</tr>
	<tr>
	<td>3</td>
	<td>4</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	</tr>
	<tr>
	<td>4</td>
	<td>5</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>376</td>
	<td>377</td>
	<td>Springer</td>
	<td>234</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>https://www.springernature.com/gp/products/jou...</td>
	<td></td>
	</tr>
	<tr>
	<td>377</td>
	<td>378</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	</tr>
	<tr>
	<td>378</td>
	<td>379</td>
	<td>Elsevier</td>
	<td>236</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://www.elsevier.com/</td>
	<td></td>
	</tr>
	<tr>
	<td>379</td>
	<td>380</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	</tr>
	<tr>
	<td>380</td>
	<td>999999</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	</tr>
	</tbody>
	</table>
	<p>381 rows × 8 columns</p>
	</div>




	```python
	# merge pour avoir les titres
	publisher_ids_dedup = pd.merge(publisher_ids_dedup, publisher_export[['id', 'name']], on='id', how='left')
	publisher_ids_dedup
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>journal</th>
	<th>id</th>
	<th>name</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>1</td>
	<td>1</td>
	<td>Revue Médicale Suisse</td>
	</tr>
	<tr>
	<td>1</td>
	<td>2</td>
	<td>2</td>
	<td>American Physical Society</td>
	</tr>
	<tr>
	<td>2</td>
	<td>3</td>
	<td>3</td>
	<td>Public Library of Science</td>
	</tr>
	<tr>
	<td>3</td>
	<td>4</td>
	<td>4</td>
	<td>The Global Studies Institute de l’Université d...</td>
	</tr>
	<tr>
	<td>4</td>
	<td>4</td>
	<td>5</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>375</td>
	<td>987</td>
	<td>376</td>
	<td>Springer</td>
	</tr>
	<tr>
	<td>376</td>
	<td>987</td>
	<td>377</td>
	<td>Springer</td>
	</tr>
	<tr>
	<td>377</td>
	<td>989</td>
	<td>378</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	</tr>
	<tr>
	<td>378</td>
	<td>991</td>
	<td>379</td>
	<td>Elsevier</td>
	</tr>
	<tr>
	<td>379</td>
	<td>994</td>
	<td>380</td>
	<td>Generative Grammar Group of the Department of ...</td>
	</tr>
	</tbody>
	</table>
	<p>380 rows × 3 columns</p>
	</div>




	```python
	# garder les ids avant le dédoublonage pour la correction du publisher_ids_dedup
	publisher_ids_dedup = publisher_ids_dedup.rename(columns = {'id': 'publisher_av_dedup'})
	publisher_ids_dedup
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>journal</th>
	<th>publisher_av_dedup</th>
	<th>name</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>1</td>
	<td>1</td>
	<td>Revue Médicale Suisse</td>
	</tr>
	<tr>
	<td>1</td>
	<td>2</td>
	<td>2</td>
	<td>American Physical Society</td>
	</tr>
	<tr>
	<td>2</td>
	<td>3</td>
	<td>3</td>
	<td>Public Library of Science</td>
	</tr>
	<tr>
	<td>3</td>
	<td>4</td>
	<td>4</td>
	<td>The Global Studies Institute de l’Université d...</td>
	</tr>
	<tr>
	<td>4</td>
	<td>4</td>
	<td>5</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>375</td>
	<td>987</td>
	<td>376</td>
	<td>Springer</td>
	</tr>
	<tr>
	<td>376</td>
	<td>987</td>
	<td>377</td>
	<td>Springer</td>
	</tr>
	<tr>
	<td>377</td>
	<td>989</td>
	<td>378</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	</tr>
	<tr>
	<td>378</td>
	<td>991</td>
	<td>379</td>
	<td>Elsevier</td>
	</tr>
	<tr>
	<td>379</td>
	<td>994</td>
	<td>380</td>
	<td>Generative Grammar Group of the Department of ...</td>
	</tr>
	</tbody>
	</table>
	<p>380 rows × 3 columns</p>
	</div>




	```python
	publisher_export_dedup = publisher_export.drop_duplicates(subset='name')
	publisher_export_dedup
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>id</th>
	<th>name</th>
	<th>country</th>
	<th>city</th>
	<th>state</th>
	<th>starting_year</th>
	<th>website</th>
	<th>oa_policies</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>1</td>
	<td>Revue Médicale Suisse</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	</tr>
	<tr>
	<td>1</td>
	<td>2</td>
	<td>American Physical Society</td>
	<td>236</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://www.aps.org/</td>
	<td></td>
	</tr>
	<tr>
	<td>2</td>
	<td>3</td>
	<td>Public Library of Science</td>
	<td>236</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://www.plos.org/</td>
	<td></td>
	</tr>
	<tr>
	<td>3</td>
	<td>4</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	</tr>
	<tr>
	<td>4</td>
	<td>5</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>371</td>
	<td>372</td>
	<td>[American Medical Association]</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://archneur.jamanetwork.com/issues.aspx</td>
	<td></td>
	</tr>
	<tr>
	<td>374</td>
	<td>375</td>
	<td>Société botanique de Genève</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	</tr>
	<tr>
	<td>377</td>
	<td>378</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	</tr>
	<tr>
	<td>379</td>
	<td>380</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	</tr>
	<tr>
	<td>380</td>
	<td>999999</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	</tr>
	</tbody>
	</table>
	<p>196 rows × 8 columns</p>
	</div>




	```python
	del publisher_export_dedup['id']
	# convertir l'index en id
	publisher_export_dedup = publisher_export_dedup.reset_index()
	# ajout de l'id avec l'index + 1
	publisher_export_dedup['id'] = publisher_export_dedup['index'] + 1
	del publisher_export_dedup['index']
	publisher_export_dedup
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>name</th>
	<th>country</th>
	<th>city</th>
	<th>state</th>
	<th>starting_year</th>
	<th>website</th>
	<th>oa_policies</th>
	<th>id</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>Revue Médicale Suisse</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>1</td>
	</tr>
	<tr>
	<td>1</td>
	<td>American Physical Society</td>
	<td>236</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://www.aps.org/</td>
	<td></td>
	<td>2</td>
	</tr>
	<tr>
	<td>2</td>
	<td>Public Library of Science</td>
	<td>236</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://www.plos.org/</td>
	<td></td>
	<td>3</td>
	</tr>
	<tr>
	<td>3</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>4</td>
	</tr>
	<tr>
	<td>4</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>5</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>191</td>
	<td>[American Medical Association]</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://archneur.jamanetwork.com/issues.aspx</td>
	<td></td>
	<td>372</td>
	</tr>
	<tr>
	<td>192</td>
	<td>Société botanique de Genève</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>375</td>
	</tr>
	<tr>
	<td>193</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>378</td>
	</tr>
	<tr>
	<td>194</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>380</td>
	</tr>
	<tr>
	<td>195</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>381</td>
	</tr>
	</tbody>
	</table>
	<p>196 rows × 8 columns</p>
	</div>




	```python
	del publisher_export_dedup['id']
	# convertir l'index en id
	publisher_export_dedup = publisher_export_dedup.reset_index()
	# ajout de l'id avec l'index + 1
	publisher_export_dedup['id'] = publisher_export_dedup['index'] + 1
	del publisher_export_dedup['index']
	publisher_export_dedup
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>name</th>
	<th>country</th>
	<th>city</th>
	<th>state</th>
	<th>starting_year</th>
	<th>website</th>
	<th>oa_policies</th>
	<th>id</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>Revue Médicale Suisse</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>1</td>
	</tr>
	<tr>
	<td>1</td>
	<td>American Physical Society</td>
	<td>236</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://www.aps.org/</td>
	<td></td>
	<td>2</td>
	</tr>
	<tr>
	<td>2</td>
	<td>Public Library of Science</td>
	<td>236</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://www.plos.org/</td>
	<td></td>
	<td>3</td>
	</tr>
	<tr>
	<td>3</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>4</td>
	</tr>
	<tr>
	<td>4</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>5</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>191</td>
	<td>[American Medical Association]</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td>http://archneur.jamanetwork.com/issues.aspx</td>
	<td></td>
	<td>192</td>
	</tr>
	<tr>
	<td>192</td>
	<td>Société botanique de Genève</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>193</td>
	</tr>
	<tr>
	<td>193</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>194</td>
	</tr>
	<tr>
	<td>194</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>195</td>
	</tr>
	<tr>
	<td>195</td>
	<td>UNKNOWN</td>
	<td>999999</td>
	<td></td>
	<td></td>
	<td>0</td>
	<td></td>
	<td></td>
	<td>196</td>
	</tr>
	</tbody>
	</table>
	<p>196 rows × 8 columns</p>
	</div>




	```python
	# merge avec les ids d'avant Sherpa
	publisher_ids_dedup = pd.merge(publisher_ids_dedup, publisher_export_dedup[['id', 'name']], on='name', how='left')
	publisher_ids_dedup = publisher_ids_dedup.rename(columns = {'id': 'publisher'})
	publisher_ids_dedup = publisher_ids_dedup.rename(columns = {'journal': 'id'})
	publisher_ids_dedup
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>id</th>
	<th>publisher_av_dedup</th>
	<th>name</th>
	<th>publisher</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>1</td>
	<td>1</td>
	<td>Revue Médicale Suisse</td>
	<td>1</td>
	</tr>
	<tr>
	<td>1</td>
	<td>2</td>
	<td>2</td>
	<td>American Physical Society</td>
	<td>2</td>
	</tr>
	<tr>
	<td>2</td>
	<td>3</td>
	<td>3</td>
	<td>Public Library of Science</td>
	<td>3</td>
	</tr>
	<tr>
	<td>3</td>
	<td>4</td>
	<td>4</td>
	<td>The Global Studies Institute de l’Université d...</td>
	<td>4</td>
	</tr>
	<tr>
	<td>4</td>
	<td>4</td>
	<td>5</td>
	<td>Universitat de València, Departamento de Teorí...</td>
	<td>5</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>375</td>
	<td>987</td>
	<td>376</td>
	<td>Springer</td>
	<td>45</td>
	</tr>
	<tr>
	<td>376</td>
	<td>987</td>
	<td>377</td>
	<td>Springer</td>
	<td>45</td>
	</tr>
	<tr>
	<td>377</td>
	<td>989</td>
	<td>378</td>
	<td>Red.: Prof. Dr. F. Cavalli, Istituto oncologic...</td>
	<td>194</td>
	</tr>
	<tr>
	<td>378</td>
	<td>991</td>
	<td>379</td>
	<td>Elsevier</td>
	<td>11</td>
	</tr>
	<tr>
	<td>379</td>
	<td>994</td>
	<td>380</td>
	<td>Generative Grammar Group of the Department of ...</td>
	<td>195</td>
	</tr>
	</tbody>
	</table>
	<p>380 rows × 4 columns</p>
	</div>




	```python
	# concat valeurs avec même id
	del publisher_ids_dedup['publisher_av_dedup']
	del publisher_ids_dedup['name']
	publisher_ids_dedup['publisher'] = publisher_ids_dedup['publisher'].astype(str)
	publisher_ids_dedup_grouped = publisher_ids_dedup.groupby('id').agg({'publisher': lambda x: ', '.join(x)})
	publisher_ids_dedup_grouped
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>publisher</th>
	</tr>
	<tr>
	<th>id</th>
	<th></th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>1</td>
	<td>1</td>
	</tr>
	<tr>
	<td>2</td>
	<td>2</td>
	</tr>
	<tr>
	<td>3</td>
	<td>3</td>
	</tr>
	<tr>
	<td>4</td>
	<td>4, 5</td>
	</tr>
	<tr>
	<td>5</td>
	<td>2</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>986</td>
	<td>193</td>
	</tr>
	<tr>
	<td>987</td>
	<td>45, 45</td>
	</tr>
	<tr>
	<td>989</td>
	<td>194</td>
	</tr>
	<tr>
	<td>991</td>
	<td>11</td>
	</tr>
	<tr>
	<td>994</td>
	<td>195</td>
	</tr>
	</tbody>
	</table>
	<p>366 rows × 1 columns</p>
	</div>




	```python
	# modifs dans les journaux
	journal = pd.read_csv('sample/journal_fin_sherpa.tsv', encoding='utf-8', header=0, sep='\t')
	journal
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>id</th>
	<th>name</th>
	<th>name_short_iso_4</th>
	<th>starting_year</th>
	<th>end_year</th>
	<th>website</th>
	<th>country</th>
	<th>language</th>
	<th>publisher</th>
	<th>doaj_seal</th>
	<th>doaj_status</th>
	<th>lockss</th>
	<th>portico</th>
	<th>nlch</th>
	<th>qoam_av_score</th>
	<th>oa_status</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>1</td>
	<td>Revue médicale suisse</td>
	<td>Rev. méd. suisse</td>
	<td>2005</td>
	<td>9999</td>
	<td>NaN</td>
	<td>215</td>
	<td>138</td>
	<td>1</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>NaN</td>
	<td>1</td>
	</tr>
	<tr>
	<td>1</td>
	<td>2</td>
	<td>Physical Review Letters</td>
	<td>Phys. rev. lett. (Print)</td>
	<td>1958</td>
	<td>9999</td>
	<td>http://prl.aps.org/</td>
	<td>236</td>
	<td>124</td>
	<td>2</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	</tr>
	<tr>
	<td>2</td>
	<td>3</td>
	<td>PLoS ONE</td>
	<td>NaN</td>
	<td>2006</td>
	<td>9999</td>
	<td>http://www.plosone.org/</td>
	<td>236</td>
	<td>124</td>
	<td>3</td>
	<td>1</td>
	<td>1</td>
	<td>1</td>
	<td>0</td>
	<td>0</td>
	<td>4.035714</td>
	<td>5</td>
	</tr>
	<tr>
	<td>3</td>
	<td>4</td>
	<td>EU-topías</td>
	<td>EU-topías</td>
	<td>2011</td>
	<td>9999</td>
	<td>NaN</td>
	<td>209</td>
	<td>124, 138, 402, 292</td>
	<td>4, 5</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>NaN</td>
	<td>1</td>
	</tr>
	<tr>
	<td>4</td>
	<td>5</td>
	<td>Physical review B: Condensed matter and materi...</td>
	<td>Phys. rev., B, Condens. matter mater. phys.</td>
	<td>1998</td>
	<td>2015</td>
	<td>http://journals.aps.org/prb/</td>
	<td>236</td>
	<td>124</td>
	<td>6</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>906</td>
	<td>997</td>
	<td>Smart Materials and Structures</td>
	<td>Smart mater. struct. (Print)</td>
	<td>1992</td>
	<td>9999</td>
	<td>http://iopscience.iop.org/0964-1726</td>
	<td>234</td>
	<td>124</td>
	<td>47</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	</tr>
	<tr>
	<td>907</td>
	<td>998</td>
	<td>Journal of Pediatric Surgery</td>
	<td>J. pediatr. surg. (Print)</td>
	<td>1966</td>
	<td>9999</td>
	<td>http://www.jpedsurg.org/</td>
	<td>236</td>
	<td>124</td>
	<td>75</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	</tr>
	<tr>
	<td>908</td>
	<td>999</td>
	<td>Probability Theory and Related Fields</td>
	<td>Probab. theory relat. fields (Internet)</td>
	<td>uuuu</td>
	<td>9999</td>
	<td>http://www.springerlink.com/content/100451/?p=...</td>
	<td>83</td>
	<td>124</td>
	<td>8</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>1</td>
	<td>1</td>
	<td>NaN</td>
	<td>2</td>
	</tr>
	<tr>
	<td>909</td>
	<td>1000</td>
	<td>Renewable Energy</td>
	<td>Renew. energy</td>
	<td>1991</td>
	<td>9999</td>
	<td>http://www.elsevier.com/wps/product/cws_home/9...</td>
	<td>234</td>
	<td>124</td>
	<td>119</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	</tr>
	<tr>
	<td>910</td>
	<td>1001</td>
	<td>Journal of applied physiology: respiratory, en...</td>
	<td>J. appl. physiol.: respir., environ. exercise ...</td>
	<td>1977</td>
	<td>1984</td>
	<td>https://www.physiology.org/journal/jappl</td>
	<td>236</td>
	<td>124</td>
	<td>217</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>NaN</td>
	<td>1</td>
	</tr>
	</tbody>
	</table>
	<p>911 rows × 16 columns</p>
	</div>




	```python
	# merge avec les journaux journal_fin_sherpa
	journal = pd.merge(journal, publisher_ids_dedup_grouped, on='id', how='left')
	journal
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>id</th>
	<th>name</th>
	<th>name_short_iso_4</th>
	<th>starting_year</th>
	<th>end_year</th>
	<th>website</th>
	<th>country</th>
	<th>language</th>
	<th>publisher_x</th>
	<th>doaj_seal</th>
	<th>doaj_status</th>
	<th>lockss</th>
	<th>portico</th>
	<th>nlch</th>
	<th>qoam_av_score</th>
	<th>oa_status</th>
	<th>publisher_y</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>1</td>
	<td>Revue médicale suisse</td>
	<td>Rev. méd. suisse</td>
	<td>2005</td>
	<td>9999</td>
	<td>NaN</td>
	<td>215</td>
	<td>138</td>
	<td>1</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>NaN</td>
	<td>1</td>
	<td>1</td>
	</tr>
	<tr>
	<td>1</td>
	<td>2</td>
	<td>Physical Review Letters</td>
	<td>Phys. rev. lett. (Print)</td>
	<td>1958</td>
	<td>9999</td>
	<td>http://prl.aps.org/</td>
	<td>236</td>
	<td>124</td>
	<td>2</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	<td>2</td>
	</tr>
	<tr>
	<td>2</td>
	<td>3</td>
	<td>PLoS ONE</td>
	<td>NaN</td>
	<td>2006</td>
	<td>9999</td>
	<td>http://www.plosone.org/</td>
	<td>236</td>
	<td>124</td>
	<td>3</td>
	<td>1</td>
	<td>1</td>
	<td>1</td>
	<td>0</td>
	<td>0</td>
	<td>4.035714</td>
	<td>5</td>
	<td>3</td>
	</tr>
	<tr>
	<td>3</td>
	<td>4</td>
	<td>EU-topías</td>
	<td>EU-topías</td>
	<td>2011</td>
	<td>9999</td>
	<td>NaN</td>
	<td>209</td>
	<td>124, 138, 402, 292</td>
	<td>4, 5</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>NaN</td>
	<td>1</td>
	<td>4, 5</td>
	</tr>
	<tr>
	<td>4</td>
	<td>5</td>
	<td>Physical review B: Condensed matter and materi...</td>
	<td>Phys. rev., B, Condens. matter mater. phys.</td>
	<td>1998</td>
	<td>2015</td>
	<td>http://journals.aps.org/prb/</td>
	<td>236</td>
	<td>124</td>
	<td>6</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	<td>2</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>906</td>
	<td>997</td>
	<td>Smart Materials and Structures</td>
	<td>Smart mater. struct. (Print)</td>
	<td>1992</td>
	<td>9999</td>
	<td>http://iopscience.iop.org/0964-1726</td>
	<td>234</td>
	<td>124</td>
	<td>47</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>907</td>
	<td>998</td>
	<td>Journal of Pediatric Surgery</td>
	<td>J. pediatr. surg. (Print)</td>
	<td>1966</td>
	<td>9999</td>
	<td>http://www.jpedsurg.org/</td>
	<td>236</td>
	<td>124</td>
	<td>75</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>908</td>
	<td>999</td>
	<td>Probability Theory and Related Fields</td>
	<td>Probab. theory relat. fields (Internet)</td>
	<td>uuuu</td>
	<td>9999</td>
	<td>http://www.springerlink.com/content/100451/?p=...</td>
	<td>83</td>
	<td>124</td>
	<td>8</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>1</td>
	<td>1</td>
	<td>NaN</td>
	<td>2</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>909</td>
	<td>1000</td>
	<td>Renewable Energy</td>
	<td>Renew. energy</td>
	<td>1991</td>
	<td>9999</td>
	<td>http://www.elsevier.com/wps/product/cws_home/9...</td>
	<td>234</td>
	<td>124</td>
	<td>119</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>910</td>
	<td>1001</td>
	<td>Journal of applied physiology: respiratory, en...</td>
	<td>J. appl. physiol.: respir., environ. exercise ...</td>
	<td>1977</td>
	<td>1984</td>
	<td>https://www.physiology.org/journal/jappl</td>
	<td>236</td>
	<td>124</td>
	<td>217</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>NaN</td>
	<td>1</td>
	<td>NaN</td>
	</tr>
	</tbody>
	</table>
	<p>911 rows × 17 columns</p>
	</div>




	```python
	del journal['publisher_x']
	journal = journal.rename(columns = {'publisher_y': 'publisher'})
	journal
	```




	<div>
	<style scoped>
	.dataframe tbody tr th:only-of-type {
	vertical-align: middle;
	}

	.dataframe tbody tr th {
	vertical-align: top;
	}

	.dataframe thead th {
	text-align: right;
	}
	</style>
	<table border="1" class="dataframe">
	<thead>
	<tr style="text-align: right;">
	<th></th>
	<th>id</th>
	<th>name</th>
	<th>name_short_iso_4</th>
	<th>starting_year</th>
	<th>end_year</th>
	<th>website</th>
	<th>country</th>
	<th>language</th>
	<th>doaj_seal</th>
	<th>doaj_status</th>
	<th>lockss</th>
	<th>portico</th>
	<th>nlch</th>
	<th>qoam_av_score</th>
	<th>oa_status</th>
	<th>publisher</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>0</td>
	<td>1</td>
	<td>Revue médicale suisse</td>
	<td>Rev. méd. suisse</td>
	<td>2005</td>
	<td>9999</td>
	<td>NaN</td>
	<td>215</td>
	<td>138</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>NaN</td>
	<td>1</td>
	<td>1</td>
	</tr>
	<tr>
	<td>1</td>
	<td>2</td>
	<td>Physical Review Letters</td>
	<td>Phys. rev. lett. (Print)</td>
	<td>1958</td>
	<td>9999</td>
	<td>http://prl.aps.org/</td>
	<td>236</td>
	<td>124</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	<td>2</td>
	</tr>
	<tr>
	<td>2</td>
	<td>3</td>
	<td>PLoS ONE</td>
	<td>NaN</td>
	<td>2006</td>
	<td>9999</td>
	<td>http://www.plosone.org/</td>
	<td>236</td>
	<td>124</td>
	<td>1</td>
	<td>1</td>
	<td>1</td>
	<td>0</td>
	<td>0</td>
	<td>4.035714</td>
	<td>5</td>
	<td>3</td>
	</tr>
	<tr>
	<td>3</td>
	<td>4</td>
	<td>EU-topías</td>
	<td>EU-topías</td>
	<td>2011</td>
	<td>9999</td>
	<td>NaN</td>
	<td>209</td>
	<td>124, 138, 402, 292</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>NaN</td>
	<td>1</td>
	<td>4, 5</td>
	</tr>
	<tr>
	<td>4</td>
	<td>5</td>
	<td>Physical review B: Condensed matter and materi...</td>
	<td>Phys. rev., B, Condens. matter mater. phys.</td>
	<td>1998</td>
	<td>2015</td>
	<td>http://journals.aps.org/prb/</td>
	<td>236</td>
	<td>124</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	<td>2</td>
	</tr>
	<tr>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	<td>...</td>
	</tr>
	<tr>
	<td>906</td>
	<td>997</td>
	<td>Smart Materials and Structures</td>
	<td>Smart mater. struct. (Print)</td>
	<td>1992</td>
	<td>9999</td>
	<td>http://iopscience.iop.org/0964-1726</td>
	<td>234</td>
	<td>124</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>907</td>
	<td>998</td>
	<td>Journal of Pediatric Surgery</td>
	<td>J. pediatr. surg. (Print)</td>
	<td>1966</td>
	<td>9999</td>
	<td>http://www.jpedsurg.org/</td>
	<td>236</td>
	<td>124</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>908</td>
	<td>999</td>
	<td>Probability Theory and Related Fields</td>
	<td>Probab. theory relat. fields (Internet)</td>
	<td>uuuu</td>
	<td>9999</td>
	<td>http://www.springerlink.com/content/100451/?p=...</td>
	<td>83</td>
	<td>124</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>1</td>
	<td>1</td>
	<td>NaN</td>
	<td>2</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>909</td>
	<td>1000</td>
	<td>Renewable Energy</td>
	<td>Renew. energy</td>
	<td>1991</td>
	<td>9999</td>
	<td>http://www.elsevier.com/wps/product/cws_home/9...</td>
	<td>234</td>
	<td>124</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>1</td>
	<td>0</td>
	<td>NaN</td>
	<td>2</td>
	<td>NaN</td>
	</tr>
	<tr>
	<td>910</td>
	<td>1001</td>
	<td>Journal of applied physiology: respiratory, en...</td>
	<td>J. appl. physiol.: respir., environ. exercise ...</td>
	<td>1977</td>
	<td>1984</td>
	<td>https://www.physiology.org/journal/jappl</td>
	<td>236</td>
	<td>124</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>0</td>
	<td>NaN</td>
	<td>1</td>
	<td>NaN</td>
	</tr>
	</tbody>
	</table>
	<p>911 rows × 16 columns</p>
	</div>




	```python
	# esport JSON publisher
	result = journal.to_json(orient='records', force_ascii=False)
	parsed = json.loads(result)
	with open('sample/journal.json', 'w', encoding='utf-8') as file:
	json.dump(parsed, file, indent=2, ensure_ascii=False)
	```


	```python
	# export csv
	journal.to_csv('sample/journal.tsv', sep='\t', encoding='utf-8', index=False)
	```


	```python
	# export excel
	journal.to_excel('sample/journal.xlsx', index=False)
	```


	```python
	# esport JSON publisher
	result = publisher_export_dedup.to_json(orient='records', force_ascii=False)
	parsed = json.loads(result)
	with open('sample/publisher.json', 'w', encoding='utf-8') as file:
	json.dump(parsed, file, indent=2, ensure_ascii=False)
	```


	```python
	# export csv
	publisher_export_dedup.to_csv('sample/publisher.tsv', sep='\t', encoding='utf-8', index=False)
	```


	```python
	# export excel
	publisher_export_dedup.to_excel('sample/publisher.xlsx', index=False)
	```


	```python

	```

07_oacct_sherpa_publishers.mdNo OneTemporaryActions

File Metadata

07_oacct_sherpa_publishers.mdView Options

Event Timeline

07_oacct_sherpa_publishers.md
No OneTemporary
Actions

07_oacct_sherpa_publishers.md
View Options